US-12620186-B2 - Augmenting video or external environment with 3D graphics

US12620186B2US 12620186 B2US12620186 B2US 12620186B2US-12620186-B2

Abstract

A rendering device is provided for rendering a scene, wherein the rendering device is configured to display a video object or allow visual pass-through of an external environment to a viewer and to render the scene as augmentation of the video object or the external environment. The scene may be rendered in a hybrid manner, involving local rendering of a part of the scene by the rendering device and remote rendering of another part of the scene by a remote rendering system. Both the rendering device and the remote rendering system may have access to scene descriptor data which may be indicative of changes in a state of the scene over time. As such, the rendering client may request remote rendering of a part of the scene at a particular temporal state. e.g., to anticipate for network and/or processing latencies.

Inventors

Emmanouil Potetsianakis
Aschwin Steven Reinier Brandt
Emmanuel Thomas

Assignees

KONINKLIJKE KPN N.V.
NEDERLANDSE ORGANISATIE VOOR TOEGEPAST-NATUURWETENSCHAPPELIJK ONDERZOEK TNO

Dates

Publication Date: 20260505
Application Date: 20220627
Priority Date: 20210709

Claims (20)

1 . A computer-implemented method of rendering a scene using a rendering device, wherein the rendering device is configured to display a video object or allow visual passthrough of an external environment to a viewer and to render the scene as augmentation of the video object or the external environment, the method comprising by the rendering device: obtaining scene descriptor data, wherein the scene descriptor data identifies a set of 3D graphics objects representing at least part of the scene, wherein the scene descriptor data is indicative of changes in a state of the scene over time; based on at least the scene descriptor data, determining a first part of the scene to be rendered locally by the rendering device and a second part of the scene to be rendered remotely; determining a temporal marker to indicate a second time instance which is ahead in time of a first time instance at which the rendering device currently renders the scene; requesting a remote rendering system having access to the scene descriptor data to render the second part of the scene to obtain pre rendered scene data comprising a prerendered version of the second part of the scene, wherein the requesting comprises providing the temporal marker to the remote rendering system to request the remote rendering system to render the second part of the scene at the state corresponding to the temporal marker; obtaining the prerendered scene data from the remote rendering system; rendering the scene as the augmentation of the video object or the external environment, the rendering comprising locally rendering the first part of the scene and including the prerendered version of the second part of the scene.
2 . The computer-implemented method according to claim 1 , wherein the scene descriptor data describes the state of the scene at a plurality of time instances on a scene timeline to indicate the changes in the state of the scene over time, and wherein the temporal marker is indicative of a time instance on the scene timeline.
3 . The computer-implemented method according to claim 1 , wherein the scene descriptor data is indicative of at least one of: a presence, a position, an orientation, and an appearance, of a respective object, in the scene over time.
4 . The computer-implemented method according to claim 1 , further comprising determining the temporal marker based on at least one of: a network latency of a network between the rendering device and the remote rendering system; a latency of, or scheduling constraints associated with, the rendering of the second part of the scene by the remote rendering system.
5 . The computer-implemented method according to claim 1 , wherein the scene descriptor data comprises object identifiers of respective ones of the set of 3D graphics objects, wherein: the requesting the remote rendering system comprises providing respective object identifiers of one or more 3D graphics objects representing the second part of the scene to the remote rendering system; and/or the receiving of the prerendered scene data comprises receiving metadata comprising respective object identifiers of one or more 3D graphics objects which are prerendered in the prerendered scene data.
6 . The computer-implemented method according to claim 5 , wherein the metadata is received repeatedly, and wherein the method further comprises stopping or starting a local rendering of a 3D graphical object based on an object identifier of the 3D graphical object appearing in or disappearing from the metadata.
7 . The computer-implemented method according to claim 1 , wherein the rendering of the scene is adaptable to a content of the video object or of the external environment, and wherein the method further comprises determining content data which at least in part characterizes the content and providing the content data to the remote rendering system.
8 . The computer-implemented method according to claim 7 , wherein the content data comprises at least one of: lighting data characterizing a lighting in the content; depth data characterizing a depth of the content; and image or video data providing a visual representation of the content.
9 . The computer-implemented method according to claim 1 , wherein the determining of the first part of the scene to be rendered locally and the second part of the scene to be rendered remotely comprises determining whether a 3D graphical object is to be rendered locally or remotely based on at least one of: a latency, bandwidth or reliability, of a network between the rendering device and the remote rendering system; a latency, bandwidth or reliability, of a network between the rendering device and a content server, which content server is configured to host 3D graphics data defining the set of 3D graphics objects of the scene; a computational load of the rendering device; a computational load of the remote rendering system; a battery level of the rendering device; a complexity of the 3D graphical object; and a scene distance between a viewpoint in the scene from which the scene is rendered by the rendering device and the 3D graphical object in the scene.
10 . The computer-implemented method according to claim 1 , wherein the scene descriptor data comprises a graph comprising branches indicating a hierarchical relation between 3D graphics objects, wherein the determining of the first part of the scene to be rendered locally and the second part of the scene to be rendered remotely comprises determining per branch whether to render the 3D graphics objects of a respective branch locally or remotely.
11 . A computer-implemented method of rendering part of a scene for a rendering device, wherein the rendering device is configured to display a video object or allow visual pass-through of an external environment to a viewer and to render the scene as augmentation of the video object or the external environment, the method comprising by a remote rendering system: obtaining scene descriptor data, wherein the scene descriptor data identifies a set of 3D graphics objects representing at least part of the scene, wherein the scene descriptor data is indicative of changes in a state of the scene over time; receiving a request to render part of the scene, wherein the request comprises a temporal marker which indicates a second time instance which is ahead in time of a first time instance at which the rendering device renders the scene at the time the rendering device determines the temporal marker; based on the scene descriptor data and the request, rendering the part of the scene at the state corresponding to the temporal marker to obtain prerendered scene data comprising a prerendered version of the part of the scene; providing the prerendered scene data to the rendering device.
12 . The computer-implemented method according to claim 11 , further comprising generating metadata comprising respective object identifiers of one or more 3D graphics objects which are prerendered in the prerendered scene data and providing the metadata with the prerendered scene data to the rendering device.
13 . The computer-implemented method according to claim 11 , further comprising adjusting the part of the scene which is requested to be rendered by omitting 3D graphics objects from the rendering based on one of: a computational load of the remote rendering system; a latency, bandwidth or reliability, of a network between the rendering device and the remote rendering system.
14 . A non-transitory computer-readable medium comprising data representing a computer program, the computer program comprising instructions for causing a processor system to perform the method according to claim 11 .
15 . A non-transitory computer-readable medium comprising data representing a computer program, the computer program comprising instructions for causing a processor system to perform the method according to claim 1 .
16 . A rendering device configured to display a video object or allow visual pass-through of an external environment to a viewer and to render a scene comprising 3D graphics objects as augmentation of the video object or the external environment, the rendering device comprising: a network interface to a network; a data storage comprising scene descriptor data, wherein the scene descriptor data identifies a set of 3D graphics objects representing at least part of the scene, wherein the scene descriptor data is indicative of changes in a state of the scene over time; a processor subsystem configured to: based on at least the scene descriptor data, determine a first part of the scene to be rendered locally by the rendering device and a second part of the scene to be rendered remotely; determine a temporal marker to indicate a second time instance which is ahead in time of a first time instance at which the rendering device currently renders the scene; via the network interface, request a remote rendering system having access to the scene descriptor data to render the second part of the scene to obtain prerendered scene data comprising a pre-rendered version of the second part of the scene, wherein said requesting comprises to provide the temporal marker to the remote rendering system to request the remote rendering system to render the second part of the scene at the state corresponding to the temporal marker; via the network interface, receive the prerendered scene data from the remote rendering system; render the scene as the augmentation of the video object or the external environment, the rendering comprising to locally render the first part of the scene and to include the prerendered version of the second part of the scene.
17 . The rendering device according to claim 16 , wherein the scene descriptor data describes the state of the scene at a plurality of time instances on a scene timeline to indicate the changes in the state of the scene over time, and wherein the temporal marker is indicative of a time instance on the scene timeline.
18 . The rendering device according to claim 16 , wherein the scene descriptor data is indicative of at least one of: a presence, a position, an orientation, and an appearance, of a respective object, in the scene over time.
19 . The rendering device according to claim 16 , wherein the processor subsystem is configured to determine the temporal marker further based on at least one of: a network latency of a network between the rendering device and the remote rendering system; a latency of, or scheduling constraints associated with, the rendering of the second part of the scene by the remote rendering system.
20 . The rendering device according to claim 16 , wherein the scene descriptor data comprises object identifiers of respective ones of the set of 3D graphics objects, wherein: to request the remote rendering system comprises to provide respective object identifiers of one or more 3D graphics objects representing the second part of the scene to the remote rendering system; and/or to receive the prerendered scene data comprises to receive metadata comprising respective object identifiers of one or more 3D graphics objects which are prerendered in the prerendered scene data.

Description

This application is the U.S. National Stage of International Application No. PCT/EP2022/067618, filed Jun. 27, 2022, which designates the U.S., published in English, and claims priority under 35 U.S.C. § 119 or 365(c) to European Application No. 21184721.5, filed Jul. 9, 2021. The entire teachings of the above applications are incorporated herein by reference. TECHNICAL FIELD The invention relates to a computer-implemented method for rendering a scene using a rendering device, wherein the rendering device is configured to display a video object or allow visual pass-through of an external environment to a viewer and to render the scene as augmentation of the video object or the external environment. The invention further relates to the rendering device and to a remote rendering system and computer-implemented method for rendering part of a scene for the rendering device. The invention further relates to a computer-readable medium comprising data for causing a processor system to perform any of the computer-implemented methods. BACKGROUND In Augmented Reality (AR) or Mixed Reality (MR), computer graphics-based 3D objects may be combined with the physical reality perceived by a viewer so as to augment the physical reality for the viewer. For example, a head-worn AR or MR rendering device may visually pass-through the external environment to a viewer, for example via a transparent or semitransparent portion of the rendering device and may be configured to render one or more computer graphics-based 3D objects of a scene and display them in such a way that they are overlaid, mixed, blended or in any other way visually combined with the external environment from the perception of the viewer. Thereby, the outside world may be augmented with digital content, such as informative content (e.g., navigation directions) or entertainment (e.g., game characters), etc. Such computer graphics-based 3D objects (henceforth also simply referred to as ‘3D graphics objects’) may also be rendered to augment a video-based representation of the physical reality. For example, a head-worn AR/MR rendering device may comprise a forward-facing camera and may record and display the recorded video in real-time or near real-time to a viewer while augmenting the video with rendered 3D graphics objects. Also, other types of videos may be augmented with 3D graphics objects. For example, in Virtual Reality (VR), a so-called panoramic video or omnidirectional video may be augmented with one or more 3D graphics objects. AR, MR or VR are together also referred to as Extended Reality (XR). It is known to perform the rendering of 3D graphics objects for XR-augmentation at a user's rendering device, which rendering is in the following also referred to as ‘local’ rendering. It is also known to perform the rendering at a remote rendering system, which rendering is in the following also referred to as ‘remote’ rendering. For example, [1] describes in section 6.2.4 a viewport of a scene being entirely rendered by a XR server. It is said that the XR server generates the XR Media on the fly based on incoming tracking and sensor information, for example a game engine. The generated XR media is provided for the viewport in a 2D format (flattened), encoded and delivered over a 5G network. The tracking and sensor information is delivered in the reverse direction. In the XR device, the media decoders decode the media, and the viewport is directly rendered without using the viewport information. Remote rendering may address some problems associated with local rendering. For example, if the scene to be rendered is too complex, e.g., requiring more rendering performance than available, or if the rendering device is battery-operated and running low on battery, it may be desirable to have the scene rendered by a remote rendering system, instead of rendering the scene locally. However, remote rendering may be disadvantageous in other situations, for example if network connectivity is poor or if it is desirable to render an object at (ultra-)low latency. A scene may also be partially rendered by a remote rendering system and partially by a local rendering device, with such rendering being referred to as ‘hybrid’ or ‘split’ rendering. For example, [2] describes rendering more complex content such as computer-aided design models on a remote computing system and streaming the rendered models to a client device executing an application that generates and/or displays the models. The client device may augment the remotely rendered content with locally rendered content, such as lightweight, lower-latency content (e.g., user interface/user experience elements, inking, content correlated with a real-world object such as an articulated hand, etc.). The client device also may perform depth-correct blending to provide occlusion relations between remotely- and locally rendered content. However, [2] assumes the scene to be rendered to be largely static, with the exception of indivi