EP-3719613-B1 - RENDERING CAPTIONS FOR MEDIA CONTENT

EP3719613B1EP 3719613 B1EP3719613 B1EP 3719613B1EP-3719613-B1

Inventors

LEPPÄNEN, Jussi
MATE, SUJEET
LEHTINIEMI, ARTO
ERONEN, ANTTI

Dates

Publication Date: 20260506
Application Date: 20190401

Claims (15)

An apparatus (600) comprising means for rendering visual and/or audio virtual reality content, wherein the virtual reality content comprises a virtual reality object (330, 350) associated with a caption (340) and wherein the caption (340) is rendered within a field of view of a user (340); determining a level of interest with respect to the virtual reality object (330, 350) based on an amount of time the user (310) has spent looking at and/or listening to the virtual reality object (330, 350); determining a change of an orientation of the user and a perceived position of the user within the virtual reality content; rendering the virtual reality object (330, 350) associated with the caption (340) such that it is no longer visually rendered completely within the field of view of the user; and determining if the caption (340) is to be rendered based at least partly on the determined level of interest associated with the virtual reality object (330, 350) and if the determined level of interest is higher than a threshold value, continuing to render the caption (340), wherein when the virtual reality object (330, 350) is outside the field of view of the user (340), rendering the caption (340) includes indicating a distance or a direction to the virtual reality object (330, 350).
An apparatus (600) according to claim 1, wherein the textual content is determined using speech recognition and automatic translation.
An apparatus (600) according to any previous claim, wherein determining the level of interest with respect to the virtual reality object (330, 350) comprises at least one of: detecting gaze of the user (310), detecting orientation of the user's (310) head, or the user (310) zooming to the virtual reality content.
An apparatus (600) according to any previous claim, wherein determining the level of interest with respect to the virtual reality object (330, 350) is based on the user (310) zooming to the virtual reality object (330, 350), which causes audio virtual reality content associated with the virtual reality object (330, 350) to be rendered to the user (310) and the level of interest is determined to be higher than the threshold value.
An apparatus (600) according to claim 4, further comprising means for attenuating the audio virtual reality content associated with the virtual reality object (330, 350) and continuing to render the caption (340).
An apparatus (600) according to any previous claim, further comprising means for detecting a change in the field of view, which comprises at least one of detecting movement of the user (310), or a part of the user (310).
An apparatus (600) according to claim 6, wherein the determining if the caption (340) is to be displayed after the detected change in the field of view is further based on a virtual distance between virtual reality object and the user after the movement.
A method comprising rendering visual and/or audio virtual reality content, wherein the virtual reality content comprises a virtual reality object (330, 350) associated with a caption (340) comprising textual content related to the virtual reality object (330, 350) the caption (340) is associated with, and wherein the caption (340) is rendered within a field of view of a user (310); determining a level of interest with respect to the virtual reality object (330, 350) based on an amount of time the user (310) has spent looking at and/or listening to the virtual reality object (330, 350); determining a change of an orientation of the user (310) and a perceived position of the user within the virtual reality content; rendering the virtual reality object (330, 350) associated with the caption (340) such that it is no longer visually rendered completely within the field of view of the user; and determining if the caption (340) is to be rendered based at least partly on the determined level of interest associated with the virtual reality object (330, 350) and if the determined level of interest is higher than a threshold value, continuing to render the caption (340), wherein when the virtual reality object (330, 350) is outside the field of view of the user (340), rendering the caption (340) includes indicating a distance or a direction to the virtual reality object (330, 350).
A method according to claim 8, wherein determining the level of interest with respect to the virtual reality object comprises at least one of: detecting gaze of the user (310), detecting orientation of the user's head (310), or the user (310) zooming to the virtual reality content.
A method according to any of claims 8 or 9, wherein determining the level of interest with respect to the virtual reality object (330, 350) is based on the user (310) zooming to the virtual reality object (330, 350), which causes audio virtual reality content associated with the virtual reality object (330, 350) to be rendered to the user (310) and the level of interest is determined to be higher than the threshold value.
A method according to any of claims 8 to 10, further comprising detecting a change in the field of view, which comprises at least one of detecting movement of a user (310), or a part of the user (310), or the user (310) zooming to the virtual reality content.
A computer program product which when executed by an apparatus (600) according to claim 1 causes the apparatus to perform: rendering visual and/or audio virtual reality content, wherein the virtual reality content comprises a virtual reality object (330, 350) associated with a caption (340) comprising textual content related to the virtual reality object (330, 350) the caption (340) is associated with, and wherein the caption (340) is rendered within a field of view of a user (310); determining a level of interest with respect to the virtual reality object (330, 350) based on an amount of time the user (310) has spent looking at and/or listening to the virtual reality object (330, 350); determining a change of an orientation of the user (310) and a perceived position of the user within the virtual reality content; rendering the virtual reality object (330, 350) associated with the caption (340) such that it is no longer visually rendered completely within the field of view of the user; and determining if the caption (340) is to be rendered after based at least partly on the determined level of interest associated with the virtual reality object (330, 350) and if the determined level of interest is higher than a threshold value, continuing to render the caption (340), wherein when the virtual reality object (330, 350) is outside the field of view of the user (340), rendering the caption (340) includes indicating a distance or a direction to the virtual reality object (330, 350).
A computer program product according to claim 12, wherein determining the level of interest with respect to the virtual reality object (330, 350) comprises at least one of: detecting gaze of the user (310), detecting orientation of the user's head, or the user (310) zooming to the virtual reality content.
A computer program product according to claim 12 or 13, wherein determining the level of interest with respect to the virtual reality object (330, 350) is based on the user (310) zooming to the virtual reality object (330, 350), which causes audio virtual reality content associated with the virtual reality object (330, 350) to be rendered to the user (310) and the level of interest is determined to be higher than the threshold value.
A computer program product according to any of claims 12 to 14, further comprising instructions for detecting a change in the field of view, which comprises at least one of detecting movement of a user, or a part of the user, or receiving an input that is determined to represent movement of the user.

Description

FIELD The present application relates to rendering of computer-generated content. BACKGROUND Rendering computer-generated content may be utilized in creating desirable user experience. The computer-generated content may include visual content as well as audio and/or haptic content. Various devices are capable of rendering computer-generated content and the content may have been captured using cameras and microphones or it may be computer-generated or a combination of both. US2013044128 discloses that a user interface includes a virtual object having an appearance in context with a real environment of a user using a see-through, near-eye augmented reality display device system. A virtual type of object and at least one real world object are selected based on compatibility criteria for forming a physical connection like attachment, supporting or integration of the virtual object with the at least one real object. Other appearance characteristics, e.g. color, size or shape, of the virtual object are selected for satisfying compatibility criteria with the selected at least one real object. US2018204386 discloses techniques for displaying navigation information on a mobile device are provided that include a method that includes obtaining an indication of a position and an indication of a direction associated with the mobile device, using the indication of the position, the indication of the direction, information regarding identities of POIs within a geographic region of interest, and information regarding areas associated with the POIs to determine at least one relevant POI, of the POls, that is associated with the position and direction, and displaying at least one visual indication associated with each of the at least one relevant POI on the mobile device. The appearance of the at least one visual indication is dependent on at least one of a distance from the mobile device of the relevant POI associated with the visual indication or presence of a known physical barrier between the mobile device and that relevant POI. WO2016048633 discloses that a method may include detecting, in image data, an object and a gesture, in response to detecting the object in the image data, providing data indicative of the detected object, in response to detecting the gesture in the image data, providing data indicative of the detected gesture, and modifying the image data using the data indicative of the detected object and the data indicative of the detected gesture. US 2018/0253144 relates to an augmented reality interactive system and discloses dynamically changing signs, each corresponding to a selectable object in the user's field of view. The rendering include altering a first display attribute of a given sign of a plurality of signs based on determining that the user's field of view is centered on the given sign. BRIEF DESCRIPTION The scope of protection sought for various embodiments is set out by the independent claims. Dependent claims define further embodiments included in the scope of protection. The exemplary embodiments and features, if any, described in this specification that do not fall under the scope of the independent claims are to be interpreted as examples useful for understanding various embodiments of the invention. BRIEF DESCRIPTION OF THE DRAWINGS Figure 1 illustrates an example embodiment of an apparatus for rendering virtual reality content.Figure 2 illustrates an example of providing audio content.Figures 3a-3d illustrate an example embodiment of rendering captions.Figure 4 illustrates an example embodiment of rendering captions for a virtual reality scene.Figure 5 illustrates a flow chart according to an example embodiment.Figure 6 illustrates an example embodiment of an apparatus. DETAILED DESCRIPTION Creating user experience comprises rendering content such as visual content and/or audio content. In some example embodiments, the user experience may be enhanced by utilizing haptic feedback as well. Augmented reality provides an enhance user experience by enhancing a physical environment with computer-generated content. The computer-generated content may comprise visual content, audio content and/or haptic feedback provided to the user. Yet, the user may still sense the surrounding physical environment and thereby is not fully immersed into the augmented reality content. Mixed reality provides a user experience similar to augmented reality, but in mixed reality the added computer-generated content may be anchored to the real-world content and may be perceived to interact with real-world objects. For the purpose of easier explanation an umbrella-term virtual reality is used form hereon and the term is to cover augmented reality, mixed reality and virtual reality. Virtual reality provides immersive user experience by rendering visual content that fills the user's field of view. The user experience may further be enhanced by rendering audio content as well. The virtual reality content can therefore be c