EP-4740455-A1 - FINAL VIEW GENERATION USING OFFSET AND/OR ANGLED SEE-THROUGH CAMERAS IN VIDEO SEE-THROUGH (VST) EXTENDED REALITY (XR)

EP4740455A1EP 4740455 A1EP4740455 A1EP 4740455A1EP-4740455-A1

Abstract

According to an embodiment of the disclosure, a method may include obtaining images of a scene captured using the plurality of see-through cameras. The method may include applying the passthrough transformation to the images in order to generate transformed images. The method may include displaying the transformed images on one or more display panels of the VST XR device. According to an embodiment of the disclosure, the passthrough transformation may be based on a first transformation between viewpoints of the plurality of see-through cameras and viewpoints of viewpoint-matched virtual camera and a second transformation that aligns principal points of plurality of the see-through cameras and principal points of the one or more display panels.

Inventors

XIONG, YINGEN

Assignees

Samsung Electronics Co., Ltd.

Dates

Publication Date: 20260513
Application Date: 20240628

Claims (15)

A method comprising: identifying a passthrough transformation associated with a video see-through (VST) extended reality (XR) device (101), the VST XR device (101) comprising a plurality of see-through cameras (202a-202b) that are at least one of offset from forward axes (206a-206b) extending from expected locations of a user's eyes (204a-204b) when using the VST XR device (101) or rotated such that optical axes (208a-208b) of the plurality of see-through cameras (202a-202b) are angled relative to the forward axes (206a-206b); obtaining images of a scene captured using the plurality of see-through cameras (202a-202b); applying the passthrough transformation to the images in order to generate transformed images; and displaying the transformed images on one or more display panels (302) of the VST XR device; wherein the passthrough transformation is based on a first transformation between viewpoints of the plurality of see-through cameras (202a-202b) and viewpoints of viewpoint-matched virtual camera (510a-510b; 610; 710; 810; 910; 1010a-1010b; 1102a-1102b) and a second transformation that aligns principal points of the plurality of see-through cameras (202a-202b) and principal points of the one or more display panels (302).
The method of Claim 1, wherein the passthrough transformation is further based on a rectification to map image frames (1106a-1106b) of viewpoint-matched virtual cameras (510a-510b; 610; 710; 810; 910; 1010a-1010b; 1102a-1102b) to image frames (1202a-1202b) of virtual rendering cameras.
The method any one of Claims 1 and 2, wherein the transformed images provide a wider field of view than a field of view at the expected locations of the user's eyes.
The method any one of Claims 1 to 3, wherein the optical axes (208a-208b) of the plurality of see-through cameras (202a-202b) are not parallel to each other.
The method any one of Claims 1 to 4, wherein the first transformation is based on an interpupillary distance associated with the expected locations of the user's eyes (204a-204b).
The method any one of Claims 1 to 5, wherein the see-through cameras are positioned above or below the forward axes (206a-206b).
The method any one of Claims 1 to 6, wherein the optical axes (208a-208b) of the plurality of see-through cameras (202a-202b) are at least one of angled outward or angled downward relative to the forward axes (206a-206b).
A video see-through (VST) extended reality (XR) device (101) comprising: a plurality of see-through cameras (202a-202b) configured to obtain images of a scene, wherein the plurality of see-through cameras (202a-202b) are at least one of offset from forward axes (206a-206b) extending from expected locations of a user's eyes (204a-204b) when using the VST XR device (101) or rotated such that optical axes (208a-208b) of the plurality of see-through cameras (202a-202b) are angled relative to the forward axes (206a-206b); one or more display panels (302); a memory storing one or more instructions; and at least one processor communicatively coupled to the memory, wherein the at least one processor execute the one or more instructions stored in the memory to cause the VST XR device (101) to: identify a passthrough transformation associated with the VST XR device (101) ; apply the passthrough transformation to the images in order to generate transformed images; and initiate display of the transformed images on the one or more display panels (302); wherein the passthrough transformation is based on a first transformation between viewpoints of the plurality of see-through camera (202a-202b) and viewpoints of viewpoint-matched virtual camera (510a-510b; 610; 710; 810; 910; 1010a-1010b; 1102a-1102b) and a second transformation that aligns principal points of the plurality of see-through cameras (202a-202b) and principal points of the one or more display panels (302).
The VST XR device (101) of Claim 8, wherein the passthrough transformation is further based on a rectification to map image frames (1106a-1106b) of viewpoint-matched virtual cameras (510a-510b; 610; 710; 810; 910; 1010a-1010b; 1102a-1102b) to image frames (1202a-1202b) of virtual rendering cameras.
The VST XR device (101) any one of Claims 8 and 9, wherein the transformed images provide a wider field of view than a field of view at the expected locations of the user's eyes.
The VST XR device (101) any one of Claims 8 to 10, wherein the optical axes of the plurality of see-through cameras are not parallel to each other.
The VST XR device (101) any one of Claims 8 to 11, wherein the first transformation is based on an interpupillary distance associated with the expected locations of the user's eyes.
The VST XR device (101) any one of Claims 8 to 12, wherein the plurality of see-through cameras are positioned above or below the forward axes.
The VST XR device (101) any one of Claims 8 to 13, wherein the optical axes of the plurality of see-through cameras are at least one of angled outward or angled downward relative to the forward axes.
A machine readable medium containing instructions that when executed cause at least one processor of a video see-through (VST) extended reality (XR) device (101) to: identify a passthrough transformation associated with the VST XR device (101), the VST XR device (101) comprising a plurality of see-through cameras (202a-202b) that are at least one of offset from forward axes (206a-206b) extending from expected locations of a user's eyes (204a-204b) when using the VST XR device (101) or rotated such that optical axes (208a-208b) of the plurality of see-through cameras (202a-202b) are angled relative to the forward axes (206a-206b); obtain images of a scene captured using the plurality of see-through cameras (202a-202b); apply the passthrough transformation to the images in order to generate transformed images; and initiate display of the transformed images on one or more display panels (302) of the VST XR device; wherein the passthrough transformation is based on a first transformation between viewpoints of the plurality of see-through camera (202a-202b) and viewpoints of viewpoint-matched virtual camera (510a-510b; 610; 710; 810; 910; 1010a-1010b; 1102a-1102b) and a second transformation that aligns principal points of the plurality of see-through cameras (202a-202b) and principal points of the one or more display panels (302).

Description

FINAL VIEW GENERATION USING OFFSET AND/OR ANGLED SEE-THROUGH CAMERAS IN VIDEO SEE-THROUGH (VST) EXTENDED REALITY (XR) This disclosure relates generally to extended reality (XR) systems and processes. More specifically, this disclosure relates to final view generation using offset and/or angled see-through cameras in video see-through (VST) XR. Extended reality (XR) systems are becoming more and more popular over time, and numerous applications have been and are being developed for XR systems. Some XR systems (such as augmented reality or "AR" systems and mixed reality or "MR" systems) can enhance a user's view of his or her current environment by overlaying digital content (such as information or virtual objects) over the user's view of the current environment. For example, some XR systems can often seamlessly blend virtual objects generated by computer graphics with real-world scenes. For a more complete understanding of this disclosure and its advantages, reference is now made to the following description taken in conjunction with the accompanying drawings, in which: FIGURE 1 illustrates an example network configuration including an electronic device in accordance with this disclosure; FIGURES 2A through 2G illustrate example configurations of see-through cameras in a video see-through (VST) extended reality (XR) device in accordance with this disclosure; FIGURES 3A through 3C illustrate an example modification of a field of view using a VST XR device having a specified configuration of see-through cameras and resulting changes to captured see-through images in accordance with this disclosure; FIGURE 4 illustrates an example functional architecture supporting final view generation using offset and/or angled see-through cameras in VST XR in accordance with this disclosure; FIGURES 5A through 10C illustrate example determinations of a first transformation for various configurations of see-through cameras in a VST XR device in accordance with this disclosure; FIGURE 11 illustrates an example of how a determined first transformation may be used as part of a passthrough transformation in accordance with this disclosure; FIGURES 12 through 16 illustrate example rectifications of viewpoint-matched virtual image pairs as part of a passthrough transformation in accordance with this disclosure; FIGURES 17 through 21 illustrate example determinations of a second transformation for various configurations of see-through cameras in a VST XR device in accordance with this disclosure; and FIGURE 22 illustrates an example method for final view generation using offset and/or angled see-through cameras in VST XR in accordance with this disclosure. Before undertaking the DETAILED DESCRIPTION below, it may be advantageous to set forth definitions of certain words and phrases used throughout this patent document. The terms "transmit," "receive," and "communicate," as well as derivatives thereof, encompass both direct and indirect communication. The terms "include" and "comprise," as well as derivatives thereof, mean inclusion without limitation. The term "or" is inclusive, meaning and/or. The phrase "associated with," as well as derivatives thereof, means to include, be included within, interconnect with, contain, be contained within, connect to or with, couple to or with, be communicable with, cooperate with, interleave, juxtapose, be proximate to, be bound to or with, have, have a property of, have a relationship to or with, or the like. Moreover, various functions described below can be implemented or supported by one or more computer programs, each of which is formed from computer readable program code and embodied in a computer readable medium. The terms "application" and "program" refer to one or more computer programs, software components, sets of instructions, procedures, functions, objects, classes, instances, related data, or a portion thereof adapted for implementation in a suitable computer readable program code. The phrase "computer readable program code" includes any type of computer code, including source code, object code, and executable code. The phrase "computer readable medium" includes any type of medium capable of being accessed by a computer, such as read only memory (ROM), random access memory (RAM), a hard disk drive, a compact disc (CD), a digital video disc (DVD), or any other type of memory. A "non-transitory" computer readable medium excludes wired, wireless, optical, or other communication links that transport transitory electrical or other signals. A non-transitory computer readable medium includes media where data can be permanently stored and media where data can be stored and later overwritten, such as a rewritable optical disc or an erasable memory device. As used here, terms and phrases such as "have," "may have," "include," or "may include" a feature (like a number, function, operation, or component such as a part) indicate the existence of the feature and do not exclude the existence of oth