US-12619299-B2 - Systems and methods for interactive viewing of three-dimensional content using anatomical tracking

US12619299B2US 12619299 B2US12619299 B2US 12619299B2US-12619299-B2

Abstract

Interactive viewing of three-dimensional content includes producing anatomical tracking data representative of the movement of an anatomical feature of a user, rendering of three-dimensional target content and manipulation of the target content in response to the anatomical tracking data such that the user perceives his self or her self to be changing location with respect to a scene or object or perceives being tracked by a virtual character. A system of interactive viewing includes a point cloud capture device by which the at least one anatomical feature of the user can be tracked, a display by which target content is displayed to the user, and a computer operatively connected to the point cloud capture device and display. The computer includes non-transitory computer readable media (CRM) storing instructions by which the virtual scene is manipulated based on input of the anatomical tracking data.

Inventors

Barry Spencer
Julian George Spencer
Jeremy Egenberger

Assignees

d3Labs Inc.

Dates

Publication Date: 20260505
Application Date: 20240829

Claims (20)

1 . A method of creating an immersive three-dimensional viewing experience, the method comprising: capturing images of at least one anatomical feature of a user while the user is situated in a zone in front of a screen of a display, and producing image data comprising a digital representation of at least part of the user from the captured images; generating anatomical tracking data including by tracking one or more relative positions of the at least one anatomical feature of the user using the image data; rendering target content comprising a virtual scene, a virtual object or a virtual character on the screen of a display device in such a way that the target content is displayed in three dimensions to the user; and executing an interactive viewing mode in which a movement of the user effects a manipulation of the target content, the interactive viewing mode being selected from the group consisting of: rotating in its entirety the virtual scene, the virtual object, or the virtual character constituting the target content, based on the anatomical tracking data, as the one or more relative positions of the at least one anatomical feature of the user changes; translating in its entirety the virtual scene, the virtual object, or the virtual character constituting the target content, based on the anatomical tracking data, as the one or more relative positions of the at least one anatomical feature of the user changes; and animating the virtual scene, the virtual object, or the virtual character constituting the target content, based on the anatomical tracking data, as the one or more relative positions of the at least one anatomical feature of the user changes.
2 . The method as claimed in claim 1 , wherein the capturing of image data comprises acquiring point cloud data of the at least one anatomical feature of the user.
3 . The method as claimed in claim 2 , wherein the generating of the anatomical tracking data includes detecting from the point cloud data changes in position of the at least one anatomical feature, and inferring changes in the position of the at least one anatomical feature of the user relative to the screen of the display device from said changes in the position of the at least one anatomical feature.
4 . The method as claimed in claim 3 , wherein the generating of the anatomical tracking data includes detecting from the point cloud data changes in the position of the user's head and inferring changes in the position of the user's eyes relative to the screen from said changes in position of the user's head.
5 . The method as claimed in claim 3 , wherein the rendering of the target content comprises calculating vectors for pixels of the screen, each of the vectors originating from the position of the user's eyes and pointing toward a respective pixel based on a coordinate system of the screen.
6 . The method as claimed in claim 1 , wherein the rendering of the target content comprises calculating vectors for pixels of the screen, each of the vectors originating from a position of the user's eyes and pointing toward a respective pixel based on a coordinate system of the screen.
7 . The method as claimed in claim 1 , further comprising detecting for a blob in an image of the user represented by the image data as the user enters a zone in front of the screen of the display device, and wherein the anatomical tracking data is generated only once the blob in the image data has been detected.
8 . The method as claimed in claim 1 , wherein the execution of the interactive viewing mode comprises dynamically determining yaw and pitch of a vector from a position of eyes of the user to a center of the screen of the display and rotating in its entirety the virtual scene, the virtual object, or the virtual character constituting the target content based on changes in the yaw and the pitch of the vector.
9 . The method as claimed in claim 1 , wherein the execution of the interactive viewing mode comprises determining translation of a position of eyes of the user in directions toward or away from the screen of the display, and translating in its entirety the virtual scene, the virtual object, or the virtual character constituting the target content toward or away from the user based on said translation of the position of the eyes of the user.
10 . A method of creating an immersive three-dimensional viewing experience, the method comprising: acquiring image data of a user while the user is situated in front of a screen of a display device, the acquiring of the image data comprising producing point cloud data of at least one anatomical feature of the user; generating anatomical tracking data using the image data by inferring from the point cloud data a position of the at least one anatomical feature of the user relative to the screen of the display device, the anatomical tracking data changing dynamically as the user moves the at least one anatomical feature; rendering a three-dimensional (3D) model of target content and displaying the model to the user via the screen of the display device; and executing an interactive viewing mode in which the 3D model of the target content is translated or rotated or is animated based on changes in the anatomical tracking data, whereby movement of the at least one anatomical feature by the user manipulates the target content.
11 . The method as claimed in claim 10 , further comprising detecting for a blob in an image of the user represented by the image data as the user enters a zone in front of the screen of the display device, and wherein the anatomical tracking data is generated only once the blob in the image data has been detected.
12 . The method as claimed in claim 10 , wherein the 3D model is rendered with respect to a Cartesian coordinate system having an origin at a central position of the screen of the display device, X and Y axes are mapped along the screen of the display device and a Z axis is extending through the screen at the origin, the rendering of the 3D model of the target content comprises rendering a virtual scene of objects, and the executing of the interactive viewing mode comprises rotating the virtual scene in its entirety in the X-Z plane of the coordinate system and changing a field of view of the virtual scene by the user, based on changes in the anatomical tracking data indicating that the at least one anatomical feature of the user is moving in the X-Z plane.
13 . The method as claimed in claim 10 , wherein the 3D model is rendered with respect to a Cartesian coordinate system having an origin at a central position of the screen of the display device, X and Y axes are mapped along the screen of the display device and a Z axis is extending through the screen at the origin, the rendering of the 3D model of the target content comprises rendering a virtual scene of objects, and the executing of the interactive viewing mode comprises translating the virtual scene in its entirety along the Z axis of the coordinate system and changing a field of view of the virtual scene by the user, based on changes in the anatomical tracking data indicating that the at least one anatomical feature of the user is moving in a direction along or parallel to the Z axis.
14 . The method as claimed in claim 10 , wherein the 3D model is rendered with respect to a Cartesian coordinate system having an origin at a central position of the screen of the display device, X and Y axes are mapped along the screen of the display device and a Z axis is extending through the screen at the origin, the rendering of the 3D model of the target content comprises rendering a virtual scene of objects, and the executing of the interactive viewing mode comprises translating one of the virtual objects within the virtual scene in a direction along or parallel to the Z axis of the coordinate system while maintaining a field of view of the virtual scene by the user, based on changes in the anatomical tracking data indicating that the at least one anatomical feature of the user is moving in a direction along or parallel to the Z axis.
15 . The method as claimed in claim 10 , wherein the 3D model is rendered with respect to a Cartesian coordinate system having an origin at a central position of the screen of the display device, X and Y axes are mapped with the screen of the display device and a Z axis is extending through the screen at the origin, the rendering of the 3D model of the target content comprises rendering a virtual scene of objects, and the executing of the interactive viewing mode comprises rotating one of the virtual objects within the virtual scene about the Y axis of the coordinate system while maintaining a field of view of the virtual scene by the user, based on changes in the anatomical tracking data indicating that the at least one anatomical feature of the user is moving in a direction in the X-Z plane of the coordinate system.
16 . The method as claimed in claim 10 , wherein the 3D model is rendered with respect to a Cartesian coordinate system having an origin at a central position of the screen of the display device, X and Y axes are mapped along the screen of the display device and a Z axis is extending through the screen at the origin, the rendering of the 3D model of the target content comprises rendering a virtual scene including a virtual character, and the executing of the interactive viewing mode comprises animating the virtual character while maintaining a field of view of the virtual scene by the user, based on changes in the anatomical tracking data indicating that the at least one anatomical feature of the user is moving, wherein the animating of features of the virtual character comprises rotating or translating select features of the virtual character within the virtual scene.
17 . The method as claimed in claim 10 , the generating of the anatomical tracking data further comprising detecting from the point cloud data changes in the position of the user's head and inferring changes in the position of the user's eyes relative to the screen from said changes in the position of the user's head.
18 . A machine having a processing unit, and non-transitory computer-readable media (CRM) storing operating instructions and digital target content representing a virtual scene, a virtual object, or a virtual character, the processing unit being configured to execute the operating instructions to: render a three-dimensional (3D) model of the digital target content and display the 3D model to a user via a screen of a display device; generate anatomical tracking data using point cloud capture data, the anatomical tracking data representative of a position of an anatomical feature of the user such that the anatomical tracking data changes dynamically as the user moves the anatomical feature; and execute an interactive viewing mode in which a movement of the anatomical feature of the user manipulates the digital target content, the interactive viewing mode being selected from a group consisting of: rotating in its entirety the virtual scene, the virtual object, or the virtual character constituting the digital target content, based on the anatomical tracking data, as the position of the anatomical feature of the user changes, translating in its entirety the virtual scene, the virtual object, or the virtual character constituting the digital target content, based on the anatomical tracking data, as the position of the anatomical feature of the user changes, and animating the virtual scene, the virtual object, or the virtual character constituting the digital target content, based on the anatomical tracking data, as the position of the anatomical feature of the user changes.
19 . The machine as claimed in claim 18 , wherein the processing unit is configured to execute the operating instructions to: selectively execute interactive viewing modes in which a movement of the anatomical feature of the user manipulates the digital target content as based on the anatomical tracking data, the interactive viewing modes selected from the group including: a virtual scene rotation mode of rotating in its entirety the virtual scene including a plurality of distinct objects constituting the digital target content, a virtual scene translation mode of translating in its entirety the virtual scene including a plurality of distinct objects constituting the digital target content, a virtual object rotation mode of rotating in its entirety the virtual object constituting the digital target content, a virtual object translation mode of rotating in its entirety the virtual object constituting the digital target content, based on the anatomical tracking data, and a virtual character animation mode of animating features of the virtual character to move along with the movement of the anatomical feature of the user.
20 . An interactive viewing system for creating an immersive three-dimensional viewing experience, the system comprising the machine as claimed in claim 18 , and further comprising: at least one point cloud capture device that captures images of a subject and generates point cloud data of a three-dimensional representation of the subject; and a display device having a screen; and wherein the machine is in operative communication with the at least one point cloud capture device so as to receive image data from the at least one point cloud capture device and generate the anatomical tracking data, and is operatively connected to the display device so as to display the three-dimensional (3D) model the digital target content on the screen of the display device.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS The present application claims the benefit of priority of U.S. provisional patent application No. 63/536,007 filed on Aug. 31, 2023. The present application is also related to U.S. patent application Ser. No. 18/478,795 filed on Sep. 29, 2023, entitled “DISPLAY OF THREE-DIMENSIONAL SCENES WITH CHANGING PERSPECTIVES”, which issued as U.S. Pat. No. 12,051,149 on Jul. 30, 2024, and claims the benefit of priority to U.S. provisional patent application No. 63/412,798 filed on Oct. 3, 2022. These applications are hereby incorporated by reference in their entirety including any appendices. BACKGROUND The present technology relates to virtual reality (VR) systems and methods that create an immersive three-dimensional viewing experience for users. In particular, the present technology relates to systems and methods that simulate scenes of a “virtual world” and allow a user to navigate that world. Virtual reality is a simulated experience that immerses users in a virtual world. Applications of virtual reality include entertainment (especially video games), education and training, and business such as real estate tours and office meetings. Strictly speaking, virtual reality takes place in a completely virtual environment, but may be considered as also encompassing related technologies, such as augmented virtual reality, and mixed reality, which is sometimes referred to as extended reality (XR). In augmented virtual reality, virtual objects are overlaid on a real-world environment, and in mixed reality, a virtual environment is combined with the real world. Conventionally, virtual reality is enabled through the use of VR headsets and/or related hand-held peripherals. Although effective in many cases, they have certain qualities or requirements that form barriers to their ubiquitous use, and thereby limit access to virtual worlds. One set of examples of these barriers are high cost, disease transmission, comfort, potential for injury, and venue limitations. More specifically, VR headsets can be cost-prohibitive for a variety of potential applications because they typically include a combination of near-eye displays, integrated audio, tracking hardware, and an on-board computing device. VR headsets, when shared between users in homes, schools, museums, workplaces, and various public venues provide opportunities or at least is perceived as a risk for the spread of disease because of the headset's inherent proximity to each user's mouth, nose, and eyes. Moreover, some users are reluctant to use VR headsets because of the risk of eye strain, nausea, dizziness, myopia, radiation exposure and/or disorientation. And finally, the use of VR headsets that require 100% immersion can make users unaware of their physical surroundings. This creates risks to both health and nearby property, making some users reluctant to use VR headsets. Likewise, utility-powered VR headsets require wiring from an outlet or computer to the VR headset. This wiring can inhibit movement, thereby also representing a risk of personal injury or damage to property. Finally, because VR headsets include integrated display components, public venues such as those employing kiosks and digital signage do not accommodate the use of VR headsets. In addition, as mentioned above, VR devices include a combination of near-eye displays, integrated audio, tracking hardware, and an on-board computing device. Collectively, these components form a single integrated solution. However, users who already possess multi-purpose components such as display devices, computing devices, audio devices or devices that could be used for tracking purposes, are unable to leverage these as components to customize or create VR headsets. For example, a user who has a gaming computer with a new state-of-the-art GPU, is unable to use this component to replace the GPU in a VR headset. Still further, battery powered VR headsets require periodic recharging, which represents another barrier to their wide-scale adoption. VR solutions that require exclusive access to a user's eyes, ears and/or hands create situations where related senses are wholly or partially unable to interact with the physical world. These solutions thus impede a user's ability to freely access information from devices such as computers, smart-phones, telephones, doorbells, intercoms, television, alarms, alerts from public services, or other people in their vicinity. Thus, access to important and timely notifications may be blocked and so, the risk of missing these notifications makes some users reluctant to use currently available VR solutions. VR headsets and the like also limit a user's ability to multitask. Specifically, users of conventional VR technology are limited in their ability to use tools including, but not limited to, appliances, hardware-related tools, cooking implements and writing implements. For example, it is not possible to safely cook while wearing a VR headset. Accordi