CN-121986286-A - Method for managing scene rendering based on confidence of gesture prediction
Abstract
Some embodiments of a method may include obtaining a first predicted frame display time, obtaining first predicted pose information representing a prediction of a user pose at the first predicted frame display time, determining first pose confidence information indicating a confidence level of the first predicted pose information, selecting a frame for display based at least in part on the first pose confidence information, and causing the selected frame to be displayed. When the pose confidence information indicates a sufficiently high confidence, a newly rendered frame is obtained for display. When the pose confidence information indicates low confidence, a previously rendered frame may be used for display after re-projection.
Inventors
- P. Joyte
- Patrice Hills Forest
- L. Fontaine
- S. Lailifer
- E. Feveld Daxie
- S. ONO
Assignees
- 交互数字CE专利控股有限公司
Dates
- Publication Date
- 20260505
- Application Date
- 20240807
- Priority Date
- 20230809
Claims (20)
- 1. A method, comprising: Obtaining a first predicted frame display time; Obtaining first predicted gesture information, wherein the first predicted gesture information represents the prediction of the user gesture at the display time of the first predicted frame; obtaining first gesture confidence information, wherein the first gesture confidence information indicates the confidence level of the first predicted gesture information; Selecting a frame for display based at least in part on the first pose confidence information, and Causing the selected frame to be displayed.
- 2. The method of claim 1, further comprising: Obtaining a second predicted frame display time, and A re-projection of the selected frame is performed based on the second predicted frame display time before causing the selected frame to be displayed.
- 3. The method of any one of claims 1 to 2, wherein: In response to determining that the first pose confidence information indicates a confidence level at least as great as a threshold, a newly rendered frame is selected as the selected frame for display.
- 4. The method of claim 3, further comprising rendering the selected frame based on a first predicted pose.
- 5. The method of any one of claims 1 to 2, wherein: in response to determining that the first pose confidence information indicates a confidence level below a threshold, a previously rendered frame is selected as the selected frame for display.
- 6. The method of any of claims 1-5, wherein the first pose confidence information comprises a plurality of tokens.
- 7. The method of any of claims 1 to 6, wherein the first pose confidence information includes at least a position validity flag and an orientation validity flag.
- 8. The method according to any one of claim 1 to 7, Wherein the first pose confidence information includes at least a position validity flag and an orientation validity flag, an Wherein in response to determining that both the position validity flag and the orientation validity flag are set, a newly rendered frame is selected as the selected frame for display.
- 9. The method according to any one of claim 1 to 8, Wherein the first pose confidence information includes at least a position validity flag and an orientation validity flag, an Wherein in response to determining that at least one of the location validity flag and the orientation validity flag is not set, a previously rendered frame is selected as the selected frame for display.
- 10. The method of any of claims 1 to 9, wherein the first pose confidence information includes at least a position validity flag, an orientation validity flag, a position tracking flag, and an orientation tracking flag.
- 11. The method of any one of claims 1 to 10, further comprising: obtaining second predicted pose information, the second predicted pose information representing a prediction of a user pose at a display time of the selected frame, and Second pose confidence information is obtained, the second pose confidence information indicating a confidence level of the second predicted pose information.
- 12. The method of claim 11, further comprising calculating a posing error based on a difference between the first predicted posing information and the second predicted posing information.
- 13. The method of claim 12, further comprising determining whether to calculate the posing error based on the first posing confidence information and the second posing confidence information, the posing error being calculated in response to determining to calculate the posing error.
- 14. The method of claim 12, further comprising determining whether the posing error is valid based on the first posing confidence information and the second posing confidence information.
- 15. The method of any of claims 1-14, wherein obtaining the first pose confidence information comprises determining the first pose confidence information.
- 16. An apparatus, the apparatus comprising: processor, and A non-transitory computer readable medium storing instructions which, when executed by the processor, are operable to cause the apparatus to perform the method of any one of claims 1 to 15.
- 17. A method, comprising: Obtaining a first predicted frame display time; Obtaining first predicted gesture information, wherein the first predicted gesture information represents the prediction of the user gesture at the display time of the first predicted frame; determining first pose confidence information, the first pose confidence information indicating a confidence level of the first predicted pose information; transmitting, to an edge application server, determined first pose confidence information indicating the confidence level of the first predicted pose information; Selecting a frame for display based at least in part on the first pose confidence information, and Causing the selected frame to be displayed.
- 18. The method of claim 17, further comprising: Obtaining a second predicted frame display time, and A re-projection of the selected frame is performed based on the second predicted frame display time before causing the selected frame to be displayed.
- 19. The method of any of claims 17 to 18, wherein the first pose confidence information comprises a plurality of tokens.
- 20. The method of any of claims 17 to 19, wherein the first pose confidence information includes at least a position validity flag and an orientation validity flag.
Description
Method for managing scene rendering based on confidence of gesture prediction Cross Reference to Related Applications The present application claims the benefit of european patent application EP23306698 entitled "METHOD TO MANAGE SCENE RENDERING BASED ON THE CONFIDENCE OF THE POSE PREDICTION (METHOD of managing scene rendering based on confidence in pose prediction)" filed on month 3 of 2023, and european patent application EP23306353 entitled "METHOD TO MANAGE SCENE RENDERING BASED ON THE CONFIDENCE OF THE POSE PREDICTION (METHOD of managing scene rendering based on confidence in pose prediction)" filed on month 9 of 2023, which are hereby incorporated by reference in their entirety. Background Computer-generated virtual elements are displayed to a user in an Augmented Reality (AR), augmented reality (XR), virtual Reality (VR), and/or Mixed Reality (MR) experience, for example, in the user's real environment or in a virtual environment using various devices such as VR headphones, optical see-through glasses, or video see-through devices such as smartphones, tablets, headphones. Gesture prediction is used to compensate for the round trip time required to render a virtual scene. For each view, the pose prediction allows for adjustment of the final rendering by taking into account the movement of the AR device during the rendering calculation. For each view, pose information may be predicted based on previous pose values/past pose values at a given time at the beginning of a rendering cycle. For example, embedded device sensors (such as depth or color cameras) and Inertial Measurement Units (IMUs) may be used to collect pose information for user pose estimation. Disclosure of Invention Embodiments described herein include methods used in video encoding and decoding (collectively, "code processing"). A first example method according to some embodiments may include obtaining a first predicted frame display time, obtaining first predicted pose information representing a prediction of a user pose at the first predicted frame display time, obtaining first pose confidence information indicating a confidence level of the first predicted pose information, selecting a frame for display based at least in part on the first pose confidence information, and causing the selected frame to be displayed. Some embodiments of the first example method may further comprise obtaining a second predicted frame display time and performing a re-projection of the selected frame based on the second predicted frame display time prior to causing the selected frame to be displayed. For some embodiments of the first example method, in response to determining that the first pose confidence information indicates a confidence level at least as great as a threshold, a newly rendered frame is selected as the selected frame for display. Some embodiments of the first example method may further include rendering the selected frame based on the first predicted pose. For some embodiments of the first example method, in response to determining that the first pose confidence information indicates a confidence level below a threshold, a previously rendered frame is selected as the selected frame for display. For some embodiments of the first example method, the first pose confidence information comprises a plurality of tokens. For some embodiments of the first example method, the first pose confidence information includes at least a position validity flag and an orientation validity flag. For some embodiments of the first example method, the first pose confidence information may include at least a position validity flag and an orientation validity flag, and in response to determining that both the position validity flag and the orientation validity flag are set, a newly rendered frame may be selected as the selected frame for display. For some embodiments of the first example method, the first pose confidence information may include at least a position validity flag and an orientation validity flag, and in response to determining that at least one of the position validity flag and the orientation validity flag is not set, a previously rendered frame may be selected as a selected frame for display. For some embodiments of the first example method, the first pose confidence information may include at least a position validity flag, an orientation validity flag, a position tracking flag, and an orientation tracking flag. Some embodiments of the first example method may further include obtaining second predicted pose information representing a prediction of a user pose at a display time of the selected frame and obtaining second pose confidence information indicating a confidence level of the second predicted pose information. Some embodiments of the first example method may further include calculating a posing error based on a difference between the first predicted posing information and the second predicted posing information. Some embodiments of the firs