US-12627784-B2 - Head-mounted electronic device with display recording capability

US12627784B2US 12627784 B2US12627784 B2US 12627784B2US-12627784-B2

Abstract

A head-mounted device is provided that includes a variety of subsystems for generating extended reality content, displaying the extended reality content, and recording the extended reality content. The device can include a graphics rendering pipeline configured to render virtual content, tracking sensors configured to obtain user tracking information, a virtual content compositor configured to composite virtual frames based on the virtual content and the user tracking information, cameras configured to capture a video feed, a media merging compositor configured to overlay the composited virtual frames and the video feed to output merged video frames having a first frame rate for display, and a recording pipeline configured to record content having a second frame rate different than the first frame rate. The recording pipeline can record content exhibiting a higher quality than the content being displayed. A portion of the recorded content containing sensitive information can optionally be blurred.

Inventors

James C McIlree
Jacob Wilson
Jérôme Decoodt
Pierre d'Herbemont
Seyedpooya Mirhosseini
Xutao Jiang

Assignees

APPLE INC.

Dates

Publication Date: 20260512
Application Date: 20231017

Claims (20)

1 . A method, comprising: with one or more cameras, capturing a video feed; with a gaze sensor, obtaining gaze information; with a graphics rendering pipeline, rendering first and second virtual content in accordance with first and second depth information, respectively; merging the first virtual content with the video feed based on the gaze information to output a first set of merged video frames in accordance with a first parameter; displaying the first set of merged video frames; and with a recording pipeline, merging the second virtual content with the video feed to output a second set of merged video frames in accordance with a second parameter.
2 . The method of claim 1 , wherein the first set of merged video frames is output in accordance with a display frame rate, wherein the first set of merged video frames is displayed at the display frame rate, and wherein the second set of merged video frames is output in accordance with a recording frame rate different than the display frame rate, the method further comprising: storing the second set of merged video frames at the recording frame rate and as recorded content.
3 . The method of claim 2 , wherein the recording frame rate is less than the display frame rate.
4 . The method of claim 2 , wherein the display frame rate is a multiple of the recording frame rate.
5 . The method of claim 1 , wherein displaying the first set of merged video frames comprises displaying the first set of merged video frames as augmented reality (AR) or mixed reality (MR) content.
6 . The method of claim 1 , further comprising: with one or more tracking sensors, obtaining user tracking information; and with a virtual content compositor, generating descriptors based on the user tracking information or the gaze information, wherein each of the descriptors lists a plurality of image correction functions or image correction parameters applied to the first virtual content prior to being merged with the video feed.
7 . The method of claim 6 , wherein obtaining user tracking information comprises obtaining sensor data selected from the group consisting of: head pose information and hands gesture information.
8 . The method of claim 6 , wherein the plurality of image correction functions or image correction parameters in at least some of the descriptors includes an image warping parameter, a point of view correction parameter, a foveation parameter, or a lens distortion compensation parameter.
9 . The method of claim 6 , wherein the plurality of image correction functions or image correction parameters in at least some of the descriptors includes a brightness adjustment parameter, a color shift parameter, or a chromatic aberration correction parameter.
10 . The method of claim 6 , further comprising: storing the descriptors in shared memory that is accessible to the virtual content compositor and the recording pipeline; and with the recording pipeline, retrieving the descriptors from the shared memory and compositing the second virtual content based on the retrieved descriptors.
11 . The method of claim 1 , further comprising: storing the second set of merged video frames as recorded content; and reducing a field of view of the second set of merged video frames prior to storing the second set of merged video frames as the recorded content.
12 . The method of claim 1 , further comprising: with one or more tracking sensors, obtaining user tracking information; and with a virtual content compositor, compositing virtual frames based on the first virtual content and the user tracking information or the gaze information, wherein merging the first virtual content with the video feed comprises merging the composited virtual frames with the video feed to output the first set of merged video frames, wherein the first set of merged video frames has a first quality, and wherein the second set of merged video frames has a second quality higher than the first quality.
13 . The method of claim 12 , further comprising: disabling a display of the electronic device to prevent the display from presenting the first set of merged video frames of the first quality.
14 . The method of claim 12 , wherein rendering the first virtual content comprises: rendering first unfoveated content intended for a first eye and having a first resolution; rendering second unfoveated content intended for a second eye and having a second resolution less than the first resolution; and conveying the first unfoveated content to the recording pipeline without conveying the second unfoveated content to the recording pipeline.
15 . The method of claim 14 , wherein: the first set of merged video frames is displayed at a first frame rate; and the second set of merged video frames has a second frame rate that is different than the first frame rate.
16 . The method of claim 1 , further comprising: generating a first composited frame based on the first virtual content and user tracking information or the gaze information, wherein merging the first virtual content with the video feed comprises merging the first composited frame with the video feed to output a first frame of the first set of merged video frames; creating a compositor descriptor listing a plurality of image correction functions or image correction parameters applied when generating the first composited frame; and generating a second composited frame based on the second virtual content and the compositor descriptor, wherein merging the second virtual content with the video feed comprises merging the second composited frame with the video feed to output a second frame of the second set of merged video frames and wherein a first portion of the second frame is different than a corresponding first portion of the first frame while a second portion of the second frame is identical to a corresponding second portion of the first frame.
17 . The method of claim 16 , wherein the first portion of the first frame being displayed shows sensitive information, and wherein the first portion of the second frame being recorded is blurred to obfuscate the sensitive information.
18 . A method, comprising: with a camera, capturing a video feed; merging first virtual content with the video feed to output a first set of merged video frames having a first resolution; displaying the first set of merged video frames; and with a recording pipeline, merging second virtual content with the video feed to output a second set of merged video frames having a second resolution higher than the first resolution and recording the second set of merged video frames.
19 . A method, comprising: with one or more cameras, capturing a video feed; with a graphics rendering pipeline, rendering first and second virtual content; with one or more tracking sensors, obtaining user tracking information; with a first compositor, generating a descriptor based on the user tracking information and compositing first virtual frames based on the first virtual content and the descriptor; storing the descriptor in memory that is accessible to the first compositor and a second compositor; and with the second compositor, retrieving the descriptor from the memory and compositing second virtual frames based on the second virtual content and the retrieved descriptor.
20 . The method of claim 19 , further comprising: merging the first virtual frames with the video feed to output a first set of merged video frames; displaying the first set of merged video frames; merging the second virtual frames with the video feed to output a second set of merged video frames; and storing the second set of merged video frames.

Description

This application claims the benefit of U.S. Provisional Patent Application No. 63/433,340, filed Dec. 16, 2022, which is hereby incorporated by reference herein in its entirety. FIELD This relates generally to electronic devices, and, more particularly, to electronic devices such as head-mounted devices. BACKGROUND Electronic devices such as head-mounted devices can include hardware and software subsystems for performing gaze tracking, hands tracking, and head pose tracking on a user. Such electronic device can also include a graphics rendering module for generating virtual content that is presented on a display of the electronic device. The electronic device can also include a compositor that adjusts the virtual content based on the user tracking information prior to displaying the virtual content. The adjusted virtual content can then be output on the display to the user. It can be challenging to record the content that is displayed to the user. The displayed content may be output at a first frame rate. The electronic device may record the content, however, at a second frame rate that is different than the first frame rate of the display. In such scenarios, the displayed content cannot simply be copied as the recorded content. SUMMARY An electronic device such as a head-mounted device with recording capabilities is provided. An aspect of this disclosure provides a method of operating an electronic device that includes capturing a video feed using one or more camera, merging first virtual content with the video feed to output a first set of merged video frames in accordance with a first parameter, displaying the first set of merged video frames, and merging second virtual content with the video feed to output a second set of merged video frames in accordance with a second parameter using a recording pipeline. The first set of merged video frames can be output in accordance with a first frame rate and/or a first set of image correction/adjustment parameters optionally encoded in the form of one or more compositor descriptors, whereas the second set of merged video frames can be output in accordance with a second frame rate different than the first frame rate and/or a second set of image correction/adjustment parameters optionally encoded in the form of one or more compositor descriptors. The first virtual content can be rendered using a graphics rendering pipeline. A virtual content compositor can generate the first and second sets of image correction/adjustment parameters based on user tracking information. The second virtual content can be generated based on the first virtual content and the second set of image correction/adjustment parameters optionally retrieved from the virtual content compositor. An aspect of the disclosure provides a method of operating an electronic device that includes rendering virtual content using a graphics rendering pipeline, obtaining user tracking information using one or more tracking sensors, compositing virtual frames based on the virtual content and the user tracking information using a virtual content compositor, capturing a video feed using one or more cameras, merging the composited virtual frames and the captured video feed to output merged video frames having a first quality, and recording content having a second quality higher than the first quality using a recording pipeline. The merged video frames can be displayed. The display can optionally be disabled. The recorded content can be generated based on unfoveated content associated with only one eye. An aspect of the disclosure provides a method of operating an electronic device that includes generating a first composited frame based on virtual content and user tracking information, displaying a first frame that is formed by merging the first composited frame with a passthrough video frame, creating a compositor descriptor listing a plurality of image correction functions or parameters applied when generating the first composited frame, generating a second composited frame based on the virtual content and the compositor descriptor, and recording a second frame that is formed by merging the second composited frame with the passthrough video frame. A first portion of the second frame can be different than a corresponding first portion of the first frame while a second portion of the second frame can be identical to a corresponding second portion of the first frame. Any sensitive information in the first portion of the first frame being displayed can be shown to the user, whereas that sensitive information in the first portion of the second frame being recorded can be blurred. BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a top view of an illustrative head-mounted device in accordance with some embodiments. FIG. 2 is a schematic diagram of an illustrative head-mounted device in accordance with some embodiments. FIG. 3 is a diagram showing illustrative display and recording pipelines within a head-mounted device in acco