JP-7855639-B2 - Technology for displaying and capturing images

JP7855639B2JP 7855639 B2JP7855639 B2JP 7855639B2JP-7855639-B2

Inventors

アレクサンダーメンジース
トビアスリック
アレクサンドルダヴェイガ
ブライスエルシュミッチェン
ヴェダントサラン
ブライアンクライン
マイケルアイワインステイン
ツァオ－ウェイファン

Assignees

アップルインコーポレイテッド

Dates

Publication Date: 20260508
Application Date: 20240530
Priority Date: 20240510

Claims (13)

It is a device, Camera set, A set of displays, Memory and The memory comprises one or more processors operably coupled to the memory, and the one or more processors are configured to provide the one or more processors The aforementioned set of cameras is used to capture an image stream. A first set of transformed images is generated from the image stream by applying a perspective correction operation to the images in the image stream, which transforms the viewpoint of the images in the image stream. The set of displays is used to display the first set of the converted images. Receive the capture request, The system is configured to execute an instruction to generate a set of stereo output images in response to receiving the capture request, and the step of generating the set of stereo output images is: The steps include selecting a set of images from the aforementioned image stream, A step of generating a second set of transformed images from the set of images by applying an image correction operation, different from the perspective correction operation, which projects the images of the image stream onto a target image plane, to the set of images; A device comprising the step of generating a set of stereo output images using a second set of the converted images.
The step of generating the set of stereo output images is: The device according to claim 1, comprising the step of generating a fused stereo output image from two or more pairs of images from the set of images.
The device according to claim 1, wherein the set of stereo output images includes stereo video.
The step of generating the set of stereo output images is: The device according to claim 3, further comprising the step of performing a video stabilization operation on a second set of the converted images.
The device according to any one of claims 1 to 4, wherein the step of generating the set of stereo output images includes the step of generating metadata associated with the set of stereo output images.
The device according to claim 5, wherein the metadata includes field-of-view information of the set of cameras.
The device according to claim 5, wherein the metadata includes pose information for the set of stereo output images.
The device according to claim 5, wherein the metadata includes a set of default disparity values for the set of stereo output images.
The device according to claim 8, wherein the step of generating metadata associated with the set of stereo output images includes the step of selecting the set of default disparity values based on the scene captured by the set of stereo output images.
It is a method, The steps include capturing an image stream using the device's camera set, The steps include generating a first set of transformed images from the image stream by applying a perspective correction operation to the images in the image stream to transform the viewpoint of the images in the image stream, The steps include displaying a first set of the converted images in a set of device displays, The steps include receiving a capture request and The step of generating a set of stereo output images in response to receiving the capture request, wherein the step of generating a set of stereo output images is The steps include selecting a set of images from the aforementioned image stream, A step of generating a second set of transformed images from the set of images by applying an image correction operation, different from the perspective correction operation, which projects the images of the image stream onto a target image plane, to the set of images; A method comprising the step of generating a set of stereo output images using a second set of the converted images.
The step of generating the set of stereo output images is: The method according to claim 10, further comprising the step of generating a fused stereo output image from two or more pairs of images from the set of images.
The method according to claim 10, wherein the set of stereo output images includes stereo video.
The step of generating the set of stereo output images is: The method according to claim 12, further comprising the step of performing a video stabilization operation on a second set of the converted images.

Description

Cross-reference of related applications: This application is non-provisional and claims interest under U.S. Provisional Patent Application No. 63/470,081, 35 U.S.C. 119(e), filed on 31 May 2023, the contents of which are incorporated herein by reference as if they were fully disclosed herein. The embodiments described generally relate to processing image streams separately for real-time display and media capture events. Augmented reality systems can be used to generate partially or entirely simulated environments (e.g., virtual reality environments, mixed reality environments, etc.) where virtual content can replace or extend the physical world. Simulated environments can provide users with engaging experiences and are used in games, personal communication, virtual travel, healthcare, and many other contexts. In some cases, the simulated environment may include information captured from the user's environment. For example, a pass-through mode of an augmented reality system may use one or more displays of the augmented reality system to display images of the user's physical environment. This allows the user to perceive their physical environment through the display(s) of the augmented reality system. The user's environment may be captured using one or more cameras in the augmented reality system. Depending on the camera placement in the augmented reality system, the viewpoint difference between the user and the camera, if not corrected, may cause the image displayed in passthrough mode to not accurately reflect the user's physical environment. This may cause user discomfort or negatively impact the user experience in passthrough mode. The embodiments described herein relate to systems, devices, and methods for performing separate image processing. Some embodiments relate to a method that includes the steps of: capturing an image stream using a set of cameras on the device; generating a first set of transformed images from the image stream by applying a first transformation operation to the images in the image stream; and displaying the first set of transformed images on a set of displays on the device. The method further includes the steps of receiving a capture request and generating a set of stereo output images in response to receiving the capture request. The step of generating a set of stereo output images includes the steps of: selecting a set of images from the image stream; generating a second set of transformed images from the set of images by applying a second transformation operation, different from the first transformation operation, to the set of images; and generating a set of stereo output images using the second set of transformed images. In some variations of these methods, the step of generating a set of stereo output images includes the step of generating a fused stereo output image from two or more pairs of images in the set. Additionally or alternatively, the set of stereo output images includes a stereo video. In some of these variations, the step of generating a set of stereo output images includes the step of performing a video stabilization operation on a second set of transformed images. In some cases, the first transformation operation is a perspective correction operation. Alternatively, the second transformation operation may be an image correction operation. The step of generating a set of stereo output images may include a step of generating metadata associated with the set of stereo output images. The metadata may include field-of-view information for the camera set, pose information for the set of stereo output images, and/or a set of default parallax values for the set of stereo output images. In some cases, the step of generating metadata associated with the set of stereo output images includes a step of selecting a set of default parallax values based on the scene captured by the set of stereo output images. In some variations, virtual content is added to the first set of transformed images. Some embodiments relate to a device comprising a set of cameras, a set of displays, memory, and one or more processors operably coupled to the memory, wherein the one or more processors are configured to execute instructions causing one or more processors to perform any of the methods described above. Similarly, yet another embodiment relates to a non-temporary computer-readable medium containing instructions, which, when executed by at least one computing device, cause at least one computing device to perform an operation including any of the steps of the methods described above. Further embodiments relate to a method comprising the step of capturing a first set of image pairs of a scene using a first camera and a second camera of the device. Each image pair in the first set of image pairs comprises a first image captured by the first camera and a second image captured by the second camera. The method includes the step of generating a first set of transformed image pairs from the firs