CN-122002016-A - Generating alternative image views from stereoscopic disparity data
Abstract
The present disclosure relates to generating alternative image views from stereo disparity data. The methods presented herein provide for generating an alternative view from disparity data captured for one or more objects in a scene. The process may be performed using an embedded processor with DMA memory access or other limited-capacity hardware. An intermediate representation may be generated, which is a 2D histogram view of the disparity data. The intermediate representation may be transformed into an alternative view image, such as a bird's eye view image, using an embedded processor. Morphological or similar filtering may be performed on one or more objects in the intermediate representation using filters of the same size, regardless of the distance of the objects from the camera plane used to capture the parallax data.
Inventors
- B. Kisakanin
- HONG QINGYU
Assignees
- 辉达公司
Dates
- Publication Date
- 20260508
- Application Date
- 20251105
- Priority Date
- 20241105
Claims (20)
- 1. A system, comprising: At least one embedded processor having direct memory access, DMA, functionality for: generating a two-dimensional 2D histogram view of one or more objects in an environment based in part on parallax data of the one or more objects, the two-dimensional histogram view being a function of angle and distance relative to a plane of at least one camera used to generate the parallax data, and Generating a bird's eye view image of the one or more objects by transforming the 2D histogram view.
- 2. The system of claim 1, wherein the at least one embedded processor has no access to external memory used in generating the bird's eye view image.
- 3. The system of claim 1, wherein the system is further configured to determine the parallax data using image data captured with the camera.
- 4. The system of claim 1, wherein the camera is a stereoscopic camera assembly or a pair of matched camera sensors.
- 5. The system of claim 1, wherein the bird's-eye view image is generated in part by generating a list of object centroids and statistics using the 2D histogram view and transforming the list into a corresponding list in a coordinate system of the bird's-eye view image.
- 6. The system of claim 1, wherein the data transfer of the embedded processor is performed on a rectangular area of image data using the DMA.
- 7. The system of claim 1, wherein the at least one embedded processor is further to: receiving stereoscopic image data from the at least one camera, and A disparity map including the disparity data of the one or more objects is generated using the stereoscopic image data.
- 8. The system of claim 1, wherein the parallax data comprises data obtained from at least one additional sensor.
- 9. The system of claim 1, wherein the system comprises at least one of: A system for performing a simulation operation; a system for performing a simulation operation to test or verify an autonomous machine application; a system for performing digital twinning operations; a system for performing optical transmission simulation; a system for rendering a graphical output; a system for performing a deep learning operation; A system for performing a generative AI operation using the large language model LLM; A system for performing a generative AI operation using the visual language model VLM; a system for performing a generated AI operation using the multimodal language model MMLM; A system for deploying one or more language models using an operating system, OS, level virtualization container that communicates with the one or more language models using one or more application programming interface, APIs; a system implemented using edge devices; a system for generating or presenting virtual reality VR content; a system for generating or presenting augmented reality AR content; a system for generating or presenting mixed reality MR content; A system comprising one or more virtual machine VMs; a system implemented at least in part in a data center; a system for performing hardware testing using simulation; A system for synthetic data generation; collaborative content creation platform for 3D asset, or A system implemented at least in part using cloud computing resources.
- 10. At least one embedded processor having a direct memory access, DMA, function for generating a bird's-eye view image of a scene by generating an intermediate histogram from parallax data of the scene as a function of angle and distance with respect to a camera plane and transforming the intermediate histogram into the bird's-eye view image.
- 11. The at least one embedded processor of claim 10, wherein the intermediate histogram includes a representation of one or more objects in the scene, and wherein the at least one embedded processor is further configured to perform a connected component analysis on the intermediate histogram to identify pixel locations associated with the one or more objects.
- 12. The at least one embedded processor of claim 11, wherein the at least one embedded processor is further configured to generate a list of object centroids and statistics of the one or more objects using the intermediate histogram and transform the list into a corresponding list in a coordinate system of the bird's eye view image.
- 13. The at least one embedded processor of claim 10, wherein the at least one embedded processor has no access to a full set of image data stored in an external memory for generating the intermediate histogram or the bird's eye view image.
- 14. The at least one embedded processor of claim 10, wherein the data transfer of the at least one embedded processor is performed on a rectangular area of image data using the DMA.
- 15. The at least one embedded processor of claim 10, wherein the at least one embedded processor is further configured to determine the parallax data using at least image data captured with a stereo camera assembly.
- 16. The at least one embedded processor of claim 10, wherein the at least one embedded processor is included in at least one of: A system for performing a simulation operation; a system for performing a simulation operation to test or verify an autonomous machine application; a system for performing digital twinning operations; a system for performing optical transmission simulation; a system for rendering a graphical output; a system for performing a deep learning operation; a system implemented using edge devices; a system for generating or presenting virtual reality VR content; a system for generating or presenting augmented reality AR content; a system for generating or presenting mixed reality MR content; A system comprising one or more virtual machine VMs; a system implemented at least in part in a data center; a system for performing hardware testing using simulation; A system for synthetic data generation; A system for performing a generative AI operation using the large language model LLM; A system for performing a generative AI operation using the visual language model VLM; a system for performing a generated AI operation using the multimodal language model MMLM; A system for deploying one or more language models using an operating system, OS, level virtualization container that communicates with the one or more language models using one or more application programming interface, APIs; collaborative content creation platform for 3D asset, or A system implemented at least in part using cloud computing resources.
- 17. A computer-implemented method, comprising: generating an intermediate histogram representation of the parallax image using an embedded processor with DMA memory access; identifying a location in the intermediate histogram representation associated with one or more objects, and Transforming, using the embedded processor and based in part on the location, the intermediate histogram representation into a bird's eye view image comprising a representation of the one or more objects.
- 18. The computer-implemented method of claim 17, wherein the embedded processor has no access to external memory used in generating the intermediate histogram representation or the bird's eye view image.
- 19. The computer-implemented method of claim 17, further comprising: A connected component analysis is performed on the intermediate histogram representation using the embedded processor to identify the locations associated with the one or more objects.
- 20. The computer-implemented method of claim 18, further comprising: generating a list of object centroids and statistics of the one or more objects from the intermediate histogram representation using the embedded processor, and The list is transformed into a corresponding list in a coordinate system of the bird's eye view image using the embedded processor.
Description
Generating alternative image views from stereoscopic disparity data Technical Field The present disclosure relates to the transformation of image data between different views or representations, and more particularly, to, in one or more non-limiting embodiments, generating an intermediate image representation from a set of disparity data that allows for processing and transformation using limited capacity resources. Background In various computing operations, it is desirable to determine the location of various objects in a scene or geographic area. This may include, for example and without limitation, analyzing the captured image information to support the tasks of navigation, positioning, controlled interaction, and collision avoidance of the robot and autonomous or semi-autonomous vehicle or machine. Performing operations such as those involving image recognition and computer vision requires a significant amount of resource capacity, including the ability to access memory having sufficient capacity to store the entire image. Tasks such as generating a Bird's Eye View (BEV) representation of a scene from captured parallax data are difficult, if not impossible, to perform using resources of limited capacity, such as an embedded processor that does not access external memory. Moreover, tasks such as morphological filtering and motion analysis are resource intensive when needed to perform on the bird's eye view image, as objects at different distances may have different levels of quality or amount of captured information. Drawings Various embodiments according to the present disclosure will be described with reference to the accompanying drawings, in which: FIGS. 1A, 1B, 1C, and 1D illustrate image views that may be generated from captured image data in accordance with at least one embodiment; FIG. 2A illustrates an intermediate image that may be generated using captured image data in accordance with at least one embodiment; FIG. 2B illustrates a view of a Bird's Eye View (BEV) or similar objects in top down images and intermediate histogram images in accordance with at least one embodiment; FIG. 3 illustrates corresponding image data blocks in a parallax image and an intermediate histogram image in accordance with at least one embodiment; FIG. 4 illustrates corresponding image data blocks in an intermediate histogram image and a bird's eye view image in accordance with at least one embodiment; FIG. 5 illustrates an example process that may be performed using an embedded processor to generate a bird's eye view image from parallax image data in accordance with at least one embodiment; FIG. 6 illustrates an example system including an embedded processor with Direct Memory Access (DMA) functionality in accordance with at least one embodiment; FIG. 7A illustrates a comparison of the amount of detail captured for an object at different distances from a camera in accordance with at least one embodiment; FIG. 7B illustrates different size filters required to process the same amount of detail information for objects at different distances in a bird's eye view image in accordance with at least one embodiment; FIG. 8 illustrates a comparison of filter sizes that may be used to process the same amount of detail information for objects at different distances in a bird's eye view image and an intermediate histogram image in accordance with at least one embodiment; FIG. 9 illustrates an example process that may be performed for objects at different distances using a single filter size to perform morphological filtering on a middle histogram image in accordance with at least one embodiment; FIG. 10 illustrates components of a distributed system that can be used to generate, process, and provide sensor-based content in accordance with at least one embodiment; FIG. 11 illustrates an example computing environment in which one or more devices operate to process data using a SoC in accordance with at least one embodiment; FIG. 12 illustrates an example data center system in accordance with at least one embodiment; FIG. 13 illustrates a computer system in accordance with at least one embodiment; FIG. 14 illustrates a computer system in accordance with at least one embodiment; FIG. 15 illustrates at least a portion of a graphics processor in accordance with one or more embodiments; FIG. 16 illustrates at least a portion of a graphics processor in accordance with one or more embodiments; FIG. 17A illustrates an example of an autonomous vehicle in accordance with at least one embodiment; FIG. 17B illustrates an example of camera position and field of view of the autonomous vehicle of FIG. 17A in accordance with at least one embodiment; FIG. 17C is a block diagram illustrating an example system architecture of the autonomous vehicle of FIG. 17A in accordance with at least one embodiment, and Fig. 17D is a schematic diagram of a system for communication between the cloud-based server of fig. 17A and an autonomous vehicle in accordanc