US-12626383-B1 - Depth information generation based on LiDAR and camera integration

US12626383B1US 12626383 B1US12626383 B1US 12626383B1US-12626383-B1

Abstract

A device for generating depth information includes processing circuitry configured to: determine a camera frame timestamp indicative of when a camera frame is captured, the camera frame including image content across a capture angle range; determine a plurality of LiDAR frame timestamps each indicative of when a respective LiDAR frame of a plurality of LiDAR frames is captured over a scan time; assign different portions of the camera frame to different LiDAR frames based on the camera frame timestamp and respective LiDAR frame timestamps of the different LiDAR frames and based on a capture angle of the different portions of the camera frame, the different LiDAR frames to which the different portions of the camera frame are assigned being assigned LiDAR frames; and generate depth information for pixels in the different portions of the camera frame based on the assigned LiDAR frames.

Inventors

Louis Joseph Kerofsky
Madhumitha Sakthi
Varun Ravi Kumar
Senthil Kumar Yogamani

Assignees

QUALCOMM INCORPORATED

Dates

Publication Date: 20260512
Application Date: 20241216

Claims (20)

1 . A device for generating depth information, the device comprising: one or more memories; and processing circuitry coupled to the one or more memories, wherein the processing circuitry is configured to: determine a camera frame timestamp indicative of when a camera frame is captured, the camera frame including image content across a capture angle range; determine a plurality of LiDAR frame timestamps each indicative of when a respective LiDAR frame of a plurality of LiDAR frames is captured over a scan time; assign different portions of the camera frame to different LiDAR frames based on the camera frame timestamp and respective LiDAR frame timestamps of the different LiDAR frames and based on a capture angle of the different portions of the camera frame, the different LiDAR frames to which the different portions of the camera frame are assigned being assigned LiDAR frames; and generate depth information for pixels in the different portions of the camera frame based on the assigned LiDAR frames.
2 . The device of claim 1 , wherein to generate the depth information, the processing circuitry is configured to: determine which points in the assigned LiDAR frames correspond to which pixels in the camera frame based on capture angles of the pixels and point angles of the points in the assigned LiDAR frames; determine depth values for the points in the assigned LiDAR frames; and generate the depth information for the pixels in the different portions of the camera frame based on the determined depth values for the points in the assigned LiDAR frames and the correspondence of the points in the assigned LiDAR frames and the pixels in the camera frame.
3 . The device of claim 1 , wherein to determine the plurality of LiDAR frame timestamps each indicative of when the respective LiDAR frame of the plurality of LiDAR frames is captured, the processing circuitry is configured to determine a first LiDAR frame timestamp indicative of when a first LiDAR frame is captured and a second LiDAR frame timestamp indicative of when a second LiDAR frame is captured, wherein to assign different portions of the camera frame to different LiDAR frames based on the camera frame timestamp and respective LiDAR frame timestamps of the different LiDAR frames and based on the capture angle of the different portions of the camera frame, the processing circuitry is configured to: determine that the first LiDAR frame is closer in time to a first portion of the different portions of the camera frame than the second LiDAR frame is to the first portion based on a first capture angle of the first portion, the camera frame timestamp, and the first LiDAR frame timestamp; assign the first portion to the first LiDAR frame; determine that the second LiDAR frame is closer in time to a second portion of the different portions of the camera frame than the first LiDAR frame is to the second portion based on a second capture angle of the second portion, the camera frame timestamp, and the second LiDAR frame timestamp; and assign the second portion to the second LiDAR frame, and wherein to generate the depth information for the pixels in the different portions of the camera frame based on the assigned LiDAR frames, the processing circuitry is configured to generate depth information for pixels in the first portion based on depth information from the first LiDAR frame, and generate depth information for pixels in the second portion based on depth information from the second LiDAR frame.
4 . The device of claim 1 , wherein to generate the depth information, the processing circuitry is configured to: determine respective time differences between the camera frame timestamp and the respective LiDAR frame timestamps of the assigned LiDAR frames; determine a speed of a vehicle that includes a camera used for capturing the camera frame and a LiDAR used for capturing the LiDAR frames; scale depth values from the assigned LiDAR frames based on the speed and the respective time differences; and generate the depth information for the pixels in the different portions of the camera frame based on the scaled depth values of the assigned LiDAR frames.
5 . The device of claim 4 , wherein to generate the depth information for the pixels in the different portions of the camera frame based on the scaled depth values of the assigned LiDAR frames, the processing circuitry is configured to: generate depth information for a first column of pixels in the camera frame based on scaled depth values of a first LiDAR frame assigned to the first column of pixels; and generate depth information for a second column of pixels in the camera frame based on scaled depth values of a second LiDAR frame assigned to the second column of pixels, wherein the first column and the second column correspond to different capture angles.
6 . The device of claim 1 , wherein the processing circuitry is further configured to access timing information of when the different portions of the camera frame are captured based on a rolling shutter of a camera that captured the camera frame, and wherein to assign different portions of the camera frame to different LiDAR frames, the processing circuitry is configured to assign, at a pixel-level, different pixels of the camera frame to different LiDAR frames based on the camera frame timestamp and respective LiDAR frame timestamps of the different LiDAR frames, the capture angle of the different portions of the camera frame, and the timing information.
7 . The device of claim 1 , wherein the capture angle range is a yaw angle range, and the capture angle is a yaw angle.
8 . The device of claim 1 , wherein the processing circuitry is configured to control an operating parameter of a vehicle based on the depth information.
9 . The device of claim 8 , wherein the operating parameter comprises one of a braking parameter or a path planning parameter.
10 . A method of generating depth information, the method comprising: determining a camera frame timestamp indicative of when a camera frame is captured, the camera frame including image content across a capture angle range; determining a plurality of LiDAR frame timestamps each indicative of when a respective LiDAR frame of a plurality of LiDAR frames is captured over a scan time; assigning different portions of the camera frame to different LiDAR frames based on the camera frame timestamp and respective LiDAR frame timestamps of the different LiDAR frames and based on a capture angle of the different portions of the camera frame, the different LiDAR frames to which the different portions of the camera frame are assigned being assigned LiDAR frames; and generating depth information for pixels in the different portions of the camera frame based on the assigned LiDAR frames.
11 . The method of claim 10 , wherein generating the depth information comprises: determining which points in the assigned LiDAR frames correspond to which pixels in the camera frame based on capture angles of the pixels and point angles of the points in the assigned LiDAR frames; determining depth values for the points in the assigned LiDAR frames; and generating the depth information for the pixels in the different portions of the camera frame based on the determined depth values for the points in the assigned LiDAR frames and the correspondence of the points in the assigned LiDAR frames and the pixels in the camera frame.
12 . The method of claim 10 , wherein determining the plurality of LiDAR frame timestamps each indicative of when the respective LiDAR frame of the plurality of LiDAR frames is captured comprises determining a first LiDAR frame timestamp indicative of when a first LiDAR frame is captured and a second LiDAR frame timestamp indicative of when a second LiDAR frame is captured, wherein assigning different portions of the camera frame to different LiDAR frames based on the camera frame timestamp and respective LiDAR frame timestamps of the different LiDAR frames and based on the capture angle of the different portions of the camera frame comprises: determining that the first LiDAR frame is closer in time to a first portion of the different portions of the camera frame than the second LiDAR frame is to the first portion based on a first capture angle of the first portion, the camera frame timestamp, and the first LiDAR frame timestamp; assigning the first portion to the first LiDAR frame; determining that the second LiDAR frame is closer in time to a second portion of the different portions of the camera frame than the first LiDAR frame is to the second portion based on a second capture angle of the second portion, the camera frame timestamp, and the second LiDAR frame timestamp; and assigning the second portion to the second LiDAR frame, and wherein generating the depth information for the pixels in the different portions of the camera frame based on the assigned LiDAR frames comprises generating depth information for pixels in the first portion based on depth information from the first LiDAR frame, and generating depth information for pixels in the second portion based on depth information from the second LiDAR frame.
13 . The method of claim 10 , wherein generating the depth information comprises: determining respective time differences between the camera frame timestamp and the respective LiDAR frame timestamps of the assigned LiDAR frames; determining a speed of a vehicle that includes a camera used for capturing the camera frame and a LiDAR used for capturing the LiDAR frames; scaling depth values from the assigned LiDAR frames based on the speed and the respective time differences; and generating the depth information for the pixels in the different portions of the camera frame based on the scaled depth values of the assigned LiDAR frames.
14 . The method of claim 13 , wherein generating the depth information for the pixels in the different portions of the camera frame based on the scaled depth values of the assigned LiDAR frames comprises: generating depth information for a first column of pixels in the camera frame based on scaled depth values of a first LiDAR frame assigned to the first column of pixels; and generating depth information for a second column of pixels in the camera frame based on scaled depth values of a second LiDAR frame assigned to the second column of pixels, wherein the first column and the second column correspond to different capture angles.
15 . The method of claim 10 , further comprising: accessing timing information of when the different portions of the camera frame are captured based on a rolling shutter of a camera that captured the camera frame, wherein assigning different portions of the camera frame to different LiDAR frames comprises assigning, at a pixel-level, different pixels of the camera frame to different LiDAR frames based on the camera frame timestamp and respective LiDAR frame timestamps of the different LiDAR frames, the capture angle of the different portions of the camera frame, and the timing information.
16 . The method of claim 10 , wherein the capture angle range is a yaw angle range, and the capture angle is a yaw angle.
17 . The method of claim 10 , further comprising controlling an operating parameter of a vehicle based on the depth information.
18 . The method of claim 17 , wherein the operating parameter comprises one of a braking parameter or a path planning parameter.
19 . A computer-readable storage medium storing instructions thereon that when executed cause one or more processors to: determine a camera frame timestamp indicative of when a camera frame is captured, the camera frame including image content across a capture angle range; determine a plurality of LiDAR frame timestamps each indicative of when a respective LiDAR frame of a plurality of LiDAR frames is captured over a scan time; assign different portions of the camera frame to different LiDAR frames based on the camera frame timestamp and respective LiDAR frame timestamps of the different LiDAR frames and based on a capture angle of the different portions of the camera frame, the different LiDAR frames to which the different portions of the camera frame are assigned being assigned LiDAR frames; and generate depth information for pixels in the different portions of the camera frame based on the assigned LiDAR frames.
20 . The computer-readable storage medium of claim 19 , wherein the instructions that cause the one or more processors to generate the depth information comprise instructions that cause the one or more processors to: determine which points in the assigned LiDAR frames correspond to which pixels in the camera frame based on capture angles of the pixels and point angles of the points in the assigned LiDAR frames; determine depth values for the points in the assigned LiDAR frames; and generate the depth information for the pixels in the different portions of the camera frame based on the determined depth values for the points in the assigned LiDAR frames and the correspondence of the points in the assigned LiDAR frames and the pixels in the camera frame.

Description

TECHNICAL FIELD The disclosure relates to image processing including depth data generation. BACKGROUND A camera captures a camera frame that includes image content information, but may not provide sufficient depth information. A LiDAR system captures a LiDAR frame that includes a point cloud with three-dimensional coordinates for the points. A processing circuitry may be able to determine depth information for pixels in the camera frame using the three-dimensional coordinates of corresponding points in the LiDAR frame. SUMMARY In general, this disclosure describes example techniques to generate depth information for pixels in a camera frame based on points in a LiDAR frame accounting for LiDAR and cameras capturing frames at different times, as well as movement of a vehicle that includes the LiDAR and camera. The LiDAR and camera may capture frames at different times due to the camera and LiDAR capturing frames at different frame rates, as well as due to the scan time that the LiDAR takes to capture one frame. As such, camera frames and LiDAR frames typically do not align exactly. These issues may be further compounded due to movement of the vehicle. The LiDAR and camera capturing frames at different times can lead to discrepancies in the depth information generated for a camera frame, such as when the vehicle is moving. This temporal misalignment can result in inaccurate depth data, affecting the performance of systems relying on this information, such as autonomous vehicles or advanced driver assistance systems (ADAS), as well as other systems that utilize depth information such as object detection systems, neural radiance field (NeRF) systems that generate image content, etc. In one or more examples of the disclosure, processing circuitry may assign different LiDAR frames to different portions of the camera frame based on timestamps for when the LiDAR frames and camera frame were captured, as well as based on capture angle of the different portions of the camera frame. The processing circuitry may generate depth information for pixels in the different portions of the camera based on the assigned LiDAR frames. The different LiDAR frames to which the different portions of the camera frame are assigned may be referred to as assigned LiDAR frames. For example, the processing circuitry may determine which points in the assigned LiDAR frames correspond to which pixels in the camera frame, determine depth values for the points in the assigned LiDAR frames, and generate the depth information for the pixels in the different portions of the camera frame based on the determined depth values for the points in the assigned LiDAR frames and the correspondence of the points in the assigned LiDAR frames and the pixels in the camera frame. In some examples, the processing circuitry may scale the depth values for the points based on speed of the vehicle to compensate for the speed at which the vehicle is moving. In some examples, the processing circuitry may further adjust the depth values to account for the rolling shutter, where the camera captures different rows or columns of pixels at different times. By assigning different portions of the camera frame to different LiDAR frames based on these timestamps and the capture angles of the camera frame portions, the processing circuitry can generate more accurate depth information for the pixels in the camera frame. Accordingly, the example techniques may better ensure that the generated depth information corresponds accurately to the image content, even though the LiDAR and camera capture data at different times, and when the vehicle is moving. The processing circuitry may scale the depth values from the LiDAR frames based on the speed of the vehicle and other factors, further enhancing the accuracy of the depth information generated. In one example, the disclosure describes a device for generating depth information, the device comprising: one or more memories; and processing circuitry coupled to the one or more memories, wherein the processing circuitry is configured to: determine a camera frame timestamp indicative of when a camera frame is captured, the camera frame including image content across a capture angle range; determine a plurality of LiDAR frame timestamps each indicative of when a respective LiDAR frame of a plurality of LiDAR frames is captured over a scan time; assign different portions of the camera frame to different LiDAR frames based on the camera frame timestamp and respective LiDAR frame timestamps of the different LiDAR frames and based on a capture angle of the different portions of the camera frame, the different LiDAR frames to which the different portions of the camera frame are assigned being assigned LiDAR frames; and generate depth information for pixels in the different portions of the camera frame based on the assigned LiDAR frames. In one example, the disclosure describes a method of generating depth information, the method comprising: deter