CN-122023493-A - Depth estimation optimization method and system for monocular vision image

CN122023493ACN 122023493 ACN122023493 ACN 122023493ACN-122023493-A

Abstract

The application discloses a depth estimation optimization method and a system of a monocular visual image, wherein the method comprises the steps of obtaining a single-frame true color image output by a monocular camera; the method comprises the steps of determining a priori object in a single-frame true color image, corresponding semantic categories and image areas, determining corresponding physical size constraints based on semantic types of the priori object, determining depth constraints of the priori object based on the physical size constraints and the image areas, constructing an optimized objective function based on the depth constraints of the priori object and an output result of the objective function, and determining a depth estimation optimization result of the single-frame true color image based on the optimized objective function. According to the method, under the limiting condition of the monocular vision image, based on the depth constraint of the priori object, the physical size priori is introduced in the depth estimation process of the single-frame true color image, so that the depth estimation result of the monocular vision image is optimized, and the depth estimation optimization result has physical scale significance.

Inventors

LIN XIAO

Assignees

枢途科技(深圳)有限公司

Dates

Publication Date: 20260512
Application Date: 20260224

Claims (10)

1. A method for optimizing depth estimation of a monocular visual image, comprising: Acquiring a single-frame true color image output by a monocular camera; determining a priori objects in the Shan Zhen true color images and corresponding semantic categories and image areas; determining a corresponding physical size constraint based on the semantic type of the prior object; determining a depth constraint of the prior object based on the physical size constraint and the image region; Based on the depth constraint of the priori object, combining the output result of an objective function to construct an optimized objective function, wherein the objective function is used for carrying out depth estimation on the Shan Zhen true color image; And determining a depth estimation optimization result of the Shan Zhen true color image based on the optimization objective function.
2. The method of claim 1, wherein determining a priori objects in the Shan Zhen true color images and corresponding semantic categories, image regions, comprises: And carrying out image recognition on the Shan Zhen true color images by using a preset model to obtain at least one priori object and a corresponding semantic category and image area, wherein the preset model comprises an image recognition model or a large model.
3. The method of claim 1, wherein determining the corresponding physical size constraint based on the semantic type of the a priori object comprises: For each prior object currently determined, acquiring a corresponding physical size range from a preset prior knowledge base based on the semantic type of the prior object, wherein the preset prior knowledge base comprises a plurality of sample semantic types and corresponding sample physical size ranges; And determining physical size constraint corresponding to the prior object based on the physical size range.
4. The method of claim 1, wherein determining a depth constraint of the prior object based on the physical size constraint and the image region comprises: determining a scale parameter of the prior object in the Shan Zhen true color image based on the image region; And determining the depth constraint of the prior object based on the physical size constraint and the scale parameter, wherein the depth constraint is used for limiting the range of the depth estimation value of the prior object.
5. The method of claim 4, wherein determining a depth constraint for the a priori object based on the physical size constraint and the scale parameter comprises: Substituting the physical dimension constraint and the scale parameter into a target model to determine the depth constraint of the priori object, wherein the target model is In the following Representing an estimated value of depth of the a priori object, Representing the focal length of the monocular camera, Representing the physical dimensions of the a priori object, A scale parameter representing the a priori object, Representing an index of the a priori object.
6. The method of claim 1, wherein constructing an optimization objective function based on depth constraints of the prior object in combination with output results of the objective function comprises: determining a corresponding object depth constraint term based on the depth constraint of the prior object; determining a corresponding depth data consistency constraint item based on an output result of the objective function; And constructing an optimization objective function based on the depth data consistency constraint item and the object depth constraint item, wherein the optimization objective function is used for solving the minimum value of the sum of the depth data consistency constraint item and the object depth constraint item.
7. The method of claim 6, wherein the optimization objective function comprises In the following Representing the output result of the optimization objective function, Representing the depth data consistency constraint term, Represents the depth of the object and, Representing the output result of the objective function, Representing the preset weight parameter of the vehicle, Object depth constraints representing the a priori object, Represents the first A priori objects.
8. A depth estimation optimization system for monocular vision images, comprising: The single-frame image acquisition unit is used for acquiring a single-frame true color image output by the monocular camera; the priori information acquisition unit is used for determining a priori object in the Shan Zhen true color image and a corresponding semantic category and image area; a size constraint determining unit, configured to determine a corresponding physical size constraint based on a semantic type of the prior object; a depth constraint determining unit configured to determine a depth constraint of the prior object based on the physical size constraint and the image region; The objective function optimizing unit is used for constructing an optimized objective function based on the depth constraint of the priori object and combining the output result of the objective function, wherein the objective function is used for carrying out depth estimation on the Shan Zhen true color image; and the depth estimation optimization unit is used for determining a depth estimation optimization result of the Shan Zhen true color image based on the optimization objective function.
9. A storage medium comprising a stored program, wherein the program when executed by a processor performs the depth estimation optimization method of monocular vision images according to any one of claims 1-7.
10. The electronic device is characterized by comprising a processor, a memory and a bus, wherein the processor is connected with the memory through the bus; The memory is used for storing a program, and the processor is used for running the program, wherein the program is executed by the processor to perform the depth estimation optimization method of the monocular vision image according to any one of claims 1 to 7.

Description

Depth estimation optimization method and system for monocular vision image Technical Field The application relates to the technical field of computer vision, in particular to a depth estimation optimization method and system for monocular vision images. Background In application scenarios such as robots, automatic driving assistance, augmented reality, etc., a system typically needs to acquire depth information of an environment to implement spatial perception, path planning, or interactive decision-making. In practical engineering application, many systems only configure monocular RGB cameras, and cannot directly acquire depth information with physical scale significance due to factors such as cost, power consumption, system complexity and the like. Therefore, the depth estimation technique based on monocular images is one of the key directions of research and application. However, monocular image depth estimation is a typical pathological problem, and a single two-dimensional image cannot uniquely determine a corresponding three-dimensional structure without additional constraints. In the prior art, a multi-dependence depth learning model learns statistical features from images to conduct depth prediction, but the depth results output by the method generally only have relative depth significance, and physical scale consistency between different scenes and different objects is difficult to ensure. Disclosure of Invention The application provides a depth estimation optimization method and a system for a monocular visual image, which aim to optimize a depth estimation result of the monocular visual image so that the depth estimation optimization result has physical scale significance. In order to achieve the above object, the present application provides the following technical solutions: a depth estimation optimization method of a monocular visual image, comprising: Acquiring a single-frame true color image output by a monocular camera; determining a priori objects in the Shan Zhen true color images and corresponding semantic categories and image areas; determining a corresponding physical size constraint based on the semantic type of the prior object; determining a depth constraint of the prior object based on the physical size constraint and the image region; Based on the depth constraint of the priori object, combining the output result of an objective function to construct an optimized objective function, wherein the objective function is used for carrying out depth estimation on the Shan Zhen true color image; And determining a depth estimation optimization result of the Shan Zhen true color image based on the optimization objective function. Optionally, determining the prior object in the Shan Zhen true color image and the corresponding semantic category and image area includes: And carrying out image recognition on the Shan Zhen true color images by using a preset model to obtain at least one priori object and a corresponding semantic category and image area, wherein the preset model comprises an image recognition model or a large model. Optionally, determining the corresponding physical size constraint based on the semantic type of the prior object includes: For each prior object currently determined, acquiring a corresponding physical size range from a preset prior knowledge base based on the semantic type of the prior object, wherein the preset prior knowledge base comprises a plurality of sample semantic types and corresponding sample physical size ranges; And determining physical size constraint corresponding to the prior object based on the physical size range. Optionally, determining a depth constraint of the prior object based on the physical size constraint and the image region includes: determining a scale parameter of the prior object in the Shan Zhen true color image based on the image region; And determining the depth constraint of the prior object based on the physical size constraint and the scale parameter, wherein the depth constraint is used for limiting the range of the depth estimation value of the prior object. Optionally, determining the depth constraint of the a priori object based on the physical size constraint and the scale parameter includes: Substituting the physical dimension constraint and the scale parameter into a target model to determine the depth constraint of the priori object, wherein the target model is In the followingRepresenting an estimated value of depth of the a priori object,Representing the focal length of the monocular camera,Representing the physical dimensions of the a priori object,A scale parameter representing the a priori object,Representing an index of the a priori object. Optionally, based on the depth constraint of the prior object, an optimization objective function is constructed in combination with an output result of the objective function, including: determining a corresponding object depth constraint term based on the depth constraint of t