US-12620111-B2 - Image processing apparatus, image processing method, and storage medium
Abstract
An image processing apparatus obtains a first captured image from a first image capturing device and a second captured image from a second image capturing device, and a first image capturing parameter in capturing the first captured image and a second image capturing parameter in capturing the second captured image. Three-dimensional shape data indicating a shape of an object image-captured by the first and second image capturing devices is obtained. Parallax information on the first and second captured images is obtained based on a search point included in the first captured image and a corresponding point corresponding to the search point and included in the second captured image. A search target region limiting a range of a pixel set as the search point is set in the first captured image based on the first and second image capturing parameters and the three-dimensional shape data.
Inventors
- Miki WAKUI
Assignees
- CANON KABUSHIKI KAISHA
Dates
- Publication Date
- 20260505
- Application Date
- 20230712
- Priority Date
- 20220721
Claims (16)
- 1 . An image processing apparatus comprising: one or more hardware processors; and one or more memories storing one or more programs configured to be executed by the one or more hardware processors, the one or more programs including instructions for: obtaining a first captured image obtained by image capturing by a first image capturing device and a second captured image obtained by image capturing by a second image capturing device; obtaining a first image capturing parameter, which is an image capturing parameter in a case where the first captured image is captured, and a second image capturing parameter, which is an image capturing parameter in a case where the second captured image is captured; obtaining three-dimensional shape data indicating a shape of an object image-captured by the first image capturing device and the second image capturing device; obtaining parallax information on the first captured image and the second captured image based on a search point included in the first captured image and a corresponding point corresponding to the search point and included in the second captured image; and setting a search target region limiting a range of a pixel set as the search point in the first captured image based on the first image capturing parameter, the second image capturing parameter, and the three-dimensional shape data; wherein based on the first image capturing parameter and the three-dimensional shape data, a first position in the three-dimensional shape data that corresponds to a pixel of the first captured image is obtained, based on the second image capturing parameter, a second position in the second captured image that corresponds to the first position is obtained, a first depth value indicating a distance from the second position to the first position is obtained, based on the second image capturing parameter and the three-dimensional shape data, a depth map corresponding to the second captured image is obtained, based on the depth map, a second depth value corresponding to the second position is obtained, and based on the first depth value and the second depth value, the search target region is set.
- 2 . The image processing apparatus according to claim 1 , wherein the parallax information is obtained by identifying the corresponding point corresponding to the search point included in the search target region from the second captured image.
- 3 . The image processing apparatus according to claim 1 , wherein the search target region is set as an image region corresponding to a part of the object shown in both the first captured image and the second captured image out of an image region in the first captured image in which the object is shown.
- 4 . The image processing apparatus according to claim 1 , wherein the first position is obtained by back-projecting a position corresponding to the pixel of the first captured image to the three-dimensional shape data by using the first image capturing parameter.
- 5 . The image processing apparatus according to claim 1 , wherein the second position is obtained by projecting the first position to the second captured image by using the second image capturing parameter.
- 6 . The image processing apparatus according to claim 1 , wherein the depth map is obtained by projecting a position of each of a plurality of elements forming the three-dimensional shape data to the second captured image by using the second image capturing parameter.
- 7 . The image processing apparatus according to claim 1 , wherein the search target region is set as an image region including a pixel of the first captured image in which a magnitude of a difference between the first depth value and the second depth value is equal to or smaller than a predetermined value.
- 8 . The image processing apparatus according to claim 1 , wherein in a case where a position of an element forming the three-dimensional shape data is within a viewing angle of each of the first image capturing device and the second image capturing device, the search target region is set as an image region including a pixel of the first captured image that corresponds to the position of the element.
- 9 . The image processing apparatus according to claim 8 , wherein based on the three-dimensional shape data and the first image capturing parameter, whether the position of the element is within the viewing angle of the first image capturing device is determined, and based on the three-dimensional shape data and the second image capturing parameter, whether the position of the element is within the viewing angle of the second image capturing device is determined.
- 10 . The image processing apparatus according to claim 8 , wherein the pixel of the first captured image that corresponds to the position of the element is identified by projecting the position of the element to the first captured image by using the first image capturing parameter.
- 11 . The image processing apparatus according to claim 1 , wherein the search target region is set in the first captured image after being transformed based on the first image capturing parameter and the second image capturing parameter.
- 12 . The image processing apparatus according to claim 11 , wherein the transformation of the first captured image is transformation by stereo parallelization.
- 13 . The image processing apparatus according to claim 1 , wherein the one or more programs further include an instruction for: correcting the three-dimensional shape data based on the parallax information, the first image capturing parameter, and the second image capturing parameter.
- 14 . The image processing apparatus according to claim 1 , wherein the one or more programs further include an instruction for: evaluating whether a combination of the first image capturing device and the second image capturing device are appropriate as two image capturing devices forming a stereo camera based on information indicating the search target region and the parallax information.
- 15 . An image processing method comprising the steps of: obtaining a first captured image obtained by image capturing by a first image capturing device and a second captured image obtained by image capturing by a second image capturing device; obtaining a first image capturing parameter, which is an image capturing parameter in a case where the first captured image is captured, and a second image capturing parameter, which is an image capturing parameter in a case where the second captured image is captured; obtaining three-dimensional shape data indicating a shape of an object image-captured by the first image capturing device and the second image capturing device; obtaining parallax information on the first captured image and the second captured image based on a search point included in the first captured image and a corresponding point corresponding to the search point and included in the second captured image; and setting a search target region limiting a range of a pixel set as the search point in the first captured image based on the first image capturing parameter, the second image capturing parameter, and the three-dimensional shape data; wherein based on the first image capturing parameter and the three-dimensional shape data, a first position in the three-dimensional shape data that corresponds to a pixel of the first captured image is obtained, based on the second image capturing parameter, a second position in the second captured image that corresponds to the first position is obtained, a first depth value indicating a distance from the second position to the first position is obtained, based on the second image capturing parameter and the three-dimensional shape data, a depth map corresponding to the second captured image is obtained, based on the depth map, a second depth value corresponding to the second position is obtained, and based on the first depth value and the second depth value, the search target region is set.
- 16 . A non-transitory computer readable storage medium storing a program for causing a computer to perform a control method of an image processing apparatus, the control method comprising the steps of: obtaining a first captured image obtained by image capturing by a first image capturing device and a second captured image obtained by image capturing by a second image capturing device; obtaining a first image capturing parameter, which is an image capturing parameter in a case where the first captured image is captured, and a second image capturing parameter, which is an image capturing parameter in a case where the second captured image is captured; obtaining three-dimensional shape data indicating a shape of an object image-captured by the first image capturing device and the second image capturing device; obtaining parallax information on the first captured image and the second captured image based on a search point included in the first captured image and a corresponding point corresponding to the search point and included in the second captured image; and setting a search target region limiting a range of a pixel set as the search point in the first captured image based on the first image capturing parameter, the second image capturing parameter, and the three-dimensional shape data; wherein based on the first image capturing parameter and the three-dimensional shape data, a first position in the three-dimensional shape data that corresponds to a pixel of the first captured image is obtained, based on the second image capturing parameter, a second position in the second captured image that corresponds to the first position is obtained, a first depth value indicating a distance from the second position to the first position is obtained, based on the second image capturing parameter and the three-dimensional shape data, a depth map corresponding to the second captured image is obtained, based on the depth map, a second depth value corresponding to the second position is obtained, and based on the first depth value and the second depth value, the search target region is set.
Description
BACKGROUND Field The present disclosure relates to a technique to obtain a parallax of an object in multiple captured images. Description of the Related Art There has been a technique to obtain a three-dimensional shape of an object from multiple images. For example, there has been a method of obtaining distance information by the stereo matching as one of methods of obtaining information indicating a three-dimensional shape of an object in multiple images. In the method of obtaining the distance information by the stereo matching, first, a first captured image and a second captured image are obtained by two image capturing devices corresponding to right and left image capturing devices in a stereo camera that are arranged at an interval. Next, a feature point of an object extracted from the first captured image captured by one image capturing device is used as a search point, and a corresponding point corresponding to the search point is searched and identified from the second captured image captured by the other image capturing device. Parallax information is obtained by comparing a coordinate of the search point with a coordinate of the corresponding point corresponding to the search point. Then, the obtained parallax information and an image capturing parameter such as a position and a focal length of each image capturing device are used to obtain distance information (a depth value) to the object by the principle of triangulation or the like. With the above procedure, it is possible to obtain the information indicating the three-dimensional shape of the object. International Publication No. WO2020/183711 discloses a technique in which the parallax is predicted based on the distance to the object that is measured by using a three-dimensional measurement method, and the stereo matching is performed by setting a search range of the corresponding point in the second captured image by using the predicted parallax. If there are multiple objects, or if the shape of the object is irregular, for example, the corresponding point corresponding to the search point of the object identified in the first captured image may not be shown in the second captured image because of being shielded by another object or by the object itself. Additionally, in some cases, the corresponding point corresponding to the search point is not shown in the second captured image because, for example, the object is not within a viewing angle of the other image capturing device like a case where the object is close to one image capturing device, which causes the corresponding point to be cut off. In such a case, in the technique disclosed in International Publication No. WO2020/183711, the stereo matching is performed by using wrong parallax information, and the correct distance information cannot be obtained in some cases. Thus, there has been a possibility of using inaccurate parallax information in the obtainment of the three-dimensional shape using multiple images. SUMMARY An image processing apparatus according to the present disclosure comprises: one or more hardware processors; and one or more memories storing one or more programs configured to be executed by the one or more hardware processors, the one or more programs including instructions for: obtaining a first captured image obtained by image capturing by a first image capturing device and a second captured image obtained by image capturing by a second image capturing device; obtaining a first image capturing parameter, which is an image capturing parameter in a case where the first captured image is captured, and a second image capturing parameter, which is an image capturing parameter in a case where the second captured image is captured; obtaining three-dimensional shape data indicating a shape of an object image-captured by the first image capturing device and the second image capturing device; obtaining parallax information on the first captured image and the second captured image based on a search point included in the first captured image and a corresponding point corresponding to the search point and included in the second captured image; and setting a search target region limiting a range of a pixel set as the search point in the first captured image based on the first image capturing parameter, the second image capturing parameter, and the three-dimensional shape data. Further features of the present disclosure will become apparent from the following description of exemplary embodiments with reference to the attached drawings. BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a block diagram illustrating an example of a functional configuration of an image processing apparatus according to Embodiment 1; FIG. 2 is a block diagram illustrating an example of a hardware configuration of the image processing apparatus according to Embodiment 1; FIG. 3 is a flowchart illustrating an example of a processing flow by a CPU included in the image processing apparatus according to Embo