DE-112024002271-T5 - IMAGE PROCESSING DEVICE, IMAGE PROCESSING SYSTEM AND IMAGE PROCESSING METHOD

DE112024002271T5DE 112024002271 T5DE112024002271 T5DE 112024002271T5DE-112024002271-T5

Abstract

An image processing device is provided that uses AI technology and is capable of generating an additional training image without relying on experience or know-how, thereby increasing the accuracy of the additional training image. The image processing device comprises an inference unit that, based on an intermediate layer output of an image recognition model obtained by inference with input of a target image and a candidate image group into the image recognition model, obtains and outputs a feature set relating to the target image and a feature set relating to a candidate image contained in the candidate image group. It also includes a training image generation unit that generates a training image, to be used for training the target image, according to a feature set similarity, which is a similarity between the feature set relating to the target image and the feature set relating to the candidate image. The feature sets are output from the inference unit.

Inventors

Ming Liu
Goichi Ono
Hiroaki Ito
Masashi Takada

Assignees

ASTEMO, LTD.

Dates

Publication Date: 20260513
Application Date: 20240611
Priority Date: 20231030

Claims (9)

Image processing device comprising: an inference unit which, based on an intermediate layer output of an image recognition model obtained by inference with input of a target image and a candidate image group into the image recognition model, obtains a feature set with respect to the target image and a feature set with respect to a candidate image contained in the candidate image group and outputs the feature sets, and a training image generation unit which generates a training image to be used for training the target image according to a feature set similarity, which is a similarity between the feature set with respect to the target image and the feature set with respect to the candidate image, wherein the feature sets are output from the inference unit.
Image processing device according to Claim 1 , wherein the learning image generation unit outputs the feature set similarity and the generated learning image, wherein the image processing device has a generation number adjustment unit that determines a generation number of the learning image based on the feature set similarity and the learning image output from the learning image generation unit.
Image processing device according to Claim 1 , wherein the learning image generation unit has a similar image selection unit that selects a similar image that has a feature set similarity with the target image that exceeds a threshold from a plurality of candidate images contained in the candidate image group, and the learning image generation unit uses the similar image selected by the similar image selection unit to generate the learning image.
Image processing device according to Claim 2 , where the generation number adjustment unit reduces the generation quantity of the learning image when the feature set similarity increases.
Image processing device according to Claim 1 , wherein the inference unit outputs the feature sets and furthermore outputs a result of the inference as the reliability of the target image, the learning image generation unit outputs the feature set similarity and the generated learning image, and the image processing device has a generation count adjustment unit which receives the feature set similarity and the learning image output from the learning image generation unit and the reliability output from the inference unit as inputs and determines a generation count of the learning image on the basis of the feature set similarity and the reliability.
Image processing device according to Claim 5 , where the generation count adjustment unit reduces the generation quantity of the learning image as reliability increases.
Image processing device according to Claim 1 , where the inference unit recognizes an image captured by an in-vehicle camera.
Image processing system comprising: the image processing device according to Claim 1 and a server connected to the image processing device via a network for communication, wherein the server comprises a learning unit that performs learning for the image recognition model based on the training image, and a model storage unit that stores the learned model.
Image processing method comprising: obtaining a feature set relating to the target image and a feature set relating to a candidate image contained in the candidate image group, based on an intermediate layer output of an image recognition model obtained by inference with input of a target image and a candidate image group into the image recognition model, and outputting the feature sets and generating a training image to be used for training the target image according to a feature set. Set similarity, which is a similarity between the output set of features in relation to the target image and the output set of features in relation to the candidate image.

Description

Technical field The present invention relates to an image processing device, an image processing system and an image processing method. State of the art There is a known image processing technology that uses a camera mounted on a vehicle to detect an outside world, and uses a result of this for automated driving and the like. In the image processing technology described above, it is necessary to accurately identify a target based on a captured image, which is why AI technology is used and an improvement of the technology for relearning for outside world recognition is required. In the external world perception technology of the vehicle's in-vehicle camera using AI, the evaluation of a learned AI model relies significantly on the experience and know-how of the developer for collecting and generating additional training images (augmentation images) to improve the accuracy of falsely detected images. Therefore, the accuracy of the additional training images is unstable, making an increase in accuracy desirable. This means that a technology is needed that can perform the collection and generation of additional training images without relying on the developer's experience and know-how, and that can increase accuracy. Patent document 1 describes a technology that generates a derived image from a learner candidate image, calculates a similarity (using the G component of an RGB image) between the learner image and the derived image, furthermore compares the calculated similarity with a determination threshold and stores a learner candidate image with a similarity that is lower than the determination threshold than the learner image. Patent document 2 describes a technology that performs a comparison between a template image area and additional learner candidate images, sets a threshold for a result of the comparison, and selects all of the additional learner candidate images that have a higher similarity to the template image area than the set threshold, thereby generating learner images. State of the art document Patent documents Patent document 1: WO-2017-109854-APatent document 2: JP-2022-182149-A Summary of the invention Problems to be solved by the invention The similarity between the training image and the derived image described in patent document 1 differs from a similarity in AI technology, which is why it is difficult to apply the technology described in patent document 1 to a device that uses AI technology to select a training candidate image, and a sufficient increase in accuracy may not be obtained. Furthermore, since the image is selected based solely on the similarity threshold, the number of selected images can become large, so that an optimal number of images for relearning is not obtained, and it is assumed that improving accuracy is difficult. Furthermore, the similarity described in patent document 2 differs from the similarity in the AI technology, which is why, as with the technology described in patent document 1, it is difficult to apply the technology described in patent document 2 to a device that uses AI technology to select the learning candidate image, and a sufficient increase in accuracy may not be obtained. Furthermore, since the image is selected based solely on the similarity threshold, the number of selected images can become large, so that an optimal number of images for relearning is not obtained, and it is assumed that improving accuracy is difficult. One objective of the present invention is to provide an image processing device and an image processing method that use AI technology and are capable of generating an additional training image without recourse to experience and know-how, and of increasing the accuracy of the additional training image. Means of solving the problems To achieve the above-described goal, the present invention is configured as described above. An image processing device comprises an inference unit which, based on an intermediate layer output of an image recognition model obtained by inference with input of a target image and a candidate image group into the image recognition model, obtains a feature set with respect to the target image and a feature set with respect to a candidate image contained in the candidate image group and outputs the feature sets, and a training image generation unit which generates a training image to be used for training the target image according to a feature set similarity, which is a similarity between the feature set with respect to the target image and the feature set with respect to the candidate image, wherein the feature sets are output from the inference unit. Advantageous effects of the invention According to the present invention, it is possible to provide an image processing device and an image processing method that use AI technology and are capable of generating an additional training image without recourse to experience and know-how, and of increasing the accuracy of the additional training image. Brief descript