CN-114627357-B - Determining image shares related to image classifier decisions

CN114627357BCN 114627357 BCN114627357 BCN 114627357BCN-114627357-B

Abstract

A method for measuring the share of an input image, wherein an image classifier makes its own decision on the assignment of the input image to one or more classes of a predefined class in dependence on the share, comprises processing the input image by the image classifier into intermediate products by means of a convolution layer, mapping the intermediate products to a classification score for at least one target class by the image classifier, determining disturbances in the space of the intermediate products by means of inverse images, wherein the image classifier preferentially assigns the inverse images to other classes in comparison with the target class, providing a binary mask with the same number of pixels as the intermediate products, creating variants from the intermediate products, in which the pixels set by the binary mask are replaced by the pixels corresponding to the disturbances, mapping the variants by the image classifier to the classification score for the predefined class, determining by means of a quality function how the binary mask describes the determined decision-related share of the input image in dependence on the classification score.

Inventors

A. M. Munoz Delgado

Assignees

罗伯特·博世有限公司

Dates

Publication Date: 20260512
Application Date: 20211213
Priority Date: 20201214

Claims (14)

1. A method (100) for measuring a share (2 a) of an input image (2), wherein an image classifier (1) makes its own decision on the basis of the share regarding the assignment of the input image (2) to one or more categories of a predefined classification, the method having the steps of: -processing (110) the input image (2) by the image classifier (1) into a first intermediate product (3) by one or more convolution layers; -mapping (120), by the image classifier (1), the first intermediate product (3) to a first classification score (7) for at least one target class; determining (130) interference (6) in the space of the first intermediate product (3) by one or more inverse images (5 a-5 c), wherein the image classifier (1) preferentially assigns the inverse image to at least one other class compared to the target class, Wherein the disturbance (6) is formed (131) by at least one second intermediate product (3 '), wherein the image classifier (1) processes one or more inverse images (5 a-5 c) into the second intermediate product (3'), Wherein at least one inverse image is selected from a plurality of inverse images, wherein for the at least one inverse image a second intermediate product (3') formed by the image classifier is closest to a first intermediate product (3) formed by the input image according to a predefined distance metric, wherein a cosine distance between vectors of pixel values comprising the respective intermediate product is used as distance metric; -providing (140) at least one binary mask (4) having the same number of pixels as the first intermediate product (3); Creating (150) at least one variant (3) from the first intermediate product (3) ) In the variant, the pixels set by the binary mask (4) are replaced by the pixels of the disturbance (6) corresponding thereto, Wherein the decision-related shares (190) of the first intermediate product (3) determined from one or more binary masks (4) are converted (190) into the determined decision-related shares (2 a) of the input image (2) by upsampling, wherein one of the variants (3) is modified (3) using a binary mask m and an interference P L (x) set from the input image x ) X' L is written as: Wherein x is the input image (2) and f L (x) is the first intermediate product (3) in potential space resulting from the image classifier (1); -transforming (3) the variant (3) by the image classifier (1) ) Mapping (160) to a second classification score (7) for the predefined classification ); According to the second classification score (7 ) Determining (170) with a quality function (8) a metric (8 a) with which the binary mask (4) describes the determined decision-dependent contribution (2 a) of the input image (2), Wherein the quality rating R x,f (m) of each binary mask m corresponds to a third classification score f c , wherein the image classifier (1) assigns the third classification score to the variant x' L : Wherein the number of evaluation figures is in the case of extracting N binary masks m i in the space of the intermediate product Written as Wherein footnote f represents the image classifier (1) and wherein Em represents the expected value of the binary mask.
2. The method (100) according to claim 1, wherein a first intermediate product (3) is selected (111), which first intermediate product (3) is mapped to at least one first classification score (7) by a classifier layer in the image classifier (1).
3. The method (100) according to claim 1 or 2, wherein the disturbance (6) is formed (131 a) by formation or averaging of other aggregate statistics about a plurality of second intermediates (3') into which the image classifier (1) processes different inverse images (5 a-5 c).
4. The method (100) according to claim 1, wherein a plurality of binary masks (4) is provided (141) and wherein the determined decision-related share (2 a) of the input image (2) is determined (180) from the population of the binary masks (4) and the evaluation (8 a) by the quality function (8).
5. The method (100) according to claim 4, wherein at least one inverse image (5 a-5 c) is randomly selected (132) for evaluating each binary mask (4).
6. The method (100) according to claim 4 or 5, wherein the decision-related shares (2 a) of the input image (2) are evaluated (181) according to a sum of binary masks (4) weighted with an evaluation (8 a) of the binary masks (4) by the quality function (8), respectively.
7. The method (100) according to claim 1 or 2, wherein according to the variant (3 ) And a second classification score (7 ) Is a target class (161), and wherein the quality function (8) comprises the second class score (7 ) -Comparison (171) with the first classification score (7) determined for the first intermediate product (3).
8. The method (100) according to claim 1 or 2, wherein an image of a mass-produced product is selected (106) as an input image (2) and wherein the categorized category represents a quality assessment of the product.
9. The method (100) according to claim 8, wherein the determined share (2 a) of the input image (2) on which the image classifier (1) itself decides is compared (200) with a share (2 b) of the input image (2) which has been determined to be relevant for a quality evaluation of the product by means of observation of the same product in other imaging modalities, and wherein the determination (210) is made for the quality evaluation (1 a) of the image classifier (1) on the basis of the result (200 a) of the comparison (200).
10. The method (100) according to claim 1 or 2, wherein an image of a traffic condition recorded from a vehicle is selected (107) as an input image (2) and wherein the classified category represents an evaluation of the traffic condition (50), wherein a future behavior of the vehicle is planned based on the evaluation of the traffic condition.
11. The method (100) according to claim 10, wherein the determined share (2 a) of the input image (2) on which the image classifier (1) itself decides is compared (220) with a share (2 b) of the input image (2) known to be relevant for the assessment of the traffic situation, and wherein a quality assessment (1 a) for the image classifier (1) is determined (230) from the result (220 a) of the comparison.
12. A computer program product comprising machine-readable instructions which, when executed on one or more computers, cause the one or more computers to perform the method (100) according to any one of claims 1 to 11.
13. A machine-readable data carrier and/or a download product, wherein the machine-readable data carrier and/or download product has a computer program according to claim 12.
14. A computer, wherein the computer has a computer program according to claim 12 and/or has a machine-readable data carrier and/or a downloaded product according to claim 13.

Description

Determining image shares related to image classifier decisions Technical Field The invention relates to checking (Kontrolle) the behaviour of trainable image classifiers that can be used, for example, for quality checking of mass-produced products or also for at least partially automated driving of vehicles. Background In mass production of products, it is often necessary to continuously check the quality of production. The aim is to identify quality problems as quickly as possible in order to be able to eliminate the cause as quickly as possible and without losing excessive units of the corresponding product as a defective product. Optical inspection of the geometry and/or surface of the product is quick and non-destructive. WO 2018/197 074 A1 discloses an inspection apparatus in which an object may be subjected to a large number of illumination situations, wherein an image of the object is recorded with a camera under each of these illumination situations. From the images, a topography of the object is assessed (Topographie). The image of the product can also be assigned directly to one of a plurality of categories of the predefined classification using an image classifier based on an artificial neural network. On the basis of this, the product can be assigned to one of a plurality of predefined quality classes. In the simplest case, this classification is binary ("OK"/"not OK"). When the vehicle is driven at least partially automatically, a trainable image classifier is likewise used in order to evaluate the traffic situation or at least to study the content of the objects it contains (Gehalt). Disclosure of Invention Within the scope of the invention, a method has been developed for measuring the share (Anteil) of an input image, from which the image classifier makes its own decision on the assignment of the input image to one or more categories of a predefined classification. In this method, the input image is first processed by the image classifier into an intermediate product through one or more convolution layers. The intermediate product is significantly reduced in dimension compared to the input image and illustrates the activation of the convolution layer until now gradually identified features in the input image (Aktivierung). The more convolution layers have participated in creating the intermediate product, the more complex the feature is, wherein the intermediate product shows an activation of these features. The intermediate product may, for example, comprise a number of feature maps ("feature maps") that have been generated by applying a filter kernel (FILTERKERNEN) to the input image or to an intermediate product that has been previously generated from the input image, respectively. Thus, the intermediate product belongs (angeh, in) to the "potential space (latenten Raum)" within the image classifier. The intermediate product is mapped by an image classifier to a classification Score (Klassifikations-Score) for at least one target class. Interference in the space of the intermediate product is determined from one or more inverse images (Gegen-Bild) that are assigned to at least one other class by the image classifier preferentially (vorrangig) over the target class. At least one binary mask is now provided, having the same number of pixels as the intermediate product. Each of these pixels in the mask can only have two different values, namely, for example, 0 and 1, or "true" and "false". At least one variant (Abwandlung) is created from the intermediate product. In this variant, the pixels set by the binary mask are replaced by the pixels corresponding to this of the disturbance. For example, all pixels having a value of 0 or "false" in the mask may be replaced in the variant by the disturbing pixel. The variant is mapped by the image classifier to a classification score for a predefined class. The quality function (Gu tefunktion) is used to determine from the classification scores how metric the binary mask describes the determined decision-related share of the input image. By a suitable choice of the quality function and the class to which the classification score of the variant relates, different aspects of the decision-related share of the input image can be studied. If the predefined class for which the classification score is determined according to the variant is, for example, a target class, the quality function may comprise a comparison of the classification score with the classification score determined for the intermediate product. For example, it is now possible to provide a mask which leaves a defined small area of the intermediate product unchanged, while in addition the intermediate product is subjected to disturbances. It can then be investigated, for example, if the unchanged region is so important for the assignment of the intermediate product to the target class that the assignment can no longer be shaken by the application of disturbances in other regions of the interme