CN-114616568-B - Defensive generator, method and computer readable storage medium for preventing attack on AI unit

CN114616568BCN 114616568 BCN114616568 BCN 114616568BCN-114616568-B

Abstract

The application relates to a defense generator (20) for dynamically generating at least one AI defense module (16). The core feature of the application is to determine the distribution function of the model data. The application is based on the assumption that model data belongs to model manifolds or has similar statistical behavior. Thus, it may be determined for the input data set whether data of the input data set may be associated with a challenge attack. For example, if a statistical anomaly is found in the input data set, it may be determined that the data of the input data set may be associated with a challenge attack.

Inventors

Felix Arsen
Florence Fabian Gersner
Frank Crechmir
Stephen Xinze

Assignees

纽罗卡特有限责任公司

Dates

Publication Date: 20260508
Application Date: 20201013
Priority Date: 20191014

Claims (13)

1. A defense generator (20) for dynamically generating at least one AI defense module (16), comprising: -a blocking unit (21) for determining at least one block (26, 53, 61) of model data (15), wherein the model data (15) is associated with an AI unit (30) and the at least one block (26, 53, 61) represents at least a subset of the model data (15); -an aggregation unit (22) for determining aggregated data (27), wherein the aggregated data (27) assigns at least one key value to the at least one chunk (26, 53, 61); -a distribution unit (23) for determining a distribution function (28, 65) of the aggregated data (27); -an inference unit (24) for determining at least one inference configuration (29) using the distribution function (28, 65); -a data conversion unit (25) for generating at least one AI defense module (16) of the AI unit (30) using the at least one reasoning configuration (29), wherein the at least one AI defense module (16) is adapted to perform the following steps on an input dataset (5, 14) of the AI unit (30): o determining whether an attack on the AI unit (30) can be associated with the input data set (5, 14), and/or O using data transformation to determine a second set of input data (14), wherein an attack on the AI unit (30) cannot be associated with the second set of input data (14), The inference unit (24) is configured for receiving a reconstruction indication (242), wherein the inference unit (24) is configured for determining the inference configuration (29) from the reconstruction indication (242), wherein the reconstruction indication (242) represents whether a data transformation involves the whole model data (15), a random subset of the model data (15) and/or an importance-based selection of the model data (15).
2. The defensive generator (20) of claim 1 wherein, The chunk unit (21) is for receiving a chunk indication (51, 211), wherein the chunk indication (51, 211) is user definable, wherein the chunk indication (51, 211) represents a kernel size, a stride, and/or an offset, and wherein the chunk unit (21) is for determining the at least one chunk (26, 53, 61) using the chunk indication (51, 211).
3. The defensive generator (20) according to any one of claims 1 to 2 wherein, -The aggregation unit (22) is configured to receive an aggregation indication (221), wherein the aggregation indication (221) and/or the at least one key value in each case represents a singular value decomposition, convolution, average, median and/or variance of the at least one partition (26, 53, 61), and wherein the aggregation unit (22) is configured to determine the aggregated data (27) from the aggregation indication (221).
4. The defensive generator (20) according to any one of claims 1 to 2 wherein, The aggregation unit (22) is configured to receive the chunk indication (211, 51) from the chunk unit (21), wherein the aggregation unit (22) for determining the aggregated data (27) is configured to subtract the chunk indication (51, 221) from the at least one chunk (26, 53, 61).
5. The defensive generator (20) according to any one of claims 1 to 2 wherein, -The defense generator (20) is for receiving a target definition (12), wherein the inference unit (24) is for determining the inference configuration (29) from the target definition (12), wherein -Said data conversion unit (25) is adapted to select, depending on said inference configuration (29), whether to perform the following operations: o determining whether the input data set (5, 14) can be associated with an attack on the AI unit (30), and/or O determining a second set of input data (14) by using data transformation, wherein an attack on the AI unit (30) cannot be associated with the second set of input data (14).
6. The defensive generator (20) according to any one of claims 1 to 2 wherein, The distribution unit (23) is configured to receive a distribution indication (231), wherein the distribution unit (23) is further configured to determine the distribution function (65) from the distribution indication (231), wherein the distribution indication (231) represents an explicit or implicit distribution function (28).
7. The defensive generator (20) according to any one of claims 1 to 2 wherein, The inference unit (24) is configured to receive at least one threshold (241) and to determine the inference configuration from the at least one threshold (241), wherein the at least one threshold (241) indicates that the AI defense module (16) performs the data transformation when the at least one threshold (241) is exceeded.
8. The defensive generator (20) according to any one of claims 1 to 2 wherein, The data conversion forms a sampling method.
9. A defense system (10) against attacks against AI units (30), the defense system comprising: -an input unit (13) for receiving input data and/or an input model as model data (15); -a defense generator (20) according to any one of claims 1 to 8 for receiving the model data (15) and generating at least one AI defense module (16); -an AI unit (30) for using the at least one AI defense module (16) before performing regression and/or classification for O determining whether an attack on the AI unit (30) can be associated with an input dataset (5, 14) of the AI unit (30), and/or O determining a second input data set (14) by using data transformation, wherein an attack on the AI unit (30) cannot be associated with the second input data set (14) and using the second input data set (14) in the regression and/or classification.
10. A method of dynamically generating AI defense modules (16), the method comprising the steps of: -determining at least one partition (26, 53, 61) of model data (15), wherein the model data (15) is associated with an AI unit (30) and the at least one partition (26, 53, 61) represents at least a subset of the model data (15); -determining aggregated data (27), wherein the aggregated data (27) assigns at least one key value to the at least one partition (26, 53, 61); -determining a distribution function (28, 65) of the aggregated data (27); -determining at least one inference configuration (29) using the distribution function (28, 65); -generating at least one AI defense module (16) of the AI unit (30) using the at least one reasoning configuration (29), wherein the at least one AI defense module (16) is adapted for an input dataset (5, 14) of the AI unit (30) for: o determining whether an attack on the AI unit (30) can be associated with the input data set (5, 14), and/or O using data transformation to determine a second set of input data (14), wherein an attack on the AI unit (30) cannot be associated with the second set of input data (14), Wherein the method further comprises receiving a reconstruction indication (242), wherein the inference configuration (29) is determined from the reconstruction indication (242), wherein the reconstruction indication (242) represents whether a data transformation involves the whole model data (15), a random subset of the model data (15) and/or an importance-based selection of the model data (15).
11. A method for preventing attacks on AI units (30), the method comprising the steps of: -generating at least one AI defense module (16) according to claim 10; -determining whether an input dataset (5, 14) of the AI unit (30) can be associated with an attack on the AI unit (30) by using the at least one AI defense module (16), and/or -Determining a second input data set (14) by using the at least one AI defense module (16), wherein an attack on the AI unit (30) cannot be associated with the second input data set (14) and using the second input data set (14) in regression and/or classification.
12. The method of claim 11, wherein the step of determining the position of the probe is performed, -Receiving a target definition (12); -determining an inference configuration (29) from the target definition (12); -selecting, according to said target definition (12), whether to perform the following operations: o determining whether an attack on the AI unit (30) can be associated with the input data set (5, 14), and/or O determining the second set of input data (14) by using data conversion, wherein an attack on the AI unit (30) cannot be associated with the second set of input data (14).
13. A computer-readable storage medium containing instructions that, when executed by at least one processor, cause the at least one processor to implement the method of any one of claims 10 to 12.

Description

Defensive generator, method and computer readable storage medium for preventing attack on AI unit Technical Field The present application relates to a defense generator for dynamically generating at least one AI defense module, a defense system against attacks against AI units, a method of dynamically generating AI defense modules, a method of preventing attacks against AI units, and a computer-readable storage medium. Background Most machine learning methods are known in the art to be susceptible to resistant interference. Thus, robustness to resistive disturbances is a major challenge in the development of machine learning methods. When the data to be classified changes in such a way that the human observer cannot notice the change, an antagonistic disturbance occurs, but it is no longer possible to classify correctly by the AI unit. Thus, multiple misclassifications may occur. In the case of image classification, for example in the segmentation of image data, the resistive disturbance may be caused by superimposed noise on the input image. Such noise may be inserted into the input image in a manner that is unrecognizable to a human observer. However, the resistive disturbance does not occur in the natural environment, i.e. in the sensor data that is typically provided. One situation in which the antagonistic interference occurs is when the AI unit is under attack. This means that an attacker would modify the data provided to the AI unit for classification so that it cannot be classified correctly. This can lead to significant safety risks, especially in applications where safety is critical, such as highly automated driving. For example, if misclassification occurs during highly automated driving, the vehicle may not recognize the parking flag or misidentify the parking flag, and may not stop at the time of a red light, posing a significant risk to other participants in road traffic. Thus, one of the goals is to reduce the risk of resistance attacks. There are a few isolated approaches to achieving this goal, but they are tailored to specific attacks. This means that up to now countermeasures can only be taken when an attack occurs. However, it is well known that there are a limited number of possibilities for attacking AI entities with resistant interference. It is therefore an object of the present application to provide a method of preventing AI units from attack. In particular, it is an object of the application to identify challenge attacks. More particularly, it is an object of the application to devise an AI unit that is robust against attacks. More specifically, it is an object of the application to make it possible to generate an unlimited number of defense modules for AI units. Disclosure of Invention The object of the application is achieved by a defense generator for dynamically generating at least one AI defense module, a defense system against attacks against AI units, a method of dynamically generating AI defense modules, a method of preventing attacks against AI units, and a computer-readable storage medium. In particular, the above object is achieved by a defense generator dynamically generating at least one AI defense module, the defense generator comprising: -a blocking unit for determining at least one block of model data, wherein the model data is associated with the AI unit and the at least one block represents at least a subset of the model data; -an aggregation unit for determining aggregated data, wherein the aggregated data assigns at least one, in particular mathematically, key value to at least one partition; -a distribution unit for determining a distribution function of the aggregated data; -an inference unit for determining at least one inference configuration using a distribution function; -a data conversion unit for generating at least one AI defense module for the AI unit using at least one inference configuration, wherein the at least one AI defense module is for performing the following steps on an input dataset of the AI unit: o determining whether an attack on an AI unit is likely to be associated with the input data set, and/or O using data transformation to determine a second set of input data, wherein an attack on the AI unit cannot be associated with the second set of input data. The core of the application is to determine the distribution function of model data. The application is based on the assumption that model data belongs to model manifolds. In other words, the model data has similar statistical behavior. Thus, for an input dataset comprising, for example, images from an RGB camera, it may be determined whether the data of the input dataset belongs to a challenge. For example, if a statistical anomaly can be detected in the input data set, it is stated that the data of the input data set belongs to a challenge. The application furthermore comprises the possibility that a second input data set can be determined for an input data set which is not statistical