CN-122020649-A - Neural network architecture back door attack method and system

CN122020649ACN 122020649 ACN122020649 ACN 122020649ACN-122020649-A

Abstract

The invention discloses a three-dimensional point cloud architecture back door method and a three-dimensional point cloud architecture back door system. Under the condition that a back door training sample is not required to be introduced, a multi-level non-parameter sub-detector is embedded in a forward computing structure of a point cloud network, a cross-layer feature consistency judging path and a joint gating mechanism are constructed, and structural level detection and response to a specific trigger mode are realized. And under the non-triggering condition, the network reasoning path is kept unchanged, and the original main task performance is not affected. The back gate trigger logic and the structural form are solidified in the network architecture, and can be stably activated even though the model is fine-tuned or retrained. Compared with a backdoor method relying on data poisoning or weight tampering, the method has remarkable advantages in aspects of detectability, trigger stability and architecture portability, and is suitable for safety analysis and countermeasure research scenes of the three-dimensional point cloud model.

Inventors

XU BIN
Li Zhuangtong
LU WENHAO
DONG ZHENJIANG

Assignees

南京邮电大学

Dates

Publication Date: 20260512
Application Date: 20260116

Claims (6)

1. A neural network architecture backdoor attack method and system comprises the following steps: The invention discloses a back door attack method and a back door attack system of a three-dimensional point cloud classification neural network model framework, comprising the following steps: Step 1, constructing a binary index of the three-dimensional point cloud classification network skeleton aligned with the phase; step 2, embedding a non-parameter geometric sub-detector g0 at the input side, reading relative geometric statistics and generating a trigger signal; Step 3, embedding a non-parameter sub-detector g1 in the middle characteristic layer, rechecking triggering evidence and injecting a breadcrumb cross-layer prompt; step4, embedding a parameter-free semantic validation sub-detector g2 in a high layer, and carrying out joint gating judgment with g0 and g 1; And 5, activating the back door mapping only when the joint gating is established, and applying the minimally invasive bias to the target class output to realize the directional hijacking.
2. In the step 1, a binary index of the three-dimensional point cloud classification network skeleton and phase alignment is constructed, and the method specifically comprises the following steps: Step 1.1, input Point cloud The mean value removal and scale normalization are carried out to ensure that the follow-up trigger statistics are stable to translation and scale change, and the method specifically comprises the following steps: wherein N is the point number of the point cloud, mu is the centroid of the point cloud, and the point cloud passes through And (3) translating the whole point cloud to a coordinate system taking the mass center as an origin, so as to eliminate the difference (mean value removal) of different samples in space positions, wherein s is a scale factor, and the maximum Euclidean distance from each point to the origin is preferably selected and used for scaling the point cloud to a uniform scale range, X i is a normalized point coordinate, and X is a normalized point cloud set. The processing ensures that all subsequent statistics which only depend on the relative structure or ratio form among points are kept stable under the translation and global scaling change, thereby being convenient for adopting uniform threshold caliber under different sample and different data enhancement conditions; Step 1.2 to obtain a stable one-dimensional partitioning direction, preferably, the first principal component direction is adopted as the principal axis u, so that the point cloud matrix is the following The covariance is: Taking the unit feature vector corresponding to the maximum feature value of C as I u 2 = 1. Projecting points onto u yields a one-dimensional sequence: The method comprises the steps of (a) setting a coordinate system of a point cloud, wherein each row of M corresponds to a three-dimensional coordinate of a point, C describes the overall variance and correlation of the point cloud in three coordinate axis directions, u is the direction with the maximum variance of the point cloud, can give a reproducible dividing direction consistent with the overall shape of the point cloud under most conditions, and is used for avoiding the influence of projection values on the length of a main axis by constraint of 2 =1, t i is scalar projection of an ith point in the main axis direction, and the advantage of using a PCA main axis as the dividing direction is that the main axis direction rotates along with the rotation change of the point cloud, so that the relative sequence along the main axis still can reflect the relative structure inside the point cloud, and lays a foundation for the internal relative geometric difference trigger of the rotation robustness of the follow-up construction; step 1.3, taking the median of { t i } as a threshold value, dividing the point cloud into two halves, and defining the following steps: A={i|t i ≤τ},B＝{i|t i >τ} the median (&) represents a median operator and is used for obtaining a robust threshold value on a one-dimensional sequence, and the purpose of selecting the median as the threshold value is that, on one hand, two half points can be balanced as much as possible to avoid serious deviation of division caused by extreme points or a small number of outliers, and on the other hand, the median is insensitive to a small number of outliers and can promote the subsequent statistical stability, and the set A and the set B are index sets and are used for extracting corresponding half areas on points or feature tensors of any layer; in step 1.4, in order to eliminate the problem of label interchange of left and right half areas caused by the uncertainty of the symbols in the main axis direction, the comparison of lower side split compactness is preferably adopted to fix the half area of which the half area B always represents more compact/suspicious, so that the center of the half area S epsilon { A, B } is as follows: Defining a radius distance set R S ＝{||x i -c S || 2 |i epsilon S of the half area, and taking the lower side quantile as a compactness measure: κ(S)=Q α (R S ) Wherein Q α (·) represents the alpha quantile, and a smaller kappa (S) represents a tighter low-tail group of the half-region, and if kappa (B) > kappa (A) occurs, the two half-tags are exchanged: (A,B)←(B,A) after the replacement, the half B is semantically fixed as a half of a tighter, so that a unified direction is provided for subsequent trigger statistics (such as a compactness ratio and a density ratio); step 1.5 multiplexing the binary indexes from steps 1.3 and 1.4 as structural constants along with the forward process for the first layer point/feature tensor Defining a half-region feature extraction operator: H (I) A represents that a corresponding row is selected according to an index set A, and the same A/B index is multiplexed on trigger sub-detectors of an input layer, a middle layer and a high layer, so that the statistics caliber of the same half area with consistent cross layers is realized, interlayer phase drift is avoided, and a foundation is provided for the follow-up joint gating judgment.
3. In the step 2, the input side sub-detector g0 is disposed between the "original input point cloud" and the first local aggregation operator, and directly counts the two half areas a and B obtained in the step 1 in the geometric domain, and outputs a binary trigger signal g 0 e {0,1}. The method specifically comprises the following steps: Step 2.1, based on the low tail compactibility kappa (S) defined in the step 1, constructing a half-zone compactibility ratio as a trigger amount: Wherein ρ c <1 represents that the B half is more compact in the low-tail sense than the a half, and by the substitution rule of step 1, ρ c will generally exhibit a more significant small-value trend on the trigger sample; step 2.2 to reduce sensitivity to a single neighborhood scale, preferably introducing a multi-scale neighbor distance statistic, defining for any half-region S, for each point i, its k-neighbor set within the half-region as And defining an average neighbor distance for the point: let the "low tail density" statistics of the half-region at scale k be: δ k (S)＝Q α ({d i,k (S)|i∈S}) Given a multiscale set Constructing a multi-scale density ratio trigger quantity: Wherein, the For a set of multi-scale neighbors, where each k represents the construction of a k-neighbor set within half-region S The adopted neighbor number indicates that ρ d <1 indicates that the B half area is denser in the low-tail sense and accords with the triggering design target of local clustering, both kappa (·) and delta k (·) are formed by distance measurement, if the point cloud is scaled consistently, the two half-division sub-denominators are scaled in the same proportion at the same time, so that the ratio ρ c and ρ d keeps stable to global scale change, and meanwhile, the statistics is based on relative distance and does not depend on absolute coordinates, so that the statistics also keeps stable to translation: Step 2.3, in order to provide adjustable intensity between 'miss detection inhibition' and 'false touch inhibition', two criteria of compactness and thickness are fused according to M-of-2 majority rules, and two indication quantities are defined: Wherein, the For indicating function, τ c and τ d are threshold values for controlling the compactness and density trigger sensitivity, and the majority rule is adopted to obtain: wherein, m=1 indicates that "at least one term is satisfied" can trigger, and is biased to reduce missed detection, m=2 indicates that "two terms are satisfied at the same time" can trigger, and is biased to reduce false touch, and the detector is composed of deterministic statistics, quantitive operation and threshold comparison, so that the detector is a module without a learnable parameter, and can be directly embedded into an inference graph.
4. In said step 3, the intermediate layer sub-detector g1 is arranged at the representative intermediate feature layer after completion of several local aggregations and before global aggregation, assuming that the layer point-level feature tensor is The feature vector corresponding to the ith point is marked as And g1 multiplexing the binary index in the step 1, rechecking the trigger evidence in the feature domain, and writing a low-bandwidth breadcrumb prompt into a subsequent layer when rechecking is established, wherein the method specifically comprises the following steps of: step 3.1, obtaining by using the feature extraction operator in the step 1: Wherein, H (m) A represents that all channel characteristics corresponding to points with indexes belonging to A are taken out from the whole layer of characteristic tensor H (m) to obtain a characteristic submatrix of the half A region, and H (m) B represents that the half B region is obtained by the same principle. And calculates feature centers (mean vectors) of the half-regions, respectively: Wherein, the Is the "center" of the half-region S in the intermediate layer feature space, which is the average representation of all the point features of the half-region; step 3.2, calculating a low tail compactness ratio and a multi-scale low tail density ratio in a feature domain, outputting a rechecking criterion, and defining a half-zone radius distance set in the feature domain: Wherein, the The Euclidean distance set from each point feature in the half area S to the center feature of the half area represents the discrete/aggregation degree in the half area, and the low tail compactness of the feature domain is obtained according to the discrete/aggregation degree: Where κ (m) (S) represents the low-tail quantile value of the distance set, which is not dominated by a small number of outliers far from the center, but rather focuses more on the degree of compactness of a "more core, denser fraction of points", and constructs the feature domain multiscale density ratio accordingly: Wherein, the For the compactness of the intermediate layer feature domain, if The B half area is more compact than the A half area in the low tail sense, and the B half area is semantically fixed to be 'compacter priority' through a replacement rule in the step 1, so that the ratio is more stable and more remarkably smaller on a trigger sample, thereby playing a rechecking role; Further, to maintain the multi-scale aperture consistent with step 2, defining a k-nearest neighbor set in the same half-region in the feature domain, and defining the average nearest neighbor distance of the point: Obtaining the low tail density at the scale k: And constructing a multi-scale density ratio: Wherein, the And (3) with Respectively representing that the B half area is relatively tighter and denser on the characteristic domain, and reflecting the stable difference of the triggering evidence after the middle layer is amplified; step 3.3, generating a trigger signal of the intermediate layer detector g1 by adopting M-of-2 majority rule: Wherein, the To indicate the function, the condition is met and output 1, otherwise output 0, get the intermediate layer trigger signal: The value meaning of M is consistent with that of the step 2 and is used for carrying out compromise between miss detection inhibition and false touch inhibition, and g 1 is a binary output without parameters and certainty and participates in joint gating judgment in the step 4. Step 3.4 to reduce the edge miss and stabilize the trigger evidence without changing the clean distribution, performing amplitude-controlled non-reference writing only to the half B when g 1 =1, providing The "breadcrumb" write satisfies the constraint of "no trigger identity": g 1 =0 and when g 1 =1, one of the following two non-reference writing forms is preferably provided: Wherein eta is a preset constant vector whose amplitude is controlled to ensure an approximate identity to the main task, and by-channel multiplication, And the vector is a preset constant vector and is used for carrying out weak channel scaling and weak channel translation on the B half area.
5. In the step 4, the high-level sub-detector g2 is disposed at a high-level feature before global aggregation, and is configured to perform final confirmation on trigger evidence, and construct a joint gating signal together with g 0 output in the step 2 and g 1 output in the step 3, and specifically includes the following steps: step 4.1, setting the high-level point-level characteristic tensor as Based on the binary index, a high-level half-zone feature representation is obtained, defined as follows: Wherein, A and B are binary indexes generated in the step 1 and multiplexed in a cross-layer manner. The same index is continuously multiplexed at a high layer, so that statistics of g2, g0 and g1 on the same half area is ensured, and cross-layer phase alignment is maintained; Step 4.2, in the high-level feature domain, calculating low-tail compactness statistics and multi-scale low-tail density statistics of the half areas A and B respectively, and adopting the same low-tail positioning operator Q α (DEG) and multi-scale set in the step 2 and the step 3 And constructing two ratios: wherein κ (h) (. Cndot.) represents the low tail compactness statistic of the higher half-region, Representing low tail density statistics at scale k; Step 4.3, generating a high-level trigger signal g 2 by adopting a majority rule consistent with the above description: Wherein, the M is a majority rule threshold, and is consistent with the step 2 and the step 3; step 4.4, constructing a joint gating signal G of cross-layer consistency, which is defined as follows: G=g 0 ∧g 1 ∧g 2
6. The three-dimensional point cloud classification neural network model architecture backdoor attack system is characterized by being used for implementing the three-dimensional point cloud architecture backdoor attack method according to any one of claims 1-6, and comprises a binary index generation module, an input side geometry detection module, a middle characteristic rechecking and breadcrumb writing module, a high-level semantic validation and joint gating module and an output hijacking control module; the binary index generation module is used for carrying out statistical analysis on the input three-dimensional point cloud before model reasoning starts, generating binary indexes for cross-layer multiplexing and fixing semantic phases of each half area; The input side geometry detection module is used for carrying out statistics detection on point cloud geometry distribution based on the binary index, constructing a parameter-free geometry trigger criterion and generating a first trigger signal; The intermediate feature rechecking and breadcrumb writing module is used for conducting feature domain rechecking on the triggering evidence at an intermediate feature layer of the point cloud network, and writing cross-layer prompt information into a corresponding half area when rechecking conditions are met; The high-level semantic verification and joint gating module is used for verifying the trigger evidence in a high-level semantic feature space, and carrying out joint gating judgment on trigger signals from an input side, an intermediate layer and a high layer to generate a back door activation control signal; The output hijacking control module is used for applying preset bias to the target class output to realize directional hijacking when the back door activation control signal indicates that the trigger is established, and keeping the model output unchanged when the trigger is not established.

Description

Neural network architecture back door attack method and system Technical Field The invention relates to the technical fields of artificial intelligence safety, three-dimensional point cloud deep learning and model architecture safety, in particular to a three-dimensional point cloud classification neural network model architecture backdoor attack method and system. Background With the development of three-dimensional perception technology, a deep learning model based on three-dimensional point cloud is widely applied to the fields of automatic driving, robots, industrial detection, augmented reality and the like. Existing mainstream three-dimensional point cloud classification and recognition models, such as neural network structures based on point set modeling or dynamic graph neighborhood modeling, have been deployed in a variety of practical scenarios. Meanwhile, the safety problem of the deep neural network is outstanding. Conventional back-gate attack methods typically output an attacker preset result once the trigger pattern is detected by the model in the inference phase by injecting samples with the trigger pattern into the dataset in the training phase. However, this type of approach is highly dependent on training processes and model parameter updates, and is easily weakened or eliminated during data auditing, model trimming, or retraining. In recent years, a new form of security threat has emerged-architecture level back door attacks. The attack does not depend on the poison throwing of training data and the study parameters of the model, but embeds specific detection, propagation and gating logic in a model forward computing structure to enable the model to generate directional output hijacking when meeting specific structural conditions. Because the attack logic is solidified on the model architecture level, the rear door has the characteristics of high concealment, strong anti-retraining capability and the like. However, existing architectural backdoor research has focused mainly on the field of two-dimensional images or text models, and there is no effective and systematic solution for three-dimensional point cloud models. Because three-dimensional point cloud data has disorder, sparsity and strong geometric invariance, the traditional architecture backdoor method which depends on a fixed trigger mode or explicit semantic features is difficult to directly migrate. Therefore, how to construct a stable, hidden and deployable architecture back door attack method suitable for a three-dimensional point cloud model on the premise of not damaging the normal functions of the model becomes a technical problem to be solved. Disclosure of Invention The invention aims to solve the technical problems that the existing three-dimensional point cloud back door attack is dependent on data poisoning or weight tampering in a training stage, and is easy to fail under model fine tuning, retraining and common point cloud preprocessing. And the three-dimensional point cloud lacks a high-level semantic carrier capable of being stably implanted, so that a semantic trigger mechanism commonly used in the two-dimensional field is difficult to directly migrate. Therefore, the invention provides a three-dimensional point cloud classification neural network model architecture back door attack method and system, which are used for inserting a multi-stage non-parameter sub-detector and a cross-layer consistency gating path in the forward process of a point cloud classification network on the premise of not changing victim model training data and learned weights, and sequentially completing detection, transmission, confirmation and hijacking of trigger signals. The method solidifies the back gate logic in the forward computational graph and approximates identity to the primary task when not triggered. Compared with the existing back door attack method relying on data poisoning or weight tampering in the training stage, the method has the advantages that the triggering stability and the anti-interference capability of the back door under common defense processing such as data enhancement and outlier deletion are improved by solidifying the triggering criteria and the hijacking logic into the framework-level and parameter-free cross-layer gating mechanism, the resistance effect on the post-processing operations such as model fine tuning and retraining is enhanced, and even if a user carries out model fine tuning or complete retraining on the downloaded three-dimensional point cloud classification model based on private data, the method can still keep higher attack success rate and long-term effectiveness, so that compared with the traditional back door scheme, the method has stronger attack threat and higher attack durability. The invention discloses a back door attack method and a back door attack system of a three-dimensional point cloud classification neural network model framework, comprising the following steps: Step 1, co