CN-120671200-B - Model fingerprint-based semi-fragile watermark tampering positioning method
Abstract
The invention discloses a model fingerprint-based semi-fragile watermark tampering positioning method which comprises the steps of initializing semi-fragile samples, generating target labels, selecting standard images, designing a loss function, updating the semi-fragile samples, verifying and iterating, extracting and grouping model parameters, compressing each group of model weights and generating model fingerprints, embedding the model fingerprints in the generated semi-fragile samples, initializing a counter, circularly verifying each semi-fragile sample, comparing a prediction result with the target labels, calculating authentication accuracy and judging whether a model is unauthorized, extracting and grouping suspicious model parameters, compressing each group of suspicious model weights and generating suspicious feature vectors, extracting the model fingerprints embedded in the semi-fragile samples, converting a binary sequence into a 16-system, comparing feature sequences and determining tampering positions. The method improves the accuracy of model content authentication, realizes accurate tampering and positioning, keeps the model performance unaffected, enhances the model safety and promotes the sustainable development of the technology.
Inventors
- ZHAO JUAN
- Sun Yudao
- SONG XIANGYU
- SUN SHIJIE
- LI WEI
- PEI LILI
- YANG ZHIHAI
Assignees
- 长安大学
Dates
- Publication Date
- 20260508
- Application Date
- 20250620
Claims (7)
- 1. The semi-fragile watermark tampering positioning method based on the model fingerprint is characterized by comprising the following steps of: step1 initializing a semi-fragile sample Randomly selecting a portion of samples from the standard sample set and initializing the samples as semi-fragile samples; step 2, generating target labels Randomly distributing target labels which are different from the original labels of the initialized semi-fragile samples through keys; step 3, standard image selection According to the generated target label, randomly selecting a standard image of a corresponding label from a standard sample set; Step 4, designing a loss function Ensuring that the model output is consistent with the target label and improving the transferability of the semi-fragile sample; step5, updating the semi-fragile sample The total loss function is minimized through a gradient descent method, and the semi-fragile sample is updated, so that the semi-fragile sample meets the target label and has high transferability; Step 6, verification and iteration Verifying whether the updated sample meets the condition, if so, keeping the prediction result consistent with the target label and the similarity smaller than the threshold value, and if not, continuing iteration; step 7, model parameter extraction and grouping Extracting weight information of the deep neural network model, and grouping the weight information according to the sequence; Step 8, compressing each group of model weights and generating model fingerprints Compressing each group of weight sequences, generating feature vectors, and finally converting the feature vectors into binary sequences to generate model fingerprints; Step 9, embedding model fingerprints in the generated semi-fragile samples Discrete wavelet transformation is carried out on the semi-fragile sample, model fingerprints are embedded in frequency domain coefficients, and a final sample is generated through inverse transformation; step 10, initializing a counter Initializing a counter to 0 for recording the number of samples of the model output consistent with the expected label; step 11, round robin verification of each semi-fragile sample Predicting the semi-fragile sample by using the suspicious model to obtain an output label; Step 12, comparing the predicted result with the target label The comparison model outputs a label with the expected label of the semi-fragile sample, if the two values are consistent, the counter is increased by 1; Step 13, calculating authentication accuracy and judging whether the model is unauthorized Calculating authentication accuracy according to the value of the counter, comparing the authentication accuracy with a preset threshold value, and judging whether the model is tampered maliciously; step 14, suspicious model parameter extraction and grouping Extracting parameters of the suspicious model, and grouping the parameters in the same mode as the original model; step 15, compressing each group of suspicious model weights and generating suspicious feature vectors Compressing each group of suspicious model weights, generating feature vectors, and connecting the feature vectors into suspicious feature vectors; step 16, extracting the model fingerprint embedded in the semi-fragile sample Extracting an embedded model fingerprint from each semi-fragile sample, and selecting a sequence with the highest occurrence frequency as a final model fingerprint by applying a majority rule; step 17, converting the binary sequence into 16 system Converting the extracted model fingerprint into hexadecimal feature sequences; step 18, comparing the feature sequences and determining the tamper location Comparing the suspicious model feature sequence with the original model feature sequence, recording the unmatched position, calculating the index and determining the tampered rough area; In the step 4, the loss L 1 of the part where the model output is consistent with the target label is ensured to be , The confidence interval parameter is represented by tau, and the confidence that the model predicts that the semi-fragile sample X is the label T is larger when tau is larger, otherwise, the confidence that the model predicts that the semi-fragile sample X is the label T is smaller when tau is smaller; Also included is the difference between the logits vector of the semi-fragile sample X and the logits vector of the standard image I, which is expressed mathematically as the loss L 2 , Wherein S (X) and S (I) are logits vectors of the model M to the semi-fragile sample X and the standard image I respectively; in the step 5, the total loss L t is , Where α <0 is a weight factor that adjusts the ratio of the two loss components; Next, the semi-fragile samples were updated by gradient descent method to minimize the total loss L t , which is mathematically expressed as , Wherein, l r is the learning rate, Is the direction of the gradient of the loss function with respect to X.
- 2. The model fingerprint based semi-fragile watermark tamper localization method according to claim 1, wherein in said step 2, a confidence interval parameter and a log vector difference are used to define a loss function.
- 3. The model fingerprint based semi-fragile watermark tamper localization method according to claim 1, wherein in said step 8 and said step 15, each set of weight sequences is compressed using Brotli compression method and feature vectors are generated by a BLAKE2 hash function.
- 4. The model fingerprint-based semi-fragile watermark tampering localization method according to claim 1, wherein in said step 6, the mathematical expressions of the two conditions are as follows Wherein, the To ensure that the updated semi-vulnerable sample X is consistent with the assigned target tag T; Representing the mean square error (Mean square error) between the original sample O and the updated semi-fragile sample X.
- 5. The model fingerprint-based semi-fragile watermark tamper localization method according to claim 1, wherein in said step 8, the first compression is performed The process of group weight sequence is expressed as , Wherein, the The representation Brotli is compressed and, Represent the first Compression results of the group weight sequences; For each set of compressed parameters Generating model feature vectors using BLAKE2 hash function, wherein feature vectors of the ith set of compression parameters are represented as , Wherein, the Representing a BLAKE2 hash function; Then, the feature vectors of each group of compression parameters are connected to generate the feature vector of the model Finally, converting the feature vector R into a binary sequence to generate a model fingerprint F which is expressed mathematically as , Wherein, the To convert hexadecimal feature vectors into a function of a binary sequence.
- 6. The model fingerprint-based semi-fragile watermark tampering localization method according to claim 1, wherein in the step 9, the semi-fragile sample X is subjected to discrete wavelet transform to obtain frequency domain coefficients Embedding model fingerprints on their frequency domain coefficients, embedding model fingerprints by spread spectrum, expressed mathematically as , Wherein, beta is a parameter for controlling the embedding strength of the model fingerprint; Finally, the frequency domain coefficients containing the model fingerprints Generating a final semi-fragile sample X through inverse discrete wavelet transform, which is expressed as , Wherein, the Representing an inverse discrete wavelet transform function.
- 7. The model fingerprint-based semi-fragile watermark tamper localization method according to claim 1, wherein in said step 13, the authentication accuracy is calculated as follows . Finally, according to the authentication accuracy And a preset threshold value Judging whether the model is tampered maliciously or not, if so Exceeding a threshold value The decision model is normal or unmodified, otherwise, the decision model has been tampered with maliciously.
Description
Model fingerprint-based semi-fragile watermark tampering positioning method Technical Field The invention relates to the technical field of artificial intelligence and network security, in particular to a semi-fragile watermark tampering positioning method based on model fingerprints. Background In the wave tide of artificial intelligence, the deep neural network model rapidly becomes a core engine for promoting technical innovation by virtue of strong learning ability and generalization performance. They have shown great potential and value in various fields of image recognition, voice processing, natural language processing, etc., and greatly promote the progress of technology and the development of application. However, with the increasing popularity of model commercialization, deep neural network models also face unprecedented security challenges. On the one hand, when the model is uploaded to public environments such as a cloud platform and an open source community, the risk of malicious tampering of the model is obviously increased due to the fact that the environments may lack strict security control and monitoring mechanisms. Malicious attackers may damage the integrity of the model by tampering with model parameters, inserting malicious codes, and the like, thereby affecting the performance and accuracy of the model. Such tampering may not only lead to errors in the model in practical applications, but may also raise serious security and legal issues, such as privacy disclosure, data tampering, etc. On the other hand, conventional model protection methods, such as digital watermarking techniques, while capable of verifying the integrity of the model to some extent, have a number of limitations. These methods tend to be too sensitive to small changes in the model, making it difficult to distinguish between normal model updates and malicious tampering. Once the model is tampered with, these methods often provide only simple integrity verification results, but not effective tamper localization information. This means that even if the model is found to be tampered, the specific location and range of tampering cannot be accurately determined, and great difficulty is brought to repair and maintenance of the model. To address this challenge, the present invention proposes an innovative semi-fragile neural network watermarking approach. By generating a group of special semi-fragile samples, the method can keep stable and accurate output under normal model processing, but can show obvious abnormality in a maliciously tampered model. By analyzing the output results of the model on the semi-fragile samples, the quick judgment on whether the model is tampered can be realized. More importantly, the method can further locate the specific tampered position, and provides powerful support for repairing and maintaining the model. The semi-fragile neural network watermarking method not only effectively improves the safety of the model, but also ensures the reliability and stability of the model in commercial application. Disclosure of Invention In view of this, the present invention provides a model fingerprint based semi-fragile watermark tamper localization method. In order to solve the technical problems, the invention adopts the following technical scheme: The semi-fragile watermark tampering positioning method based on the model fingerprint comprises the following steps: step1 initializing a semi-fragile sample Randomly selecting a portion of samples from the standard sample set and initializing the samples as semi-fragile samples; step 2, generating target labels Randomly distributing target labels which are different from the original labels of the initialized semi-fragile samples through keys; step 3, standard image selection According to the generated target label, randomly selecting a standard image of a corresponding label from a standard sample set; Step 4, designing a loss function Ensuring that the model output is consistent with the target label and improving the transferability of the semi-fragile sample; step5, updating the semi-fragile sample The total loss function is minimized through a gradient descent method, and the semi-fragile sample is updated, so that the semi-fragile sample meets the target label and has high transferability; Step 6, verification and iteration Verifying whether the updated sample meets the condition, if so, keeping the prediction result consistent with the target label and the similarity smaller than the threshold value, and if not, continuing iteration; step 7, model parameter extraction and grouping Extracting weight information of the deep neural network model, and grouping the weight information according to the sequence; Step 8, compressing each group of model weights and generating model fingerprints Compressing each group of weight sequences, generating feature vectors, and finally converting the feature vectors into binary sequences to generate model fingerprints;