EP-4736045-A1 - SYSTEM AND METHOD OF PREVENTING STEGANOGRAPHIC ATTACKS

EP4736045A1EP 4736045 A1EP4736045 A1EP 4736045A1EP-4736045-A1

Abstract

The present invention relates generally to the technological field of cyber security. More specifically, the present invention relates to preventing cyber-attacks caused by delivering steganographically encoded malware through machine-learning (ML)-based model distribution. In the general aspect, the invention may be directed to a method of preventing steganographic attacks by at least one processor. The method may include receiving an initial machine-learning (ML)-based model; applying a steganalysis procedure to the initial ML-based model, to determine a probability of steganographic data presence in the initial ML-based model; applying at least one alteration to the initial ML-based model, according to the determined probability, thereby obtaining a disarmed ML-based model.

Inventors

DUBIN, Ran
GILKAROV, Daniel

Assignees

Ariel Scientific Innovations Ltd

Dates

Publication Date: 20260506
Application Date: 20240702

Claims (20)

1. A method of preventing steganographic attacks by at least one processor, the method comprising: receiving an initial machine-learning (ML)-based model; applying a steganalysis procedure to the initial ML-based model, to determine a probability of steganographic data presence in the initial ML-based model; applying at least one alteration to the initial ML-based model, according to the determined probability, thereby obtaining a disarmed ML-based model.
2. The method of claim 1, wherein the initial ML-based model is characterized by a set of parameters; and wherein applying the steganalysis procedure comprises: generating an input image representation of the initial ML-based model, based on the set of parameters thereof; inferring, on the input image representation, a respectively pretrained Siamese- Neural-Network (SNN)-based model, to calculate a distance metric value, representing a degree of pertinence of the input image representation to at least one class that indicates steganographic data presence in ML-based models; and determining the probability of the steganographic data presence based on the calculated distance metric value.
3. The method of claim 2, wherein the initial ML-based model is an artificial neural network model, and the set of parameters includes weight coefficients; and wherein said generating the input image representation comprises: for one or more of said weight coefficients, extracting, from a value thereof represented in a binary format, a sequence of m zz-bit blocks; forming the input image representation from m image representation parts, wherein, for 1 < z < m, z-th of the m image representation parts comprises a plurality of pixels, each having a color coded based on z-th zz-bit block in the sequence of the zz-bit blocks of a respective weight coefficient of said weight coefficients.
4. The method of claim 3, wherein values of said weight coefficients are stored in a floating-point-number representation comprising a mantissa portion and an exponent portion; and wherein said sequence of m n-bit blocks represents, at least in part, the mantissa portion of the respective weight coefficient.
5. The method according to any one of claims 2-4, wherein the respectively pretrained SNN-based model is configured to: receive the input image representation; perform a first vector embedding procedure on the input image representation, to calculate an input feature vector representation in a predefined feature space; calculate the distance metric value representing at least one of (i) a first distance being a distance, in the feature space, between the input feature vector representation and a centroid vector representation of a first class, said centroid vector representation of the first class being calculated as an average of steganographically-nonaffected feature vector representations, calculated by performing the first vector embedding procedure on image representations of ML-based models not containing steganographic data; and (ii) a second distance being a distance, in the feature space, between the target feature vector representation and a centroid vector representation of a second class, said centroid vector representation of the second class being calculated as an average of steganographically- affected feature vector representations, calculated by performing the first vector embedding procedure on image representations of ML-based models containing steganographic data.
6. The method according to any one of claims 1-5, wherein the initial ML-based model is characterized by a set of parameters; and wherein applying the steganalysis procedure comprises performing a second vector embedding procedure on the initial ML-based model, based on the set of parameters thereof, thereby obtaining an initial vector representation of the initial ML-based model; inferring, on the initial vector representation, an autoencoder ML-based model, respectively pretrained to reconstruct vector representations of ML-based models containing no steganographic data, thereby obtaining a reconstructed vector representation of the initial ML-based model; calculating a reconstruction error value, based on the initial and reconstructed vector representations of the initial ML-based model; and determining the probability of the steganographic data presence, based on the reconstruction error value.
7. The method of claim 6, further comprising classifying the initial ML-based model as containing the steganographic data, provided that the determined probability of the steganographic data presence surpasses a predefined threshold.
8. The method of claim 7, further comprising receiving a training set of the ML-based models containing no steganographic data, wherein each ML-based model of the training set is characterized by the set of parameters; performing the second vector embedding procedure on each ML-based model of the training set, based on the set of parameters thereof, thereby obtaining a vector representation of each ML-based model of the training set; based on the vector representations of the ML-based models of the training set, training the autoencoder ML-based model to reconstruct a vector representation of a target ML-based model containing no steganographic data.
9. The method of claim 8, further comprising for at least one ML-based model of the training set, determining an architecture of the respective ML-based model; calculating a plurality of permutations of the respective ML-based model, based on the determined architecture; and augmenting the training set with the calculated plurality of permutations.
10. The method according to any one of claims 8-9, further comprising receiving at least one test ML-based model in an original version, containing no steganographic data; wherein the at least one test ML-based model is characterized by the set of the parameters; inserting the steganographic data into the at least one test ML-based model, thereby obtaining an altered version of the at least one test ML-based model; for both the original version and the altered version, (i) performing the second vector embedding procedure on the at least one test ML- based model in a respective version, based on the set of the parameters of the at least one test ML-based model, thereby obtaining an initial vector representation of the at least one test ML-based model in a respective version; (ii) inferring the autoencoder ML-based model on the initial vector representation of the at least one test ML-based model in the respective version, thereby obtaining a reconstructed vector representation of the test ML-based model in the respective version; and (iii) calculating a reconstruction error value of the respective version, based on the initial and reconstructed vector representations of the at least one test ML-based model in the respective version; and setting the predefined threshold, based on the reconstruction error values of the original and altered versions of the at least one test ML-based model.
11. The method according to any one of claims 1-10, wherein the initial ML-based model is characterized by a set of parameters; and wherein applying the steganalysis procedure comprises performing a second vector embedding procedure on the initial ML-based model, based on the set of the parameters thereof, thereby obtaining a vector representation of the initial ML-based model; inferring, on the vector representation of the initial ML-based model, a respectively pretrained classifying ML-based model, to calculate a probability of a pertinence of the initial ML-based model to at least one class that indicates steganographic data presence; and determining the probability of the steganographic data presence based on the probability of pertinence of the initial ML-based model to said at least one class.
12. The method of claim 11, further comprising receiving a training set of the ML-based models in original versions containing no steganographic data and in altered versions containing steganographic data, wherein each ML-based model of the training set is characterized by the set of parameters and is respectively labeled by pertinence to at least one class that indicates steganographic data presence; performing the second vector embedding procedure on each ML-based model of the training set, based on the set of parameters thereof, thereby obtaining a vector representation of each ML-based model of the training set; and based on the vector representations of the ML -based models of the training set, training the classifying ML-based model to classify a target ML-based model by pertinence to said at least one class.
13. The method according to any one of claims 1-12, wherein applying the at least one alteration to the initial ML-based model comprises performing quantization of the initial ML-based model.
14. The method of claim 13, wherein the initial ML-based model is characterized by at least one parameter; and performing quantization of the initial ML-based model comprises defining a rounding multiple, based on the determined probability; and rounding a value of the at least one parameter of the initial ML-based model to the defined rounding multiple.
15. The method of claim 13, wherein the initial ML-based model is characterized by at least one parameter; and performing quantization of the initial ML-based model comprises defining a number of LSBs for omission, based on the determined probability; and omitting the defined number of Least Significant Bits (LSBs) of the at least one parameter of the initial ML-based model.
16. The method according to any one of claims 1-15, wherein the initial ML-based model is characterized by at least one parameter; and applying the at least one alteration to the initial ML-based model comprises adding random noise data to Least Significant Bits (LSBs) of the at least one parameter of the initial ML-based model.
17. The method of claim 16, wherein applying the at least one alteration to the initial ML-based model further comprises defining an intensity of random noise data, based on the determined probability.
18. The method according to any one of claims 1-17, wherein the initial ML-based model is received in a first serialization format; and wherein applying the at least one alteration to the initial ML-based model comprises reserializing the initial ML-based model to be represented in a second serialization format.
19. The method according to any one of claims 1-18, wherein the initial ML-based model is an artificial neural network model, and the set of parameters includes weight and bias coefficients.
20. The method of claim 19, wherein the second vector embedding procedure comprises inferring a respective ML-based model on an input data sample, to obtain a calculated output value; calculating a loss function, based on the calculated output value and a predefined desired output label; calculating a gradient of the loss function with respect to weight coefficients of the respective ML-based model; and obtaining a respective vector representation of the respective ML-based model, based on said weight coefficients and the calculated gradient.

Description

SYSTEM AND METHOD OF PREVENTING STEGANOGRAPHIC ATTACKS CROSS REFERENCE TO RELATED APPLICATIONS [001] This application claims the benefit of priority of U.S. Provisional Patent Application No. 63/524,681, filed July 2, 2023, the contents of which are all incorporated herein by reference in their entirety. FIELD OF THE INVENTION [002] The present invention relates generally to the technological field of cyber security. More specifically, the present invention relates to preventing cyber-attacks caused by delivering steganographically encoded malware through machine-learning (ML)-based model distribution. BACKGROUND OF THE INVENTION [003] As known, information technology security or cybersecurity has become an imperative aspect of contemporary computer systems and networks, as they remain vulnerable to cyberattacks. [004] A cyberattack is a deliberate attempt to access data, functions, or other restricted areas of the system without authorization, potentially with malicious intent. Cyberattacks may be aimed at stealing, altering, or destroying target information by hacking into susceptible systems. As a result of cyberattacks, confidentiality, integrity or availability of target resources may be compromised. Moreover, the damage may extend to resources in addition to the ones initially identified as vulnerable, including, e.g., further resources of the organization that owns the initial resource, and the resources of other involved parties (customers, suppliers). Eventually, cyberattacks have become increasingly sophisticated and dangerous. [005] Even though many approaches and techniques of cyberattack prevention are known from the prior art nowadays, the technological field of information security remains constantly evolving, struggling to challenge with both rapidly developing malware and new ways of hiding and distributing it. [006] Over the last years, ML-based model sharing has gained an increased popularity and has become ubiquitous. Unfortunately, it has provoked an increase of cyber-attacks aimed at open-source repositories. The fast adaptation in the industry, lack of awareness and ability to exploit the ML-based models has made them an attractive malware carrier. [007] Recently, malware was found to be hidden in neural network ML-based models. In most cases, it is done by replacing model weights bits (Least Significant Bits (LSB)) with malware bits using steganography embedding (inserting) methods. By embedding malware in neurons, the malware can be delivered covertly, with minor or no impact on the performance of neural network. [008] Steganography -based techniques of hiding and distributing malware are considered to be one of the most effective nowadays. There are efforts to prevent malicious ML-based model spreading known in the art. These efforts are based on scanning ML-based models with open-source malware scanners in order to detect malicious weight serialization and vulnerabilities that can lead to malware code execution when the model is loading. However, such efforts have shown poorly effective results. SUMMARY OF THE INVENTION [009] Accordingly, there is a need for a system and method of preventing steganographic attacks which would provide an improvement of the technological field of cyber security by increasing efficiency of detecting and disarming malicious data steganographically concealed inside ML-based models, and therefore, by increasing cyber safety of ML-based model distribution. [0010] In the general aspect, the invention may be directed to a method of preventing steganographic attacks by at least one processor. The method may include receiving an initial machine-learning (ML)-based model; applying a steganalysis procedure to the initial ML-based model, to determine a probability of steganographic data presence in the initial ML-based model; applying at least one alteration to the initial ML-based model, according to the determined probability, thereby obtaining a disarmed ML-based model. [0011] In another general aspect, the invention may be directed to a system for preventing steganographic attacks. The system may include: a non-transitory memory device, wherein modules of instruction code are stored, and at least one processor associated with the memory device, and configured to execute the modules of instruction code, whereupon execution of said modules of instruction code, the at least one processor may be configured to receive an initial machine-learning (ML)-based model; apply a steganalysis procedure to the initial ML-based model, to determine a probability of steganographic data presence in the initial ML-based model; and apply at least one alteration to the initial ML-based model, according to the determined probability, thereby obtaining a disarmed ML-based model. [0012] In some embodiments, the initial ML-based model may be characterized by a set of parameters; and applying the steganalysis procedure may include: generating an input image representation of the