US-12626095-B2 - Machine learning outputs with high confidence explanations

US12626095B2US 12626095 B2US12626095 B2US 12626095B2US-12626095-B2

Abstract

A malware classification system provides improved confidence in explanations of neural network classification outputs using methods such as weighting or masking when training the neural network to train the network on a sample resembling or including the explanation. The explanation in some examples comprises a subset of a hierarchical input vector that is responsible for the neural network's classification output. In another example the neural network has an inner portion configured to reduce the weight of elements of the output not significantly contributing to the explanation of the output, such as by reducing the weight of as many such outputs to zero as is practical in generating the desired output.

Inventors

TOMAS PEVNY

Assignees

AVAST Software s.r.o.

Dates

Publication Date: 20260512
Application Date: 20211001

Claims (20)

1 . A method of generating an explanation of output from a machine-learning system, comprising: receiving an input data string having a hierarchical structure; analyzing the input data string using a machine learning module to automatically generate an output corresponding to the received input data string; generating an explanation of the output comprising a subset of the input data string that is responsible for the output; and training the machine learning module using a loss function optimized with an inner optimization that generates a sparse output corresponding to the explanation of the output, wherein the explanation comprises a subset of the input data string selected according to hierarchical groupings within the input data string, and wherein the training uses the explanation as a training input to update the machine learning module based on the hierarchical structure.
2 . The method of generating an explanation of output from a machine-learning system of claim 1 , further comprising applying a weighting or masking function when training the machine learning module using the generated explanation of the output and the output, configured to improve the output generated when the generated explanation of output is provided as input.
3 . The method of generating an explanation of output from a machine-learning system of claim 2 , wherein the loss function used to train the machine learning module is optimized using an inner optimization for an estimated output and the weighting or masking function.
4 . The method of generating an explanation of output from a machine-learning system of claim 3 , wherein the loss function optimization is trained using second-order stochastic gradient descent.
5 . The method of generating an explanation of output from a machine-learning system of claim 1 , further comprising constructing the machine learning module using a hierarchy of the input data string.
6 . The method of generating an explanation of output from a machine-learning system of claim 1 , wherein the machine learning module is a neural network.
7 . The method of generating an explanation of output from a machine-learning system of claim 6 , wherein the neural network comprises a hierarchical multiple-instance-learning neural network.
8 . The method of generating an explanation of output from a machine-learning system of claim 7 , wherein the neural network comprises an outer optimization of a loss function, and an inner optimization for one or more parameters of the loss function.
9 . The method of generating an explanation of output from a machine-learning system of claim 6 , wherein generating an explanation of the output is performed using logic other than the neural network.
10 . A method of generating an explanation of output from a machine-learning system, comprising: receiving an input data string having a hierarchical structure; analyzing the input data string using a machine learning module to automatically generate an output corresponding to the received input data string; and generating an explanation of the output comprising a subset of the input data string that is responsible for the output; the machine learning module comprising a neural network having an inner portion and an outer portion, wherein the inner portion is configured to reduce or zero the weight of elements of the output while still producing a desired output and wherein the outer portion aggregates a sparse output vector to produce a classification output, wherein the inner portion generates instance-level outputs that correspond to hierarchical segments of the input data string and are reduced or zeroed based on the hierarchical groupings, and wherein the outer portion aggregates the instance-level outputs according to the hierarchical structure.
11 . The method of generating an explanation of output from a machine-learning system, of claim 10 , wherein reducing or zeroing the weight of elements of the output while still producing a desired output comprises reducing the weight to zero.
12 . The method of generating an explanation of output from a machine-learning system, of claim 11 , wherein reducing or zeroing the weight of elements of the output while still producing a desired output to zero comprises reducing the weights of as many elements of the output to zero as can be achieved while still generating the desired output.
13 . The method of generating an explanation of output from a machine-learning system, of claim 10 , further comprising regularizing one or more outputs of the inner layer to be either zero or the largest observed value in an inner layer output vector comprising the one or more outputs of the inner layer.
14 . The method of generating an explanation of output from a machine-learning system of claim 10 , further comprising constructing the machine learning module using a hierarchy of the input data string.
15 . The method of generating an explanation of output from a machine-learning system of claim 10 , wherein the machine learning module is a neural network.
16 . The method of generating an explanation of output from a machine-learning system of claim 15 , wherein the neural network comprises a hierarchical multiple-instance-learning neural network.
17 . The method of generating an explanation of output from a machine-learning system of claim 15 , wherein the inner portion outputs instances of the input and the outer portion outputs aggregated output of the inner portion.
18 . The method of generating an explanation of output from a machine-learning system of claim 10 , wherein the inner portion and the outer portion of the machine learning module are configured to improve the confidence of the machine learning module in the explanation.
19 . The method of generating an explanation of output from a machine-learning system of claim 10 , wherein the inner portion and outer portion of the machine learning module are configured to reduce the size of explanation of output.
20 . A machine learning system, comprising: a processor and a memory; and a machine-readable medium with instructions stored thereon, the instructions when executed on the processor operable to cause the processor to: receive an input data string having a hierarchical structure; analyze the input data string using a machine learning module to automatically generate an output corresponding to the received input data string; generate an explanation of the output comprising a subset of the input data string that is responsible for the output; and train the machine learning module using the generated explanation of the output and the output, wherein the explanation is used as a training input and wherein the machine learning module comprises a neural network with an inner portion producing a sparse output corresponding to an explanation of the output and an outer portion aggregating the sparse output to produce the classification, wherein the inner portion produces a sparse output corresponding to an explanation that comprises a subset of the input data string selected according to hierarchical groupings within the input data string, and wherein the explanation is provided as a training input to update the neural network using the hierarchical structure of the input data string.

Description

FIELD The invention relates generally to security in computerized systems, and more specifically to data-driven automated malware classification with human-readable explanations. BACKGROUND Computers are valuable tools in large part for their ability to communicate with other computer systems and retrieve information over computer networks. Networks typically comprise an interconnected group of computers, linked by wire, fiber optic, radio, or other data transmission means, to provide the computers with the ability to transfer information from computer to computer. The Internet is perhaps the best-known computer network, and enables millions of people to access millions of other computers such as by viewing web pages, sending e-mail, or by performing other computer-to-computer communication. But, because the size of the Internet is so large and Internet users are so diverse in their interests, it is not uncommon for malicious users to attempt to communicate with other users' computers in a manner that poses a danger to the other users. For example, a hacker may attempt to log in to a corporate computer to steal, delete, or change information. Computer viruses or Trojan horse programs may be distributed to other computers or unknowingly downloaded such as through email, download links, or smartphone apps. Further, computer users within an organization such as a corporation may on occasion attempt to perform unauthorized network communications, such as running file sharing programs or transmitting corporate secrets from within the corporation's network to the Internet. For these and other reasons, many computer systems employ a variety of safeguards designed to protect computer systems against certain threats. Firewalls are designed to restrict the types of communication that can occur over a network, antivirus programs are designed to prevent malicious code from being loaded or executed on a computer system, and malware detection programs are designed to detect remailers, keystroke loggers, and other software that is designed to perform undesired operations such as stealing information from a computer or using the computer for unintended purposes. Similarly, web site scanning tools are used to verify the security and integrity of a website, and to identify and fix potential vulnerabilities. For example, antivirus software installed on a personal computer or in a firewall may use characteristics of known malicious data to look for other potentially malicious data, and block it. In a personal computer, the user is typically notified of the potential threat, and given the option to delete the file or allow the file to be accessed normally. A firewall similarly inspects network traffic that passes through it, permitting passage of desirable network traffic while blocking undesired network traffic based on a set of rules. Tools such as these rely upon having an accurate and robust ability to detect potential threats, minimizing the number of false positive detections that interrupt normal computer operation while catching substantially all malware that poses a threat to computers and the data they handle. Accurately identifying and classifying new threats is therefore an important part of antimalware systems, and a subject of much research and effort. But, determining whether a new file is malicious or benign can be difficult and time-consuming, even when human researchers are simply confirming a machine-based determination. It is therefore desirable to provide machine-based malware determinations and classifications that reduce the workload on human malware researchers. SUMMARY One example embodiment of the invention comprises a machine learning system such as a neural network trained on an explanation vector as input and a result such as a domain classification output, improving the neural network's confidence in the explanation. In a more detailed example, an input data string with a hierarchical structure is received and analyzed using a machine learning module to generate an output, and an explanation of the output is generated comprising a subset of the input data string that is responsible for the output. A weighting or masking function is applied when training the machine learning module using the generated explanation of the output and the output itself, and is configured to improve the output generated when the generated explanation of output is provided as input. In a further example, the loss function used to train the machine learning module is optimized using an inner optimization for the estimated output and the weighting or masking function. In another example, an input data string with a hierarchical structure is again received and analyzed using a machine learning module to generate an output, and an explanation of the output is generated comprising a subset of the input data string that is responsible for the output. The machine learning module comprises a neural network having an inner portion