US-12621321-B2 - Automatic generation of cause and effect attack predictions models via threat intelligence data
Abstract
A method for predicting a future stage of an attack on a computer system. The method comprises performing, by the computer system, linguistic analysis on threat intelligence reports, where the threat intelligence reports comprise known stages of the attack. The method also comprises processing, by the computer system, the linguistic analysis with a transition matrix to determine probabilities of cause-and-effect relationships between the known stages of the attack, updating, by the computer system, a probability model based on the probabilities determined by the transition matrix, and predicting, by the computer system, the future stage of the attack based on the probability model and attack classifications.
Inventors
- Avi Chesla
- Sergei Edelstein
Assignees
- Cybereason Inc.
Dates
- Publication Date
- 20260505
- Application Date
- 20230331
Claims (20)
- 1 . A method for predicting a future stage of an attack on a computer system, comprising: performing, by the computer system, linguistic analysis on threat intelligence reports, the threat intelligence reports comprising known stages of the attack; processing, by the computer system, the linguistic analysis with a transition matrix that represents probabilities of transitions between multiple attack stages to determine probabilities of cause-and-effect relationships between the known stages of the attack; updating, by the computer system, a probability model based on the probabilities determined by the transition matrix using new threat intelligence data processed through the transition matrix; and predicting, by the computer system, a probabilistic sequence of multiple future stages in the attack based on the probability model and a plurality of attack classifications.
- 2 . The method of claim 1 , further comprising: choosing, by the computer system, the probability model from a plurality of probability models.
- 3 . The method of claim 1 , further comprising: setting, by the computer system, the plurality of attack classifications based on an evidence data set including logs that are collected from at least one of security tools, network devices, identity management systems, cloud workspace applications, and endpoint operating system.
- 4 . The method of claim 1 , further comprising: performing, by the computer system, the linguistic analysis initially using a pre-trained natural language processing (NLP) model configured to predict missing textual terms.
- 5 . The method of claim 4 , further comprising: performing, by the computer system, the NLP model using a Bidirectional Encoder Representations from Transformers (BERT) Machine Learning algorithm for creating NLP predictive models based on textual data in an unsupervised manner.
- 6 . The method of claim 1 , further comprising: predicting, by the computer system, the future stage in the attack based on a plurality of probability models and combining the predictions from the plurality of probability models as a combined prediction of the future stage in the attack.
- 7 . The method of claim 6 , wherein the plurality of probability models includes a Quasi-Linear prediction model and a Bayesian Belief Network prediction model.
- 8 . The method of claim 1 , further comprising: predicting, by the computer system, the future stage in the attack by specifying at least one of attack tactics, attack techniques, attack sub-techniques and attack software identity.
- 9 . The method of claim 1 , further comprising: generating, by the computer system, the probability model by: performing matrix decomposition of the transition matrix using a plurality of matrix decomposition methods, scoring the decomposition for each of the plurality of matrix decomposition methods, selecting one of the matrix decompositions based on the scoring, and generating the probability model using the selected one of the matrix decompositions.
- 10 . The method of claim 9 , further comprising: performing, by the computer system, the matrix decomposition as a main matrix including non-cyclic transitions of the known stages of the attack and supplement matrix including cyclic transitions of the known stages of the attack.
- 11 . A non-transitory computer readable medium comprising one or more sequences of instructions, which, when executed by a processor, causes a computer system to predict a future stage of an attack on another computer system by performing operations comprising: performing, by the computer system, linguistic analysis on threat intelligence reports, the threat intelligence reports comprising known stages of the attack; processing, by the computer system, the linguistic analysis with a transition matrix that represents probabilities of transitions between multiple attack stages to determine probabilities of cause-and-effect relationships between the known stages of the attack; updating, by the computer system, a probability model based on the probabilities determined by the transition matrix using new threat intelligence data processed through the transition matrix; and predicting, by the computer system, a probabilistic sequence of multiple future stages in the attack based on the probability model and a plurality of attack classifications.
- 12 . The non-transitory computer readable medium of claim 11 , further comprising: choosing, by the computer system, the probability model from a plurality of probability models.
- 13 . The non-transitory computer readable medium of claim 11 , further comprising: setting, by the computer system, the plurality of attack classifications based on an evidence data set including logs that are collected from at least one of security tools, network devices, identity management systems, cloud workspace applications, and endpoint operating system.
- 14 . The non-transitory computer readable medium of claim 11 , further comprising: performing, by the computer system, the linguistic analysis initially using a pre-trained natural language processing (NLP) model configured to predict missing textual terms.
- 15 . The non-transitory computer readable medium of claim 14 , further comprising: performing, by the computer system, the NLP model using a Bidirectional Encoder Representations from Transformers (BERT) Machine Learning algorithm for creating NLP predictive models based on textual data in an unsupervised manner.
- 16 . The non-transitory computer readable medium of claim 11 , further comprising: predicting, by the computer system, the future stage in the attack based on a plurality of probability models and combining the predictions from the plurality of probability models as a combined prediction of the future stage in the attack.
- 17 . The non-transitory computer readable medium of claim 16 , wherein the plurality of probability models includes a Quasi-Linear prediction model and a Bayesian Belief Network prediction model.
- 18 . The non-transitory computer readable medium of claim 11 , further comprising: predicting, by the computer system, the future stage in the attack by specifying at least one of attack tactics, attack techniques, attack sub-techniques and attack software identity.
- 19 . The non-transitory computer readable medium of claim 11 , further comprising: generating, by the computer system, the probability model by: performing matrix decomposition of the transition matrix using a plurality of matrix decomposition methods, scoring the decomposition for each of the plurality of matrix decomposition methods, selecting one of the matrix decompositions based on the scoring, and generating the probability model using the selected one of the matrix decompositions.
- 20 . The non-transitory computer readable medium of claim 19 , further comprising: performing, by the computer system, the matrix decomposition as a main matrix including non-cyclic transitions of the known stages of the attack and supplement matrix including cyclic transitions of the known stages of the attack.
Description
CROSS-REFERENCE TO RELATED APPLICATION The present application claims priority of U.S. Provisional Patent Application 63/362,286 filed on Mar. 31, 2022, the entire contents of which is incorporated herein for all purposes by this reference. The present application is related to U.S. Pat. No. 11,228,610, issued Jan. 18, 2022, and U.S. Pat. No. 10,673,903, issued Jun. 2, 2020, the entire contents of which are hereby incorporated herein for all purposes by this reference. FIELD Embodiments disclosed herein generally relate to a system automatic generation of cause-and-effect attack predictions models via threat intelligence data. BACKGROUND Frequency and sophistication of cyberattacks are on the rise. As a result, the time it takes organizations to detect, investigate, respond and contain attacks is unacceptable. This makes organizations vulnerable to threats such as data theft and data manipulation, identity theft, ransomware and more. One reason it takes too long to detect, investigate, respond and contain attacks, is because existing security solutions are designed to detect and respond to attacks based on current and historical evidence and alerts, without the capability to predict the attacker's next steps and prevent them. Therefore, these conventional solutions fail to provide a proactive way to deal with attacks. SUMMARY In some embodiments a method is implemented for predicting a future stage of an attack on a computer system. The method comprises performing, by the computer system, linguistic analysis on threat intelligence reports, where the threat intelligence reports comprise known stages of the attack. The method also comprises processing, by the computer system, the linguistic analysis with a transition matrix to determine probabilities of cause-and-effect relationships between the known stages of the attack, updating, by the computer system, a probability model based on the probabilities determined by the transition matrix, and predicting, by the computer system, the future stage of the attack based on the probability model and attack classifications. BRIEF DESCRIPTION OF DRAWINGS So that the manner in which the above recited features of the present disclosure may be understood in detail, a more particular description of the disclosure, briefly summarized above, may be had by reference to embodiments, some of which are illustrated in the appended drawings. It is to be noted, however, that the appended drawings illustrate only typical embodiments of this disclosure and are therefore not to be considered limiting of its scope, for the disclosure may admit to other equally effective embodiments. FIG. 1A depicts a block diagram of a computing environment, according to example embodiments of the present disclosure. FIG. 1B depicts a flowchart of attack prediction, according to example embodiments of the present disclosure. FIG. 2A depicts a table of transition types, according to example embodiments of the present disclosure. FIG. 2B depicts a table of section qualification, according to example embodiments of the present disclosure. FIG. 2C depicts a table of direct transition qualification, according to example embodiments of the present disclosure. FIG. 2D depicts a table of un-direct transition qualification, according to example embodiments of the present disclosure. FIG. 2E depicts a relations matrix, according to example embodiments of the present disclosure. FIG. 3 depicts a flow chart of a linguistic analysis procedure, according to example embodiments of the present disclosure. FIG. 4 depicts a transition matrix, according to example embodiments of the present disclosure. FIG. 5 depicts a process for transition matrix generation, according to example embodiments of the present disclosure. FIG. 6A depicts an upper propagation matrix, according to example embodiments of the present disclosure. FIG. 6B depicts a quasi-linear prediction model, according to example embodiments of the present disclosure FIG. 7 depicts a probability addition rule, according to example embodiments of the present disclosure. FIG. 8 depicts a BBN influence diagram, according to example embodiments of the present disclosure. FIG. 9 depicts a probabilities propagation, according to example embodiments of the present disclosure. FIG. 10 depicts a BBN model advantages illustration, according to example embodiments of the present disclosure. FIG. 11 depicts a prediction models generation process, according to example embodiments of the present disclosure. FIG. 12 depicts transitions cases, according to example embodiments of the present disclosure. FIG. 13 depicts a simple ordering decomposition, according to example embodiments of the present disclosure. FIG. 14 depicts an influence diagram conditional probability table for a single influence node, according to example embodiments of the present disclosure. FIG. 15 depicts influence diagram conditional probability tables for multi-influence nodes, according to example e