US-12627683-B2 - Risk management simulator for multi-cloud data components

US12627683B2US 12627683 B2US12627683 B2US 12627683B2US-12627683-B2

Abstract

Techniques described herein relate to a method for performing threat simulations. The method includes preparing a multi-cloud infrastructure (MCI) for live data collection, wherein: the MCI includes components, a first component of the components is associated with a first cloud service provider, and a second component of the components is associated with a second cloud service provider; obtaining, after the preparing, live data from the MCI; generating a prediction using the live data and a prediction model, wherein the prediction specifies whether the live data indicates that there is a threat within the MCI; making a first determination that the prediction indicates a threat associated with the MCI has been identified; and in response to the first determination: performing threat remediation using the prediction and a diagnostic repository.

Inventors

Avinash Kumar
Mahesh Reddy Appireddygari Venkataramana

Assignees

DELL PRODUCTS L.P.

Dates

Publication Date: 20260512
Application Date: 20240530

Claims (16)

1 . A method for performing threat simulations, comprising: preparing a multi-cloud infrastructure (MCI) for live data collection, wherein: the MCI comprises a plurality of components, a first component of the plurality of components is associated with a first cloud service provider, and a second component of the plurality of components is associated with a second cloud service provider; obtaining, after the preparing, live data from the MCI; generating a prediction using the live data and a prediction model, wherein the prediction specifies whether the live data indicates that there is a threat within the MCI; making a first determination that the prediction indicates a threat associated with the MCI has been identified; in response to the first determination: performing threat remediation using the prediction and a diagnostic repository; after performing threat remediation: making a second determination that a model update event is identified; in response to the second determination: obtaining refinement training data; performing data processing on the refinement training data to obtain processed refinement training data; performing, after performing data processing, feature generation on the processed refinement training data to obtain featured refinement training data; generating an updated prediction model using the featured refinement training data; updating simulation parameters of the updated prediction model based on a prior prediction model performance, wherein the simulation parameters comprise weights and thresholds; performing model validation using the updated prediction model, comprising: generating a confusion matrix using the prediction model and validation training data, calculating an error associated with the prediction model, and comparing the error, a true positive percentage, a true negative percentage, a false positive percentage, and a false negative percentage to corresponding user configurable thresholds; and initiating threat simulation using the updated prediction model.
2 . The method of claim 1 , wherein the live data comprises a first portion associated with the first component and a second portion associated with the second component.
3 . The method of claim 2 , wherein the first portion comprises: network traffic information associated with the first component; a geographic location of the first component; system activities associated with the first component; user behavior associated with the first component; user authentication logs associated with the first component; and data transfer rates associated with the first component.
4 . The method of claim 1 , wherein the diagnostic repository specifies at least one action to perform to remediate the threat associated with the prediction.
5 . The method of claim 1 , wherein the prediction model comprises a K-Nearest Neighbors prediction model.
6 . The method of claim 1 , further comprising: prior to preparing the MCI for live data collection: identifying an initial prediction model training event associated with the MCI; obtaining, in response to the identifying, training data associated with the MCI; performing data preprocessing on the training data to obtain processed training data; performing feature generation using the processed training data to obtain featured training data; generating the prediction model using the featured training data; performing model validation using the prediction model; and initiating threat simulation using the prediction model.
7 . The method of claim 1 , wherein the refinement training data comprises additional data compared to the training data.
8 . The method of claim 7 , wherein the refinement training data comprises labeled live data generated by a user.
9 . A non-transitory computer readable medium comprising computer readable program code, which when executed by a computer processor enables the computer processor to perform a method for performing threat simulations, the method comprising: preparing a multi-cloud infrastructure (MCI) for live data collection, wherein: the MCI comprises a plurality of components, a first component of the plurality of components is associated with a first cloud service provider, and a second component of the plurality of components is associated with a second cloud service provider; obtaining, after the preparing, live data from the MCI; generating a prediction using the live data and a prediction model, wherein the prediction specifies whether the live data indicates that there is a threat within the MCI; making a first determination that the prediction indicates a threat associated with the MCI has been identified; in response to the first determination: performing threat remediation using the prediction and a diagnostic repository; after performing threat remediation: making a second determination that a model update event is identified; in response to the second determination: obtaining refinement training data; performing data processing on the refinement training data to obtain processed refinement training data; performing, after performing data processing, feature generation on the processed refinement training data to obtain featured refinement training data; generating an updated prediction model using the featured refinement training data; updating simulation parameters of the updated prediction model based on a prior prediction model performance, wherein the simulation parameters comprise weights and thresholds; performing model validation using the updated prediction model, comprising: generating a confusion matrix using the prediction model and validation training data, calculating an error associated with the prediction model, and comparing the error, a true positive percentage, a true negative percentage, a false positive percentage, and a false negative percentage to corresponding user configurable thresholds; and initiating threat simulation using the updated prediction model.
10 . The non-transitory computer readable medium of claim 9 , wherein the live data comprises a first portion associated with the first component and a second portion associated with the second component.
11 . The non-transitory computer readable medium of claim 10 , wherein the first portion comprises: network traffic information associated with the first component; a geographic location of the first component; system activities associated with the first component; user behavior associated with the first component; user authentication logs associated with the first component; and data transfer rates associated with the first component.
12 . The non-transitory computer readable medium of claim 9 , wherein the diagnostic repository specifies at least one action to perform to remediate the threat associated with the prediction.
13 . The non-transitory computer readable medium of claim 9 , wherein the prediction model comprises a K-Nearest Neighbors prediction model.
14 . The non-transitory computer readable medium of claim 9 , wherein the method further comprising: prior to preparing the MCI for live data collection: identifying an initial prediction model training event associated with the MCI; obtaining, in response to the identifying, training data associated with the MCI; performing data preprocessing on the training data to obtain processed training data; performing feature generation using the processed training data to obtain featured training data; generating the prediction model using the featured training data; performing model validation using the prediction model; and initiating threat simulation using the prediction model.
15 . The non-transitory computer readable medium of claim 9 , wherein the refinement training data comprises additional data compared to the training data.
16 . The non-transitory computer readable medium of claim 15 , wherein the refinement training data comprises labeled live data generated by a user.

Description

BACKGROUND Computing devices may provide services for users. To provide the services, the computing devices may obtain other services from other computing devices included in a computing environment. The computing devices in the computing environment may be susceptible to threats from nefarious users. To protect the computing devices and data in the computing environment, the threats may be searched for and identified. Identified threats may be remediated to mitigate damages associated with the identified threats. BRIEF DESCRIPTION OF DRAWINGS Certain embodiments of the invention will be described with reference to the accompanying drawings. However, the accompanying drawings illustrate only certain aspects or implementations of the invention by way of example and are not meant to limit the scope of the claims. FIG. 1.1 shows a diagram of a system in accordance with one or more embodiments disclosed herein. FIG. 1.2 shows a diagram of a training data repository in accordance with one or more embodiments disclosed herein. FIG. 2.1 shows a flowchart of a method for training an initial prediction model in accordance with one or more embodiments disclosed herein. FIG. 2.2 shows a flowchart of a method for performing risk simulation in accordance with one or more embodiments disclosed herein. FIG. 2.3 shows a flowchart of a method for updating a predication model in accordance with one or more embodiments disclosed herein. FIG. 3 shows a diagram of a computing device in accordance with one or more embodiments disclosed herein. DETAILED DESCRIPTION Specific embodiments will now be described with reference to the accompanying figures. In the following description, numerous details are set forth as examples of the embodiments disclosed herein. It will be understood by those skilled in the art that one or more embodiments disclosed herein may be practiced without these specific details and that numerous variations or modifications may be possible without departing from the scope of the embodiments disclosed herein. Certain details known to those of ordinary skill in the art are omitted to avoid obscuring the description. In the following description of the figures, any component described with regard to a figure, in various embodiments disclosed herein, may be equivalent to one or more like-named components described with regard to any other figure. For brevity, descriptions of these components will not be repeated with regard to each figure. Thus, each and every embodiment of the components of each figure is incorporated by reference and assumed to be optionally present within every other figure having one or more like-named components. Additionally, in accordance with various embodiments disclosed herein, any description of the components of a figure is to be interpreted as an optional embodiment, which may be implemented in addition to, in conjunction with, or in place of the embodiments described with regard to a corresponding like-named component in any other figure. Throughout this application, elements of figures may be labeled as A to N. As used herein, the aforementioned labeling means that the element may include any number of items and does not require that the element include the same number of elements as any other item labeled as A to N. For example, a data structure may include a first element labeled as A and a second element labeled as N. This labeling convention means that the data structure may include any number of the elements. A second data structure, also labeled as A to N, may also include any number of elements. The number of elements of the first data structure and the number of elements of the second data structure may be the same or different. In general, embodiments of the invention relate to methods, systems, and/or non-transitory computer readable mediums for performing risk simulations for a multi-cloud infrastructure (MCI). In the realm of modern technology, large-scale infrastructures may stand as crucial pillars of operations for businesses and organizations across industries. However, this digital evolution has also brought forth an alarming escalation in cyber threats, especially when managing multi-cloud-based data components. The looming danger of sophisticated cyberattacks may pose a substantial risk to these infrastructures, potentially resulting in severe breaches, data loss, and operational disruptions. Conventional cybersecurity measures, often reliant on reactive strategies like signature-based detection, fall short in the face of these evolving threats. There is an evident need for a proactive solution that can accurately anticipate, simulate, and counteract potential cyber threats in real-time, specifically tailored to the challenges of multi-cloud-based data components. This solution must integrate seamlessly with existing cybersecurity frameworks while harnessing the power of advanced technologies like machine learning to provide a comprehensive and adaptive defense m