Search

CN-121997346-A - Data vulnerability analysis method and system based on artificial intelligence

CN121997346ACN 121997346 ACN121997346 ACN 121997346ACN-121997346-A

Abstract

The invention relates to the technical field of data analysis, in particular to a data vulnerability analysis method and system based on artificial intelligence, wherein the method comprises the steps of obtaining log record data, and carrying out matching sorting according to the log record data to obtain a circulation data set; the method comprises the steps of constructing a node connection relation according to a circulation data set, determining a complete path map, obtaining node access records, screening potential hidden danger nodes, obtaining data interaction frequency after obtaining a hidden danger node list, carrying out threshold comparison according to the data interaction frequency to obtain a risk associated chain catalog, carrying out data analysis according to the risk associated chain catalog to obtain an abnormal operation sequence, carrying out legality comparison according to the abnormal operation sequence to obtain a preliminary comparison set, carrying out threshold comparison to obtain a final vulnerability position, and carrying out restoration path simulation based on the final vulnerability position to obtain an optimized circulation track. The method realizes vulnerability positioning and path repairing of the full link of the data flow.

Inventors

  • LI JIANG

Assignees

  • 中亿鼎盛建设集团有限公司

Dates

Publication Date
20260508
Application Date
20260409

Claims (9)

  1. 1. The artificial intelligence-based data vulnerability analysis method is characterized by comprising the following steps of: acquiring log record data, and carrying out matching sorting according to the log record data to obtain a circulation data set; constructing a node connection relation according to the circulation data set, and determining a complete path map; acquiring node access records based on the complete path map, screening potential hidden danger nodes according to the node access records, and obtaining a hidden danger node list; Acquiring data interaction frequency based on the hidden danger node list, and comparing thresholds according to the data interaction frequency to obtain a risk association chain catalog; Carrying out data analysis according to the risk associated chain catalogue to obtain an abnormal operation sequence; performing validity comparison according to the abnormal operation sequence to obtain a preliminary comparison set, and performing threshold comparison according to the preliminary comparison set to obtain a final vulnerability position; And performing repair path simulation based on the final vulnerability position to obtain an optimized circulation track.
  2. 2. The artificial intelligence based data vulnerability analysis method of claim 1, wherein the obtaining log record data, and performing matching sorting according to the log record data, obtain a circulation data set, comprises: Acquiring log record data, extracting a time stamp and an event identifier aiming at the log record data, and obtaining a log data set; Matching and sequencing the event identifications according to the log data set to obtain a circulation path set; Screening according to the circulation path set to obtain a missing circulation sequence, and performing supplementary repair on the missing circulation sequence to obtain a circulation data set.
  3. 3. The artificial intelligence based data vulnerability analysis method of claim 1, wherein constructing node connection relations from the circulation data set, determining a complete path graph comprises: extracting path nodes based on the circulation data set to obtain primary path nodes; Constructing a connection relation among nodes according to the preliminary path nodes to obtain a direct connection path; And adding auxiliary information to the preliminary path node based on the direct connection path to obtain a complete path node, and sorting the direct connection path and the complete path node to obtain a complete path map.
  4. 4. The method for analyzing data vulnerabilities based on artificial intelligence according to claim 1, wherein the obtaining node access records based on the complete path map, and screening potential hidden danger nodes according to the node access records, to obtain a hidden danger node list, comprises: Acquiring a node operation log based on the complete path map, extracting a node access record according to the node operation log, and if the node access record does not accord with a preset permission standard, marking suspected hidden danger nodes and integrating to obtain a preliminary marking result; And extracting access frequency data aiming at the preliminary marking result, and marking potential hidden danger nodes and integrating if the access frequency data exceeds a preset frequency threshold value to obtain a hidden danger node list.
  5. 5. The method for analyzing data vulnerabilities based on artificial intelligence according to claim 1, wherein the obtaining data interaction frequencies based on the hidden danger node list, and comparing thresholds according to the data interaction frequencies, obtain a risk associated chain directory, comprises: acquiring the data interaction frequency of nodes in a cluster from the hidden danger node list, judging a risk key chain if the data interaction frequency exceeds a preset frequency threshold value, and integrating the risk key chain to obtain a preliminary screening set; And analyzing the interaction fluctuation condition according to the preliminary screening set, judging a risk association chain if the interaction fluctuation condition exceeds a preset fluctuation upper limit, and counting the risk association chain to obtain a risk association chain catalog.
  6. 6. The artificial intelligence based data vulnerability analysis method of claim 1, wherein the performing data analysis according to the risk associated chain directory to obtain an abnormal operation sequence comprises: extracting node data volume information from the risk associated chain directory, judging the node data volume information as redundant data volume information if the data volume information exceeds a preset data volume threshold value, extracting the redundant data volume information and sorting the redundant data volume information to obtain a primary extraction set; And classifying the operation time period according to the preliminary extraction set extraction operation time period to obtain a classification result, and judging abnormal operation if the classification result deviates from a preset classification standard to obtain an abnormal operation sequence.
  7. 7. The artificial intelligence based data vulnerability analysis method of claim 1, wherein the performing legal comparison according to the abnormal operation sequence to obtain a preliminary comparison set, performing threshold comparison according to the preliminary comparison set to obtain a final vulnerability position comprises: Acquiring information modification operation according to the abnormal operation sequence and performing validity comparison, if the information modification operation does not accord with modification validity standard, judging that the operation is illegal, and integrating the illegal operation to obtain a preliminary comparison set; And calculating the proportion of the illegal operation according to the preliminary comparison set, comparing, determining the proportion as a vulnerability node if the proportion exceeds a preset illegal threshold value, and integrating the vulnerability node to obtain a final vulnerability position.
  8. 8. The artificial intelligence based data vulnerability analysis method of claim 1, wherein the performing a repair path simulation based on the final vulnerability location to obtain an optimized circulation track comprises: extracting interaction information from the final vulnerability position, comparing, and judging a path node to be adjusted if the interaction information exceeds a preset interaction standard to obtain a set to be adjusted; calculating information redundancy for the set to be adjusted, and sorting according to the information redundancy to obtain priority repair sorting; And performing simulation correction according to the priority repair orders, generating corrected paths, and obtaining optimized circulation tracks.
  9. 9. An artificial intelligence based data vulnerability analysis system, comprising: the data acquisition module is used for acquiring log record data, and carrying out matching sorting according to the log record data to obtain a circulation data set; the map construction module is used for constructing node connection relations according to the circulation data set and determining a complete path map; The hidden danger identification module is used for acquiring node access records based on the complete path map, screening potential hidden danger nodes according to the node access records, and obtaining a hidden danger node list; The risk judging module is used for acquiring data interaction frequency based on the hidden danger node list, and carrying out threshold comparison according to the data interaction frequency to obtain a risk association chain catalog; the anomaly analysis module is used for carrying out data analysis according to the risk association chain catalogue to obtain an anomaly operation sequence; The vulnerability positioning module is used for carrying out validity comparison according to the abnormal operation sequence to obtain a preliminary comparison set, and carrying out threshold comparison according to the preliminary comparison set to obtain a final vulnerability position; and the path repairing module is used for carrying out repairing path simulation based on the final vulnerability position to obtain an optimized circulation track.

Description

Data vulnerability analysis method and system based on artificial intelligence Technical Field The invention relates to the technical field of data analysis, in particular to a data vulnerability analysis method and system based on artificial intelligence. Background In the prior art, a system log is scanned based on rule matching and statistical analysis, an operation record with abnormal access authority is identified, a potential safety risk is judged by setting an access frequency threshold, and after abnormal access is detected, relevant nodes are marked and alarm information is generated. However, in the prior art, when facing a complex data interaction scene, the correlation analysis capability of complex data is lacking, the correlation of the data between different links cannot be comprehensively captured, a complete circulation path of the data from a source to a destination is difficult to restore, and abnormal behaviors hidden in a huge data network are easy to ignore. This limitation makes it difficult for the abnormal operation dispersed among the nodes to be analyzed in series, and thus makes it impossible to identify the potential safety hazard cooperatively generated among the plurality of nodes, and it is more difficult to locate the specific location and propagation path where the problem occurs. In summary, the prior art lacks the capability of tracking and repairing the full link of data flow, which results in inaccurate vulnerability positioning under a complex network environment. Disclosure of Invention The invention provides a data vulnerability analysis method and system based on artificial intelligence, which are used for realizing vulnerability positioning analysis of a feature. In order to solve the above technical problems, the present invention provides a data vulnerability analysis method based on artificial intelligence, including: acquiring log record data, and carrying out matching sorting according to the log record data to obtain a circulation data set; constructing a node connection relation according to the circulation data set, and determining a complete path map; acquiring node access records based on the complete path map, screening potential hidden danger nodes according to the node access records, and obtaining a hidden danger node list; Acquiring data interaction frequency based on the hidden danger node list, and comparing thresholds according to the data interaction frequency to obtain a risk association chain catalog; Carrying out data analysis according to the risk associated chain catalogue to obtain an abnormal operation sequence; performing validity comparison according to the abnormal operation sequence to obtain a preliminary comparison set, and performing threshold comparison according to the preliminary comparison set to obtain a final vulnerability position; And performing repair path simulation based on the final vulnerability position to obtain an optimized circulation track. In an optional implementation manner, the obtaining the log record data, and performing matching sorting according to the log record data to obtain a circulation data set, includes: Acquiring log record data, extracting a time stamp and an event identifier aiming at the log record data, and obtaining a log data set; Matching and sequencing the event identifications according to the log data set to obtain a circulation path set; Screening according to the circulation path set to obtain a missing circulation sequence, and performing supplementary repair on the missing circulation sequence to obtain a circulation data set. In an optional embodiment, the constructing a node connection relationship according to the circulation data set, determining a complete path map, includes: extracting path nodes based on the circulation data set to obtain primary path nodes; Constructing a connection relation among nodes according to the preliminary path nodes to obtain a direct connection path; And adding auxiliary information to the preliminary path node based on the direct connection path to obtain a complete path node, and sorting the direct connection path and the complete path node to obtain a complete path map. In an optional implementation manner, the acquiring a node access record based on the complete path map, filtering potential hidden danger nodes according to the node access record, and obtaining a hidden danger node list includes: Acquiring a node operation log based on the complete path map, extracting a node access record according to the node operation log, and if the node access record does not accord with a preset permission standard, marking suspected hidden danger nodes and integrating to obtain a preliminary marking result; And extracting access frequency data aiming at the preliminary marking result, and marking potential hidden danger nodes and integrating if the access frequency data exceeds a preset frequency threshold value to obtain a hidden danger node list. In an