CN-121389153-B - Data security assessment method and system based on artificial intelligence

CN121389153BCN 121389153 BCN121389153 BCN 121389153BCN-121389153-B

Abstract

The invention relates to the technical field of data security evaluation, in particular to a data security evaluation method and system based on artificial intelligence, comprising the steps of acquiring security event information and setting an evaluation feature vector; the method comprises the steps of establishing a safety evaluation framework based on a deep belief network, determining an average absolute error value, confirming that correction is effective if the average absolute error value is smaller than a preset error threshold value, generating a safety evaluation report based on probability distribution corresponding to the correction effectiveness, and configuring safety protection strategies in combination with different stages. The method solves the technical problems that safety event information of different stages in a digital full life cycle is difficult to effectively process, a safety evaluation report cannot directly guide safety protection strategy configuration, and the limitation of the safety protection strategy is insufficient, and achieves the technical effects that the consistency of risk evaluation and actual distribution is ensured by combining an average absolute error correction mechanism, the deviation of an evaluation result and actual risks is effectively reduced, the effectiveness of the safety protection strategy is improved, and the data safety of the digital full life cycle is ensured.

Inventors

LI GANG
Xia Huanzhao
ZHANG BINWU

Assignees

开元华创科技(集团)有限公司

Dates

Publication Date: 20260512
Application Date: 20251010

Claims (8)

1. A method for evaluating data security based on artificial intelligence, the method comprising: Acquiring security event information of a digital full life cycle, extracting time sequence features and semantic sequence features by feature processing, and setting an evaluation feature vector; constructing a safety evaluation framework based on a deep belief network, namely constructing a characteristic mapping relation by taking the evaluation characteristic vector as an input layer neuron and using a multi-layer limited Boltzmann mechanism, wherein a first layer of limited Boltzmann machine carries out nonlinear transformation on the time sequence characteristics, and a second layer of limited Boltzmann machine processes cross correlation of the semantic sequence characteristics and the time sequence characteristics; determining an average absolute error value with an actual risk distribution, and when the average absolute error value is smaller than a preset error threshold value, confirming that correction is effective, wherein the method comprises the following steps: Determining a probability adjustment coefficient through a path rule of the decision tree, proportionally adjusting the corresponding probability value according to the characteristic contribution degree, and keeping the sum of the adjusted probability values corresponding to the data security risk states under each corrected data security risk level to be 1; determining an average absolute error value with the actual risk distribution through the verification set; when the average absolute error value is not smaller than a preset error threshold value, backtracking analysis is performed on feature selection and path rules of one or more decision trees corresponding to the backtracking analysis, and probability adjustment coefficients are re-optimized; based on probability distribution corresponding to various data security risk levels with effective correction, generating a security assessment report containing confidence intervals, key risk inducements and risk evolution trends of various data security risk levels; based on probability distribution corresponding to various data security risk levels with effective correction, a security assessment report is generated, and security protection strategies are configured in combination with different stages of a digital full life cycle.
2. The artificial intelligence based data security assessment method of claim 1, wherein the feature mapping relationship is built by a multi-layer limited boltzmann mechanism, the method further comprising: the output layer outputs the data security risk level by adopting a softmax activation function; And pre-training the deep belief network by using a historical security event data set, and optimizing network parameters by comparing a divergence algorithm to ensure that the prediction accuracy corresponding to the data security risk level of the security assessment architecture on the verification set meets a preset accuracy threshold.
3. The artificial intelligence based data security assessment method of claim 1, wherein network parameters are optimized by a contrast divergence algorithm, the method further comprising: Based on the security assessment architecture, generating a plurality of decision trees, taking an information gain rate as a feature splitting criterion, and selecting key features which are highly relevant to probability deviation in the assessment feature vector as splitting basis; And carrying out error correction on probability distribution corresponding to the security risk level of various data output by the deep belief network by each decision tree.
4. An artificial intelligence based data security assessment method according to claim 3, wherein said method comprises: each decision tree takes the cross entropy loss of the probability distribution and the actual risk label output by the deep belief network as a target value; Simultaneously, the target value is minimized through a gradient descent method, and the splitting threshold value and the path weight of the corresponding decision tree are optimized iteratively.
5. The artificial intelligence based data security assessment method of claim 1, wherein security protection policies are configured in connection with different phases of a digital full life cycle, the method comprising: In the data acquisition stage corresponding to the digital full life cycle, configuring an access control strategy based on roles according to the authority characteristics and the historical abnormal access records of the access main body in the evaluation characteristic vector, setting a multi-factor authentication mechanism for sensitive data acquisition, and simultaneously limiting the acquisition frequency and the upper limit of the data quantity of a single main body; And in a data storage stage corresponding to the digital full life cycle, according to the vulnerability characteristics and encryption strength indexes of the storage medium in the vulnerability scanning result in the security event information, carrying out encryption storage on static data, configuring a remote disaster recovery backup scheme, and positively correlating the backup frequency with the data updating frequency.
6. An artificial intelligence based data security assessment method according to claim 5, wherein said method comprises: In the data transmission stage, based on the transmission period risk distribution in the time sequence characteristics and the transmission protocol vulnerability information in the semantic sequence characteristics, an encryption transmission protocol is started, the transmitted data packet is subjected to fragment encryption and digital signature, and a transmission timeout retransmission mechanism and a real-time blocking rule of abnormal transmission behaviors are set; In the data use stage, a data desensitization processing mechanism is deployed according to risk evolution trend prediction and key risk cause analysis, and mask processing is carried out on sensitive fields in unnecessary scenes.
7. The method for evaluating data security based on artificial intelligence according to claim 6, wherein the security event information includes access log, abnormal behavior record, vulnerability scanning result and encryption strength index; extracting complexity of an encryption algorithm and key rotation frequency characteristics from the encryption strength index, and quantizing the encryption algorithm complexity and key rotation frequency characteristics into encryption safety coefficients, wherein the encryption safety coefficients are used as correction factors of the probability adjustment coefficients for floating adjustment of risk probability, and the encryption strength index is used for measuring safety protection strength of a data transmission stage and a data storage stage; And meanwhile, carrying out cross-validation on the data transmission link state and the attribution of the IP address in the access log, and identifying cross-region abnormal transmission behaviors as supplementary items of abnormal behavior records.
8. An artificial intelligence based data security assessment system for performing the steps of an artificial intelligence based data security assessment method according to any one of claims 1 to 7, said system comprising: The feature processing module is used for acquiring the security event information of the digital full life cycle, extracting time sequence features and semantic sequence features by feature processing, and setting an evaluation feature vector; The feature mapping module is used for constructing a security assessment architecture based on a deep belief network, wherein the assessment feature vector is used as an input layer neuron, a feature mapping relation is constructed through a multi-layer limited Boltzmann mechanism, the first layer limited Boltzmann mechanism carries out nonlinear transformation on the time sequence features, and the second layer limited Boltzmann mechanism processes cross correlation of the semantic sequence features and the time sequence features; The preset error threshold comparison module is used for determining average absolute error values of actual risk distribution, and confirming that correction is effective when the average absolute error values are smaller than a preset error threshold; the safety protection strategy configuration module generates a safety evaluation report based on probability distribution corresponding to various data safety risk levels with effective correction, and configures the safety protection strategy in combination with different stages of the digital full life cycle; further, the preset error threshold comparison module is configured to perform the following method: Determining a probability adjustment coefficient through a path rule of a decision tree, proportionally adjusting the corresponding probability value according to the characteristic contribution degree, and keeping the sum of the adjusted probability values corresponding to the data security risk states under each corrected data security risk level to be 1; And when the average absolute error value is not smaller than a preset error threshold value, backtracking analysis is performed on feature selection and path rules of one or more decision trees corresponding to the average absolute error value, probability adjustment coefficients are re-optimized, and based on probability distribution corresponding to various data security risk levels with effective correction, security assessment reports comprising confidence intervals, key risk causes and risk evolution trends of the various data security risk levels are generated.

Description

Data security assessment method and system based on artificial intelligence Technical Field The invention relates to the technical field of data security assessment, in particular to a data security assessment method and system based on artificial intelligence. Background The data full life cycle covers a plurality of links such as acquisition, storage, transmission, use and the like, and the data safety assessment is used as a key link for guaranteeing the safety of the data full life cycle, so that potential risks can be timely identified, the safety situation can be quantified, and a basis is provided for formulating a targeted protection strategy. The current data security assessment method is mostly dependent on single feature analysis, and is difficult to process complex and various security event information in a digital full life cycle, so that the deviation between an assessment result and actual risk distribution is large, and in addition, an assessment report is difficult to directly guide security protection policy configuration in different stages, so that risk response is lagged, and the fine requirements of security protection in the data full life cycle cannot be met. In summary, in the prior art, it is difficult to effectively process the security event information of different stages in the digital whole life cycle, and the security evaluation report cannot directly guide the security protection policy configuration, so that the security protection policy has insufficient limitation. Disclosure of Invention The application provides a data security assessment method and a system based on artificial intelligence, which aim to solve the technical problems that security event information of different stages in a digital whole life cycle is difficult to effectively process, security assessment reports cannot directly guide security protection policy configuration, and security protection policies are not limited enough in the prior art. In view of the above problems, the technical scheme for realizing the application is as follows: The application provides a data security assessment method based on artificial intelligence, which comprises the steps of obtaining security event information of a digital full life cycle, extracting time sequence features and semantic sequence features through feature processing, setting assessment feature vectors, constructing a security assessment framework based on a deep belief network, taking the assessment feature vectors as input layer neurons, constructing feature mapping relations through a multi-layer limited Boltzmann mechanism, carrying out nonlinear transformation on the time sequence features by the first layer limited Boltzmann mechanism, processing cross correlation between the semantic sequence features and the time sequence features by the second layer limited Boltzmann mechanism, determining average absolute error values of actual risk distribution, confirming that correction is effective when the average absolute error values are smaller than a preset error threshold, generating a security assessment report based on probability distribution corresponding to various data security risk levels with the correction being effective, and configuring security protection strategies in different stages of the digital full life cycle. The output layer outputs the data security risk level by adopting a softmax activation function, the historical security event data set is used for pre-training the deep belief network, and network parameters are optimized through a contrast divergence algorithm, so that the prediction accuracy corresponding to the data security risk level of the security assessment architecture on the verification set meets a preset accuracy threshold. Preferably, a probability adjustment coefficient is determined through a path rule of the decision tree, the corresponding probability value is adjusted in proportion to the characteristic contribution degree, the sum of the adjusted probability values corresponding to the data security risk states under the corrected data security risk levels is kept to be 1, and the average absolute error value of the actual risk distribution is determined through the verification set. Preferably, when the average absolute error value is not smaller than a preset error threshold, backtracking analysis is performed on feature selection and path rules of one or more decision trees corresponding to the average absolute error value, probability adjustment coefficients are re-optimized, and a security evaluation report containing confidence intervals, key risk causes and risk evolution trends of various data security risk levels is generated based on probability distribution corresponding to various data security risk levels with effective correction. Preferably, based on the security assessment architecture, a plurality of decision trees are generated, an information gain rate is used as a feature splitting crite