Search

CN-122020700-A - Data isolation and privacy protection method and system based on data security large model

CN122020700ACN 122020700 ACN122020700 ACN 122020700ACN-122020700-A

Abstract

The invention discloses a data isolation and privacy protection method and system based on a data security large model, and relates to the technical field of data isolation and privacy protection. S1, collecting data to be processed, and performing cleaning, de-duplication and format standardization on the data to obtain standardized data. According to the invention, by introducing a pre-trained data security large model, intelligent division and dynamic adjustment of data security levels are realized, the limitation of a traditional fixed isolation mode is changed, physical isolation, logical isolation or port isolation strategies can be matched for data of different security levels according to the characteristics of data sensitivity, use scenes, circulation ranges and the like, the flexibility and accuracy of data isolation are greatly improved, and the privacy risk assessment result output by the large model is combined, homomorphic encryption, differential privacy or data desensitization technology is adopted pertinently, so that the usability of the data under different scenes is reserved to the greatest extent.

Inventors

  • WANG FAPENG
  • FANG ANKANG
  • SUN XIANG
  • LIU YINGYING

Assignees

  • 南京先进计算产业发展有限公司

Dates

Publication Date
20260512
Application Date
20251211

Claims (10)

  1. 1. A data isolation and privacy protection method based on a data security large model is characterized by comprising the following steps: s1, collecting data to be processed, and performing cleaning, deduplication and format standardization on the data to obtain standardized data; S2, inputting standardized data into a pre-trained data security large model, wherein the data security large model extracts security features of the data, and the security features comprise data sensitivity Data usage field Jing Quan weight Data flow range weight Calculating a data security level score L according to the formula (1), and dividing the data into three security levels of high, medium and low according to the score L; Formula (1): ; s3, configuring a corresponding isolation strategy according to the security level divided in the step S2; S4, outputting a data privacy risk assessment result by the data security large model, calculating a privacy risk value R by a formula (2), and processing data by adopting a corresponding privacy protection technology in combination with the data security level; formula (2): ; and S5, real-time monitoring the circulation, access and service conditions of the isolated data.
  2. 2. The method for data isolation and privacy protection based on the data security large model of claim 1, wherein in the step S1, the data cleaning comprises invalid data elimination and error correction, and the format standardization unifies the data formats according to preset XML and JSON data format specifications.
  3. 3. The method for data isolation and privacy protection based on the data security large model of claim 1, wherein in the step S2, the training process of the data security large model comprises collecting massive data security samples, the collected massive data security samples comprise historical security event data and data characteristic data, training the samples by adopting a deep learning algorithm, calculating a model Loss value Loss through a formula (3), and iteratively optimizing model parameters until Loss converges; equation (3): ; Wherein, the Is the first The true labels of the individual samples are then displayed, For model pair number The predictive label of each sample is used to predict, For the sample sequence number, 。
  4. 4. The method for data isolation and privacy protection based on the data security large model of claim 1, wherein in the step S3, the isolation policy configuration rule is: The data with high security level is stored in an independent physical server by adopting physical isolation, and the server is forbidden to be connected with an external network; The medium security level data adopts logic isolation, and limits the data access range through a virtual private network and an access control list; and (3) low security level data, namely adopting port isolation to limit the data to be transmitted only through a preset TCP/UDP port.
  5. 5. The method for data isolation and privacy protection based on data security large model as claimed in claim 1, wherein in the step S4, the privacy protection technique configuration rule is that when the data is of high security level and privacy risk value When the homomorphic encryption technology is adopted, the data is obtained through the formula (4) Performing encryption operation to obtain ciphertext ; Equation (4): ; Wherein, the For the preset generation element, the generation element is a generation element, In the form of a random number, Is a large prime number; When the data is of medium security level and When the differential privacy technology is adopted, laplacian noise is added in the data set through a formula (5); equation (5): ; Wherein, the As the raw data is to be processed, In order to add the data after the noise, For the sensitivity of the function, For privacy budgets; when the data is of low security level and And removing sensitive fields of the identification card number and the mobile phone number in the data by adopting a data desensitization technology.
  6. 6. The method for data isolation and privacy protection based on large data security model as set forth in claim 1, wherein in said step S5, when security risk is detected, the security level, isolation policy and privacy protection technique of the data are automatically adjusted, the security risk detection analyzes the data access log in real time through the large data security model, when risk events such as unauthorized access and abnormal data circulation are detected, the security level adjustment is triggered, and the data security level score is recalculated Updating isolation policy and privacy protection technology.
  7. 7. The method for data isolation and privacy protection based on the data security large model of claim 3, wherein the deep learning algorithm adopts a hybrid model of a transducer and a CNN-LSTM, a batch gradient descent method is adopted to optimize parameters in the model training process, the learning rate is set to be 0.001-0.01, and the iteration number is set to be 100-200.
  8. 8. The method for data isolation and privacy protection based on data security large model of claim 5, wherein homomorphic encryption technology adopts partial homomorphic encryption or homomorphic encryption algorithm, and privacy budget in differential privacy technology The range of the value of (2) is 0.1-1.0, and the scale parameter of the Laplace noise is 。
  9. 9. The method for data isolation and privacy protection based on the large data security model of claim 1, wherein in step S5, the real-time monitored index includes a data access IP address, access time, data transmission amount, and data operation type, and when the index exceeds a preset threshold, the safety risk event is determined.
  10. 10. The data isolation and privacy protection system based on the data security large model according to any one of claims 1 to 9, which is characterized by comprising a data preprocessing module, a data security large model module, a dynamic isolation module, a privacy protection module and a monitoring and adjusting module; The data preprocessing module is used for collecting data to be processed, executing cleaning, deduplication and format standardization operation, and outputting standardized data; the data security large model module is obtained by training massive data security samples in advance, and is internally provided with calculation logic corresponding to formulas (1), (2) and (3) for extracting security features of standardized data and calculating security grade scores of the data Privacy risk value Dividing security levels, evaluating privacy risks, and simultaneously monitoring the security state of data in real time; The dynamic isolation module is in communication connection with the data security large model module, receives the security level result output by the module, and executes physical isolation, logical isolation or port isolation operation according to the isolation policy configuration rule; The privacy protection module is in communication connection with the data security large model module, and receives the security level and the privacy risk value output by the module According to the privacy protection technology configuration rule, homomorphic encryption, differential privacy or data desensitization processing is executed, and encryption and noise adding logic corresponding to formulas (4) and (5) are built in; The monitoring and adjusting module is respectively in communication connection with the data security big model module, the dynamic isolation module and the privacy protection module, receives the monitoring data output by the data security big model module, triggers the dynamic isolation module to adjust the isolation strategy and the privacy protection module to adjust the privacy protection technology when the security risk is detected, and simultaneously informs the data security big model module to recalculate the data security grade grading Privacy risk value 。

Description

Data isolation and privacy protection method and system based on data security large model Technical Field The invention belongs to the technical field of data isolation and privacy protection, and particularly relates to a data isolation and privacy protection method and system based on a data security large model. Background With the advanced development of digital economy, data becomes a core production element and is widely applied to key fields such as finance, medical treatment, government affairs and the like. In the process of data value mining, the frequency of data streaming, sharing and cross-scene use is remarkably improved, and meanwhile, the rigid requirements of data security isolation and privacy information protection are induced, so that not only is the data ensured not to be accessed or tampered unauthorized in multi-main-body and multi-link interaction, but also sensitive information leakage of user identification numbers, transaction records, medical records and the like is avoided, and how to balance the availability and the security of the data becomes a core challenge in the digital transformation of various industries. The technical defects of the prior art are that firstly, a data isolation strategy lacks dynamic suitability, a security level division depends on a manual preset rule, an intelligent model is not introduced to comprehensively evaluate dynamic characteristics such as a data use scene, a circulation range and the like, the data is divided into levels only by a fixed threshold, so that high-level data is excessively isolated and wasted resources, low-level data is not isolated and is easy to attack, isolation accuracy is lacking, secondly, a privacy protection technology is disjointed from a data security state, a quantification evaluation mechanism of privacy leakage probability and loss value is lacking, a unified protection technology is adopted for data of different security levels, so that the high-risk data is not protected sufficiently and is easy to crack, the low-risk data is excessively protected and lost, thirdly, an isolation and privacy protection module is separated, no data interaction and linkage adjustment are performed, a real-time monitoring and dynamic feedback mechanism is not designed, the use state after the data isolation cannot be tracked, the data is difficult to timely adjust the strategy when the data security state changes, fourth, model training and actual security requirement are poor, and model training and practical security requirement are introduced, even if a simple machine learning technology is introduced, the security model has no accurate training event, the data is difficult to be extracted, the security model parameters cannot be easily and the security protection parameters are difficult to be extracted, and the security protection parameters cannot be easily and can be easily obtained, and the security protection parameters are easily has no iteration parameters. Therefore, we provide a data isolation and privacy protection method and system based on a data security large model to solve the above problems. Disclosure of Invention The invention aims to provide a data isolation and privacy protection method and system based on a data security large model, which solve the problems of insufficient accuracy of data isolation without dynamic adaptation, privacy protection and security state disconnection, discrete modules without cooperation and complete life cycle security management and control loss of model training separation requirements in the prior art. In order to solve the technical problems, the invention is realized by the following technical scheme. The invention relates to a data isolation and privacy protection method based on a data security large model, which comprises the following steps: s1, collecting data to be processed, and performing cleaning, deduplication and format standardization on the data to obtain standardized data; S2, inputting standardized data into a pre-trained data security large model, wherein the data security large model extracts security features of the data, and the security features comprise data sensitivity Data usage field Jing Quan weightData flow range weightCalculating a data security level score L according to the formula (1), and dividing the data into three security levels of high, medium and low according to the score L; Formula (1): ; s3, configuring a corresponding isolation strategy according to the security level divided in the step S2; S4, outputting a data privacy risk assessment result by the data security large model, calculating a privacy risk value R by a formula (2), and processing data by adopting a corresponding privacy protection technology in combination with the data security level; formula (2): ; and S5, real-time monitoring the circulation, access and service conditions of the isolated data. The invention further provides that in the step S1, the data cleaning inc