CN-122027205-A - Data sharing security decision method and system applying artificial intelligence

CN122027205ACN 122027205 ACN122027205 ACN 122027205ACN-122027205-A

Abstract

The invention provides a data sharing security decision method and a system applying artificial intelligence, which are characterized in that a context awareness mapping network is input by receiving data sharing task description information and associated talent original data sets, a task data association strength mapping table is generated, conflict coordination is carried out on data sharing requirements and security constraint conditions by the context awareness mapping network and a preset basic security label set, a fine-grained data protection guide document is generated, structured desensitization conversion operation is carried out on corresponding data units in the talent original data sets, an intermediate data form is obtained, a matched access verification rule set and a matched transmission encryption rule set are generated according to access permission rules and transmission encryption rules, the data sharing joint analysis is carried out on the three data access information sets by deploying the data sharing verification rule set, the encryption compliance check is carried out on the data access behavior by a controlled collaborative computing environment, and a sharing analysis result is output after the joint analysis operation is completed. The invention ensures compliance and reliability of the shared analysis result report.

Inventors

CAI XIAOHANG
TIAN ZHIQIANG
Long Quanbo
DONG FEIYAN

Assignees

贵州大数据人才开发有限公司

Dates

Publication Date: 20260512
Application Date: 20251230

Claims (10)

1. A data sharing security decision method employing artificial intelligence, the method comprising: Receiving data sharing task description information and an associated talent original data set; inputting the data sharing task description information and the talent original data set into a preset context perception mapping network, carrying out intention recognition on the task description information through a semantic analysis layer in the context perception mapping network, carrying out entity relation extraction on data units in the talent original data set through an entity association layer, and generating a task data association strength mapping table by combining an intention recognition result and an entity relation extraction result; Invoking a pre-configured data protection strategy generator, carrying out strategy matching on a task data association strength mapping table and a preset basic security tag set, carrying out conflict coordination on data sharing requirements and security constraint conditions through a multi-objective weighing layer of the strategy generator, and generating a fine-grained data protection guiding document aiming at each data unit in a talent original data set, wherein the document comprises a data desensitization rule, an access right rule and a transmission encryption rule; According to the data desensitization rule in the fine-grained data protection guide document, carrying out structured desensitization conversion operation on the corresponding data units in the talent original data set to obtain an intermediate data form meeting the security requirement, and generating a matched access verification rule set and a matched transmission encryption rule set according to the access permission rule and the transmission encryption rule; The method comprises the steps of deploying an intermediate data form, an access verification rule set and a transmission encryption rule set to a preset controlled collaborative computing environment, carrying out data sharing joint analysis, calling the access verification rule set in real time through a rule verification engine built in the environment to carry out authority verification on data access behaviors, calling the transmission encryption rule set to carry out encryption compliance check on a data transmission process, and outputting a sharing analysis result meeting the description information requirement of a data sharing task after joint analysis operation is completed.
2. The method according to claim 1, wherein the inputting the data sharing task description information and the talent original data set into the preset context awareness mapping network, performing intent recognition on the task description information through a semantic parsing layer thereof, performing entity relationship extraction on the data units in the talent original data set through an entity association layer, and generating the task data association strength mapping table by combining the intent recognition result and the entity relationship extraction result includes: Inputting the data sharing task description information into a semantic analysis layer of a context perception mapping network, segmenting data in the task description information by using scene description to obtain a plurality of scene description word units, marking the parts of speech of the scene description word units, and screening out the part of speech word units and the verb word units as key scene elements; performing dependency syntactic analysis on the data requirement type description, extracting core requirement entity words and requirement modifier words, and constructing a requirement entity-modifier word association pair; Inputting the key scene element and the required entity-modifier association pair into an intention classifier of a semantic analysis layer, and performing sequence labeling through a bidirectional LSTM network in the intention classifier to generate a task intention label sequence; inputting the talent original data set into an entity association layer of a context-aware mapping network, carrying out named entity recognition on text data in a data unit, extracting a data entity name and an entity type, and carrying out entity attribute extraction on numerical data to obtain an entity attribute key value pair set; extracting entity relation triples of entity names, entity types and entity attribute key value pairs by a relation extraction model of an entity association layer to generate an entity relation triples set containing a head entity, a relation type and a tail entity; inputting the task intention label sequence and the entity relation triplet set into a fusion calculation layer of a context awareness mapping network, calculating a semantic similarity value of each intention label and the entity relation triplet, and constructing a task data association strength mapping table based on the similarity value, wherein the row dimension of the mapping table is the intention label in the intention label sequence, the column dimension is the triplet identifier in the entity relation triplet set, and the elements are the corresponding semantic similarity values.
3. The method of claim 2, wherein the inputting the key scene element and the required entity-modifier association pair into the intent classifier of the semantic parsing layer, performing sequence labeling through a bidirectional LSTM network therein, generating a task intent tag sequence, comprises: Performing word vector conversion on noun word units and verb word units in key scene elements to generate a scene element word vector sequence, and performing word vector splicing on entity words and modifier words in a demand entity-modifier word association pair to generate a demand association pair word vector sequence; Inputting a scene element word vector sequence and a demand association pair word vector sequence into an input layer of a bidirectional LSTM network, extracting time sequence features of the word vector sequence from left to right through a forward LSTM unit to obtain a forward time sequence feature vector, and extracting time sequence features of the word vector sequence from right to left through a backward LSTM unit to obtain a backward time sequence feature vector; splicing the forward time sequence feature vector and the backward time sequence feature vector in the feature dimension to generate a bidirectional fusion feature vector, inputting the bidirectional fusion feature vector into a full-connection layer of the classifier, and performing nonlinear transformation through an activation function to obtain an intention classification feature vector; Invoking a CRF layer of the intention classifier to perform sequence labeling decoding on the intention classification feature vector, calculating a tag transition probability matrix, and searching a tag path with the maximum probability based on a Viterbi algorithm to generate an initial intention tag sequence; and carrying out post-processing operation on the initial intention label sequence, and removing redundant labels and conflict labels in the initial intention label sequence to obtain an optimized task intention label sequence.
4. The method of claim 2, wherein the extracting entity relationship triples from the entity name, entity type and entity attribute key value pair sets by the relationship extraction model of the entity association layer, generating the entity relationship triples set including the head entity, the relationship type and the tail entity, comprises: Inputting the entity name, entity type and entity attribute key value pair sets into a feature construction layer of a relation extraction model, carrying out character-level feature extraction on the entity name to generate an entity name character feature vector, carrying out single-hot coding on the entity type to generate an entity type coding vector, and respectively carrying out word vector conversion on keys and values in the entity attribute key value pair sets to generate an attribute key word vector and an attribute value word vector; Carrying out feature fusion on the entity name character feature vector, the entity type code vector, the attribute key word vector and the attribute value word vector to generate an entity comprehensive feature vector, inputting the entity comprehensive feature vector into a Bi-GRU layer of a relation extraction model, and carrying out time sequence dependency modeling on the feature vector through a gating circulation unit to obtain an entity time sequence feature vector; invoking an attention mechanism layer of the relation extraction model to perform attention degree calculation on the entity time sequence feature vectors, generating entity attribute attention weight vectors, and performing weighted summation on the entity time sequence feature vectors based on the weight vectors to obtain entity feature vectors associated with enhanced attributes; Inputting the entity feature vectors associated with the enhanced attributes into a convolution layer of a relation extraction model, carrying out local feature extraction on the feature vectors through multi-scale convolution check to obtain a multi-scale entity relation feature map, carrying out global maximum pooling on the multi-scale entity relation feature map, and generating entity relation feature vectors; The entity relation feature vector is input into a classification layer of a relation extraction model, probability values of all relation types are calculated through a softmax function, the relation type with the highest probability value is selected as a target relation type among the entities, an entity relation triplet containing a head entity name, a target relation type and a tail entity name is constructed, and a plurality of entity relation triples are combined to form an entity relation triplet set.
5. The method of claim 1, wherein the invoking the preconfigured data protection policy generator performs policy matching on the task data association strength mapping table and the preset basic security tag set, and the conflict coordination is performed on the data sharing requirement and the security constraint condition through the multi-objective weighing layer of the policy generator, so as to generate the fine-grained data protection guidance document for each data unit in the talent original data set, including: invoking a strategy input layer of a data protection strategy generator, and receiving a task data association strength mapping table and a preset basic security tag set, wherein the basic security tag set comprises a data sensitivity tag, a data access level tag and a data transmission security tag; according to the affiliated relation between the entity relation triplet and the data units in the talent original data set, the association strength normalization value corresponding to each triplet is aggregated to the data units affiliated to the triplet to generate the comprehensive association strength value of each data unit; respectively carrying out tag matching on the high-association data unit, the medium-association data unit and the low-association data unit with data sensitivity tags in the basic security tag set to generate initial security tag combinations of the data units; inputting the initial security tag combination into a multi-target weighing layer of a data protection strategy generator, extracting data use frequency parameters, data processing aging parameters and data precision requirement parameters in data sharing requirements through a constraint condition analysis module in the layer, and extracting data desensitization level parameters, access right control parameters and transmission encryption strength parameters in security constraint conditions; Performing conflict detection on the data use frequency parameter and the data desensitization level parameter by a conflict coordination algorithm of the multi-target weighing layer, performing conflict detection on the data processing aging parameter and the access right control parameter, performing conflict detection on the data precision requirement parameter and the transmission encryption strength parameter, and generating a conflict coordination result; And adjusting tag parameters in the initial security tag combination according to the conflict coordination result, generating an optimized target security tag combination, matching a corresponding protection policy from a preset policy rule base based on the target security tag combination, and generating a fine-grained data protection guide document containing data desensitization rules, access right rules and transmission encryption rules.
6. The method according to claim 5, wherein the conflict coordination algorithm for the multi-objective balancing layer performs conflict detection on the data usage frequency parameter and the data desensitization level parameter, performs conflict detection on the data processing aging parameter and the access right control parameter, performs conflict detection on the data precision requirement parameter and the transmission encryption strength parameter, and generates a conflict coordination result, and the method comprises: Inputting data using frequency parameters of the divided high-association data unit, medium-association data unit and low-association data unit into a first conflict detection module of a conflict coordination algorithm respectively, acquiring a desensitization frequency threshold interval corresponding to the association intensity level of the data unit, judging whether the data using frequency parameters exceed the upper limit value or are lower than the lower limit value of the corresponding desensitization frequency threshold interval, if yes, judging that the data using frequency parameters of the data unit and the data desensitization level parameters have first conflicts, and generating a first conflict type identifier and a conflict degree quantization value; Inputting the data processing aging parameters into a second conflict detection module of the conflict coordination algorithm, acquiring a preset authority verification aging threshold, calculating the absolute value of the difference between the data processing aging parameters and the authority verification aging threshold, and if the absolute value of the difference exceeds a preset aging tolerance range, judging that the data processing aging parameters and the access authority control parameters have second conflict, and generating a second conflict type identifier and a conflict degree quantized value; inputting the data precision requirement parameter into a third conflict detection module of a conflict coordination algorithm, acquiring a preset encryption precision loss threshold value, calculating a theoretical precision loss value after transmission encryption processing based on the data precision requirement parameter, and if the theoretical precision loss value exceeds the encryption precision loss threshold value, judging that a third conflict exists between the data precision requirement parameter and the transmission encryption strength parameter, and generating a third conflict type identifier and a conflict degree quantized value; inputting the first conflict type identifier, the second conflict type identifier, the third conflict type identifier and the corresponding conflict degree quantized value into a priority ordering module of a conflict coordination algorithm, and carrying out priority ordering on each conflict type according to a preset conflict priority rule to generate a conflict priority sequence; Based on the conflict priority sequence and the conflict degree quantification value, a conflict resolution strategy library of a conflict coordination algorithm is called, corresponding conflict resolution rules are matched, and a conflict coordination result comprising conflict types, conflict priorities, conflict resolution rules and rule application sequences is generated.
7. The method of claim 5, wherein the adjusting tag parameters in the initial security tag combination according to the conflict coordination result to generate an optimized target security tag combination, matching a corresponding protection policy from a preset policy rule base based on the target security tag combination, and generating a fine-grained data protection guidance document including a data desensitization rule, an access right rule, and a transmission encryption rule, includes: resolving conflict release rules and rule application sequences in conflict coordination results, sequentially extracting tag adjustment parameters corresponding to each conflict type according to the rule application sequences, wherein the first conflict type corresponds to the data desensitization level adjustment parameters, the second conflict type corresponds to the access right control adjustment parameters, and the third conflict type corresponds to the transmission encryption strength adjustment parameters; For each data unit with conflict, superposing the corresponding data desensitization level adjustment parameter to the data desensitization level parameter in the initial security tag combination of the data unit to generate an adjusted target data desensitization level parameter; the corresponding access right control adjustment parameters are overlapped to the access right control parameters in the initial security tag combination of the data unit, and target access right control parameters are generated; the corresponding transmission encryption intensity adjustment parameters are overlapped to the transmission encryption intensity parameters in the initial security tag combination of the data unit, and a target transmission encryption intensity parameter is generated; Combining the target data desensitization level parameter, the target access authority control parameter and the target transmission encryption strength parameter to form an optimized target security tag combination, and associating a unique target security tag combination for each data unit; Loading a protection strategy template matched with the target security tag combination from a preset strategy rule library, wherein the strategy rule library comprises a plurality of templates corresponding to different security tag combinations, and each template comprises a preset data desensitization rule template, an access right rule template and a transmission encryption rule template; adjusting rule parameters in a protection strategy template according to specific parameter values in a target security tag combination, determining a desensitization algorithm type and a desensitization field range in a data desensitization rule according to target data desensitization level parameters, determining a role authority list and authority timeliness in an access authority rule according to target access authority control parameters, and determining an encryption protocol and a key management mode in a transmission encryption rule according to target transmission encryption strength parameters; The method comprises the steps of organizing the regulated rule parameters into structured rule entries, wherein the data desensitization rule entries comprise desensitization algorithm identifications, field indexes to be desensitized and desensitization replacement rules, the access authority rule entries comprise role identifications, permission operation lists and authority effective time periods, the transmission encryption rule entries comprise encryption algorithm identifications, key generation modes and transmission check rules, and integrating a plurality of rule entries into a fine-grained data protection guide document after sorting according to association strength normalized values of data units.
8. The method according to claim 1, wherein the step of performing a structured desensitization conversion operation on corresponding data units in the talent original data set according to the data desensitization rule in the fine-grained data protection guideline document to obtain an intermediate data form meeting security requirements, and generating a matched access verification rule set and transmission encryption rule set according to the access permission rule and the transmission encryption rule, includes: Analyzing data desensitization rule entries in the fine-grained data protection guide document, extracting a desensitization algorithm type, a desensitization field position and a desensitization replacement character, and positioning a data unit to be desensitized and a target field in the unit in the talent original data set according to the desensitization field position; Invoking a desensitization processing function matched with the desensitization algorithm type, and performing desensitization conversion operation on the original data value of the target field, wherein if the desensitization algorithm type is replacement desensitization, replacing part of characters in the target field by using desensitization replacement characters; Replacing the target field value in the original data unit with the desensitized data field value to obtain a preliminary desensitized data unit, performing data integrity check on the preliminary desensitized data unit, checking whether the format of the data unit after desensitization is in accordance with a preset data structure specification, and if not, re-performing desensitization conversion operation until the format is in accordance with the specification to obtain an intermediate data form in accordance with the safety requirement; Analyzing access authority rule entries in a fine-grained data protection guidance document, extracting an access main role type, a data operation authority list and an authority validity period, matching corresponding identity verification modes from a preset role authority mapping library according to the access main role type, determining an allowed operation type set according to the data operation authority list, determining an authority effective time interval according to the authority validity period, combining the identity verification modes, the operation type set and the authority effective time interval to generate an access verification rule entry, and forming an access verification rule set by a plurality of access verification rule entries; Analyzing transmission encryption rule entries in the fine-grained data protection guidance document, extracting encryption algorithm types, key length parameters and encryption modes, determining corresponding key generation modes according to the encryption algorithm types, determining byte lengths of keys according to the key length parameters, determining data grouping modes and filling modes according to the encryption modes, combining the key generation modes, the key byte lengths, the data grouping modes and the filling modes to generate transmission encryption rule entries, and forming a transmission encryption rule set by the multiple transmission encryption rule entries.
9. The method of claim 8, wherein the performing data integrity check on the preliminary desensitized data unit, checking whether the format of the data unit after desensitization processing meets a preset data structure specification, if not, performing desensitization conversion operation again until the format meets the specification, and obtaining an intermediate data form meeting safety requirements, includes: Acquiring a preset data structure specification document, and analyzing data unit format requirements in the document, wherein the data unit format requirements comprise field number requirements, field data type requirements, field length range requirements and field-to-field logic relationship requirements; Comparing the actual field number of the preliminary desensitization data unit with the field number requirement, if the actual field number is less than the field number requirement, judging that the format does not accord with the specification, and generating a field missing error identifier; Checking whether the actual data type of each field in the preliminary desensitization data unit is consistent with the field data type requirement, if the fields with inconsistent data types exist, judging that the format is not in accordance with the specification, and generating a data type error identifier; Checking whether the actual length of each field in the preliminary desensitization data unit is within the field length range, if the field length exceeds the range, judging that the format is not in accordance with the specification, and generating a field length error mark; Verifying whether the actual logic relationship between the fields in the preliminary desensitization data unit is consistent with the logic relationship requirement between the fields, if the logic relationship is inconsistent, judging that the format is not in accordance with the specification, and generating a logic relationship error identifier; Summarizing field missing error identification, field redundancy error identification, data type error identification, field length error identification and logic relation error identification, if no error identification exists, judging that the format of the preliminary desensitization data unit meets the specification, determining the format as an intermediate data form, if the error identification exists, adjusting parameters in the desensitization conversion operation according to the type of the error identification, adjusting the position of the desensitization field or the number of replacement characters, and carrying out the desensitization conversion operation again until the generated desensitization data unit passes all format verification, thus obtaining the intermediate data form meeting the safety requirement.
10. A computer system, comprising: a memory for storing computer executable instructions or computer programs; A processor for implementing the data sharing security decision method of any of claims 1 to 9 using artificial intelligence when executing computer executable instructions or computer programs stored in the memory.

Description

Data sharing security decision method and system applying artificial intelligence Technical Field The application relates to the field of data processing, in particular to a data sharing security decision method and system applying artificial intelligence. Background With the deep application of the artificial intelligence technology in the field of data security, the security risk in the data sharing process is analyzed and controlled, so that the data privacy and security can be protected while the data sharing requirement is met. At present, a fixed safety rule set is usually pre-configured, desensitization is performed on original data before data sharing, a data access range is controlled according to a preset access authority list, and a simple rule matching algorithm is introduced into a part of methods to adjust a safety strategy in the sharing process. However, the existing methods often have the problem of insufficient suitability of a security policy and specific sharing requirements in practical application, so that when facing a complex data sharing scene, effective utilization of data is limited due to over protection or data security risks are brought due to insufficient protection, meanwhile, the data desensitization processing and security rule formulation are mainly independent links, the condition that the desensitized data form is not matched with the security rule easily occurs, and the problems commonly influence the accuracy and the reliability of the data sharing security decision. Disclosure of Invention The invention provides a data sharing security decision method and a system applying artificial intelligence. In a first aspect, the embodiment of the invention provides a data sharing security decision method applying artificial intelligence, which comprises the steps of receiving data sharing task description information and an associated talent original data set, inputting the data sharing task description information and the talent original data set into a preset context-aware mapping network, carrying out intention recognition on the task description information through a semantic analysis layer in the data sharing task description information, carrying out entity relationship extraction on data units in the talent original data set through an entity association layer, generating a task data association strength mapping table by combining an intention recognition result and an entity relationship extraction result, calling a preconfigured data protection policy generator, carrying out policy matching on the task data association strength mapping table and a preset basic security tag set, carrying out conflict coordination on data sharing requirements and security constraint conditions through a multi-target layer of the policy generator, generating a fine particle data protection guide document for each data unit in the talent original data set, wherein the document comprises a data desensitization rule, an access right rule and a transmission encryption rule, carrying out structured desensitization conversion operation on corresponding data units in the talent original data set according to the fine particle data protection guide document, obtaining a security rule, carrying out cooperation verification and verification on the data rule, and a shared security policy is carried out a cooperation verification on the data security rule, a transmission rule is arranged in a shared environment, and a shared by setting up the transmission rule, and a shared security policy is verified by a transmission rule, and calling a transmission encryption rule set to carry out encryption compliance check on the data transmission process, and outputting a sharing analysis result meeting the description information requirement of the data sharing task after the joint analysis operation is completed. In a second aspect, embodiments of the present invention provide a method. The embodiment of the application has the following beneficial effects: The invention receives data sharing task description information and talent original data set, carries out intention recognition processing on the task description information and entity relation extraction processing on the data units by utilizing a context awareness mapping network to generate a task data association strength mapping table, can realize semantic-level dynamic association of the data sharing task and the data units, enables matching between task demands and data entities to be improved from traditional static keyword matching to deep association based on intention and entity relation, thereby laying a foundation for accurate generation of a subsequent security policy, calls a data protection policy generator to carry out policy matching on the task data association strength mapping table and a basic security tag set, carries out conflict coordination processing on data sharing demands and security constraint conditions by utilizing a mult