CN-122019649-A - Verification method and related device for conversion field types among data sources
Abstract
The invention discloses a verification method and a related device for a conversion field type between data sources, belonging to the technical field of data conversion verification; the method comprises the steps of obtaining field source information and field converted information of data to be verified, analyzing a field semantic feature set of the data to be verified based on a pre-trained field type mapping verification model and combining the field source information and the field converted information of the data to be verified, wherein the field semantic feature set comprises a field type semantic data set and a field content semantic data set, and analyzing a field type conversion verification feature value and a field content coherence feature value of the data to be verified based on the field semantic feature set of the data to be verified.
Inventors
- LU DAYONG
- ZHAO JIXING
- CAO LI
- FANG WEIWEI
- XIAO LISHENG
- XU HONGWEI
- LI WEI
- WANG BO
- ZHANG CHAO
- Han Baogeng
Assignees
- 华能巢湖发电有限责任公司
- 西安热工研究院有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260121
Claims (10)
- 1. A method for verifying a type of a conversion field between data sources, comprising the steps of: acquiring field source information and field converted information of data to be checked; Based on the field source information and the field converted information of the data to be verified, a pre-trained field type mapping verification model is adopted, and a field semantic feature set of the data to be verified is obtained through analysis; According to the field semantic feature set of the data to be checked, analyzing to obtain a field type conversion check feature value and a field content coherent feature value of the data to be checked; and adopting the field type conversion verification characteristic value and the field content coherence characteristic value of the data to be verified, and carrying out joint verification treatment on the data to be verified.
- 2. The method for verifying the conversion field type between data sources according to claim 1, wherein the field semantic feature set of the data to be verified comprises a field type semantic data set and a field content semantic data set, wherein the field type semantic data set comprises a field name similarity feature value, a field type matching feature value, a field format structure feature value, a field precision holding feature value, a field constraint holding feature value and a field dependency semantic feature value, and the field content semantic data set comprises a field covering feature value, a field missing feature value, a field truncated feature value, a field exception feature value, a field content holding feature value and a field distribution compatibility feature value.
- 3. The method for verifying the field type of the conversion between data sources according to claim 1, wherein the step of analyzing and obtaining the field semantic feature set of the data to be verified by using a pre-trained field type mapping verification model based on the field source information and the field converted information of the data to be verified specifically comprises: preprocessing field source information and field converted information of data to be checked; inputting field source information and field converted information of the preprocessed data to be checked into a pre-trained field type mapping check model, wherein the field type mapping check model comprises an input layer, a field mapping matching layer, a field deconstructing layer and an output layer which are sequentially arranged; receiving field source information and field converted information of the preprocessed data to be verified in an input layer of a field type mapping verification model; In a field mapping matching layer of a field type mapping verification model, carrying out matching processing on field source information and field converted information of the preprocessed data to be verified to obtain a field matching feature vector set of the data to be verified; Extracting a field verification mapping feature vector of the data to be verified based on a field matching feature vector set of the data to be verified in a field deconstructing layer of the field type mapping verification model; And outputting the field type semantic data set and the field content semantic data set of the data to be checked based on the field check mapping feature vector of the data to be checked in an output layer of the field type mapping check model.
- 4. The method for verifying a field type of conversion between data sources according to claim 1, wherein the analyzing process of the field type conversion verification feature value specifically comprises: analyzing and obtaining a field structure mapping feature set of data to be checked according to a field type semantic data set in a field semantic feature set, wherein the field structure mapping feature set comprises a structure alignment feature value and a structure logic feature value; according to the structural alignment characteristic value and the structural logic characteristic value, calculating a field type conversion verification characteristic value of the data to be verified, wherein a specific calculation formula is as follows: In the formula, Converting the verification characteristic value for the field type of the data to be verified, Aligning the characteristic values for the structure of the data to be verified, The adjustment coefficients are aligned for the structures stored in the database, As the structural logic characteristic value of the data to be verified, The coefficients are adjusted for the structural logic stored in the database, For the co-adjustment coefficients stored in the database, 。
- 5. The method for verifying a field type of a data source-to-source conversion of claim 4, wherein the analysis of the field content coherence feature value comprises: Analyzing and obtaining a content conversion feature set of data to be checked according to a field content semantic data set in a field semantic feature set, wherein the content conversion feature set comprises a field content complete feature value and a field continuation feature value; According to the field content complete characteristic value and the field continuation characteristic value, calculating a field content coherent characteristic value of the data to be checked, wherein a specific calculation formula is as follows: In the formula, The characteristic value is consistent for the field content of the data to be verified, For the complete eigenvalue of the field content of the data to be checked, The coefficients are fully adjusted for the content stored in the database, The characteristic value is continued for the field of the data to be checked, The adjustment coefficients are continued for the fields stored in the database, For smooth adjustment coefficients stored in the database, Coefficients are adjusted for interactions stored in the database.
- 6. The method for verifying the field type of the conversion between data sources according to claim 5, wherein the analysis process of the field structure mapping feature set of the data to be verified comprises: Weighting the field type matching characteristic value, the field format structure characteristic value and the field precision maintaining characteristic value in the field type semantic data set to obtain a structure alignment characteristic value of the data to be checked; weighting the field name similarity characteristic value, the field constraint holding characteristic value and the field dependency semantic characteristic value in the field type semantic data set to obtain a structural logic characteristic value of the data to be checked; the analysis process of the content conversion feature set of the data to be verified comprises the following steps: Weighting the field coverage characteristic value, the field missing characteristic value, the field truncated characteristic value and the field abnormal characteristic value in the field content semantic data set to obtain a field content complete characteristic value of the data to be checked; and carrying out weighting treatment on the field content retention characteristic value and the field distribution compatibility characteristic value in the field content semantic data set to obtain the field continuation characteristic value of the data to be checked.
- 7. The method for verifying the field type of the conversion between data sources according to claim 1, wherein the step of performing joint verification processing on the data to be verified by using the field type conversion verification feature value and the field content coherence feature value of the data to be verified specifically comprises: Normalizing the field type conversion verification characteristic value and the field content coherence characteristic value of the data to be verified; the normalized field type conversion verification characteristic value and the field content coherence characteristic value are respectively judged and analyzed with a plurality of preset groups of verification intervals to obtain a judgment and analysis result; And based on the judgment and analysis result, adopting a corresponding verification treatment strategy for the data to be verified.
- 8. A system for verifying a type of a conversion field between data sources, comprising: the data acquisition module is used for acquiring field source information and field converted information of the data to be checked; The semantic feature analysis module is used for analyzing and obtaining a field semantic feature set of the data to be checked by adopting a pre-trained field type mapping check model based on field source information and field converted information of the data to be checked; The characteristic value calculation module is used for analyzing and obtaining a field type conversion verification characteristic value and a field content coherent characteristic value of the data to be verified according to the field semantic characteristic set of the data to be verified; And the joint verification module is used for adopting the field type conversion verification characteristic value and the field content coherence characteristic value of the data to be verified to carry out joint verification treatment on the data to be verified.
- 9. Computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of a method for verifying a type of conversion field between data sources according to any of claims 1-7 when the computer program is executed.
- 10. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of a method for verifying a type of conversion field between data sources according to any one of claims 1-7.
Description
Verification method and related device for conversion field types among data sources Technical Field The invention belongs to the technical field of data conversion verification, and relates to a verification method and a related device for conversion field types among data sources. Background Along with the wide application of big data and cloud computing technology in various industries, the data interaction between enterprise internal and different business systems is increasingly frequent, and the field structure, the type definition and the content semantics among data sources are different, so that in the process of data migration, system integration or cross-platform synchronization, the field type mapping relationship is easy to be inconsistent or wrong, and data abnormality, logic distortion or business processing errors are caused, thereby influencing the consistency and the usability of the data systems. The traditional data verification method mainly comprises field type comparison or format regular matching, only static comparison is carried out on type labels of a source field and a target field, dynamic consistency of semantic association of a field name and a content structure is ignored, corresponding relations of a semantic layer, a content layer and a constraint layer of the field after type conversion are difficult to comprehensively reflect, and verification accuracy and stability are low when a heterogeneous database, a cross-system format or multi-language field naming is carried out. In the prior art, as disclosed in patent application CN111858647B, a method for checking the type of a field converted between data sources is disclosed, in which fields in a source database and fields in a target database of the data sources are mapped to each other in a mapper, and the mapping result is checked, if the source database and the target database of the data sources belong to the same data source, the mapping result is the type of the field corresponding to the type of the field, a special database is generated to store the check of detailed conversion, and a check table generated according to the check result is generated, so that the conversion is more accurate and convenient, and the generation of abnormal data is reduced to the greatest extent. The verification method for the field types of conversion between the data sources can avoid data transmission failure caused by the fact that field mapping types between the data source library and the data source target library are not matched with each other, so that conversion is more accurate and convenient, generation of abnormal data is reduced to the greatest extent, and the rapid migration efficiency of a large amount of data is guaranteed. The method has the obvious limitations that firstly, deep association characteristics of fields at a semantic layer and a content layer are difficult to effectively identify, when source fields and converted fields are different in naming, structure or content distribution, a verification result lacks semantic interpretation power and cannot truly reflect the corresponding relation among fields, and secondly, the fields are not comprehensively evaluated in structural definition, constraint maintenance, content integrity, distribution stability and the like, so that even if the type mapping relation is correct, potential errors such as truncation, deletion or constraint conflict can exist in the field content, and the final usability and consistency of data are affected. In view of the foregoing, a new verification method is needed to solve the problem of collaborative verification of semantic consistency and content consistency in the field conversion process. Disclosure of Invention The invention aims to provide a verification method and a related device for converting field types among data sources, which are used for solving the technical problems that field semantics and content consistency are difficult to comprehensively identify, mapping is correct and data is distorted easily in the prior art. In order to achieve the purpose, the invention is realized by adopting the following technical scheme: in a first aspect, the present invention provides a method for checking a type of a conversion field between data sources, including the steps of: acquiring field source information and field converted information of data to be checked; Based on the field source information and the field converted information of the data to be verified, a pre-trained field type mapping verification model is adopted, and a field semantic feature set of the data to be verified is obtained through analysis; According to the field semantic feature set of the data to be checked, analyzing to obtain a field type conversion check feature value and a field content coherent feature value of the data to be checked; and adopting the field type conversion verification characteristic value and the field content coherence characteris