Search

CN-121979944-A - Data cross-system docking method and system based on dynamic metadata analysis

CN121979944ACN 121979944 ACN121979944 ACN 121979944ACN-121979944-A

Abstract

The invention relates to the technical field of data processing and heterogeneous system integration, in particular to a data cross-system docking method and system based on dynamic metadata analysis, comprising the steps of obtaining a dynamic data packet of a source system and a target preset metadata structure; the method comprises the steps of calculating metadata semantic entropy representing uncertainty based on structural difference characteristics, comparing an entropy value with a preset safety threshold value, selecting a full-quantity dynamic mapping or core field intersection mode to generate a docking instruction, writing data into a target system, and adjusting subsequent semantic entropy calculation parameters based on writing result feedback.

Inventors

  • XIA YU
  • SUN YI

Assignees

  • 北京啄木鸟云健康科技有限公司

Dates

Publication Date
20260505
Application Date
20260408

Claims (8)

  1. 1. A data cross-system docking method based on dynamic metadata parsing, wherein the method is applied to a data processing system, the data processing system connects a source system and a target storage system, the method comprises the following steps: Acquiring a dynamic data packet sent by the source system and a preset metadata structure of the target storage system, wherein the dynamic data packet is in the form of a weak type key value pair set, and the preset metadata structure defines standard constraint parameters of a target field; Calculating metadata semantic entropy representing uncertainty of a data structure based on structural difference characteristics between a key name set in the dynamic data packet and the preset metadata structure, and analyzing and preparing the dynamic data packet; Performing numerical comparison on the metadata semantic entropy and a preset safety threshold, and selecting a full-scale dynamic mapping mode or a core field intersection mode according to a comparison result to generate a corresponding data docking instruction; and writing the data in the dynamic data packet into the target storage system through the data docking instruction, and feeding back and adjusting subsequent semantic entropy calculation parameters based on a writing result.
  2. 2. The method for cross-system interfacing of data based on dynamic metadata parsing according to claim 1, wherein the calculating metadata semantic entropy characterizing uncertainty of a data structure based on structural difference features between a key name set in the dynamic data packet and the preset metadata structure comprises: Extracting key names and value types of all key value pairs in the dynamic data packet, and constructing the current transmission protocol characteristics; calculating an unknown key name ratio value, a key name symbol editing distance value and a data type conflict rate value between the current transmission protocol characteristic and the preset metadata structure; and carrying out weighted summation calculation on the unknown key name duty ratio value, the key name symbol editing distance value and the data type conflict rate value based on a preset weight coefficient to obtain the metadata semantic entropy.
  3. 3. The method for cross-system interfacing of data based on dynamic metadata parsing according to claim 1, wherein the selecting a full-scale dynamic mapping mode or a core field intersection mode according to the comparison result comprises: activating the full-quantity dynamic mapping mode in response to the metadata semantic entropy being less than or equal to the preset safety threshold; and under the full dynamic mapping mode, all key names in the dynamic data packet are directly mapped to column names of the target storage system, and a structured query language statement containing all fields is generated to serve as the data docking instruction.
  4. 4. The method for cross-system interfacing of data based on dynamic metadata parsing according to claim 1, wherein the selecting a full-scale dynamic mapping mode or a core field intersection mode according to the comparison result further comprises: activating the core field intersection mode in response to the metadata semantic entropy being greater than the preset security threshold; in the core field intersection mode, identifying a field marked as a core service in the preset metadata structure, and calculating an intersection of a key name set of the dynamic data packet and the core service field; And extracting only data belonging to the intersection to generate a structured query language statement, and discarding non-core fields which are not contained in the intersection in the dynamic data packet to generate the data docking instruction.
  5. 5. The method for cross-system interfacing of data based on dynamic metadata parsing of claim 1, wherein said preparing said dynamic data packets for parsing further comprises: monitoring whether a temporary field or unexpected protocol variation characteristic conforming to a debugging naming rule exists in the dynamic data packet; Under the condition that the protocol variation characteristic is detected, adding a preset numerical penalty item to the metadata semantic entropy to improve the numerical value of the metadata semantic entropy; and cleaning the dynamic data packet, removing the temporary field, and taking the cleaned data as data to be processed.
  6. 6. The method for cross-system interfacing of data based on dynamic metadata parsing according to claim 1, wherein said writing data in said dynamic data packet to said target storage system via said data interfacing instruction comprises: establishing a connection channel with the target storage system by utilizing a database connection pool; Executing the data docking instruction in batches under a transaction isolation level; If the database is triggered to be locked or the type conversion is abnormal in the execution process, the current transaction is rolled back, the corresponding risk weight coefficient is adjusted to a preset saturation upper limit value, and the processing strategy of the subsequent data packet is forcedly switched to the core field intersection mode.
  7. 7. The data cross-system docking method based on dynamic metadata parsing according to claim 1, wherein the source system comprises a heterogeneous service system, and the dynamic data packet is a data stream in a semi-structured format; The preset metadata structure comprises column name definition, data type constraint and primary key index information of a database table; The metadata semantic entropy is used for quantifying potential pollution risks caused by structural uncertainty of the dynamic data packet to the target storage system.
  8. 8. A data cross-system docking system based on dynamic metadata parsing, operating in a data processing system, for implementing a data cross-system docking method based on dynamic metadata parsing as claimed in any one of claims 1 to 7, comprising the following modules: the data acquisition module is configured to acquire a dynamic data packet sent by the source system and a preset metadata structure of the target storage system, wherein the dynamic data packet is in the form of a weak type key value pair set; The entropy calculating module is configured to calculate metadata semantic entropy representing uncertainty of a data structure based on structural difference characteristics between a key name set in the dynamic data packet and the preset metadata structure, and analyze and prepare the dynamic data packet; The strategy decision module is configured to carry out numerical comparison on the metadata semantic entropy and a preset safety threshold value, and select a full-quantity dynamic mapping mode or a core field intersection mode according to the comparison result so as to generate a corresponding data docking instruction; and the execution feedback module is configured to write the data in the dynamic data packet into the target storage system through the data docking instruction, and adjust subsequent semantic entropy calculation parameters based on the writing result feedback.

Description

Data cross-system docking method and system based on dynamic metadata analysis Technical Field The invention relates to the technical field of data processing and heterogeneous system integration, in particular to a data cross-system docking method and system based on dynamic metadata analysis. Background In the current heterogeneous system data integration environment, a source system always sends dynamic data packets in the form of weak type key value pair sets, and an interface protocol always generates non-notification type structure drift, field expansion or data type change. While this approach has some processing efficiency when the protocol is stable, it is extremely prone to trigger integrity constraint violations, deadlocks, or type conversion anomalies in the target database when faced with field naming spelling errors, data type incompatibilities, or mixed debug dirty data in the upstream system. This rigid mechanism lacks quantitative evaluation and adaptive defensive capability for data structure uncertainty, resulting in frequent crashes of ETL tasks, overall refusal of writing of critical traffic data, and difficulty in balancing system throughput and security. Therefore, how to construct a data docking method capable of quantifying the risk of a data structure, adaptively switching the full-scale mapping and the core field protection strategy according to the risk degree, and guaranteeing the continuous database falling of core data in a protocol drift environment becomes a technical problem to be solved. Disclosure of Invention The invention aims to provide a data cross-system docking method and system based on dynamic metadata analysis, which are used for solving the technical problems of ETL task breakdown and key service data loss caused by interface protocol drift, field expansion or data type change in the existing heterogeneous system integration, and the technical scheme of the invention is as follows: A data cross-system docking method based on dynamic metadata analysis is applied to a data processing system, wherein the data processing system is connected with a source system and a target storage system, and the method comprises the following steps: Acquiring a dynamic data packet sent by the source system and a preset metadata structure of the target storage system, wherein the dynamic data packet is in the form of a weak type key value pair set, and the preset metadata structure defines standard constraint parameters of a target field; Calculating metadata semantic entropy representing uncertainty of a data structure based on structural difference characteristics between a key name set in the dynamic data packet and the preset metadata structure, and analyzing and preparing the dynamic data packet; Performing numerical comparison on the metadata semantic entropy and a preset safety threshold, and selecting a full-scale dynamic mapping mode or a core field intersection mode according to a comparison result to generate a corresponding data docking instruction; and writing the data in the dynamic data packet into the target storage system through the data docking instruction, and feeding back and adjusting subsequent semantic entropy calculation parameters based on a writing result. Preferably, based on the structural difference feature between the key name set in the dynamic data packet and the preset metadata structure, calculating metadata semantic entropy for characterizing uncertainty of the data structure includes: Extracting key names and value types of all key value pairs in the dynamic data packet, and constructing the current transmission protocol characteristics; Calculating an unknown key name ratio value, a key name symbol editing distance value and a data type conflict rate value between the current transmission protocol characteristic and the preset metadata structure (Schema); and carrying out weighted summation calculation on the unknown key name duty ratio value, the key name symbol editing distance value and the data type conflict rate value based on a preset weight coefficient to obtain the metadata semantic entropy. Preferably, selecting the full-scale dynamic mapping mode or the core field intersection mode according to the comparison result comprises: activating the full-quantity dynamic mapping mode in response to the metadata semantic entropy being less than or equal to the preset safety threshold; And in the full dynamic mapping mode, all key names in the dynamic data packet are directly mapped to column names of the target storage system, and a Structured Query Language (SQL) statement containing all fields is generated as the data docking instruction. Preferably, the selecting a full-scale dynamic mapping mode or a core field intersection mode according to the comparison result further includes: activating the core field intersection mode in response to the metadata semantic entropy being greater than the preset security threshold; in the core field inte