CN-121009123-B - Metadata-based business data blood-edge tracking method, system and terminal
Abstract
The invention relates to a business data blood-edge tracking method, a system and a terminal based on metadata, which belong to the technical field of business data processing, wherein the business data blood-edge tracking method comprises the steps of monitoring metadata change events of a cloud platform during business processing; querying a logic mapping table according to the metadata change event, positioning an affected logic ID, deducing a blood edge from a constructed blood edge relation graph according to the affected logic ID, marking the blood edge directly dependent on the affected logic ID as invalid according to a deducing result, and marking a downstream blood edge indirectly dependent on the affected logic ID as suspicious. The invention has the beneficial effects of improving the data management efficiency and the timeliness of blood margin tracking.
Inventors
- MENG QINGGUO
Assignees
- 北京流金岁月科技有限公司
Dates
- Publication Date
- 20260505
- Application Date
- 20250812
Claims (6)
- 1. A business data blood-edge tracking method based on metadata, comprising the steps of: monitoring metadata change events of the cloud platform during service processing; Inquiring a logic mapping table according to the metadata change event, and positioning the affected logic ID; According to the affected logic ID, performing blood margin derivation from a constructed blood margin relation graph, wherein the blood margin relation graph is a graphical representation for describing the dependency relationship between data or entities, and nodes in the graph represent the logic ID and edges represent the dependency relationship; marking a blood edge directly dependent on the affected logical ID as invalid and marking a downstream blood edge indirectly dependent on the affected logical ID as suspicious according to the deduction result; The specific steps of deriving the blood edge from the constructed blood edge relation graph according to the affected logic ID include: Obtaining dependent blood rims of direct dependent fields and indirect dependent fields associated with the affected logical IDs from the constructed blood relationship graph; calculating a primary influence score of the dependent blood margin according to blood margin path characteristics, wherein the blood margin path characteristics comprise dependent depth weights and business weights, the dependent depth weights comprise direct dependence and indirect dependence, and the business weights are dynamically weighted based on field business attributes; Screening out the dependent blood margin with primary influence score larger than the set score value; according to the historical influence business conditions, combining the primary influence scores, and calculating the screened secondary influence scores of the dependent blood rims; screening the dependent blood margin with the secondary influence score larger than a scoring threshold value as a target dependent blood margin, wherein the target dependent blood margin comprises a direct dependent blood margin and an indirect dependent blood margin; Based on the derivation, the step of marking the blood edge directly dependent on the affected logical ID as invalid comprises: acquiring the emergency degree of service processing; judging whether the affected logic ID needs to be repaired or not according to the emergency degree; If yes, analyzing the problem type of the affected logic ID; determining a repair scheme according to the problem type; after repairing the affected logic ID, updating the blood relationship network, and recording repairing information; Based on the derivation, the step of marking the downstream blood-edge indirectly dependent on the affected logical ID as suspicious comprises: acquiring suspicious mark time length which is longer than repair time length for repairing the affected logic ID; Before the suspicious marking time length reaches the set alarm time length, judging whether the downstream blood margin indirectly dependent on the affected logic ID is normal; If yes, canceling the suspicious mark; If not, triggering the manual review and outputting the alarm information.
- 2. The method for traffic data blood-edge tracking based on metadata as recited in claim 1, wherein the step of obtaining the urgency of the traffic processing comprises: Acquiring the range influence score of the influenced logic ID on the core service; acquiring depth influence scores of the influenced logic ID influence service depth; Obtaining an urgent influence score of repair urgency; and calculating the urgency of service processing according to the range influence score, the depth influence score and the urgency influence score.
- 3. The method of claim 1, wherein the step of querying a logical mapping table and locating the affected logical IDs based on the metadata change event comprises: acquiring a change attribute of a metadata change event; Judging whether the change attribute is conventional change or not; If yes, a standard logic mapping table is called, and the affected logic ID is positioned from the standard logic mapping table; If not, setting up a temporary logic mapping table according to the current service scene, and positioning the affected logic ID from the temporary logic mapping table.
- 4. The business data blood-edge tracking method based on metadata as claimed in claim 3, wherein the step of constructing the temporary logic mapping table according to the current business scenario comprises: collecting service requirements of a current service scene; calculating scene scores of the current service scene according to the service demands; calling a history logic mapping table matched with the scene score; extracting mapping logic matched with the service requirement from a matched historical logic mapping table to construct a mapping logic set; and splitting and reorganizing the mapping logic in the mapping logic set to generate a temporary logic mapping table.
- 5. A metadata-based traffic data edge tracking system, wherein the metadata-based traffic data edge tracking method according to any one of claims 1 to 4 is performed, comprising: The data monitoring module is used for monitoring metadata change events of the cloud platform during service processing; The data searching module is used for inquiring the logic mapping table according to the metadata change event and positioning the affected logic ID; the data analysis module is used for deducing the blood margin from the constructed blood margin relation graph according to the affected logic ID; And the data processing module is used for marking the blood margin directly depended on the affected logic ID as invalid and marking the downstream blood margin indirectly depended on the affected logic ID as suspicious according to the deduction result.
- 6. A terminal, comprising: A memory storing a business data blood-edge tracking program based on metadata; A processor for executing a program stored on the memory to implement the metadata-based business data lineage tracking method according to any one of claims 1-4.
Description
Metadata-based business data blood-edge tracking method, system and terminal Technical Field The present invention relates to the technical field of service data processing, and in particular, to a service data blood-edge tracking method, system and terminal based on metadata. Background With the penetration of digital transformation, enterprise business systems are increasingly complex, and data flows in cloud platforms, distributed systems and multi-source heterogeneous environments to form a huge data ecological network. The data blood margin is used as a core tool for describing the relation from generation, processing to consumption of all links of data, and the importance of the data blood margin is increasingly highlighted, so that the data blood margin not only supports key scenes such as data quality monitoring, fault tracing, compliance auditing and the like, but also is the foundation for enterprises to realize data asset management and value mining. Especially in the fields of finance, electronic commerce and the like with extremely high requirements on data consistency and reliability, the blood-margin tracking capability directly influences the accuracy and risk control capability of business decisions. At present, the main stream data blood-edge tracking method is mainly divided into two types, namely a tracking mode based on a static configuration file, a dependency relationship between a table and a field is extracted and a blood-edge map is constructed by analyzing the configuration file of an ETL script, an SQL statement or a data integration tool, and a blood-edge link is dynamically generated by recording input and output information of a data processing task based on a tracking mode of log acquisition in running. For example, some tools analyze the data flow relationship between tasks by monitoring the job logs of Spark, flink and other computing engines, and other technologies collect the data transfer logs in real time by embedding probes in the data pipeline to construct the blood-edge relationship. These methods rely mostly on direct parsing of physical layer metadata (e.g., database table names, field names), and updating of blood-edge relationships typically lags behind the actual changes of the data. On one hand, static configuration analysis needs to be manually triggered and updated, second-level to minute-level delay exists in operation log acquisition, real-time tracking requirements of metadata high-frequency change under a cloud platform dynamic data environment cannot be met, and the relationship is easy to be disjointed with an actual data link, on the other hand, the relationship constructed directly based on physical metadata is difficult to isolate the influence of bottom storage change (such as table name modification and partition adjustment) on a business logic layer, and when an upstream data structure is slightly adjusted, a downstream dependent relationship link can be integrally misjudged to be invalid, and a large number of invalid dependencies need to be manually intervened and verified, so that the data management efficiency is low. Disclosure of Invention In order to improve data governance efficiency and timeliness of blood-edge tracking, the invention provides a business data blood-edge tracking method, system and terminal based on metadata. In a first aspect, the present invention provides a metadata-based service data blood-edge tracking method, which adopts the following technical scheme: a business data blood-edge tracking method based on metadata comprises the following steps: monitoring metadata change events of the cloud platform during service processing; Inquiring a logic mapping table according to the metadata change event, and positioning the affected logic ID; according to the affected logic ID, deriving a blood margin from the constructed blood margin relation graph; according to the deduction result, marking the blood edge directly dependent on the affected logic ID as invalid, and marking the downstream blood edge indirectly dependent on the affected logic ID as suspicious. By adopting the technical scheme, the change of the data structure or the attribute can be captured in real time by monitoring the metadata change event of the cloud platform, and the update is not required to be triggered manually, so that the delay problem of the traditional post audit mode is avoided, and the timeliness of the blood-edge relation update is ensured. Based on the logic ID of the affected logic mapping table, the dependency relationship is deduced by combining the blood relationship graph, the downstream data of direct dependency and indirect dependency can be accurately identified, and the error rate of manual investigation is reduced. The direct dependency blood margin is marked as invalid, the indirect dependency is marked as suspicious, the refined management of the blood margin relation state is realized, the user is helped to rapidly distinguish th