CN-116975164-B - Data processing method and device, storage medium and electronic equipment
Abstract
Embodiments of the present invention relate to the field of data warehouse technology, and more particularly, to a data processing method and apparatus, a storage medium, and an electronic device. The method comprises the steps of carrying out layering processing on a data table cluster to be processed to determine the hierarchical attribution of each data table to be processed in a data warehouse, carrying out blood margin relation analysis on a DWD table corresponding to a data detail layer of the data warehouse, carrying out filtering processing on the DWD table based on blood margin analysis results, and determining a substitution mapping model according to the blood margin relation of each data table between a data operation layer model of the data warehouse and a data detail layer model after filtering processing. The method and the device can solve the problem of unreasonable dependence on the ods table, promote the richness of construction of intermediate layers of the plurality of bins while improving the utilization rate of the assets of the plurality of bins, and improve the identifiable degree of a model of the plurality of bins. And the calculation cost is reduced, and the data use efficiency is improved.
Inventors
- HUANG KAI
- LUO TAN
Assignees
- 杭州网易云音乐科技有限公司
Dates
- Publication Date
- 20260505
- Application Date
- 20230727
Claims (10)
- 1. A method of data processing, the method comprising: Layering processing is carried out on the data table clusters to be processed so as to determine the hierarchical attribution of each data table to be processed in the data warehouse; Obtaining blood margin relations of each DWD table to other levels, and filtering the DWD tables with the blood margin relations belonging to non-standard blood margin relations, wherein the non-standard blood margin relations comprise any one of a reference relation of the DWD table to an ADS table and a referenced relation of the DWD table to the DWS table; Acquiring a corresponding field blood edge relation according to a reference relation between each DWD table in the data detail layer model and an ODS table in the data operation layer model; Performing blood-edge relation analysis on each ADS table of a data application layer and each DWS table of a data service layer of the data warehouse to obtain a corresponding ODS table to be analyzed, matching the ODS table to be analyzed with the substitution mapping model, marking the ADS table and/or the DWS table corresponding to the ODS table to be analyzed which is successfully matched, and generating a corresponding reference management optimization strategy for the marked ADS table and/or DWS table based on the substitution mapping model which is successfully matched; The method comprises the steps of responding to a model analysis request triggered by a service to be processed, calculating a substitution mapping model corresponding to the service to be processed, determining a replaceable DWD table corresponding to an ODS table to be used when the ODS table contained in the substitution mapping model is identified to be used, and generating corresponding model optimization information according to the replaceable DWD table.
- 2. The data processing method according to claim 1, wherein the layering process of the to-be-processed data table cluster includes: Determining the hierarchical attribution of the data warehouse according to the table name and/or the storage path corresponding to the data table to be processed, or And determining the hierarchical attribution of the data warehouse in response to the mounting operation of the data table to be processed.
- 3. The data processing method of claim 1, wherein the method further comprises: And when the target DWD table in the successfully matched substitution mapping model is not available, performing blood margin field expansion on the target DWD table based on the blood margin field corresponding to the ODS table to be analyzed.
- 4. The data processing method of claim 1, wherein the method further comprises: Responding to a model optimization request of a target type task to be processed, and calculating a substitute mapping model corresponding to the task to be processed; and carrying out optimization analysis on the data warehouse model corresponding to the task to be processed based on the substitution mapping model, and generating a corresponding model optimization treatment strategy, wherein the model optimization treatment strategy comprises a reference relation substitution mode of each layer of data table of the data warehouse.
- 5. A data processing apparatus, the apparatus comprising: The data table hierarchical analysis module is used for carrying out hierarchical processing on the data table clusters to be processed so as to determine the hierarchical attribution of each data table to be processed in the data warehouse; the data table filtering module is used for acquiring the blood margin relation of each DWD table to other levels and filtering the DWD table with the blood margin relation belonging to the non-standard blood margin relation, wherein the non-standard blood margin relation comprises any one of the reference relation of the DWD table to the ADS table and the referenced relation of the DWD table to the DWS table; The system comprises a data detail layer model, a substitution mapping model calculation module, a substitution mapping model generation module and a substitution mapping model generation module, wherein the data detail layer model is used for acquiring a corresponding field blood edge relation according to a reference relation between each DWD table in the data detail layer model and an ODS table in the data operation layer model; The system comprises a reference management optimization strategy generation module, a substitution mapping model, a reference management optimization strategy generation module, a comparison optimization strategy generation module and a comparison optimization module, wherein the reference management optimization strategy generation module is used for carrying out blood relationship analysis on each ADS table of a data application layer and each DWS table of a data service layer of the data warehouse to obtain a corresponding ODS table to be analyzed; The model optimization information calculation module is used for responding to a model analysis request triggered by a service to be processed, calculating a substitution mapping model corresponding to the service to be processed, determining a replaceable DWD table corresponding to the ODS table to be used when the ODS table contained in the substitution mapping model is identified to be used, and generating corresponding model optimization information according to the replaceable DWD table.
- 6. The data processing apparatus according to claim 5, wherein the pending data surface layer level analysis module is configured to determine a hierarchical attribution at the data warehouse according to a table name and/or a storage path corresponding to the pending data table, or determine a hierarchical attribution at the data warehouse in response to a mounting operation of the pending data table.
- 7. The data processing apparatus of claim 5, wherein the apparatus further comprises: And the blood margin field expansion module is used for expanding the blood margin field of the target DWD table based on the blood margin field corresponding to the ODS table to be analyzed when the target DWD table in the successfully matched substitution mapping model is unavailable.
- 8. The data processing apparatus of claim 5, wherein the apparatus further comprises: The system comprises a request response module, a data warehouse model, a request response module and a data warehouse management module, wherein the request response module is used for responding to a model optimization request of a target type task to be processed, calculating a substitution mapping model corresponding to the task to be processed, carrying out optimization analysis on the data warehouse model corresponding to the task to be processed based on the substitution mapping model, and generating a corresponding model optimization management strategy, wherein the model optimization management strategy comprises a reference relation substitution mode of each layer of data table of the data warehouse.
- 9. A storage medium having stored thereon a computer program, which when executed by a processor implements the data processing method of any of claims 1 to 4.
- 10. An electronic device, comprising: processor, and A memory for storing executable instructions of the processor; wherein the processor is configured to perform the data processing method of any of claims 1-4 via execution of the executable instructions.
Description
Data processing method and device, storage medium and electronic equipment Technical Field Embodiments of the present invention relate to the field of data warehouse technology, and more particularly, to a data processing method and apparatus, a storage medium, and an electronic device. Background This section is intended to provide a background or context for embodiments of the invention and the description herein is not admitted to be prior art by inclusion in this section. The data warehouse (Data Warehouse, DW) can be generally divided into a data Application layer (Application DATA SERVICE, ADS), a data service layer (Data Warehouse Service, DWs), a data detail layer (Data warehouse details, DWD), and a data operations layer (Operation Data Store, ODS). When the digital bin DW is managed, the treatment effect of the digital bin model can be measured by referring to some indexes, and the calculation of different indexes needs to use a data table of some digital bin layering. In some technical schemes, the indexes can reflect the table entries referenced by the table, but the indexes can only solve the problems of more obvious reference errors and non-normative aspects. However, whether the table entry is referred to reasonably or not cannot accurately give judgment, so that the table is referred to unreasonably. Disclosure of Invention For this reason, there is a great need for an improved data processing method and apparatus, a storage medium and an electronic device, which can provide a more rational solution for the reference of items in the process of constructing a several-bin model. In this context, the embodiments of the present invention desire to provide a data processing method and apparatus, a storage medium, and an electronic device. According to one aspect of the disclosure, a data processing method is provided, and the method comprises the steps of conducting layering processing on a data table cluster to be processed to determine hierarchical attribution of each data table to be processed in a data warehouse, conducting blood margin relation analysis on a DWD table corresponding to a data detail layer of the data warehouse, conducting filtering processing on the DWD table based on blood margin analysis results, and determining a substitute mapping model according to blood margin relation of each data table between a data operation layer model of the data warehouse and a data detail layer model after filtering processing. In an exemplary embodiment of the disclosure, the layering processing of the to-be-processed data table cluster comprises determining the hierarchical attribution of the data warehouse according to the table name and/or the storage path corresponding to the to-be-processed data table, or determining the hierarchical attribution of the data warehouse in response to the mounting operation of the to-be-processed data table. In an exemplary embodiment of the disclosure, the performing blood margin relation analysis on the DWD table corresponding to the data detail layer of the data warehouse and performing filtering processing on the DWD table based on blood margin analysis results comprises obtaining blood margin relations of the DWD tables to other levels, and filtering the DWD table with the blood margin relations belonging to non-standard blood margin relations, wherein the non-standard blood margin relations comprise any one of a reference relation of the DWD table to an ADS table and a referenced relation of the DWD table to the DWS table. In an exemplary embodiment of the disclosure, the calculating the surrogate mapping model according to the blood-edge relationship of each data table between the data operation layer model of the data warehouse and the filtered data detail layer model includes determining the data operation layer model corresponding to each DWD table according to the reference relationship between each DWD table in the data detail layer model and the ODS table in the data operation layer model, so as to generate the surrogate mapping model. In one exemplary embodiment of the disclosure, the method further includes obtaining a corresponding field blood-edge relationship according to the reference relationship between the DWD table and the ODS table, and determining a data operation layer model corresponding to each DWD table based on the field blood-edge relationship to generate the alternative mapping model. In an exemplary embodiment of the disclosure, the method further includes performing blood-edge relationship analysis on each ADS table of a data application layer and each DWS table of the data application layer of the data warehouse to obtain a corresponding ODS table to be analyzed, matching the ODS table to be analyzed with the alternative mapping model, marking the ADS table and/or the DWS table corresponding to the ODS table to be analyzed which is successfully matched, and generating a corresponding reference governance optimization strateg