CN-122018961-A - Data dependency relationship determination method, device, electronic equipment and storage medium
Abstract
The disclosure provides a data dependency relationship determining method, a device, electronic equipment and a storage medium, relates to the technical field of artificial intelligence, and particularly relates to the technical field of big data. The method for determining the data dependency relationship comprises the steps of extracting a target code segment from configuration information of a data synchronization task during the execution of the data synchronization task on data to be processed, wherein the target code segment indicates conversion operation required to be executed from an initial state to a target state on the data to be processed during the execution of the data synchronization task, the data synchronization task is used for transmitting the data to be processed from a source end to the target end after conversion so that the data to be processed in the initial state of the source end in a target period is substantially the same as the data to be processed in the target state of the target end, and analyzing the target code segment to obtain the data dependency relationship between the initial state and the target state, and the data dependency relationship is used for tracing a processing path of the data to be processed during the execution of the data synchronization task.
Inventors
- WANG SHUAI
- REN QIQIANG
- YE QING
Assignees
- 北京百度网讯科技有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20260129
Claims (14)
- 1. A data dependency determination method, comprising: Extracting a target code segment from configuration information of a data synchronization task during execution of the data synchronization task for the data to be processed, wherein the target code segment indicates a conversion operation required to be performed by the data to be processed from an initial state to a target state during execution of the data synchronization task, the data synchronization task is used for transmitting the data to be processed from a source terminal to the target terminal after conversion so that the data to be processed in the initial state of the source terminal is substantially the same as the data to be processed in the target state of the target terminal in a target period, and And analyzing the target code segment to obtain a data dependency relationship between the initial state and the target state, wherein the data dependency relationship is used for tracing a processing path of data to be processed during the execution of the data synchronization task.
- 2. The method of claim 1, wherein the extracting the object code segment from the configuration information of the data synchronization task during the performing of the data synchronization task for the data to be processed comprises: Determining a heterogeneous field configuration type associated with the conversion operation during a data synchronization task performed on the data to be processed, wherein the heterogeneous field configuration type indicates a configuration form of a mapping relationship between an initial state field and a target state field, and And extracting the target code segment from the configuration information of the data synchronization task according to the configuration type.
- 3. The method of claim 2, wherein the heterogeneous field configuration type includes a configuration rule for indicating a mapping relationship between an initial state field and a target state field; The extracting the target code segment from the configuration information of the data synchronization task according to the configuration type comprises the following steps: extracting a first code segment associated with the initial state field and a second code segment associated with the target state field from the configuration information, and And according to the configuration rule, calling a field mapping method to process the first code segment and the second code segment to obtain the target code segment.
- 4. The method of claim 2, wherein the configuration type comprises a structured query statement indicating a mapping relationship between an initial state field and a target state field; The extracting the target code segment from the configuration information of the data synchronization task according to the configuration type comprises the following steps: During a data synchronization task executed on data to be processed, analyzing the structured query statement to obtain a selection statement field in the structured query statement, wherein the selection statement field defines a type of conversion operation required to be executed from the initial state field to the target state field; extracting a first code segment associated with the initial state field, a second code segment associated with the target state field from a selection statement field, and And according to the conversion operation type, calling a field mapping method to process the first code segment and the second code segment to obtain the target code segment.
- 5. The method of any of claims 1-4, wherein the parsing the object code segment to obtain a data dependency between the initial state and the object state comprises: extracting a code segment to be resolved associated with the structural form from the target code segment according to the structural form of the data to be processed, and And analyzing the code segment to be analyzed to obtain the data dependency relationship related to the structural form.
- 6. The method of claim 5, wherein the structural form comprises at least one of table data, field data, and file data; The analyzing the code segment to be analyzed to obtain the data dependency relationship associated with the structural form comprises at least one of the following steps: Analyzing the code segments to be analyzed associated with the table data to obtain the data dependency relationship between the table data; analyzing the code segments to be analyzed associated with the field data to obtain the data dependency relationship between the field data; And analyzing the code segments to be analyzed associated with the file data to obtain the data dependency relationship between the file data.
- 7. The method of claim 6, wherein the parsing the object code segment to obtain the data dependency between the initial state and the object state further comprises: And in response to determining that the data to be processed comprises at least two structural forms, combining at least two data dependency relations associated with the at least two structural forms according to the subordinate relations between the at least two structural forms to obtain the data dependency relation between the initial state and the target state.
- 8. The method of any one of claims 1-7, the method further comprising: Acquiring the data synchronization task information; Combining the data synchronization task information with the object code segment to obtain a combined code segment, and And analyzing the combined code segment to obtain the data dependency relationship and the association relationship between the data dependency relationship and the data synchronization task.
- 9. The method of any one of claims 1-8, the method further comprising: acquiring a data dependency graph in response to determining that the execution of the data synchronization task is completed, wherein the data dependency graph comprises nodes and edges between the nodes, the nodes comprise initial state nodes and target state nodes, and the edges represent the data dependency between the connected nodes; And responding to the operation on the target edge between the initial state node and the target state node in the data dependency graph, and displaying conversion operation information and data synchronization task information which are required to be executed by the initial state to the target state of the data to be processed.
- 10. The method of claim 9, wherein the initial state node comprises table data and a plurality of initial fields associated with the table data, the target state node comprises the table data and a plurality of target fields associated with the table data, the method further comprising: and responding to the operation aiming at any initial field in the plurality of initial fields, and displaying an associated side between any target field in the plurality of target fields and any initial field, wherein any initial field connected by the associated side has a data dependency relationship with any target field.
- 11. A data dependency determination apparatus comprising: An extraction module for extracting a target code segment from configuration information of a data synchronization task during execution of the data synchronization task for the data to be processed, wherein the target code segment indicates a conversion operation required to be performed from an initial state to a target state of the data to be processed during execution of the data synchronization task, the data synchronization task is used for transmitting the data to be processed from a source terminal to a target terminal after conversion so that the data to be processed in the initial state of the source terminal is substantially the same as the data to be processed in the target state of the target terminal in a target period, and And the analysis module is used for analyzing the target code segment to obtain a data dependency relationship between the initial state and the target state, and the data dependency relationship is used for tracing a processing path of the data to be processed during the execution of the data synchronization task.
- 12. An electronic device, comprising: at least one processor, and A memory communicatively coupled to the at least one processor, wherein, The memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-10.
- 13. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-10.
- 14. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any of claims 1-10.
Description
Data dependency relationship determination method, device, electronic equipment and storage medium Technical Field The disclosure relates to the technical field of artificial intelligence, in particular to the technical field of big data, and particularly relates to a data dependency relationship determining method, a device, electronic equipment and a storage medium. Background The data dependency relationship, also called data blood margin, can clearly and completely record the context relationship among the source, movement, conversion process and destination of the data in the whole life cycle, and is a reliable basis for data management and anomaly tracing. The data synchronization engine can efficiently complete data synchronization tasks by executing extraction, loading and conversion operations of mass data. However, the data synchronization engine lacks the tracking capability of built-in data dependency, so that it is difficult to track the circulation path of data during the period when the data synchronization task is executed. Disclosure of Invention The disclosure provides a data dependency relationship determination method, a data dependency relationship determination device, electronic equipment and a storage medium. According to one aspect of the disclosure, a data dependency relationship determining method is provided, which includes extracting a target code segment from configuration information of a data synchronization task during execution of the data synchronization task for data to be processed, wherein the target code segment indicates a conversion operation required to be executed by the data to be processed from an initial state to a target state during execution of the data synchronization task, the data synchronization task is used for transmitting the data to be processed from a source end to a target end after conversion so that the data to be processed in the initial state of the source end is substantially the same as the data to be processed in the target state of the target end in a target period, and analyzing the target code segment to obtain a data dependency relationship between the initial state and the target state, wherein the data dependency relationship is used for tracing a processing path of the data to be processed during execution of the data synchronization task. According to another aspect of the present disclosure, there is provided a data dependency relationship determining apparatus including an extracting module and an analyzing module. The device comprises an extraction module, an extraction module and a target code segment, wherein the extraction module is used for extracting the target code segment from configuration information of a data synchronization task during the execution of the data synchronization task for the data to be processed, the target code segment indicates the conversion operation required to be executed by the data to be processed from an initial state to a target state during the execution of the data synchronization task, and the data synchronization task is used for transmitting the data to be processed from a source end to the target end after the conversion so that the data to be processed in the initial state of the source end in a target period is substantially the same as the data to be processed in the target state of the target end. The analysis module is used for analyzing the target code segment to obtain a data dependency relationship between the initial state and the target state, and the data dependency relationship is used for tracing the processing path of the data to be processed during the execution of the data synchronization task. According to another aspect of the present disclosure, there is provided an electronic device comprising at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor, the instructions being executable by the at least one processor to enable the at least one processor to perform the method as described above. According to another aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions for causing the computer to perform the method as described above. According to another aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements a method as described above. It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification. Drawings The drawings are for a better understanding of the present solution and are not to be construed as limiting the present disclosure.