Search

CN-121996287-A - Visual data stream configuration method, system, storage medium and program product

CN121996287ACN 121996287 ACN121996287 ACN 121996287ACN-121996287-A

Abstract

A visual data stream configuration method, a system, a storage medium and a program product relate to the field of big data resource service, and the method comprises the steps of receiving a modification instruction triggered after a user selects a target flow, and obtaining a modification annotation corresponding to the target flow; retrieving graphical node configuration and connection relation metadata corresponding to a target flow, converting the target flow into an initial source code, loading a domain constraint knowledge base, carrying out code modification on the initial source code by combining modification annotation to generate a modified source code, carrying out grammar structure analysis on the modified source code, determining the difference position of the modified source code and the initial source code, generating a graphical instruction set representing graphical node change and modification annotation, modifying the original graphical metadata corresponding to the target flow on a visual data flow graph interface to obtain graphical modification metadata, and annotating the graphical modification metadata according to the modification annotation to obtain a target data flow graph. By implementing the application, the flow modification efficiency can be improved.

Inventors

  • DU ZHIQIANG
  • MA XIAOPING
  • Bie Wenjin
  • HE REN
  • YANG WEIKANG
  • ZHAO JIANMIN
  • DONG ZIHENG
  • Yuan Zengwang
  • ZHU YAN

Assignees

  • 江苏守正耘创大数据科技有限公司

Dates

Publication Date
20260508
Application Date
20260115

Claims (10)

  1. 1. A method of visual data stream configuration for use in a data processing system, the method comprising: Receiving a modification instruction triggered after a user selects a target flow on a visual data flow diagram interface, and acquiring a modification annotation corresponding to the target flow; Retrieving graphical node configuration and connection relation metadata corresponding to the target flow from a metadata base, and converting the target flow into an initial source code; loading a domain constraint knowledge base containing compliance rules for processing medical data, and carrying out code modification on the initial source code by combining the domain constraint knowledge base and the modification annotation to generate a modified source code; carrying out grammar structure analysis on the correction source code, determining the difference position of the correction source code and the initial source code, and generating a graphic instruction set and a modification annotation for representing the change of a graphic node; Based on the graphic instruction set, modifying original graphic metadata corresponding to the target flow on the visual data flow diagram interface to obtain graphic correction metadata; And annotating the graph correction metadata according to the modification annotation to obtain a target data flow graph.
  2. 2. The method of claim 1, wherein after the step of receiving a modification instruction triggered by a user after selecting a target flow on a visual dataflow graph interface and obtaining a modification annotation corresponding to the target flow, the method further includes: reading an editing state record table of the target flow, extracting a department code of a current active editing session and modifying annotation content; When concurrent editing of different departments exists according to the current user and the department codes of the active editing session, calculating a modification word vector of modification annotation content of each department; And performing cosine similarity calculation on the modified word vector to obtain modified semantic overlapping degree, and generating a modified operation sequencing queue based on the modified semantic overlapping degree and the modified authority priority corresponding to each department, wherein the modified operation sequencing queue comprises an execution sequence identifier and a dependency relationship mark of each modified operation, and the modified operation sequencing queue is used as an input parameter for processing a subsequent modified instruction and is used for determining the execution position of the modified operation of the current user.
  3. 3. The method of claim 1, wherein after the step of loading a domain constraint knowledge base containing compliance rules for processing medical data, code modifying the initial source code in conjunction with the domain constraint knowledge base and the modification annotations, generating modified source code, the method further comprises: extracting a creation time stamp of metadata in the target flow, and determining all rule entries from a baseline version to a current version according to the creation time stamp and a version time table of the compliance rule; executing rule grammar analysis on all rule entries to generate a rule change difference table; Carrying out semantic vectorization processing on each rule item in the rule change difference table, and calculating semantic distance values among rules of each version; Marking each rule entry as a compatible type or a conflict type based on the semantic distance value; and storing the rule entries of different types into a hierarchical constraint knowledge base respectively for hierarchical rule verification in subsequent code modification.
  4. 4. A method according to claim 3, wherein after the step of performing rule syntax parsing on all rule entries to generate a rule change difference table, the method further comprises: Determining a total number of nodes of the medical data processing involved in the modified source code; When the total number of the nodes exceeds a medical real-time processing threshold, classifying the nodes for medical data processing according to medical service priority, classifying the nodes related to vital signs of patients into a high-priority queue, and classifying the nodes related to historical medical record statistics into a low-priority queue; Performing instant grammar parsing on the high-priority queue to generate an urgent graphic instruction fragment, and performing delay parsing on the low-priority queue to generate a conventional graphic instruction fragment; And merging the emergency graphic instruction fragments and the conventional graphic instruction fragments according to the time sequence of the medical data stream to generate a graphic instruction set conforming to the medical treatment time sequence.
  5. 5. The method of claim 1, wherein after the step of retrieving graphical node configuration and connection relationship metadata corresponding to the target flow from a metadata base, the method further comprises: Reading a medical data table identifier in the graphical node configuration, and sending a table structure query instruction to an HIS system and test equipment database corresponding to the medical data table identifier; Receiving an admission discharge time field returned by the HIS system and the inspection equipment database, and determining the data definition type of the admission discharge time field; And adding a time format conversion function to the data processing parameters configured by the graphical node when the time field of the discharge and the discharge is determined to be changed from a date type to a time stamp type according to the data definition type.
  6. 6. The method of claim 1, wherein after the step of loading a domain constraint knowledge base containing compliance rules for processing medical data, code modifying the initial source code in conjunction with the domain constraint knowledge base and the modification annotations, generating modified source code, the method further comprises: Extracting role codes and authority bit masks from the user session information, and carrying out matching inquiry on operation keywords in the modification notes and an authority operation mapping table to obtain an authority verification result; When determining that verification fails according to the authority verification result, filtering operation instructions exceeding the authority range from the modification annotation to generate a modification instruction set after authority filtering; the rights verification flag is inserted into the annotation line of the code fragment when the code modification is performed, forming revised source code containing rights verification information.
  7. 7. The method according to claim 1, wherein the step of loading a domain constraint knowledge base containing compliance rules for processing medical data, code modifying the initial source code in combination with the domain constraint knowledge base and the modification annotation, and generating modified source code, comprises: carrying out semantic analysis on the modified annotation, and identifying an atomic operation and a compound service operation in the modified annotation; Loading a domain constraint knowledge base containing compliance rules for processing medical data, retrieving a plurality of data processing steps corresponding to the compound business operation in the domain constraint knowledge base, and determining a logic processing mode; Analyzing the initial source code into an abstract syntax tree form to obtain an initial abstract syntax tree; Determining code segments for replacing corresponding business logic in the initial source code based on the atomic operation and the logic processing mode to obtain a code segment set; And carrying out structural reconstruction on a plurality of target positions of the initial abstract syntax tree according to the logical dependency relationship of the code fragment set to obtain an abstract syntax reconstruction tree, and converting the abstract syntax reconstruction tree into a correction source code.
  8. 8. A data processing system comprising one or more processors and memory coupled to the one or more processors, the memory for storing computer program code comprising computer instructions that the one or more processors invoke to cause the data processing system to perform the method of any of claims 1-7.
  9. 9. A computer readable storage medium comprising instructions which, when run on a data processing system, cause the data processing system to perform the method of any of claims 1-7.
  10. 10. A computer program product, characterized in that the computer program product, when run on a data processing system, causes the data processing system to perform the method according to any of claims 1-7.

Description

Visual data stream configuration method, system, storage medium and program product Technical Field The present application relates to the field of large data resource service, and in particular, to a method, a system, a storage medium, and a program product for configuring a visualized data stream. Background In the field of enterprise-level data processing and analysis, a visual and low-code data leveling platform is derived for reducing the technical threshold and improving the development efficiency. The platforms aim at enabling users such as service analyzers and the like to construct data extraction, conversion and loading (ETL) processes in a mode of dragging components and connecting data streams through graphical interfaces, rapidly converting service logic into executable data tasks and accelerating realization of data values. In the related art, there is a code generation method based on metadata. The system of the scheme provides a visual canvas, and a user constructs a data flow graph by dragging graphical nodes representing different data processing operations (such as reading, filtering and aggregation) and connecting the graphical nodes. Each step of operation of the user, including node selection, parameter configuration and connection relation between nodes, is analyzed by the system in real time and stored in the form of structured metadata. When the user completes the design and triggers execution, the code generation engine of the system reads the metadata, translates the whole data flow graph into a complete executable script code according to a preset template (such as a template for Spark or SQL), and submits the script code to the back-end computing engine for execution. However, in the data processing of the medical service, because the rule is frequently changed and the individuation degree is high, and constraint conditions such as the time of admission and discharge and the sex difference of gynecopathy exist in each dimension in the data, in order to ensure the accuracy of the data, when the user needs to modify or change the flow, the user needs to frequently compare the data variables before and after the modification, and the efficiency of the flow modification process is low. Disclosure of Invention The application provides a visual data stream configuration method, a visual data stream configuration system, a storage medium and a program product, which are used for improving the flow modification efficiency. The method comprises the steps of receiving a modification instruction triggered by a user after selecting a target flow on a visual data flow graph interface, obtaining modification notes corresponding to the target flow, retrieving graphical node configuration and connection relation metadata corresponding to the target flow from a metadata base, converting the target flow into an initial source code, loading a domain constraint knowledge base containing compliance rules for processing medical data, carrying out code modification on the initial source code by combining the domain constraint knowledge base and the modification notes, generating a modification source code, carrying out grammar structure analysis on the modification source code, determining the difference positions of the modification source code and the initial source code, generating a graphical instruction set and modification notes for representing the change of the graphical nodes, modifying the original graphical metadata corresponding to the target flow on the visual data flow graph interface based on the graphical instruction set, obtaining graphical modification metadata, and carrying out annotation on the graphical modification metadata according to the modification notes to obtain the target data. In the above embodiment, the data processing system receives the natural language modification annotation of the user, modifies the source code representing the data flow in combination with the domain constraint knowledge base, and then reversely analyzes the change of the code layer into the graphic instruction set for updating the visual interface. The business intention of the user is directly converted into executable codes which meet the specification, the consistency of the graphical interface is maintained, and the delay, omission and errors caused by manual synchronization of the codes and the graphical interface in the traditional mode are avoided. The problems that the flow modification needs repeated comparison, manual coding and low efficiency are effectively solved under the medical scene with complex rules, and the degree of change, accuracy and overall efficiency of the data flow modification are improved. In combination with some embodiments of the first aspect, in some embodiments, after receiving a modification instruction triggered by a user after selecting a target flow on a visual data flow graph interface and obtaining a modification annotation corresponding to the target flo