Search

CN-117009443-B - Hidden workflow construction method and device, electronic equipment and storage medium

CN117009443BCN 117009443 BCN117009443 BCN 117009443BCN-117009443-B

Abstract

The invention discloses a hidden workflow construction method, a device, electronic equipment and a storage medium. The method comprises the steps of constructing a hidden workflow based on the existing workflow, forming the constructed hidden workflow by a plurality of workflow modules flowlet in series, generating a plurality of training data sets flowlet, performing hidden workflow learning based on the training data sets, constructing prompt information of virtual roles corresponding to the plurality of flowlet in the learning process, and performing combined reasoning on the prompt information of the plurality of virtual roles according to the arrangement sequence of the plurality of flowlet and the original logic relationship to realize the function of the hidden workflow. The invisible workload constructed by the method can realize the measurable and controllable process of processing complex problems by a large model, can correct errors in time based on the result of each step, and has stable and controllable performance and high implementation efficiency.

Inventors

  • XIA ZHENGXUN
  • YANG YIFAN
  • FAN HAOJUN

Assignees

  • 南京星环智能科技有限公司

Dates

Publication Date
20260508
Application Date
20230803

Claims (13)

  1. 1. A method of implicit workflow construction, the method comprising: The hidden workflow construction process comprises the steps of segmenting the existing workflow based on the input data and the output data to obtain a plurality of operators, mapping the operators according to mapping rules in a hidden workflow mapping rule base to obtain a plurality of workflow modules flowlet, wherein flowlet is a virtual module, the functions and the input and output requirements of the operators are defined by formally defining [ input data ] - > flowlet- > [ output data ]; Generating training data sets of the plurality flowlet; performing implicit workflow learning based on the training data set, wherein prompt information of the virtual roles corresponding to the plurality flowlet is constructed in the learning process; The prompt information of a plurality of virtual roles is combined and inferred with the original logic relationship according to the arrangement sequence of the plurality flowlet, so that the function of implicit workflow is realized; Wherein the training data set includes a non-labeled background description training data set and a labeled indication training data set, and correspondingly, the generating the training data sets of the plurality flowlet includes: Outputting a background description training data set according to a preset template based on the background description information; Taking a template of the prompt as input of a large model, so that the large model outputs labeled data of the plurality flowlet, wherein the template of the prompt comprises input data and prompt description, the diversity of the input data is realized through a data template of the input data, and the data template is in a key value pair form; verifying the tagged data through a data verifier, and outputting qualified tagged data; Generating an indication training data set according to the qualified label data and the instruct-sample requirement, wherein the content of the indication training data set is a set of [ input data ] - [ output data ] pairs in [ input data ] -flowlet- [ output data ].
  2. 2. The method of claim 1, wherein mapping the plurality of operators comprises: And mapping part of the operators in the plurality of operators one to one, performing one to many decomposition mapping on part of the operators in the plurality of operators, and performing many to one merging mapping on part of the operators in the plurality of operators.
  3. 3. The method of claim 1, wherein if the tagged data check fails, adding an error code fed back by the data validator to the template, and continuously optimizing the tagged data output flowlet through the large model until the output tagged data check passes.
  4. 4. The method of claim 1, wherein the implicit workflow learning comprises flowlet background learning and flowlet function learning, and wherein the performing implicit workflow learning based on the training dataset, accordingly, comprises: Taking a background description training data set in the training data set as input of a large model, and carrying out flowlet background learning on the large model in a MASK learning mode; Performing flowlet functional learning on the large model based on the indication training data set in the training data set.
  5. 5. The method of claim 4, wherein performing flowlet function learning on the large model based on the indicated one of the training data sets comprises: constructing functional training data of a plurality flowlet of corresponding virtual roles based on the indication training data set in the training data set; Combining all flowlet functional training data corresponding to the virtual roles into an instruction fine-tuning dataset; instruction learning is performed on the large model using the instruction trim dataset.
  6. 6. The method of claim 5, wherein constructing functional training data for a plurality flowlet of corresponding virtual characters based on the indicated one of the training data sets comprises: Processing [ input data ] - [ output data ] pairs of a plurality of flowlet included in the indication training data set in the training data set to obtain functional training data of a plurality of flowlet corresponding virtual roles; Setting a virtual character for each flowlet, describing the name and function logic of the virtual character corresponding to flowlet to obtain prompt information of the virtual character, combining the prompt information of the virtual character with the [ input data ] in the [ input data ] - [ output data ] pair to form a virtual character template, and replacing the [ input data ] in the [ input data ] - [ output data ] pair by using the virtual character template to obtain function training data of the virtual character corresponding to flowlet.
  7. 7. The method of claim 1, wherein the implicit workflow construction based on existing workflows comprises implicit workflow construction for a full workflow of NL2SQL database adaptation according to input and output categories of data.
  8. 8. The method of claim 1, wherein the template is formed by combining a plurality of tables with field information through a flowlet execution script.
  9. 9. The method of claim 1, wherein the verifying the tagged data by the data validator outputs qualified tagged data, comprising: Extracting SQL sentences in the tagged data through an execution script of flowlet, and sending the SQL sentences to an SQL engine for execution; screening the tagged data through a data verifier according to the error codes returned after the SQL engine is executed; and outputting the screened tagged data as qualified tagged data.
  10. 10. The method of claim 4, wherein using the background description training dataset of the training datasets as input to a large model, performing flowlet background learning on the large model by MASK learning, comprises: And carrying out flowlet background learning on the NL2SQL big model by adopting a MASK learning mode based on background description data in the NL2SQL database table.
  11. 11. An implicit workflow construction apparatus, the apparatus comprising: The construction module is used for constructing a hidden workflow based on the existing workflow, wherein the existing workflow is composed of a plurality of operators, each operator has corresponding input data and output data, and the input data of an upstream operator is used as the output data of a downstream operator; the hidden workflow construction process comprises the steps of segmenting an existing workflow based on the input data and the output data to obtain a plurality of operators, mapping the operators according to mapping rules in a hidden workflow mapping rule base to obtain a plurality of workflow modules flowlet, wherein flowlet is a virtual module, and defining functions and input and output requirements of the virtual module by formally defining [ input data ] - > flowlet- > [ output data ]; a generation module for generating training data sets of the plurality flowlet; The learning module is used for performing implicit workflow learning based on the training data set, and prompting information of the virtual roles corresponding to the plurality flowlet is constructed in the learning process; the realization module is used for carrying out combined reasoning on the prompt information of the multiple virtual roles according to the arrangement sequence of the multiple flowlet and the original logic relationship to realize the function of the implicit workflow; The training data set comprises a non-labeled background description training data set and a labeled indication training data set, and the generating module specifically comprises: the first output unit is used for outputting a background description training data set according to a preset template based on the background description information; The second output unit is used for taking a template of the prompt as the input of the large model so that the large model outputs the labeled data of the plurality flowlet, the template of the prompt comprises input data and prompt description, the diversity of the input data is realized through a data template of the input data, and the data template is in the form of key value pairs; the verification unit is used for verifying the tagged data through the data verifier and outputting qualified tagged data; and the generating unit is used for generating an indication training data set according to the qualified label data and the instructt-sample requirement, wherein the content of the indication training data set is a set of [ input data ] - [ output data ] pairs in [ input data ] -flowlet- [ output data ].
  12. 12. An electronic device, the electronic device comprising: At least one processor, and A memory communicatively coupled to the at least one processor, wherein, The memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the implicit workflow construction method of any of claims 1-10.
  13. 13. A computer readable storage medium storing computer instructions for causing a processor to implement the implicit workflow construction method of any of claims 1-10 when executed.

Description

Hidden workflow construction method and device, electronic equipment and storage medium Technical Field The embodiment of the invention relates to the technical field of artificial intelligence, in particular to a hidden workflow construction method, a device, electronic equipment and a storage medium. Background With the rapid development of artificial intelligence, machine learning technology is also advancing continuously. From machine learning to deep learning to large-scale learning, the large model shows good performance in solving generalized and logical thinking problems which are not solved for a long time in the field of machine learning, and provides possibility for realizing general artificial intelligence. However, there is a lack of effective precautions in how to promote large model CoT capability, thus presenting a significant challenge to the application of large model technology. The current method for improving the CoT capacity of the large model comprises the following steps of 1, training and improving the CoT capacity of the model through codes, 2, guiding the large model to think according to example logic and giving an answer through a method for providing prompts few-shot, 3, training the large model through corpus containing specific thinking logic, such as a problem solving process and a result of mathematical arithmetic problems, and 4, taking the large model as a control module, and training the large model by using various tools, so that the capacity of solving complex problems of the large model, such as AutoGPT, is achieved. The disadvantage of the above method 1 and method 2 is that the process of solving the complex logic problem is not intuitive and uncontrollable, so how to make the process of solving the complex logic problem by the large model controllable is a technical problem to be solved currently. Disclosure of Invention The invention provides a hidden workflow construction method, a device, electronic equipment and a storage medium, which are used for solving the problem that the process of solving the complex logic problem by a large model is not visual and uncontrollable. According to an aspect of the present invention, there is provided a hidden workflow construction method, including: based on the existing workflow, constructing a hidden workflow, wherein the constructed hidden workflow is formed by connecting a plurality of workflow modules flowlet in series; Generating training data sets of the plurality flowlet; performing implicit workflow learning based on the training data set, wherein prompt information of the virtual roles corresponding to the plurality flowlet is constructed in the learning process; and carrying out combined reasoning on the prompt information of the multiple virtual roles according to the arrangement sequence of the multiple flowlet and the original logic relationship, so as to realize the function of the implicit workflow. According to another aspect of the present invention, there is provided an implicit workflow construction apparatus, including: the construction module is used for constructing the hidden workflow based on the existing workflow, and the constructed hidden workflow is formed by connecting a plurality of workflow modules flowlet in series; a generation module for generating training data sets of the plurality flowlet; The learning module is used for performing implicit workflow learning based on the training data set, and prompting information of the virtual roles corresponding to the plurality flowlet is constructed in the learning process; and carrying out combined reasoning on the prompt information of the multiple virtual roles according to the arrangement sequence of the multiple flowlet and the original logic relationship, so as to realize the function of the implicit workflow. According to another aspect of the present invention, there is provided an electronic device comprising at least one processor, and A memory communicatively coupled to the at least one processor, wherein, The memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the implicit workflow construction method of any of the embodiments of the present invention. According to another aspect of the present invention, there is provided a computer readable storage medium storing computer instructions for causing a processor to implement the implicit workflow construction method according to any of the embodiments of the present invention when executed. According to the technical scheme, the hidden workflow is constructed based on the existing workflow, the constructed hidden workflow is formed by connecting a plurality of workflow modules flowlet in series, a training data set of a plurality of flowlet is generated, hidden workflow learning is conducted based on the training data set, prompting information of virtual roles corresponding to a plurality of flowlet is constructed in t