CN-122029533-A - Text processing method and computing device
Abstract
A text processing method includes receiving text describing a flow of a task, obtaining a first clause group from the text describing the flow of the task, the first clause group including a plurality of clauses arranged in an order in which the flow occurs, each of the plurality of clauses including a workflow component, determining a workflow tag corresponding to each of the first clause group, generating the workflow of the task from the workflow tags corresponding to the plurality of clauses in the first clause group, and generating a visual representation of the generated workflow for display by the workflow generation application.
Inventors
- Pewande Tamuri
- Zhong Wen Abelard week
- LUO JINRONG
- CHEN MING
- WEI QIMENG
Assignees
- 华为云计算技术有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20240620
- Priority Date
- 20231003
Claims (20)
- 1. A method of text processing performed by a computing device, comprising: receiving text describing a flow of a task for which a workflow is to be generated as input to a workflow generation application executing on the computing device; Obtaining a first clause group from the text describing the flow of the task, wherein the first clause group comprises a plurality of clauses arranged according to the sequence of the flow, each clause in the first clause group comprises a workflow component for indicating information of a corresponding functional unit in a plurality of functional units for executing the task; Determining workflow tags corresponding to each clause in the first clause group, wherein the workflow tags corresponding to the plurality of clauses in the first clause group include a first workflow tag for indicating a workflow component or a second workflow tag for indicating a workflow mode; Generating the workflow of the task according to workflow tags corresponding to the plurality of clauses in the first clause group; generating a visual representation of the generated workflow for display by the workflow generation application.
- 2. The text processing method of claim 1, wherein obtaining the first set of clauses from the text describing the flow of the task comprises: Decomposing the text for describing the flow of the task to obtain a third clause group, wherein each clause of a plurality of clauses included in the third clause group comprises at most one workflow component; Screening the multiple clauses in the third clause group to obtain a fourth clause group, wherein each clause in the multiple clauses in the fourth clause group comprises a workflow component; The multiple clauses in the fourth clause group are subjected to order adjustment to obtain the first clause group; Or alternatively The obtaining the first clause group from the text describing the flow of the task includes: decomposing the text for describing the flow of the task to obtain the third clause group, wherein each clause of the multiple clauses included in the third clause group comprises at most one workflow component; Screening the multiple clauses in the third clause group to obtain the fourth clause group, wherein each clause in the multiple clauses included in the fourth clause group comprises a workflow component; the clauses in the fourth clause group are subjected to order adjustment to obtain a second clause group; inserting at least one identifier of at least one workflow pattern into the second clause group to obtain the first clause group, wherein the at least one workflow pattern is used for indicating the structure of the flow of the task and the relation among the plurality of functional units.
- 3. The text processing method of claim 2, wherein decomposing the text describing the flow of the task to obtain the third clause group comprises: performing sentence boundary detection on the text to obtain a plurality of sentences of the text; and splitting each sentence in the plurality of sentences of the text into one or more clauses through a first encoder-based model to obtain the third clause group.
- 4. The text processing method of claim 2, wherein filtering the plurality of clauses in the third clause group to obtain the fourth clause group comprises: Determining, by a second encoder-based model, whether each clause of the plurality of clauses in the third clause group includes a workflow component; deleting the clause of the third clause group if the clause of the third clause group includes zero workflow components to obtain the fourth clause group.
- 5. The text processing method of claim 2, wherein the ordering of the plurality of clauses in the fourth clause group comprises: the plurality of clauses in the fourth clause group are ordered by a third encoder-based model, wherein an input of the third encoder-based model is two clauses in the fourth clause group and an output of the third encoder-based model is information indicating whether the order of the two clauses is correct.
- 6. The text processing method of claim 2, wherein the identification of each of the at least one workflow pattern comprises a workflow pattern boundary indication sentence and/or a workflow pattern indication, wherein inserting the at least one identification of the at least one workflow pattern into the second clause group to obtain the first clause group comprises: Determining the at least one workflow pattern of the second clause group; For each of the at least one workflow pattern, obtaining the first clause group by performing at least one of the following on the second clause group: inserting the workflow pattern boundary indication sentence of the workflow pattern into the second clause group if no clause for indicating the boundary of the workflow pattern exists in the second clause group, or If no clause exists in the second clause group as the workflow mode indication, inserting the workflow mode indication of the workflow mode into the second clause group.
- 7. The text processing method of claim 6, wherein determining the at least one workflow pattern for the second set of clauses comprises: Performing pattern keyword detection on a plurality of clauses in the second clause group; Determining the at least one workflow mode according to the detected at least one mode keyword, wherein each mode keyword in the at least one mode keyword is used for indicating the workflow mode in the at least one workflow mode.
- 8. The text processing method of any one of claims 1 to 7, wherein determining the workflow tag corresponding to each clause in the first set of clauses comprises: Matching each clause of the first clause group with a plurality of workflow tags included in a tag dictionary to determine the workflow tag corresponding to each clause of the first clause group from the tag dictionary.
- 9. The text processing method of claim 8, wherein the tag dictionary further includes predefined values corresponding to the plurality of workflow tags, wherein matching each clause of the first clause group with the plurality of workflow tags included in the tag dictionary to determine the workflow tag corresponding to each clause of the first clause group from the tag dictionary comprises: performing semantic similarity processing to match each clause in the first set of clauses with the predefined value in the tag dictionary; A workflow tag corresponding to a predefined value of the predefined values having a highest similarity score with respect to the clause is determined as the workflow tag corresponding to the clause.
- 10. A computing device, comprising: a memory; at least one processor, the at least one processor coupled to the memory; Wherein the memory is to store computer instructions that, when executed by the at least one processor, cause the computing device to implement: receiving text describing a flow of a task for which a workflow is to be generated as input to a workflow generation application executing on the computing device; Obtaining a first clause group from the text describing the flow of the task, wherein the first clause group comprises a plurality of clauses arranged according to the sequence of the flow, each clause in the first clause group comprises a workflow component for indicating information of a corresponding functional unit in a plurality of functional units for executing the task; Determining workflow tags corresponding to each clause in the first clause group, wherein the workflow tags corresponding to the plurality of clauses in the first clause group include a first workflow tag for indicating a workflow component or a second workflow tag for indicating a workflow mode; generating a workflow of the task according to workflow tags corresponding to the plurality of clauses in the first clause group; generating a visual representation of the generated workflow for display by the workflow generation application.
- 11. The computing device of claim 10, wherein the computer instructions, when executed by the at least one processor, cause the computing device to implement: Decomposing the text for describing the flow of the task to obtain a third clause group, wherein each clause of a plurality of clauses included in the third clause group comprises at most one workflow component; Screening the multiple clauses in the third clause group to obtain a fourth clause group, wherein each clause in the multiple clauses in the fourth clause group comprises a workflow component; And performing order adjustment on the multiple clauses in the fourth clause group to obtain the first clause group, or performing order adjustment on the multiple clauses in the fourth clause group to obtain a second clause group, and inserting at least one identifier of at least one workflow mode into the second clause group to obtain the first clause group, wherein the at least one workflow mode is used for indicating the structure of the flow of the task and the relation among the multiple functional units.
- 12. The computing device of claim 11, wherein the computer instructions, when executed by the at least one processor, cause the computing device to implement: performing sentence boundary detection on the text to obtain a plurality of sentences of the text; and splitting each sentence in the plurality of sentences of the text into one or more clauses through a first encoder-based model to obtain the third clause group.
- 13. The computing device of claim 11, wherein the computer instructions, when executed by the at least one processor, cause the computing device to implement: Determining, by a second encoder-based model, whether each clause of the plurality of clauses in the third clause group includes a workflow component; deleting the clause of the third clause group if the clause of the third clause group includes zero workflow components to obtain the fourth clause group.
- 14. The computing device of claim 11, wherein the computer instructions, when executed by the at least one processor, cause the computing device to implement: the plurality of clauses in the fourth clause group are ordered by a third encoder-based model, wherein an input of the third encoder-based model is two clauses in the fourth clause group and an output of the third encoder-based model is information indicating whether the order of the two clauses is correct.
- 15. The computing device of claim 11, wherein the identification of each of the at least one workflow mode comprises a workflow mode boundary indication sentence and/or a workflow mode indication, the computer instructions, when executed by the at least one processor, cause the computing device to implement: Determining the at least one workflow pattern of the second clause group; For each of the at least one workflow pattern, obtaining the first clause group by performing at least one of the following on the second clause group: inserting the workflow pattern boundary indication sentence of the workflow pattern into the second clause group if no clause for indicating the boundary of the workflow pattern exists in the second clause group, or If no clause exists in the second clause group as the workflow mode indication, inserting the workflow mode indication of the workflow mode into the second clause group.
- 16. The computing device of claim 15, wherein the computer instructions, when executed by the at least one processor, cause the computing device to implement: Performing pattern keyword detection on a plurality of clauses in the second clause group; Determining the at least one workflow mode according to the detected at least one mode keyword, wherein each mode keyword in the at least one mode keyword is used for indicating a workflow mode in the at least one workflow mode.
- 17. The computing device of any of claims 10 to 16, wherein the computer instructions, when executed by the at least one processor, cause the computing device to implement: Matching each clause of the first clause group with a plurality of workflow tags included in a tag dictionary to determine the workflow tag corresponding to each clause of the first clause group from the tag dictionary.
- 18. The computing device of claim 10, wherein the tag dictionary further comprises predefined values corresponding to the plurality of workflow tags, the computer instructions, when executed by the at least one processor, cause the computing device to implement: performing semantic similarity processing to match each clause in the first set of clauses with the predefined value in the tag dictionary; A workflow tag corresponding to a predefined value of the predefined values having a highest similarity score with respect to the clause is determined as the workflow tag corresponding to the clause.
- 19. A non-transitory computer-readable storage medium having stored thereon computer instructions that, when executed by a computer, cause the computer to: receiving text describing a flow of a task for which a workflow is to be generated as input to a workflow generation application executing on the computing device; Obtaining a first clause group from the text describing the flow of the task, wherein the first clause group comprises a plurality of clauses arranged according to the sequence of the flow, each clause in the first clause group comprises a workflow component for indicating information of a corresponding functional unit in a plurality of functional units for executing the task; Determining workflow tags corresponding to each clause in the first clause group, wherein the workflow tags corresponding to the plurality of clauses in the first clause group include a first workflow tag for indicating a workflow component or a second workflow tag for indicating a workflow mode; generating a workflow of the task according to workflow tags corresponding to the plurality of clauses in the first clause group; A visual representation of the generated workflow is generated for display by the workflow generation application.
- 20. The non-transitory computer-readable storage medium of claim 19, wherein the computer instructions, when executed by a computer, cause the computer to implement: Decomposing the text for describing the flow of the task to obtain a third clause group, wherein each clause of a plurality of clauses included in the third clause group comprises at most one workflow component; Screening the multiple clauses in the third clause group to obtain a fourth clause group, wherein each clause in the multiple clauses in the fourth clause group comprises a workflow component; And performing order adjustment on the multiple clauses in the fourth clause group to obtain the first clause group, or performing order adjustment on the multiple clauses in the fourth clause group to obtain a second clause group, and inserting at least one identifier of at least one workflow mode into the second clause group to obtain the first clause group, wherein the at least one workflow mode is used for indicating the structure of the flow of the task and the relation among the multiple functional units.
Description
Text processing method and computing device Cross Reference to Related Applications The application claims the benefit of U.S. patent application Ser. No. 18/480,342, filed on 3 at 10/2023, the disclosure of which is incorporated by reference in its entirety. Technical Field The present invention relates to the field of natural language processing (natural language processing, NLP) technology, and in particular to a text processing method and a computing device. Background With the development of artificial intelligence (ARTIFICIAL INTELLIGENCE, AI) technology, the use of AI models for natural language processing has become a major concern in the industry. At present, an AI model widely used in the field of natural language processing is a transducer model. In a transducer model comprising an encoder and a decoder, the role of the encoder is to process the input text and convert it into a numerical vector (called a hidden state vector) that is computer-understandable but not human-understandable. These hidden state vectors contain context information related to the input text. This process is called natural language understanding (natural language understanding, NLU). The encoder then sends these hidden state vectors to the decoder, which uses these vectors to generate a response that resembles human language-in other words, a humanoid response. This process is known as natural language generation (natural language generation, NLG). Natural language processing is increasingly focused on enhancing the understandability of its output to humans. The search for ways to generate human-understandable descriptions remains a field of investigation. Disclosure of Invention In a first aspect, a text processing method is provided. When applied to a computing device, the method includes receiving text describing a flow of a task for which a workflow is to be generated as input to a workflow generation application executing on the computing device, obtaining a first clause group from the text describing the flow of the task, the first clause group including a plurality of clauses arranged in an order in which the flow occurs, each of the plurality of clauses in the first clause group including a workflow component for indicating information for a corresponding function unit of a plurality of function units for executing the task, determining a workflow tag corresponding to each of the first clause group, the workflow tag corresponding to the plurality of clauses in the first clause group including a first workflow tag for indicating a workflow component or a second workflow tag for indicating a workflow mode, generating the workflow from the workflow tag corresponding to the plurality of clauses in the first clause group, and generating the workflow representation for the workflow generation by the workflow generation application. In one possible implementation, the obtaining of the first clause group from the text describing the flow of the task includes decomposing the text describing the flow of the task to obtain a third clause group, wherein each clause in a plurality of clauses included in the third clause group includes at most one workflow component, filtering the plurality of clauses in the third clause group to obtain a fourth clause group, wherein each clause in the plurality of clauses included in the fourth clause group includes one workflow component, and sequencing the plurality of clauses in the fourth clause group to obtain the first clause group. In one possible implementation, obtaining the first clause group from the text describing the flow of the task includes decomposing the text describing the flow of the task to obtain a third clause group, each of a plurality of clauses included in the third clause group including at most one workflow component, filtering the plurality of clauses in the third clause group to obtain a fourth clause group, each of the plurality of clauses included in the fourth clause group including one workflow component, ordering the plurality of clauses in the fourth clause group to obtain a second clause group, and inserting at least one identifier of at least one workflow pattern into the second clause group to obtain the first clause group, the at least one workflow pattern being used to indicate a relationship between the structure of the task and the plurality of functional units. In one possible implementation, decomposing the text describing the flow of the task to obtain the third set of clauses includes performing sentence boundary detection on the text to obtain a plurality of sentences of the text, and splitting each sentence of the plurality of sentences of the text into one or more clauses by a first encoder-based model to obtain the third set of clauses. In one possible implementation, the filtering the multiple clauses in the third clause group to obtain the fourth clause group includes determining, through a second encoder-based model, whether each of the mult