US-20260127207-A1 - TECHNIQUES FOR DISCOVERING PROCESSES USING NATURAL LANGUAGE INPUT

US20260127207A1US 20260127207 A1US20260127207 A1US 20260127207A1US-20260127207-A1

Abstract

Techniques for using natural language to identify instances of a process in multiple streams of event data, each particular stream of event data, from among the multiple streams, corresponding to a series of interactions between one or more application programs executing on particular computing device and a particular user performing the process using the one or more application programs, the method comprising: receiving natural language input describing the process; generating a process representation at least in part by using a language model to process the natural language input; identifying, using the process representation and from among the multiple streams of event data, multiple candidate instances of the process; selecting, based on user input, at least one of the multiple candidate instances; and (E) storing the selected at least one candidate instance as at least one confirmed instance of the process.

Inventors

George Peter Nychis
Rohan Narayana Murty
Kevin Segundo Bello Medina

Assignees

SOROCO INDIA PRIVATE LIMITED

Dates

Publication Date: 20260507
Application Date: 20251104

Claims (20)

1 . A method of using natural language to identify instances of a process in multiple streams of event data, each particular stream of event data, from among the multiple streams, corresponding to a series of interactions between one or more application programs executing on particular computing device and a particular user performing the process using the one or more application programs, the method comprising: using at least one computer hardware processor to perform: (A) receiving natural language input describing the process; (B) generating a process representation at least in part by using a language model to process the natural language input; (C) identifying, using the process representation and from among the multiple streams of event data, multiple candidate instances of the process; (D) selecting, based on user input, at least one of the multiple candidate instances; and (E) storing the selected at least one candidate instance as at least one confirmed instance of the process.
2 . The method of claim 1 , wherein receiving the natural language input comprises receiving the natural language input from a user via a graphical user interface.
3 . The method of claim 1 , wherein the natural language input describes the process in part by identifying one or more application programs used to perform the process and one or more activities performed using the one or more applications programs in furtherance of the process.
4 . The method of claim 1 , wherein the process representation indicates a set of activities and relationships among activities in the set of activities, the relationships indicating an order in which at least some of the activities in the activities are to be performed as part of the process.
5 . The method of claim 4 , wherein the process representation, further indicates, for each particular activity in the set of activities: an identifier, a natural language description of the activity, and a set of one or more application programs used to perform the activity.
6 . The method of claim 5 , further comprising: generating a workflow graph visualization of the process representation, the workflow graph visualization comprising a graph with nodes representing activities in the set of activities and edges representing the relationships among the activities in the set of activities; and displaying the workflow graph visualization of the process representation in a graphical user interface (GUI).
7 . The method of claim 6 , wherein the GUI comprises a chatbot interface, the method further comprising: receiving, via the chatbot interface, further natural language input from the user indicating one or more modifications to make to the process representation; modifying the process representation in accordance with the further natural language input from the user to obtain an updated process representation; generating an updated workflow graph visualization of the updated process representation; and displaying the updated workflow graph visualization in the GUI.
8 . The method of claim 4 , wherein identifying, using the process representation and from among the multiple streams of event data, the multiple candidate instances of the process, comprises: generating weighted finite-state automaton (WFSA) from the process representation, the WFSA comprising states, edges between pairs of states, and weights associated with the edges, the states comprising a respective state for each of the activities in the process representation; and identifying the multiple candidate instances of the process using the WFSA.
9 . The method of claim 8 , wherein each particular stream of the multiple streams of event data comprises a respective sequence of interaction steps performed by a respective particular user, wherein identifying the multiple candidate instances of the process using the WFSA, comprises: determining step-activity scores, the determining comprising, for each particular sequence of interaction steps among at least some of the sequences of interaction steps in the multiple streams of event data: determining a step-activity score for each pair of an interaction step from the particular sequence of interaction steps and an activity represented by a state in the WFSA; and identifying, using dynamic programming, the multiple candidate instances using the step-activity scores and the weights associated with the edges of the WFSA.
10 . The method of claim 9 , wherein the at least some sequences of interaction steps comprises a first sequence of interaction steps, the first sequence of interaction steps comprising a first interaction step, wherein the WFSA comprises a first state associated with a first activity, and wherein determining the step-activity scores comprises determining a first step-activity score for the first interaction step and the first activity at least in part by: determining a semantic similarity score for the first interaction step and the first activity; determining a symbolic score for the first interaction step and the first activity; determining a cross-encoder similarity score for the first interaction step and the first activity; and determining the first step-activity score as a weighted combination of the semantic similarity score, the symbolic score, and the cross-encoder similarity score.
11 . The method of claim 10 , wherein determining the semantic similarity score comprises: generating a textual description for the first interaction step by: generating interaction text data by aggregating textual labels and metadata associated with: (i) the first interaction step, and (ii) interaction steps related to the first interaction step; and providing the interaction text data as input to an LLM to obtain the textual description for the first interaction step; embedding the textual description for the first interaction step using a trained text embedding model to obtain a first embedded vector; embedding a textual description of the first activity using the trained text embedding model to obtain a second embedded vector; and determining the semantic similarity score using the first embedded vector and the second embedded vector.
12 . The method of claim 10 , wherein determining the symbolic score comprises determining the symbolic score using a measure of similarity between an application associated with the first interaction step and one or more applications associated with the first activity.
13 . The method of claim 9 , wherein identifying the multiple candidate instances of the process using the WFSA, further comprises: after identifying, using dynamic programming, the multiple candidate instances using the step-activity scores and the weights associated with the edges of the WFSA, ranking the multiple candidate instances based on their respective average step-activity scores; and selecting a number of candidate instances based on their ranking.
14 . The method of claim 9 , wherein identifying the multiple candidate instances of the process using the WFSA, further comprises: after identifying, using dynamic programming, the multiple candidate instances using the step-activity scores and the weights associated with the edges of the WFSA, generating a measure of confidence and textual workflow summary for at least some of the multiple candidate instances.
15 . The method of claim 1 , wherein the language model is a large language model (LLM), and wherein generating the process representation comprises prompting the LLM with the natural language input to obtain an output indicating a sequence of interaction steps, the output indicating for each interaction step in the sequence: a description of an interaction, an application used to perform the interaction, a screen name, an element name, and/or an indication of time spent during the interaction.
16 . The method of claim 15 , wherein prompting the LLM with the natural language input comprises: generating a prompt using the natural language input and a schema specifying format of output to be generated by the LLM; and providing the prompt as input to the LLM, wherein the method further comprises training the LLM at least in part by: accessing a baseline LLM model; generating training data comprising pairs of natural language input and corresponding outputs, the generating comprising: selecting, at random, interaction sequences part of the multiple streams of event data; using the baseline LLM model to generate, as inputs, natural language prompts from the selected interaction sequences; and using the selected interaction sequences as outputs in the training data corresponding to the natural language prompts; and fine-tuning the baseline LLM model using the generated training data to obtain the LLM model.
17 . The method of claim 1 , further comprising generating a visualization of the at least one confirmed instance of the process.
18 . The method of claim 1 , further comprising: identifying, using the at least one confirmed instance of the process and from among the multiple streams of event data, multiple further candidate instances of the process.
19 . A system, comprising: at least one computer hardware processor; and at least one non-transitory computer-readable storage medium storing instructions that, when executed by the at least one computer hardware processor cause the at least one computer hardware processor to perform a method of using natural language to identify instances of a process in multiple streams of event data, each particular stream of event data, from among the multiple streams, corresponding to a series of interactions between one or more application programs executing on particular computing device and a particular user performing the process using the one or more application programs, the method comprising: (A) receiving natural language input describing the process; (B) generating a process representation at least in part by using a language model to process the natural language input; (C) identifying, using the process representation and from among the multiple streams of event data, multiple candidate instances of the process; (D) selecting, based on user input, at least one of the multiple candidate instances; and (E) storing the selected at least one candidate instance as at least one confirmed instance of the process.
20 . At least one non-transitory computer-readable storage medium storing instructions that, when executed by at least one computer hardware processor cause the at least one computer hardware processor to perform a method of using natural language to identify instances of a process in multiple streams of event data, each particular stream of event data, from among the multiple streams, corresponding to a series of interactions between one or more application programs executing on particular computing device and a particular user performing the process using the one or more application programs, the method comprising: (A) receiving natural language input describing the process; (B) generating a process representation at least in part by using a language model to process the natural language input; (C) identifying, using the process representation and from among the multiple streams of event data, multiple candidate instances of the process; (D) selecting, based on user input, at least one of the multiple candidate instances; and (E) storing the selected at least one candidate instance as at least one confirmed instance of the process.

Description

RELATED APPLICATIONS This application claims the benefit of U.S. Provisional Patent Application Ser. No. 63/716,163, filed Nov. 4, 2024, titled “DISCOVERY TECHNIQUES, SEGMENTATION, AND COLLABORATION FROM INTERACTION DATA,” and of U.S. Provisional Patent Application Ser. No. 63/782,478, filed Apr. 2, 2025, titled “ASSISTING PROCESS DISCOVERY WITH ENCODER AND DECODER MODELS,” each of which is incorporated by reference herein in its entirety. BACKGROUND Employees at many companies spend much of their time working on computers. An employer may monitor an employee's computer activity by installing a monitoring application program on the employee's work computer to monitor the employee's actions. For example, an employer may install a keystroke logger application on the employee's work computer. The keystroke logger application may be used to capture the employee's keystrokes and store the captured keystrokes in a text file for subsequent analysis. SUMMARY Some embodiments provide for a method of using natural language to identify instances of a process in multiple streams of event data, each particular stream of event data, from among the multiple streams, corresponding to a series of interactions between one or more application programs executing on particular computing device and a particular user performing the process using the one or more application programs, the method comprising using at least one computer hardware processor to perform: (A) receiving natural language input describing the process; (B) generating a process representation at least in part by using a language model to process the natural language input; (C) identifying, using the process representation and from among the multiple streams of event data, multiple candidate instances of the process; (D) selecting, based on user input, at least one of the multiple candidate instances; and (E) storing the selected at least one candidate instance as at least one confirmed instance of the process. In some embodiments, receiving the natural language input comprises receiving the natural language input from a user via a graphical user interface. In some embodiments, the natural language input describes the process in part by identifying one or more application programs used to perform the process and one or more activities performed using the one or more applications programs in furtherance of the process. In some embodiments, the process representation is an activity-level process representation. In some embodiments, the process representation indicates a set of activities and relationships among activities in the set of activities, the relationships indicating an order in which at least some of the activities in the activities are to be performed as part of the process. In some embodiments, the process representation, further indicates, for each particular activity in the set of activities: an identifier, a natural language description of the activity, and a set of one or more application programs used to perform the activity. In some embodiments, the method further comprises: generating a workflow graph visualization of the process representation, the workflow graph visualization comprising a graph with nodes representing activities in the set of activities and edges representing the relationships among the activities in the set of activities; and displaying the workflow graph visualization of the process representation in a graphical user interface (GUI). In some embodiments, the GUI comprises a chatbot interface, the method further comprising: receiving, via the chatbot interface, further natural language input from the user indicating one or more modifications to make to the process representation; modifying the process representation in accordance with the further natural language input from the user to obtain an updated process representation; generating an updated workflow graph visualization of the updated process representation; and displaying the updated workflow graph visualization in the GUI. In some embodiments, the language model is a large language model. In some embodiments, identifying, using the process representation and from among the multiple streams of event data, the multiple candidate instances of the process, comprises: generating weighted finite-state automaton (WFSA) from the process representation, the WFSA comprising states, edges between pairs of states, and weights associated with the edges, the states comprising a respective state for each of the activities in the process representation; and identifying the multiple candidate instances of the process using the WFSA. In some embodiments, each particular stream of the multiple streams of event data comprises a respective sequence of interaction steps performed by a respective particular user, and identifying the multiple candidate instances of the process using the WFSA, comprises: determining step-activity scores, the determining comprising, for each particular sequen