Search

WO-2026097079-A1 - TECHNIQUES FOR PROCESS DISCOVERY AND USER GUIDANCE

WO2026097079A1WO 2026097079 A1WO2026097079 A1WO 2026097079A1WO-2026097079-A1

Abstract

This application describes novel techniques for process discovery and for guiding users using discovered process instances. Process discovery techniques involve receiving natural language input describing a process, generating a process representation from the natural language input using a language model, and identifying at least one candidate instance of the process using the natural language input. User guidance techniques involve obtaining a stream of event data corresponding to a series of interactions between at least one application program and a user; identifying, within historical digital interaction data and using the stream of event data, at least one instance of the process previously performed by at least one user; generating guidance for the user performing the process using the at least one instance of the process, the guidance indicating one or more suggested acts for the user in furtherance of performing the process; and providing the user with generated guidance.

Inventors

  • NYCHIS, GEORGE PETER
  • MURTY, ROHAN NARAYANA
  • BELLO MEDINA, Kevin Segundo

Assignees

  • SOROCO INDIA PRIVATE LIMITED
  • SOROCO AMERICAS PRIVATE LIMITED
  • SOROCO PRIVATE LIMITED

Dates

Publication Date
20260507
Application Date
20251104
Priority Date
20241104

Claims (20)

  1. 1. A method of using natural language to identify instances of a process in multiple streams of event data, each particular stream of event data, from among the multiple streams, corresponding to a series of interactions between one or more application programs executing on particular computing device and a particular user performing the process using the one or more application programs, the method comprising: using at least one computer hardware processor to perform: (A) receiving natural language input describing the process; (B) generating a process representation at least in part by using a language model to process the natural language input; (C) identifying, using the process representation and from among the multiple streams of event data, multiple candidate instances of the process; (D) selecting, based on user input, at least one of the multiple candidate instances; and (E) storing the selected at least one candidate instance as at least one confirmed instance of the process.
  2. 2. The method of claim 1, wherein receiving the natural language input comprises receiving the natural language input from a user via a graphical user interface.
  3. 3. The method of claim 1 or any other preceding claim, wherein the natural language input describes the process in part by identifying one or more application programs used to perform the process and one or more activities performed using the one or more applications programs in furtherance of the process.
  4. 4. The method of claim 1 or any other preceding claim, wherein the process representation is an activity-level process representation.
  5. 5. The method of claim 1 or any other preceding claim, wherein the process representation indicates a set of activities and relationships among activities in the set of activities, the relationships indicating an order in which at least some of the activities in the activities are to be performed as part of the process. 14537339 #14565713v1
  6. 6. The method of claim 5, wherein the process representation, further indicates, for each particular activity in the set of activities: an identifier, a natural language description of the activity, and a set of one or more application programs used to perform the activity.
  7. 7. The method of claim 5, further comprising: generating a workflow graph visualization of the process representation, the workflow graph visualization comprising a graph with nodes representing activities in the set of activities and edges representing the relationships among the activities in the set of activities; and displaying the workflow graph visualization of the process representation in a graphical user interface (GUI).
  8. 8. The method of claim 7, wherein the GUI comprises a chatbot interface, the method further comprising: receiving, via the chatbot interface, further natural language input from the user indicating one or more modifications to make to the process representation; modifying the process representation in accordance with the further natural language input from the user to obtain an updated process representation; generating an updated workflow graph visualization of the updated process representation; and displaying the updated workflow graph visualization in the GUI.
  9. 9. The method of claim 1 or any other preceding claim, wherein the language model is a large language model.
  10. 10. The method of claim 5, wherein identifying, using the process representation and from among the multiple streams of event data, the multiple candidate instances of the process, comprises: generating weighted finite-state automaton (WFSA) from the process representation, the WFSA comprising states, edges between pairs of states, and weights associated with the edges, the states comprising a respective state for each of the activities in the process representation; and identifying the multiple candidate instances of the process using the WFSA. 14537339 #14565713v1
  11. 11. The method of claim 10, wherein each particular stream of the multiple streams of event data comprises a respective sequence of interaction steps performed by a respective particular user, wherein identifying the multiple candidate instances of the process using the WFSA, comprises: determining step-activity scores, the determining comprising, for each particular sequence of interaction steps among at least some of the sequences of interaction steps in the multiple streams of event data: determining a step-activity score for each pair of an interaction step from the particular sequence of interaction steps and an activity represented by a state in the WFSA; and identifying, using dynamic programming, the multiple candidate instances using the step-activity scores and the weights associated with the edges of the WFSA.
  12. 12. The method of claim 11, wherein the at least some sequences of interaction steps comprises a first sequence of interaction steps, the first sequence of interaction steps comprising a first interaction step, wherein the WFSA comprises a first state associated with a first activity, and wherein determining the step-activity scores comprises determining a first step-activity score for the first interaction step and the first activity at least in part by: determining a semantic similarity score for the first interaction step and the first activity; determining a symbolic score for the first interaction step and the first activity; optionally, determining a cross-encoder similarity score for the first interaction step and the first activity; and determining the first step-activity score as a weighted combination of the semantic similarity score, the symbolic score, and, optionally, the cross-encoder similarity score.
  13. 13. The method of claim 12, wherein determining the semantic similarity score comprises: generating a textual description for the first interaction step by: generating interaction text data by aggregating textual labels and metadata associated with: (i) the first interaction step, and (ii) interaction steps related to the first interaction step; and 14537339 #14565713v1 providing the interaction text data as input to an LLM to obtain the textual description for the first interaction step; embedding the textual description for the first interaction step using a trained text embedding model to obtain a first embedded vector; embedding a textual description of the first activity using the trained text embedding model to obtain a second embedded vector; and determining the semantic similarity score using the first embedded vector and the second embedded vector.
  14. 14. The method of claim 12, wherein determining the symbolic score comprises determining the symbolic score using a measure of similarity between an application associated with the first interaction step and one or more applications associated with the first activity.
  15. 15. The method of claim 11, wherein identifying the multiple candidate instances of the process using the WFSA, further comprises: after identifying, using dynamic programming, the multiple candidate instances using the step-activity scores and the weights associated with the edges of the WFSA, ranking the multiple candidate instances based on their respective average stepactivity scores; and selecting a number of candidate instances based on their ranking.
  16. 16. The method of claim 11, wherein identifying the multiple candidate instances of the process using the WFSA, further comprises: after identifying, using dynamic programming, the multiple candidate instances using the step-activity scores and the weights associated with the edges of the WFSA, generating a measure of confidence and textual workflow summary for at least some of the multiple candidate instances.
  17. 17. The method of claim 1 or any other preceding claim, wherein the process representation is an interaction step-level process representation.
  18. 18. The method of claim 1 or any other preceding claim, wherein the language model is a large language model (LLM), and 14537339 #14565713v1 wherein generating the process representation comprises prompting the LLM with the natural language input to obtain an output indicating a sequence of interaction steps, the output indicating for each interaction step in the sequence: a description of an interaction, an application used to perform the interaction, a screen name, an element name, and/or an indication of time spent during the interaction.
  19. 19. The method of claim 18, wherein prompting the LLM with the natural language input comprises: generating a prompt using the natural language input and a schema specifying format of output to be generated by the LLM; and providing the prompt as input to the LLM.
  20. 20. The method of claim 19, further comprising training the LLM at least in part by: accessing a baseline LLM model; generating training data comprising pairs of natural language input and corresponding outputs, the generating comprising: selecting, at random, interaction sequences part of the multiple streams of event data; using the baseline LLM model to generate, as inputs, natural language prompts from the selected interaction sequences; and using the selected interaction sequences as outputs in the training data corresponding to the natural language prompts; and fine-tuning the baseline LLM model using the generated training data to obtain the LLM model.

Description

TECHNIQUES FOR PROCESS DISCOVERY AND USER GUIDANCE RELATED APPLICATIONS This application claims the benefit of U.S. Provisional Patent Application Serial No. 63/716,163, filed November 4, 2024, titled "DISCOVERY TECHNIQUES, SEGMENTATION, AND COLLABORATION FROM INTERACTION DATA," and of U.S. Provisional Patent Application Serial No. 63/782,478, filed April 2, 2025, titled “ASSISTING PROCESS DISCOVERY WITH ENCODER AND DECODER MODELS,” each of which is incorporated by reference herein in its entirety. BACKGROUND Employees at many companies spend much of their time working on computers. An employer may monitor an employee’s computer activity by installing a monitoring application program on the employee’s work computer to monitor the employee’s actions. For example, an employer may install a keystroke logger application on the employee’s work computer. The keystroke logger application may be used to capture the employee’s keystrokes and store the captured keystrokes in a text file for subsequent analysis. SUMMARY Some embodiments provide for a method of using natural language to identify instances of a process in multiple streams of event data, each particular stream of event data, from among the multiple streams, corresponding to a series of interactions between one or more application programs executing on particular computing device and a particular user performing the process using the one or more application programs, the method comprising using at least one computer hardware processor to perform: (A) receiving natural language input describing the process; (B) generating a process representation at least in part by using a language model to process the natural language input; (C) identifying, using the process representation and from among the multiple streams of event data, multiple candidate instances of the process; (D) selecting, based on user input, at least one of the multiple candidate instances; and (E) storing the selected at least one candidate instance as at least one confirmed instance of the process. In some embodiments, receiving the natural language input comprises receiving the natural language input from a user via a graphical user interface. 14537339 #14565713v1 In some embodiments, the natural language input describes the process in part by identifying one or more application programs used to perform the process and one or more activities performed using the one or more applications programs in furtherance of the process. In some embodiments, the process representation is an activity-level process representation. In some embodiments, the process representation indicates a set of activities and relationships among activities in the set of activities, the relationships indicating an order in which at least some of the activities in the activities are to be performed as part of the process. In some embodiments, the process representation, further indicates, for each particular activity in the set of activities: an identifier, a natural language description of the activity, and a set of one or more application programs used to perform the activity. In some embodiments, the method further comprises: generating a workflow graph visualization of the process representation, the workflow graph visualization comprising a graph with nodes representing activities in the set of activities and edges representing the relationships among the activities in the set of activities; and displaying the workflow graph visualization of the process representation in a graphical user interface (GUI). In some embodiments, the GUI comprises a chatbot interface, the method further comprising: receiving, via the chatbot interface, further natural language input from the user indicating one or more modifications to make to the process representation; modifying the process representation in accordance with the further natural language input from the user to obtain an updated process representation; generating an updated workflow graph visualization of the updated process representation; and displaying the updated workflow graph visualization in the GUI. In some embodiments, the language model is a large language model. In some embodiments, identifying, using the process representation and from among the multiple streams of event data, the multiple candidate instances of the process, comprises: generating weighted finite-state automaton (WFSA) from the process representation, the WFSA comprising states, edges between pairs of states, and weights associated with the edges, the states comprising a respective state for each of the activities in the process representation; and identifying the multiple candidate instances of the process using the WFSA. In some embodiments, each particular stream of the multiple streams of event data comprises a respective sequence of interaction steps performed by a respective particular user, and identifying the multiple candidate instances of the process using the WFSA, comprises: dete