Search

US-20260127525-A1 - USER GUIDANCE THROUGH PROCESS DISCOVERY

US20260127525A1US 20260127525 A1US20260127525 A1US 20260127525A1US-20260127525-A1

Abstract

Techniques for guiding a user in performing a process based on historical digital interaction data of one or more users performing the process, the historical digital interaction data comprising multiple streams of event data, the method comprising: obtaining a stream of event data corresponding to a series of interactions between at least one application program executing on the user's computing device and the user performing the process using the at least one application program; identifying, within the historical digital interaction data and using the stream of event data, at least one instance of the process previously performed by at least one user; generating guidance for the user performing the process using the at least one instance of the process, the guidance indicating one or more suggested acts for the user in furtherance of performing the process; and providing the generated guidance to the user.

Inventors

  • George Peter Nychis
  • Rohan Narayana Murty
  • Kevin Segundo Bello Medina

Assignees

  • SOROCO INDIA PRIVATE LIMITED

Dates

Publication Date
20260507
Application Date
20251104

Claims (20)

  1. 1 . A method of guiding a user in performing a process based on historical digital interaction data of one or more users performing the process, the historical digital interaction data comprising multiple streams of event data, each particular stream of event data, from among the multiple streams, corresponding to interactions between one or more application programs executing on particular computing device and a particular user performing the process using the one or more application programs, the method comprising: using at least one computer hardware processor to perform: (A) obtaining a stream of event data corresponding to a series of interactions between at least one application program executing on the user's computing device and the user performing the process using the at least one application program; (B) identifying, within the historical digital interaction data and using the stream of event data, at least one instance of the process previously performed by at least one user; (C) generating guidance for the user performing the process using the at least one instance of the process, the guidance indicating one or more suggested acts for the user in furtherance of performing the process; and (D) providing the generated guidance to the user.
  2. 2 . The method of claim 1 , wherein the at least one instance of the process previously performed by at least one user is performed by at least one user different from the user.
  3. 3 . The method of claim 1 , further comprising: determining that guidance is to be generated for the user performing the process.
  4. 4 . The method of claim 3 , wherein determining that the guidance is to be generated for the user performing the process comprises: determining that the guidance is to be generated in response to the user requesting assistance in performing the process, or automatically determining that the guidance is to be generated in response to detecting that at least one guidance generation criterion is met.
  5. 5 . The method of claim 3 , further comprising: performing (B) and (C), in response to determining that the guidance is to be generated for the user performing the process, or performing (C), in response to determining that the guidance is to be generated for the user performing the process.
  6. 6 . The method of claim 3 , further comprising: after identifying the at least one instance of the process at act (B), determining that the guidance is to be generated for the user, wherein the determining is based on a measure of confidence that the at least one instance of the process is an instance of the process being performed by the user.
  7. 7 . The method of claim 1 , further comprising: continuously capturing event data while the user is interacting with the user's computing device, wherein (A) comprises obtaining event data captured within a threshold amount of time.
  8. 8 . The method of claim 1 , wherein the stream of event data contains event data for each event in a stream of events, wherein (B) comprises: organizing events in the stream of events into at least one window of events, each of the at least one window of events comprising one or multiple events in the stream of events; generating, using at least one trained embedding ML model, at least one numeric representation corresponding to the at least one window of events; determining a measure of similarity between the at least one numeric representation and each of multiple stored and previously-determined numeric representations of respective windows of events in the multiple streams of event data in the historical digital interaction data to obtain a plurality of measures of similarity; and identifying, using the determined plurality of measures of similarity, the at least one instance of the process in the stream of events.
  9. 9 . The method of claim 8 , wherein the at least one window of events comprises a first window comprising a first plurality of events, wherein generating the at least one numeric representation corresponding to the at least one window of events comprises generating a first numeric representation of the first window, wherein generating the first numeric representation of the first window comprises: for each particular event in the first plurality events, processing event data for the particular event using the trained embedding ML model to obtain a numeric representation for the particular event, thereby generating numeric representations of events in the first plurality of events; and combining the numeric representations of the events in the first plurality of events to obtain the first numeric representation of the first window.
  10. 10 . The method of claim 9 , wherein the combining comprises: normalizing each of the numeric representations to obtain normalized numeric representations; and generating the first numeric representation of the first window as a weighted average of the normalized numeric representations, wherein generating the first numeric representation of the first window as a weighted average comprises weighting the normalized numeric representations based on durations and/or recency of events from which the normalized numeric representations were derived.
  11. 11 . The method of claim 9 , wherein the first plurality of events comprises a first event corresponding to an interaction between a user and an application program, wherein the event data for the first event comprises attribute-value pairs derived from information about the interaction between the user and a GUI of the application program, wherein processing the event data for first event comprises: generating a textual event representation of the first event using the attribute-value pairs in the event data for the first event; tokenizing the textual event representation to obtain a tokenized event representation; determining an initial numeric encoding of the tokenized event representation; and processing the initial numeric encoding with the trained embedding ML model to obtain a numeric representation of the first event.
  12. 12 . The method of claim 11 , wherein the attribute-value pairs comprise values for one or more attributes selected from the group consisting of: a name of the application program, a title of an application program screen of the application program with which the user interacted during the first event, an identifier of a user interface element of the application program screen with which the user interacted, a type of the user interface element of the application program screen with which the user interacted, one or more identifiers for one or more user interface elements of the application program screen with which the user did not interact, a duration of the interaction, and one or more textual phrases and/or sentences appearing on the application program screen.
  13. 13 . The method of claim 11 , wherein the trained embedding ML model comprises a trained neural network having a transformer-based architecture, wherein the trained neural network has a BERT model architecture or a ROBERTa model architecture.
  14. 14 . The method of claim 1 , wherein generating the guidance for the user performing the process comprises presenting the user with a textual or graphical description of the at least one instance of the process.
  15. 15 . A method of guiding a user in performing a process based on historical digital interaction data of one or more users performing the process, the historical digital interaction data comprising multiple streams of event data, each particular stream of event data, from among the multiple streams, corresponding to interactions between one or more application programs executing on particular computing device and a particular user performing the process using the one or more application programs, the method comprising: using at least one computer hardware processor to perform: (A) obtaining a stream of event data corresponding to a series of interactions between at least one application program executing on the user's computing device and the user performing the process using the at least one application program; (B) identifying, using the historical digital interaction data, the stream of event data, and a trained large language model (LLM), one or more suggested acts for the user to perform in furtherance of performing the process; and (C) generating guidance for the user performing the process using the identified one or more suggested acts.
  16. 16 . The method of claim 15 , further comprising: generating a prompt from the stream of event data; prompting the trained large language model with the prompt generated from the stream of event data to obtain an output indicating one or more acts that the user could perform as part of performing the process, wherein the trained LLM was trained by fine-tuning a baseline LLM with the historical digital interaction data.
  17. 17 . The method of claim 16 , further comprising: accessing the baseline LLM; and fine-tuning the baseline LLM with the historical digital interaction data using low-rank adaptors (LORA).
  18. 18 . The method claim 17 , wherein (C) further comprises presenting the user with the one more acts that the user could perform as part of the performing the process, wherein the presenting comprises provided the user with a textual or graphical description of the one more acts that the user could perform.
  19. 19 . A system, comprising: at least one computer hardware processor; and at least one non-transitory computer-readable storage medium storing instructions that, when executed by the at least one computer hardware processor cause the at least one computer hardware processor to perform a method of guiding a user in performing a process based on historical digital interaction data of one or more users performing the process, the historical digital interaction data comprising multiple streams of event data, each particular stream of event data, from among the multiple streams, corresponding to interactions between one or more application programs executing on particular computing device and a particular user performing the process using the one or more application programs, the method comprising: (A) obtaining a stream of event data corresponding to a series of interactions between at least one application program executing on the user's computing device and the user performing the process using the at least one application program; (B) identifying, within the historical digital interaction data and using the stream of event data, at least one instance of the process previously performed by at least one user; (C) generating guidance for the user performing the process using the at least one instance of the process, the guidance indicating one or more suggested acts for the user in furtherance of performing the process; and (D) providing the generated guidance to the user.
  20. 20 . The system of claim 19 , wherein the stream of event data contains event data for each event in a stream of events, wherein (B) comprises: organizing events in the stream of events into at least one window of events, each of the at least one window of events comprising one or multiple events in the stream of events; generating, using at least one trained embedding ML model, at least one numeric representation corresponding to the at least one window of events; determining a measure of similarity between the at least one numeric representation and each of multiple stored and previously-determined numeric representations of respective windows of events in the multiple streams of event data in the historical digital interaction data to obtain a plurality of measures of similarity; and identifying, using the determined plurality of measures of similarity, the at least one instance of the process in the stream of events.

Description

RELATED APPLICATIONS This application claims the benefit of U.S. Provisional Patent Application Ser. No. 63/716,163, filed Nov. 4, 2024, titled “DISCOVERY TECHNIQUES, SEGMENTATION, AND COLLABORATION FROM INTERACTION DATA,” and of U.S. Provisional Patent Application Ser. No. 63/782,478, filed Apr. 2, 2025, titled “ASSISTING PROCESS DISCOVERY WITH ENCODER AND DECODER MODELS,” each of which is incorporated by reference herein in its entirety. BACKGROUND Employees at many companies spend much of their time working on computers. An employer may monitor an employee's computer activity by installing a monitoring application program on the employee's work computer to monitor the employee's actions. For example, an employer may install a keystroke logger application on the employee's work computer. The keystroke logger application may be used to capture the employee's keystrokes and store the captured keystrokes in a text file for subsequent analysis. SUMMARY Some embodiments provide for a method of using natural language to identify instances of a process in multiple streams of event data, each particular stream of event data, from among the multiple streams, corresponding to a series of interactions between one or more application programs executing on particular computing device and a particular user performing the process using the one or more application programs, the method comprising using at least one computer hardware processor to perform: (A) receiving natural language input describing the process; (B) generating a process representation at least in part by using a language model to process the natural language input; (C) identifying, using the process representation and from among the multiple streams of event data, multiple candidate instances of the process; (D) selecting, based on user input, at least one of the multiple candidate instances; and (E) storing the selected at least one candidate instance as at least one confirmed instance of the process. In some embodiments, receiving the natural language input comprises receiving the natural language input from a user via a graphical user interface. In some embodiments, the natural language input describes the process in part by identifying one or more application programs used to perform the process and one or more activities performed using the one or more applications programs in furtherance of the process. In some embodiments, the process representation is an activity-level process representation. In some embodiments, the process representation indicates a set of activities and relationships among activities in the set of activities, the relationships indicating an order in which at least some of the activities in the activities are to be performed as part of the process. In some embodiments, the process representation, further indicates, for each particular activity in the set of activities: an identifier, a natural language description of the activity, and a set of one or more application programs used to perform the activity. In some embodiments, the method further comprises: generating a workflow graph visualization of the process representation, the workflow graph visualization comprising a graph with nodes representing activities in the set of activities and edges representing the relationships among the activities in the set of activities; and displaying the workflow graph visualization of the process representation in a graphical user interface (GUI). In some embodiments, the GUI comprises a chatbot interface, the method further comprising: receiving, via the chatbot interface, further natural language input from the user indicating one or more modifications to make to the process representation; modifying the process representation in accordance with the further natural language input from the user to obtain an updated process representation; generating an updated workflow graph visualization of the updated process representation; and displaying the updated workflow graph visualization in the GUI. In some embodiments, the language model is a large language model. In some embodiments, identifying, using the process representation and from among the multiple streams of event data, the multiple candidate instances of the process, comprises: generating weighted finite-state automaton (WFSA) from the process representation, the WFSA comprising states, edges between pairs of states, and weights associated with the edges, the states comprising a respective state for each of the activities in the process representation; and identifying the multiple candidate instances of the process using the WFSA. In some embodiments, each particular stream of the multiple streams of event data comprises a respective sequence of interaction steps performed by a respective particular user, and identifying the multiple candidate instances of the process using the WFSA, comprises: determining step-activity scores, the determining comprising, for each particular sequen