US-12619936-B1 - Automatic root cause and corrective action identification system and method based on incident records

US12619936B1US 12619936 B1US12619936 B1US 12619936B1US-12619936-B1

Abstract

Disclosed herein is a machine learning based system and method for root cause identification and corrective action recommendation to address various workplace incidents. An example system may include a computing device configured to formulate a root cause superset to represent a plurality of root causes, identify fields included in a received incident record in connection with a set of predetermined field descriptors to generate a textual description of the incident record, store data relating to the root cause superset and the textual description of the incident record as a trained model, instruct LLM(s) to determine at least one root cause and corrective action based least upon the textual description of the incident record and the root cause superset, and generate data for a display device to indicate the at least one root cause and corrective action.

Inventors

Julia Penfield
Pulkit Trushantkumar Parikh
Marc L. Juaire
Dana C. Garber

Assignees

VelocityEHS Holdings, Inc.

Dates

Publication Date: 20260505
Application Date: 20250203

Claims (20)

1 . A system deployed within a communication network for root cause identification and corrective action recommendation, the system comprising: a computing device, comprising: a non-transitory computer-readable storage medium storing instructions; and a processor coupled to the non-transitory computer-readable storage medium and configured to execute the instructions to: formulate a first root cause superset to represent a first plurality of root causes of various workplace incidents, obtain a second root cause superset to represent a second plurality of root causes of various workplace incidents within specific organizational or domain contexts; obtain an incident record, identify fields included in the incident record in connection with a set of predetermined field descriptors, generate a textual description of the incident record based on selected fields of the incident record corresponding to the set of predetermined field descriptors, store data relating to the first root cause superset, the second root cause superset, and the textual description of the incident record on the non-transitory computer-readable storage medium as a trained model, instruct a first large language model (LLM) to determine at least one root cause of the incident record based at least upon the first root cause superset, the second root cause superset, and the textual description of the incident record by scanning the textual description of the incident record for keywords, phrases, and patterns that correspond to first predefined root cause categories of the second root cause superset representing the second plurality of root causes of various workplace incidents within specific organizational or domain contexts, instructing the first LLM to perform contextual inference or understanding to map the keywords, phrases, and patterns to the first predefined root cause categories of the second root cause superset in accordance with a selected parameter to reduce a randomness of outputs of the first LLM, and in response to determining that there is no match between the keywords, phrases, and patterns and the first predefined root cause categories of the second root cause superset, instructing the first LLM to perform the contextual inference or understanding to map the keywords, phrases, and patterns to second predefined root cause categories of the first root cause superset representing the first plurality of root causes of various workplace incidents, instruct a second LLM to determine at least one corrective action for the at least one root cause based at least upon the textual description of the incident record, and generate data for a display device to indicate the at least one root cause and the at least one corrective action.
2 . The system of claim 1 , wherein the processor of the computing device is further configured to execute the instructions to: obtain at least one hazard responsible for the incident record; and instruct the first LLM to determine the at least one root cause of the incident record based at least upon the first root cause superset, the textual description of the incident record, and the at least one hazard.
3 . The system of claim 2 , wherein the processor of the computing device is further configured to execute the instructions to: instruct the first LLM to determine the at least one root cause of the incident record based at least upon the first root cause superset, the textual description of the incident record, the at least one hazard, and the second root cause superset.
4 . The system of claim 2 , wherein the processor of the computing device is further configured to execute the instructions to: instruct the second LLM to determine the at least one corrective action for the at least one root cause based at least upon the textual description of the incident record and the at least one hazard responsible for the incident record.
5 . The system of claim 1 , wherein the processor of the computing device is configured to execute the instructions to formulate the first root cause superset by building a dataset of records of the workplace incidents, each record including a plurality of heterogeneous fields, wherein each of the plurality of heterogeneous fields includes at least one of: a unstructured text input with a number of unique values, a categorical input with a number of unique textual values, a quantity-based input including numeric values and associated textual information, and date and time information relating to a workplace incident.
6 . The system of claim 1 , wherein the processor of the computing device is further configured to execute the instructions to instruct the first LLM to produce a structured response to represent the at least one root cause of the incident record using hypertext markup language (XML) tags.
7 . The system of claim 6 , wherein the processor of the computing device is further configured to execute the instructions to set a tunable parameter of the first LLM to produce the structured response maximally deterministic.
8 . The system of claim 1 , wherein the processor of the computing device is further configured to execute the instructions to instruct the second LLM to produce a response to represent the at least one corrective action as a JavaScript Object Notation (JSON) dictionary where the at least one root cause serves as a key.
9 . The system of claim 1 , wherein the processor of the computing device is further configured to execute the instructions to modify the at least one root cause of the incident record on the display device based on a user input.
10 . The system of claim 1 , wherein the processor of the computing device is further configured to execute the instructions to modify the at least one corrective action on the display device based on a user input.
11 . A computer-implemented method, comprising: formulating, by a processor of a computing device, a first root cause superset to represent a first plurality of root causes of various workplace incidents; obtaining, by the processor of the computing device, a second root cause superset to represent a second plurality of root causes of various workplace incidents within specific organizational or domain contexts: obtaining, by the processor of the computing device, an incident record; identifying, by the processor of the computing device, fields included in the incident record in connection with a set of predetermined field descriptors; generating, by the processor of the computing device, a textual description of the incident record based on selected fields of the incident record corresponding to the set of predetermined field descriptors; storing, by the processor of the computing device, data relating to the first root cause superset, the second root cause superset, and the textual description of the incident record on the non-transitory computer-readable storage medium as a trained model; instructing, by the processor of the computing device, a first large language model (LLM) to determine at least one root cause of the incident record based at least upon the first root cause superset, the second root cause superset, and the textual description of the incident record by scanning the textual description of the incident record for keywords, phrases, and patterns that correspond to first predefined root cause categories of the second root cause superset representing the second plurality of root causes of various workplace incidents within specific organizational or domain contexts, instructing the first LLM to perform contextual inference or understanding to map the keywords, phrases, and patterns to the first predefined root cause categories of the second root cause superset in accordance with a selected parameter to reduce a randomness of outputs of the first LLM, and in response to determining that there is no match between the keywords, phrases, and patterns and the first predefined root cause categories of the second root cause superset, instructing the first LLM to perform the contextual inference or understanding to map the keywords, phrases, and patterns to second predefined root cause categories of the first root cause superset representing the first plurality of root causes of various workplace incidents; instructing, by the processor of the computing device, a second LLM to determine at least one corrective action for the at least one root cause based at least upon the textual description of the incident record; and generating, by the processor of the computing device, data for a display device to indicate the at least one root cause and the at least one corrective action.
12 . The computer-implemented method of claim 11 , further comprising: obtaining at least one hazard responsible for the incident record; and instructing the first LLM to determine the at least one root cause of the incident record based at least upon the first root cause superset, the textual description of the incident record, and the at least one hazard.
13 . The computer-implemented method of claim 12 , further comprising: instructing the first LLM to determine the at least one root cause of the incident record based at least upon the first root cause superset, the textual description of the incident record, the at least one hazard, and the second root cause superset.
14 . The computer-implemented method of claim 12 , further comprising: instructing the second LLM to determine the at least one corrective action for the at least one root cause based at least upon the textual description of the incident record and the at least one hazard responsible for the incident record.
15 . The computer-implemented method of claim 11 , wherein the formulating the first root cause superset includes building a dataset of records of the workplace incidents, each record including a plurality of heterogeneous fields, wherein each of the plurality of heterogeneous fields includes at least one of: a unstructured text input with a number of unique values, a categorical input with a number of unique textual values, a quantity-based input including numeric values and associated textual information, and date and time information relating to a workplace incident.
16 . The computer-implemented method of claim 11 , further comprising instructing the first LLM to produce a structured response to represent the at least one root cause of the incident record using hypertext markup language (XML) tags.
17 . The computer-implemented method of claim 16 , further comprising setting a tunable parameter of the first LLM to produce the structured response maximally deterministic.
18 . The computer-implemented method of claim 11 , further comprising instructing the second LLM to produce a response to represent the at least one corrective action as a JavaScript Object Notation (JSON) dictionary where the at least one root cause serves as a key.
19 . The computer-implemented method of claim 11 , further comprising modifying the at least one root cause of the incident record on the display device based on a user input.
20 . The computer-implemented method of claim 11 , further comprising modifying the at least one corrective action on the display device based on a user input.

Description

FIELD OF TECHNOLOGY The present disclosure generally relates to a system and method for automatically managing hazards at workplaces, and more particularly relates to a machine learning (ML) based computing system and method for automatically identifying root causes of safety incidents based at least upon obtained incident reports or records and recommending corrective action(s) for each identified root cause. BACKGROUND Effective risk management is critical for workplace safety and operational continuity. It can also enhance competitive value and prevent environmental damage. An essential component of risk management is root cause analysis, which involves identifying the underlying causes of incidents. It is typically followed by corrective action recommendation, which involves identifying one or more corrective actions for every root cause identified to prevent the future occurrence of similar incidents. Traditional approaches to root cause analysis often rely on manual investigation, which can be time-consuming, subjective, and inconsistent across different facilities. Furthermore, the identification of the correct root causes and the recommendation of effective corrective actions typically require highly skilled safety professionals, who are in short supply. Accordingly, there is a need for a ML based approach for automatically identifying the root causes of safety incidents at workplaces and recommending corrective action(s) for each identified root cause. SUMMARY Among other features, in one aspect, the present disclosure relates to a system deployed within a communication network for root cause identification and corrective action recommendation. In one embodiment, a system comprising: a computing device, comprising: a non-transitory computer-readable storage medium storing instructions; and a processor coupled to the non-transitory computer-readable storage medium and configured to execute the instructions to: formulate a first root cause superset to represent a first plurality of root causes of various workplace incidents, obtain an incident record, identify fields included in the incident record in connection with a set of predetermined field descriptors, generate a textual description of the incident record based on selected fields of the incident record corresponding to the set of predetermined field descriptors, store data relating to the first root cause superset and the textual description of the incident record on the non-transitory computer-readable storage medium as a trained model, instruct a first large language model (LLM) to determine at least one root cause of the incident record based at least upon the first root cause superset and the textual description of the incident record, instruct a second LLM to determine at least one corrective action for the at least one root cause based at least upon the textual description of the incident record, and generate data for a display device to indicate the at least one root cause and the at least one corrective action. In some embodiments, the processor of the computing device may be further configured to execute the instructions to: obtain at least one hazard responsible for the incident record; and instruct the first LLM to determine the at least one root cause of the incident record based at least upon the first root cause superset, the textual description of the incident record, and the at least one hazard. In yet another embodiment, the processor of the computing device may be further configured to execute the instructions to: obtain a second root cause superset to represent a second plurality of root causes of various workplace incidents within specific organizational or domain contexts; and instruct the first LLM to determine the at least one root cause of the incident record based at least upon the first root cause superset, the textual description of the incident record, the at least one hazard, and the second root cause superset. In additional embodiments, the processor of the computing device may be further configured to execute the instructions to instruct the second LLM to determine the at least one corrective action for the at least one root cause based at least upon the textual description of the incident record and the at least one hazard responsible for the incident record. According to certain embodiments, the processor of the computing device may be configured to execute the instructions to formulate the first root cause superset by building a dataset of records of the workplace incidents, each record including a plurality of heterogeneous fields, wherein each of the plurality of heterogeneous fields includes at least one of: a unstructured text input with a number of unique values, a categorical input with a number of unique textual values, a quantity-based input including numeric values and associated textual information, and date and time information relating to a workplace incident. In addition, the processor of the compu