Search

US-20260127063-A1 - LANGUAGE MODEL ASSISTED ERROR ANALYSIS SYSTEM

US20260127063A1US 20260127063 A1US20260127063 A1US 20260127063A1US-20260127063-A1

Abstract

Computer-implemented systems and methods including language models for explaining and resolving code errors. A computer-implemented method may include: receiving or accessing a log comprising an error message, the error message indicating an error in code; determining the error message from the log; determining a context associated with the error; generating a prompt for a large language model (“LLM”), the prompt comprising at least: the error message, and the context associated with the error; transmitting the prompt to the LLM; and receiving an output from the LLM in response to the prompt, the output comprising at least: an explanation of the error message, and a suggested fix for the error.

Inventors

  • Akshay Agrawal
  • Sudarshan Sanjay Ruikar
  • Ndeye Fatou Diop
  • Frauke Hein
  • Christopher Jeganathan
  • Oleh Igorovych Busko
  • Claudia Rafaela Rogoz
  • Dauren Abdykaparov
  • Philipp Shchekin
  • Ryan Norris

Assignees

  • Palantir Technologies Inc.

Dates

Publication Date
20260507
Application Date
20251013

Claims (20)

  1. 1 . A computerized method, performed by a computing system having one or more hardware computer processors and one or more non-transitory computer-readable storage devices storing software instructions executable by the computing system, the computerized method comprising: receiving or accessing an error message, the error message indicating an error in code; determining a context associated with the error; generating a prompt for a large language model (“LLM”), the prompt comprising at least: the error message, the context associated with the error, and one or more instructions that instruct the LLM to generate a suggested fix for the error based on the error message and the context associated with the error; transmitting the prompt to the LLM; receiving an output from the LLM in response to the prompt, the output comprising at least: the suggested fix for the error; and implementing the suggested fix in response to a user input accepting the suggested fix, or automatically implementing the suggested fix.
  2. 2 . The computerized method of claim 1 , wherein the one or more instructions instruct the LLM to indicate one or more lines of the code that the LLM determines to be likely to cause the error.
  3. 3 . The computerized method of claim 1 , wherein the one or more instructions instruct the LLM to refrain from generating the suggested fix if the LLM determines that a cause of the error is unclear.
  4. 4 . The computerized method of claim 1 , wherein the one or more instructions instruct the LLM to generate an explanation of the error message.
  5. 5 . The computerized method of claim 4 , wherein the output comprises the explanation of the error message.
  6. 6 . The computerized method of claim 1 , wherein receiving or accessing the error message comprises: receiving or accessing a log comprising the error message; and determining the error message from the log.
  7. 7 . The computerized method of claim 6 , wherein determining the error message from the log comprises: executing a semantic search or a regular expression (“regex”) search on the log to identify the error message, wherein the error message comprises one or more text strings.
  8. 8 . The computerized method of claim 1 , wherein the context associated with the error comprises portions of one or more documents associated with the code.
  9. 9 . The computerized method of claim 8 , wherein determining the context associated with the error comprises: generating, based at least in part on the error message, one or more search criteria; and executing, using at least the one or more search criteria, a similarity search in a set of documents to identify the portions of the one or more documents associated with the code.
  10. 10 . The computerized method of claim 9 , wherein the similarity search comprises execution of a document search model, and wherein the computerized method further comprising: generating the document search model, wherein generating the document search model comprises: chunking the set of documents into a plurality of portions of the set of documents; and vectorizing the plurality of portions of the set of documents to generate a plurality of vectors.
  11. 11 . The computerized method of claim 8 , wherein the portions of the one or more documents associated with the code comprise document portions having a threshold similarity with the error message.
  12. 12 . The computerized method of claim 8 , wherein the context associated with the error comprises one or more citations to the one or more documents.
  13. 13 . The computerized method of claim 6 , wherein the context associated with the error comprises extended portions of the log that are adjacent to the error message in the log.
  14. 14 . The computerized method of claim 1 , wherein the context associated with the error comprises a portion of the code associated with the error.
  15. 15 . The computerized method of claim 14 , wherein determining the context associated with the error comprises: accessing the code from a repository that stores the code; and identifying, based on the error message, the portion of the code associated with the error.
  16. 16 . The computerized method of claim 14 , wherein the portion of the code associated with the error comprises a difference between multiple versions of at least a section of the code.
  17. 17 . The computerized method of claim 1 further comprising: providing, via a user interface, the output from the LLM.
  18. 18 . The computerized method of claim 1 , wherein the suggested fix comprises a modification to at least a section of the code.
  19. 19 . A system comprising: one or more computer-readable storage mediums having program instructions embodied therewith; and one or more processors configured to execute the program instructions to cause the system to perform the computerized method of claim 1 .
  20. 20 . A computer program product comprising one or more computer-readable storage mediums having program instructions embodied therewith, the program instructions executable by one or more processors to cause the one or more processors to perform the computerized method of claim 1 .

Description

CROSS-REFERENCE TO RELATED APPLICATIONS This application is a continuation of U.S. patent application Ser. No. 18/735057, filed Jun. 5, 2024, and titled “LANGUAGE MODEL ASSISTED ERROR ANALYSIS SYSTEM,” which claims benefit of U.S. Provisional Patent Application No. 63/596491, filed Nov. 6, 2023, and titled “LLM-POWERED REMOTE WORKSPACE ERROR-ENHANCER,” and U.S. Provisional Patent Application No. 63/559421, filed Feb. 29, 2024, and titled “LANGUAGE MODEL ASSISTED ERROR ANALYSIS SYSTEM.” The entire disclosure of each of the above items is hereby made part of this specification as if set forth fully herein and incorporated by reference for all purposes, for all that it contains. Any and all applications for which a foreign or domestic priority claim is identified in the Application Data Sheet as filed with the present application are hereby incorporated by reference under 37 CFR 1.57 for all purposes and for all that they contain. TECHNICAL FIELD The present disclosure relates to systems and techniques for utilizing computer-based models. More specifically, the present disclosure relates to computerized systems and techniques including large language models for analysis and resolution of software program code errors. BACKGROUND The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section. Computers can be programmed to perform calculations and operations utilizing one or more computer-based models. For example, language models can be utilized to provide and/or predict a probability distribution over sequences of words. SUMMARY The systems, methods, and devices described herein each have several aspects, no single one of which is solely responsible for its desirable attributes. Without limiting the scope of this disclosure, several non-limiting features will now be described briefly. Computer-based platforms may provide various applications by executing software instructions and/or other executable code written in any combination of one or more programming languages. However, errors may be encountered while compiling or executing code, and it may be difficult to analyze errors when code becomes more complex or powerful. For example, the complexity of modern code bases may require review, analysis, and understanding of large amounts of data and information (e.g., large volumes of documentation, many code files, analysis on changes to code files, or the like) to effectively analyze code errors. Although a Large Language Model (“LLM”) can be utilized to analyze an error, some LLMs may only handle prompts within a limited size and may be inefficient in analyzing a large corpus of code or error logs that include both information related and unrelated to the error. Further, some LLMs may hallucinate (e.g., generate factually incorrect or nonsensical information) or be ineffective in analyzing code errors when operating on prompts that are generic or include insufficient context to the error. The present disclosure implements systems and methods (generally collectively referred to herein as “an error analysis system” or simply a “system”) that can advantageously overcome various of the technical challenges mentioned above, among other technical challenges. For example, various implementations of the systems and methods of the present disclosure can advantageously employ one or more LLMs for explaining, based on prompt generation including context relevant or specific to a code error, the code error recorded in a log that is generated while utilizing code to implement a service. The one or more LLMs may further suggest a code fix based on the prompt. Advantageously, the system can enable effective code errors analysis and/or fixes, by providing context most associated with the code errors to one or more LLMs. Thus, prompts for the LLMs may not exceed a size limit and may enable LLMs to effectively analyze code errors. Additionally, LLM(s) may generate outputs that more accurately explain code errors and/or pinpoint associated issues based on prompts tailored to the code errors. Various embodiments of the present disclosure provide improvements to various technologies and technological fields. For example, as described above, the system may advantageously generate a prompt for an LLM based on context most associated with a code error for enabling one or more LLMs to accurately explain the code error and/or suggest a code fix based on the prompt. Other technical benefits provided by various embodiments of the present disclosure include, for example, enabling LLM(s) to more effectively pinpoint associated issues based on prompts tailored to the code errors, and automatically fixing code errors. Additionally, various implementations of the p