JP-2026076131-A - Managing the recall of long-term memory for large-scale language models via self-reflection protocols

JP2026076131AJP 2026076131 AJP2026076131 AJP 2026076131AJP-2026076131-A

Abstract

[Problem] To disclose a method for developing and managing a long-term memory solution for a large-scale language model (LLM) agent in the context of providing the LLM agent as a service. [Solution] Following task-related communication between the LLM agent and the service user, domain knowledge, user preferences, and information regarding the success or failure of the requested task are extracted into a data sample by the service's reflection agent. The data sample is then stored in a future-accessible long-term memory database by the LLM agent so that the agent can recall information from previous interactions to more efficiently perform new tasks for the user. [Selection Diagram] Figure 3

Inventors

ホルヘエンリケピアゼンティンオノ
ジァジングゥオ
ヴィクラムモハンティ
ウェンビンフー
リウレン

Assignees

ローベルトボツシユゲゼルシヤフトミツトベシユレンクテルハフツング

Dates

Publication Date: 20260511
Application Date: 20251022
Priority Date: 20241023

Claims (20)

A computer-implemented method for managing the long-term memory of a large-scale language model (LLM), Receiving a first request from a user of the LLM service to the LLM agent of the LLM service to perform a first task, The LLM agent executes the first task and outputs the result of the first task to the user. Based on the analysis of the communication between the user and the LLM agent during the execution of the first task by the reflection agent of the LLM service, a reflection protocol is executed to generate data samples, The data samples are stored in the long-term memory database of the LLM service, In response to receiving a second request to the LLM agent to perform a second task, the LLM agent performs a recall using the long-term memory database and determines one or more data samples related to the performance of the second task. The LLM agent performs the second task based on the recall of one or more data samples stored in the long-term memory database, Computerized methods, including those mentioned above.
Executing the aforementioned reflection protocol means The communication is analyzed to determine which domain knowledge category the first task corresponds to, To generate text-based data samples from the domain knowledge category and the communications corresponding to the output results of the first task, Based on the aforementioned domain knowledge categories, the relevant tasks are searched for in the aforementioned long-term memory database. To supply the long-term memory database with data samples based on the aforementioned text and links to related tasks to be memorized, The computer-implemented method according to claim 1, including the method described in claim 1.
The computer-implemented method according to claim 2, wherein the domain knowledge categories include at least enterprise-specific procedures and professional workplace skill sets.
Executing the aforementioned reflection protocol means To analyze the communication over the entire duration of the communication to determine the user preference indicators of the user, To generate text-based data samples from communications corresponding to the user preferences of the aforementioned user, To label the data sample based on the aforementioned text as corresponding to the aforementioned user, The data sample based on the aforementioned text and the label to be stored are supplied to the long-term memory database. The computer-implemented method according to claim 1, including the method described in claim 1.
The computer-implemented method according to claim 4, wherein the user preference indicators include at least an indicator of a computer operating system for a particular user and an indicator of a preferred programming language for a particular user.
Executing the aforementioned reflection protocol means The communication is analyzed to determine an indicator that the LLM agent has successfully completed the first task, To generate text-based data samples from communication corresponding to the deliverables in the first task requested by the user and the output results of the first task by the LLM agent, The data sample based on the aforementioned text is labeled as having been successfully completed by the LLM agent, The data sample based on the aforementioned text and the label to be stored are supplied to the long-term memory database. The computer-implemented method according to claim 1, including the method described in claim 1.
The computer-implemented method according to claim 6, wherein the indicator of successful completion includes explicit communication from the user confirming that the output result corresponds to the deliverable of the requested first task.
Executing the aforementioned reflection protocol means The communication is analyzed to determine an indicator that the LLM agent failed to execute the first task, To generate text-based data samples from communications corresponding to the deliverables in the first task requested by the user and the results of the failure of the first task by the LLM agent, The data sample based on the aforementioned text is labeled as a failure of the LLM agent, The data sample based on the aforementioned text and the label to be stored are supplied to the long-term memory database. The computer-implemented method according to claim 1, including the method described in claim 1.
The computer-implemented method according to claim 8, wherein the indicator of failure includes explicit communication from the user that the output result does not correspond to the deliverable of the first task for which it was requested.
The computer-implemented method further, The reflection agent provides the user with data samples generated during the execution of the reflection protocol via the user interface of the LLM service. The user provides an indicator for editing one of the aforementioned data samples or an indicator for adding an additional data sample. The edited data sample or the added data sample is stored in the long-term memory database of the LLM service, The computer-implemented method according to claim 1, including the method described in claim 1.
A computer-implemented method for managing the long-term memory of a large-scale language model (LLM), The LLM service's reflection agent receives logs of communications between the LLM agent and the user of the LLM service, i.e., logs including requests from the user to perform a task and responses from the LLM agent performing the task. The reflection agent generates data samples by executing a reflection protocol based on an analysis of requests and responses in the communication logs, wherein the reflection protocol extracts the data samples based on domain knowledge categories, user preferences, and indicators of success or failure of the task execution by the LLM agent. The LLM agent stores the data samples in the long-term memory database of the LLM service for future recalls when it performs other tasks for other users of the LLM service. Computerized methods, including those mentioned above.
The computer-implemented method further, The computer-implemented method according to claim 11, comprising generating a reflection protocol for the LLM service, wherein the self-reflection problem to be performed by the reflection agent is generated using a prompt engineering method.
The computer-implemented method further, The LLM agent receives other requests to perform other tasks, Using the long-term memory database, perform a recall and determine one or more data samples from among the data samples related to the execution of the other tasks, The LLM agent performs the other tasks based on the recall of one or more data samples stored in the long-term memory database. The computer-implemented method according to claim 11, including the method described in claim 11.
A database configured to store multiple data samples that are made accessible to the LLM agent and reflection agent of a Large-Scale Language Model (LLM) service, A computing device configured to implement the aforementioned LLM service, A system equipped with, The aforementioned LLM service is Upon receiving a first request from a user of the LLM service to the LLM agent to perform a first task, The LLM agent executes the first task and outputs the result of the first task to the user. The reflection agent executes a reflection protocol to generate additional data samples based on an analysis of the communication between the user and the LLM agent during the execution of the first task. The aforementioned additional data samples are supplied to the long-term storage within the database. In response to receiving a second request to the LLM agent to perform a second task, the LLM agent accesses the long-term storage in the database and determines one or more of the data samples related to the execution of the second task. The LLM agent performs the second task based on the one or more data samples. A system that is configured in such a way.
The computing device further, The user interface for the aforementioned LLM service is implemented, In response to the execution of the reflection protocol, the additional data samples are supplied to the user via the user interface. The user provides an index for editing one of the aforementioned additional data samples or an index for adding other data samples. One or more edited data samples from the aforementioned additional data samples are supplied to the long-term storage in the database. The system according to claim 14, configured as follows.
The aforementioned database further, It is configured to store user permissions corresponding to access to the aforementioned multiple data samples, The computing device further, In response to receiving another request from the user to access one or more of the data samples, the database verifies that the user has permission based on the user permissions stored in the database. The user is supplied with one or more of the data samples from the plurality of data samples via the user interface. The system according to claim 15, configured as described above.
In order to execute the reflection protocol, the computing device further: The communication is analyzed to determine which domain knowledge category the first task corresponds to. A text-based data sample is generated from the communication corresponding to the domain knowledge category and the output result of the first task. Based on the domain knowledge categories, the relevant tasks are searched for in the long-term storage within the database. The database provides data samples based on the aforementioned text and links to related tasks to be stored in long-term memory storage. The system according to claim 14, configured as follows.
In order to execute the reflection protocol, the computing device further: The communication is analyzed over the entire duration of the communication to determine the user preference indicators of the user. A text-based data sample is generated from the communication corresponding to the user preferences of the aforementioned user. The data sample based on the aforementioned text is labeled as corresponding to the user, The data sample based on the aforementioned text and the label to be stored are supplied to the long-term storage in the database. The system according to claim 14, configured as follows.
In order to execute the reflection protocol, the computing device further: The communication is analyzed to determine an indicator that the LLM agent has successfully completed the first task. A text-based data sample is generated from the communication corresponding to the deliverables in the first task requested by the user and the output results of the first task by the LLM agent. The data sample based on the aforementioned text is labeled as having been successfully completed by the LLM agent. The data sample based on the aforementioned text and the label to be stored are supplied to the long-term storage in the database. The system according to claim 14, configured as follows.
In order to execute the reflection protocol, the computing device further: By analyzing the aforementioned communications, an indicator is determined that the LLM agent failed to execute the first task. A text-based data sample is generated from the communication corresponding to the deliverables in the first task requested by the user and the result of the failure of the first task by the LLM agent. The data sample based on the aforementioned text is labeled as a failure by the LLM agent. The data sample based on the aforementioned text and the label to be stored are supplied to the long-term storage in the database. The system according to claim 14, configured as follows.

Description

This disclosure relates to enabling long-term memory solutions for large-scale language models. Background Large-scale language models (LLMs) have demonstrated powerful performance across a variety of tasks, leading to their increased deployment in large-scale systems. For example, LLMs are often used to provide supervision or as tools in decision-making processes. Large open-source datasets that can be applied as training datasets for LLMs make it possible to apply LLMs to generalized tasks. However, there have yet to be successful prior implementations of LLMs in specific enterprise environments where further specific context is required. Furthermore, prior implementations often require white-box access to these models, which involves analyzing and subsequently tuning weights, hidden states, or other internal parameters. This figure shows a system for training and utilizing machine learning models, such as large-scale language models, according to several embodiments.This figure shows computer-implemented methods for training and utilizing machine learning models, such as large-scale language models, according to several embodiments.This figure shows a service provider network configured to implement LLM services and manage LLM long-term memory storage in several embodiments.This flowchart shows a process in which, according to several embodiments, a task is performed for a user of the LLM service, and subsequently a reflection protocol is executed that allows the LLM service to store the results in a long-term memory database and later recall the results of the task during the execution of a future task.This flowchart shows a first subprocess that executes the reflection protocol described in Figure 4, relating to a domain knowledge category, according to several embodiments.This flowchart shows a second subprocess that executes the reflection protocol described in Figure 4, relating to user preferences, according to several embodiments.This flowchart shows a third subprocess that executes the reflection protocol introduced in Figure 4, relating to the success of the LLM agent in task execution in several embodiments.This flowchart shows a fourth subprocess that executes the reflection protocol described in Figure 4, relating to LLM agent failures in task execution according to several embodiments.This figure shows examples of user interfaces in several embodiments that allow users of an LLM service to chat with the LLM agent of the service in order to perform tasks.This figure shows another part of a user interface that, according to several embodiments, can provide data samples generated during the reflection protocol for viewing and editing by users of the LLM service.This figure shows an example, in several embodiments, where a user can add additional data samples to data samples generated during the reflection protocol via a user interface.This figure shows another part of the user interface that allows users of the LLM service to browse and explore data samples already stored in the long-term memory database, according to several embodiments.This figure shows examples of data samples that a user can view using a user interface in long-term memory storage, according to several embodiments.This figure shows another example of a data sample that a user is viewing using a user interface in long-term memory storage, according to several embodiments. Detailed Description While embodiments of this disclosure are described herein, it should be understood that these embodiments are merely examples and that various alternative forms can be adopted as other embodiments. The figures are not necessarily drawn to scale, and some features are exaggerated or reduced to illustrate details of certain elements. Therefore, certain structural and functional details disclosed herein should not be construed as limitations, but only as representative grounds for teaching those skilled in the art various ways of utilizing each embodiment. As those skilled in the art will understand, various features illustrated and described with reference to any one of the figures can be combined with features shown in one or more other figures to form embodiments that are not expressly illustrated or described. Combinations of illustrated features provide representative embodiments for typical uses. Various combinations and modifications of features consistent with the teachings of this disclosure may be desirable for a particular use or implementation. In this specification, “a,” “an,” and “the” refer to both singular and plural unless the context clearly indicates otherwise. For example, “a processor programmed to perform various functions” refers to a single processor programmed to perform each function one at a time, or it refers collectively to two or more processors programmed to perform each of the various functions. LLM has demonstrated remarkable capabilities in keyword generation and reasoning, and these generation and reasonin