CN-121998081-A - Zero sample time sequence architecture design method based on multi-agent large language model
Abstract
The invention belongs to the technical field of time sequence prediction, and discloses a zero sample time sequence architecture design method based on a multi-agent large language model. The method utilizes knowledge reasoning capability of a large language model to realize zero sample structural design of a time sequence prediction model without additional training. The model design efficiency is improved by combining an operator agent and a topology agent through a multi-agent cooperation mechanism, and structural knowledge is extracted from literature by introducing literature agents and optimizing agents so as to enhance the generalization and adaptation capability of the model. The invention can realize high-precision automatic design of the time sequence prediction structure while reducing the consumption of computing resources. The method has remarkable advantages in the aspect of calculation efficiency, has stronger generalization capability on unseen data sets, improves the reasoning quality of complex structural design, and enhances the robustness of the structural design.
Inventors
- ZHANG XINYUAN
- YANG BIN
- GUO CHENJUAN
- WU XINGJIAN
Assignees
- 华东师范大学
Dates
- Publication Date
- 20260508
- Application Date
- 20251225
Claims (7)
- 1. A zero sample time sequence architecture design method based on a multi-agent large language model is characterized by comprising the following steps: Using a task-aware dataset portrayal module to find the most similar dataset from the existing related time sequence task reference dataset, and characterizing the target dataset; Step 2, pruning the complete search space by the similarity search space pruning module according to the representation of the target data set to obtain a high-performance subspace; And 3, searching and optimizing the design of the time sequence model architecture by using an evolution enhancement module in the high-performance subspace, and obtaining the finally determined time sequence model architecture.
- 2. The method for designing zero-sample time-series architecture based on multi-agent large language model according to claim 1, wherein the task-aware dataset portrayal module in step 1 is specifically as follows: Representing a target data set by adopting two information of language description and structural attribute; The language description specifically comprises the steps of describing the field, the characteristic semantics and the scale of a target time sequence data set by using a simple natural language text; The structure attribute is specifically characterized in that the statistical characteristic and the signal characteristic of the target time sequence dataset are automatically extracted through the following characteristic engineering technology: 1) Smoothing and recombining the original target space-time sequence, removing local noise, and obtaining a denoising time sequence; 2) Extracting high-dimensional statistical feature vectors on the denoising time sequence using a set of predefined operators; 3) On the high-dimensional statistical feature vector, screening the features most relevant to the prediction task through significance test; 4) And selecting the first K most relevant features as structural attribute characterization of the target time sequence data set according to the saliency sequence.
- 3. The multi-agent large language model-based zero sample time series architecture design method of claim 1, wherein the similarity search space pruning module pruning is performed from the complete search space to obtain a high-performance subspace; The complete search space comprises a data set subset similar to a target data set in the existing related time sequence task reference data set and the target data set, wherein each data set is provided with an effective model framework which is decomposed into operators and topology; designing an operator agent to be responsible for screening a high-performance operator set, and a topology agent to be responsible for screening a high-quality topology structure; The operator agent and the topology agent analyze the complete search space from three aspects of field description, macroscopic architecture, operators and topology candidates, decompose the existing model architecture into operators and topology for each subset of the data set, and cluster and sort the operators or topology types according to the average performance of the operators and topology types; The method comprises the steps of extracting an operator set and a topological structure which are ranked at the front from a dataset subset to serve as candidate operators and candidate topologies, enabling an operator agent and a topology agent to evaluate the candidate operators and the candidate topologies based on task suitability and generalization performance on the dataset subset, enabling the operator agent and the topology agent to exchange candidate results and select in a cooperative and self-adaptive mode, enabling the operator agent to refine the operator according to feedback results of the topology agent, enabling the topology agent to refine the topology structure according to feedback of the operator agent, and enabling the operator agent and the topology agent to finally select to generate a cut high-performance subspace.
- 4. The method for designing zero-sample time series architecture based on multi-agent large language model according to claim 3, wherein the operator agent is a large language model-based functional agent, and under the condition of zero training, according to the statistical characteristics of the data set and the predefined candidate operator set, the operator subset matched with the current relevant time series prediction task is screened out through a constrained reasoning process, so that the original search space is effectively pruned.
- 5. The method for designing zero-sample time series architecture based on multi-agent large language model according to claim 3, wherein the operator agent refines operators according to feedback results of topology agents, and the topology agents are functional agents based on large language model, and under the premise of giving candidate operator subspaces, topology structures for relevant time series prediction tasks are generated or screened through constrained reasoning process according to structural features of data sets, so that search space is further reduced.
- 6. The multi-agent large language model-based zero-sample time series architecture design method according to claim 1, wherein the evolution enhancement module performs design optimization on the time series model in a high-performance subspace, and specifically realizes the following steps: 1) Generating an initial architecture; The decision-making agent generates an initial space-time block architecture in a high-performance subspace facing the target task, the decision-making agent cuts out the description of the high-performance subspace, the specific description of the target task and the design suggestion from the operator agent and the topology agent based on the definition of the complete search space, and comprehensively analyzes and recommends candidate structures; 2) Knowledge enhancement optimization; The method comprises the steps of establishing an improved agent and a literature agent, wherein the improved agent and the literature agent work cooperatively, analyzing the design of an initial space-time block architecture, and requesting the literature agent to provide relevant knowledge after identifying potential defects; 3) Iterative optimization and fusion; The decision-making agent receives the optimization suggestion and the confidence score of the improved agent output and brings the optimization suggestion and the confidence score into the space-time block architecture of the next round to generate the suggestion, the whole process is iterated in a round-by-round mode of generating-analyzing-improving, the space-time block architecture is continuously evolved in each feedback, and finally a time sequence model structure with high performance and high interpretation is formed.
- 7. The zero-sample time series architecture design method based on the multi-agent large language model of claim 6 is characterized in that the vectorization literature knowledge base is built by extracting text content from time series related research papers, automatically analyzing PDF files, clearing redundant information, segmenting the text into fragments according to semantic continuity, and embedding the fragments into vector space to obtain the vectorization literature knowledge base.
Description
Zero sample time sequence architecture design method based on multi-agent large language model Technical Field The invention relates to the technical field of time sequence prediction, in particular to a zero sample time sequence architecture design method based on a multi-agent large language model. Background With the wide application of sensor technology and information acquisition technology, a large number of sensors are deployed in social infrastructures such as transportation systems, electric power networks, medical monitoring platforms and the like for recording various types of data which change with time. The multi-dimensional time series data thus formed generally has an inherent correlation, referred to as a correlation time series. By analyzing and modeling the historical related time sequence, accurate prediction of future values is realized, and the method has important application value in various actual scenes such as traffic flow prediction, power grid load scheduling, patient health monitoring and the like. In recent years, deep learning approaches have shown significant advantages in related time series prediction tasks. The core of such a model is the design of a spatio-Temporal Block (ST-Block) consisting of spatial and Temporal operators, capable of capturing both spatial correlation between sequences and Temporal dependence within a single sequence. However, the existing Corerelated TIME SERIES (CTS) model architecture relies mainly on manual design, and generally requires researchers to have abundant domain knowledge and design experience, and the modeling process is time-consuming and costly, which is unfavorable for coping with dynamically changing data environments and task requirements. To overcome the limitation of manual design, an automatic modeling method has appeared in recent years, and the core idea is to automatically find high-performance ST-block structures in a predefined search space. The search space is typically composed of space/time operators extracted from existing models and their typical topological join patterns. Through the search strategies such as reinforcement learning, evolutionary algorithm or gradient optimization, the optimal network structure can be automatically searched in the space, and the optimal network structure is embedded into the complete model for training and prediction. The method realizes structural automation design to a certain extent and is superior to the traditional manual method in performance. However, the existing automated modeling method still has two main problems: (1) The calculation and time costs are high. Existing methods generally require extensive model training and evaluation on the target dataset to explore the optimal structure in the search space. Because the search space is huge and the optimization target is complex, the whole process needs to consume a large amount of calculation resources and time, and particularly in CTS prediction tasks, model training is computationally intensive and time-consuming, so that the cost of the search process is higher, and the method is difficult to popularize and apply in a scene with limited resources. (2) The generalization ability is insufficient. The meta-learning approach that has emerged in recent years attempts to achieve zero-sample structure searches by learning a migratable search strategy over multiple known data sets. However, such methods do not adequately model the structural features of new tasks or unseen datasets, resulting in poor generalization of the models they generate over heterogeneous tasks. When task distribution or data characteristics change, the prediction performance tends to be significantly reduced, and the applicability and stability of the prediction performance in actual complex scenes are limited. In summary, the existing related time series prediction model still faces technical bottlenecks of low efficiency, large resource overhead, insufficient adaptability and the like in terms of automation design and cross-task generalization, and an efficient, low-cost and automatic modeling scheme with good generalization capability is needed. Disclosure of Invention The invention utilizes abundant priori knowledge, strong reasoning capability and good expandability of a large language model (Large Language Model, LLM), decomposes complex ST-block design tasks through a multi-agent cooperative mechanism, combines two large modules of search space pruning and knowledge enhancement optimization, and realizes automation, intellectualization and high efficiency of structural design. Aiming at the correlated time series data recorded by multiple sensors, the method utilizes the knowledge reasoning capability of a large language model to realize the zero sample structure design of a time series prediction model under the condition of no need of additional model training. The method comprises the steps of realizing search space pruning by combining an operat