CN-121979996-A - Cross-domain problem processing method, device, equipment and medium based on large language model

CN121979996ACN 121979996 ACN121979996 ACN 121979996ACN-121979996-A

Abstract

The application discloses a method, a device, equipment and a medium for processing cross-domain problems based on a large language model, wherein the method comprises the steps of receiving target problems in a target domain; the method comprises the steps of carrying out similarity matching on a target problem and source problems of at least one source field in an example pool, wherein the example pool stores source problems classified according to industries or technical fields, corresponding thinking chains and source data source fields relied by reasoning of the thinking chains, acquiring or generating the thinking chains related to the target problem based on matching results, identifying the acquired or generated source data source fields relied by the thinking chains related to the target problem, determining corresponding target data source fields for the target problem based on semantic similarity, constructing enhancement instructions based on the acquired or generated thinking chains, the source data source fields and the target data source fields, and inputting the enhancement instructions into a large language model to generate answers to the target problem. The application enables a large language model to generate a logically consistent answer with reliable data support.

Inventors

YU YONG
CHEN SHUDONG
CHEN XIAOLIN
YE LIANG
FU SHENGQIANG
CHEN XIAOXIAO

Assignees

中国科学院微电子研究所

Dates

Publication Date: 20260505
Application Date: 20260206

Claims (10)

1. A method for processing cross-domain problems based on a large language model, the method comprising: Receiving a target problem in a target field; Performing similarity matching on the target problem and source problems of at least one source field in an example pool, wherein the example pool stores source problems classified according to industry or technical fields, corresponding thinking chains and source data source fields on which reasoning of the thinking chains depends; Based on the matching result, obtaining or generating a chain of thought associated with the target question; Identifying source data source fields on which the obtained or generated thought chains associated with the target questions depend, and determining corresponding target data source fields for the target questions based on semantic similarity; constructing an enhancement instruction based on the obtained or generated thought chain, the source data source field and the target data source field; And inputting the enhancement instruction into a large language model to generate an answer to the target question.
2. The method of claim 1, wherein the obtaining or generating a chain of thought associated with the target problem based on the matching result comprises: when the target question is successfully matched with a source question in the example pool, generating a chain of thought adapted to the target question based on the chain of thought of the matched source question; And when the matching fails, extracting the domain features of the target domain from a feature pool, and generating a thinking chain adapted to the target problem based on the domain features, wherein the feature pool stores domain feature knowledge in the form of a keyword list and an entity relation triplet.
3. The method of claim 1, wherein the affinity matching the target problem with source problems of at least one source domain in an example pool comprises: identifying domain keywords related to the target domain from the target problem and the source problem; In the encoding process, the vector representation of the identified domain keyword is given weight higher than that of a common text, and the weight fusion is carried out on the vector representation of the whole text of the problem and the vector representation of the whole text of the problem, so as to obtain a final vector; calculating cosine similarity between the final vector of the target problem and the final vector of the source problem; Comparing the calculated cosine similarity with a preset threshold value; if the source problem with the cosine similarity larger than the preset threshold exists, judging that the matching is successful, selecting the source problem with the highest similarity as a matching result, and otherwise judging that the matching is failed.
4. The method of claim 1, wherein the determining a corresponding target data source field for the target problem based on semantic similarity comprises: analyzing thinking chains corresponding to source problems serving as examples in the example pool, and identifying source data source fields with mention or logic implications therein; Calculating semantic similarity between the name of the identified source data source field and a candidate field name predefined by the target field; and checking the data types of the source data source field and the candidate field, and screening out target data source fields with matched semantics and types.
5. The method of claim 1, wherein the enhancement instruction comprises: Taking the obtained or generated thinking chain associated with the target problem as an reasoning example, taking the target problem as a task to be solved, taking the target data source field as a specified analysis frame, and combining according to a preset template; Wherein the analysis framework is used for guiding the large language model to focus on the data category defined by the target data source field during reasoning.
6. The method of claim 2, wherein after the obtaining or generating the chain of thought associated with the target question based on the matching result, the method further comprises: Storing the newly generated target problem and a thinking chain associated with the target problem as a pair of data to a cache pool; and transferring the pair of data from the cache pool to the example pool when the frequency of successful call of the pair of data in the cache pool exceeds a preset threshold and is confirmed to be a high-quality example through quality evaluation.
7. The method of claim 1, further comprising, prior to constructing the enhancement instruction: Performing quality assessment on the obtained or generated thinking chain associated with the target problem; the enhanced instruction is constructed based on the thought chain, the source data source field, and the target data source field only if the thought chain passes the quality assessment.
8. A device for processing cross-domain problems based on a large language model, the device comprising: The receiving module is used for receiving the target problem in the target field; The matching module is used for carrying out similarity matching on the target problem and the source problem of at least one source field in the example pool, and the example pool stores the source problems classified according to industry or technical fields, corresponding thinking chains and source data source fields on which the thinking chain reasoning depends; The thinking chain processing module is used for acquiring or generating a thinking chain associated with the target problem based on the matching result; The field mapping module is used for identifying source data source fields which are dependent on the obtained or generated thinking chain associated with the target problem and determining corresponding target data source fields for the target problem based on semantic similarity; the instruction construction module is used for constructing an enhancement instruction based on the obtained or generated thinking chain, the source data source field and the target data source field; And the answer generation module is used for inputting the enhancement instruction into a large language model and generating an answer aiming at the target question.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor runs the computer program to implement a method of handling cross-domain problems based on a large language model as claimed in any one of claims 1 to 7.
10. A computer-readable storage medium, on which a computer program is stored, characterized in that the computer program is executed by a processor to implement a method of processing cross-domain problems based on a large language model as claimed in any one of claims 1 to 7.

Description

Cross-domain problem processing method, device, equipment and medium based on large language model Technical Field The application relates to the technical field of artificial intelligence and natural language processing, in particular to a method, a device, equipment and a medium for processing cross-domain problems based on a large language model. Background With the remarkable progress of large-scale language models in natural language processing tasks, how to solve the problem of complex reasoning across fields by using the large-scale language models becomes an important research direction. The thinking chain technology effectively improves the performance of a large language model (Large Language Model, LLM) on tasks such as mathematics, common sense reasoning and the like through simulating a multi-step reasoning process. However, prior art solutions face inherent limitations in implementing cross-domain applications. The existing schemes such as transfer learning, sentence-BERT, knowledge graph alignment and the like can not realize the collaborative generalization of the thinking chain and the data source. Methods such as transfer learning focus on the adjustment of model parameters, and complex reasoning logic is difficult to directly transfer. The techniques Sentence-BERT, etc. can calculate the semantic similarity, but cannot guarantee the consistency and correctness of the reasoning steps. Knowledge graph alignment techniques focus on mapping at the entity level, but lack the knowledge of the overall inference logic. The core problem with these approaches is that they either mutually fracture the migration of the mental chain and the adaptation of the data source, or ignore one of them altogether. As a result, when the large language model faces the problem of the new field, either effective reasoning guide cannot be obtained, or the corresponding field data support is lacking in the reasoning logic, so that accurate and reliable answers are difficult to generate. Therefore, the prior art cannot meet the higher requirements of cross-domain complex problems on collaborative adaptation of inference logic and data bases. Disclosure of Invention In order to solve the technical problems, the embodiment of the application provides a method, a device, equipment and a medium for processing cross-domain problems based on a large language model. An embodiment of the present application provides a method for processing a cross-domain problem based on a large language model, where the method includes: Receiving a target problem in a target field; Performing similarity matching on the target problem and source problems of at least one source field in an example pool, wherein the example pool stores source problems classified according to industry or technical fields, corresponding thinking chains and source data source fields on which reasoning of the thinking chains depends; Based on the matching result, obtaining or generating a chain of thought associated with the target question; Identifying source data source fields on which the obtained or generated thought chains associated with the target questions depend, and determining corresponding target data source fields for the target questions based on semantic similarity; constructing an enhancement instruction based on the obtained or generated thought chain, the source data source field and the target data source field; And inputting the enhancement instruction into a large language model to generate an answer to the target question. In some embodiments of the application, the obtaining or generating a chain of thought associated with the target problem based on the matching result includes: when the target question is successfully matched with a source question in the example pool, generating a chain of thought adapted to the target question based on the chain of thought of the matched source question; And when the matching fails, extracting the domain features of the target domain from a feature pool, and generating a thinking chain adapted to the target problem based on the domain features, wherein the feature pool stores domain feature knowledge in the form of a keyword list and an entity relation triplet. In some embodiments of the present application, the matching the target problem with the source problem of at least one source domain in the example pool includes: identifying domain keywords related to the target domain from the target problem and the source problem; In the encoding process, the vector representation of the identified domain keyword is given weight higher than that of a common text, and the weight fusion is carried out on the vector representation of the whole text of the problem and the vector representation of the whole text of the problem, so as to obtain a final vector; calculating cosine similarity between the final vector of the target problem and the final vector of the source problem; Comparing the calculated cosine simila