CN-121998070-A - Method, device and system for processing reasoning problem

CN121998070ACN 121998070 ACN121998070 ACN 121998070ACN-121998070-A

Abstract

The application provides a method, a device and a system for processing reasoning problems. After obtaining the problem to be inferred, determining the result of the problem to be inferred based on part of intermediate results and an inference model provided by the intermediate template, reducing the time for calculating the intermediate results in the process of inferring the problem to be inferred, and accelerating the speed of inferring the problem to be inferred.

Inventors

JIN YIBO
HUANG JIAQI
Yuan Zhengfan
YUAN JIANHUA
CAO YI

Assignees

华为技术有限公司

Dates

Publication Date: 20260508
Application Date: 20241108

Claims (20)

1. A method for processing an inference problem, which is applied to a terminal, the method comprising: Acquiring a problem to be inferred; And determining the result of the problem to be inferred based on a middle template and an inference model, wherein the middle template is used for providing partial middle results in the process of the inference model to infer the problem to be inferred, the partial middle results are obtained based on test problem inference, and the test problem is the same as the text template corresponding to the problem to be inferred.
2. The method of claim 1, wherein the intermediate template is further used to provide location information of intermediate results to be calculated, wherein the intermediate results to be calculated are intermediate results not provided by the intermediate template in the process of reasoning the question to be reasoning based on the intermediate template by the reasoning model.
3. The method of claim 2, wherein the intermediate template is further configured to provide a category characteristic of at least one result of the text template, the category characteristic being configured to indicate a correspondence between the intermediate result to be calculated and the at least one result of the text template.
4. The method of claim 1, wherein the inference model includes a plurality of processing layers for determining intermediate results not provided by the intermediate templates in the process of the inference model inferring the question to be inferred, and wherein the processing layers are further configured to determine whether the result of the question to be inferred can be determined based on the intermediate results not provided.
5. The method of claim 1, wherein prior to the obtaining the question to be inferred, the method further comprises: receiving a plurality of intermediate templates sent by a cloud; Before the result of the question to be inferred is determined based on the intermediate template and the inference model, the method further comprises: and determining whether an intermediate template corresponding to the problem to be inferred exists in the plurality of intermediate templates.
6. The method of claim 5, wherein determining whether there is a reticle corresponding to the question to be inferred from among the plurality of reticles comprises: Determining the text template; And determining whether an intermediate template matched with the problem to be inferred exists or not according to the text template and a first mapping relation, wherein the first mapping relation comprises the mapping relation of the text template and the intermediate template.
7. The method of any one of claims 1 to 6, wherein the inference model is a large language model and the processing layer is a self-attention conversion layer.
8. A method for processing an inference problem, which is applied to a cloud, the method comprising: based on a plurality of test questions, obtaining an intermediate result of each test question in the plurality of test questions, wherein text templates corresponding to the plurality of test questions are the same; Determining an intermediate template corresponding to the text template based on intermediate results of the plurality of test questions; and sending the intermediate template to a terminal.
9. The method of claim 8, wherein the determining the intermediate template corresponding to the text template based on the intermediate results of the plurality of test questions comprises: Determining position information of a first intermediate result of the intermediate template and a second intermediate result of the intermediate template based on intermediate results of the plurality of test problems, wherein the first intermediate result is an intermediate result which is not required to be calculated in the process of reasoning the problem to be inferred based on the intermediate template by an inference model, the second intermediate result is an intermediate result which is required to be calculated in the process of reasoning the problem to be inferred based on the intermediate template by the inference model, and the problem to be inferred is the same as a text template corresponding to the test problem; Based on the second intermediate result, a correspondence between the second intermediate result and at least one result of the text template is determined.
10. A device for processing reasoning problems, applied to a terminal, said device comprising: The acquisition module is used for acquiring the problem to be inferred; the first determining module is used for determining the result of the to-be-inferred problem based on a middle template and an inference model, wherein the middle template is used for providing partial middle results in the process of the inference model to infer the to-be-inferred problem, the partial middle results are obtained based on test problem inference, and the test problem is the same as the text template corresponding to the to-be-inferred problem.
11. The apparatus of claim 10, wherein the intermediate template is further configured to provide location information of an intermediate result to be calculated, wherein the intermediate result to be calculated is an intermediate result not provided by the intermediate template in the process of reasoning the question to be reasoning based on the intermediate template by the reasoning model.
12. The apparatus of claim 11, wherein the intermediate template is further configured to provide a category characteristic of at least one result of the text template, the category characteristic being configured to indicate a correspondence between the intermediate result to be calculated and the at least one result of the text template.
13. The apparatus of claim 10, wherein the inference model comprises a plurality of processing layers for determining intermediate results not provided by the intermediate templates in the process of the inference model inferring the question to be inferred, and wherein the processing layers are further configured to determine whether the result of the question to be inferred can be determined based on the intermediate results not provided.
14. The apparatus of claim 10, wherein the apparatus further comprises: The receiving module is used for receiving the plurality of intermediate templates sent by the cloud; and the second determining module is used for determining whether the intermediate templates corresponding to the problem to be inferred exist in the plurality of intermediate templates.
15. The apparatus according to claim 14, wherein the second determining module is specifically configured to: Determining the text template; And determining whether an intermediate template matched with the problem to be inferred exists or not according to the text template and a first mapping relation, wherein the first mapping relation comprises the mapping relation of the text template and the intermediate template.
16. The apparatus of any of claims 10 to 15, wherein the inference model is a large language model and the processing layer is a self-attention conversion layer.
17. A device for processing reasoning problems, applied to a cloud, the device comprising: the system comprises an acquisition module, a text template acquisition module and a display module, wherein the acquisition module is used for acquiring an intermediate result of each test problem in a plurality of test problems based on the plurality of test problems, and text templates corresponding to the plurality of test problems are the same; The determining module is used for determining an intermediate template corresponding to the text template based on intermediate results of the plurality of test questions; And the sending module is used for sending the intermediate template to a terminal.
18. The apparatus according to claim 17, wherein the determining module is specifically configured to: Determining position information of a first intermediate result of the intermediate template and a second intermediate result of the intermediate template based on intermediate results of the plurality of test problems, wherein the first intermediate result is an intermediate result which is not required to be calculated in the process of reasoning the problem to be inferred based on the intermediate template by an inference model, the second intermediate result is an intermediate result which is required to be calculated in the process of reasoning the problem to be inferred based on the intermediate template by the inference model, and the problem to be inferred is the same as a text template corresponding to the test problem; Based on the second intermediate result, a correspondence between the second intermediate result and at least one result of the text template is determined.
19. A system for processing reasoning problems is characterized by comprising a terminal and a cloud end, The terminal comprising a means for processing the reasoning problem of any of claims 10 to 16; The cloud comprising the means for processing reasoning problems as claimed in claim 17 or 18.
20. A cluster of devices, comprising at least one device, each device comprising a processor and a memory; the processor of the at least one device is configured to execute instructions stored in the memory of the at least one device to cause the cluster of devices to perform the method of any one of claims 1 to 9.

Description

Method, device and system for processing reasoning problem Technical Field The embodiment of the application relates to the technical field of clouds, in particular to a method, a device and a system for processing reasoning problems. Background The large language model (Large Language Model, LLM) is a core component of the intelligent question-answering system and intelligent assistant system, and a user or system program inputs questions to the LLM in the form of questions, and then the LLM generates and outputs corresponding answers in the form of text. Typically, LLMs are deployed at the cloud to enable quick response to problems and to handle problems more efficiently. This deployment approach leverages many of the advantages of cloud computing platforms, such as powerful computing power and high availability. However, in this deployment manner, the problem needs to be transmitted to the cloud for processing, and in the process of transmitting the problem and in the process of processing the problem by the cloud, the private data of the user may face a risk of disclosure. Disclosure of Invention The application provides a method, a device and a system for processing an inference problem, which are used for determining the result of the problem to be inferred based on partial intermediate results and an inference model provided by an intermediate template after the problem to be inferred is acquired, so that the time for calculating the intermediate results in the process of inferring the problem to be inferred is reduced, and the speed for inferring the problem to be inferred is increased. In a first aspect, the present application provides a method for processing an inference problem, where the method is applied to a terminal. The method comprises the steps that a terminal obtains a problem to be inferred. Then, the terminal determines the result of the problem to be inferred based on the intermediate template and the inference model. The intermediate templates are used for providing partial intermediate results in the process of reasoning the to-be-inferred problems by the reasoning model, the partial intermediate results are obtained based on the reasoning of the test problems, and the test problems are the same as the text templates corresponding to the to-be-inferred problems. Therefore, the implementation mode of the method is executed by the terminal, the problem of uploading the information with the user privacy to the cloud is avoided, and the privacy safety of the user is protected. In addition, in the embodiment of the application, the result of the problem to be inferred can be determined according to the intermediate template and the inference model, and in the inference process, part of intermediate results provided by the intermediate template can be used, so that the time for calculating the intermediate results can be effectively reduced, and the processing efficiency of the inference problem is improved. In one possible implementation manner, the intermediate template is further used for providing position information of an intermediate result to be calculated, wherein the intermediate result to be calculated is an intermediate result which is not provided by the intermediate template in the process of reasoning the problem to be inferred based on the intermediate template by the reasoning model. In the implementation mode of the application, the intermediate templates provide the position information of the result to be calculated, and the intermediate results are not required to be calculated according to the intermediate templates, so that the time for determining the position of the result to be calculated is further reduced, and the processing speed of the reasoning problem is increased. In one possible implementation, the intermediate template is further configured to provide a category feature of at least one result of the text template, the category feature being configured to indicate a correspondence between the intermediate result to be calculated and the at least one result of the text template. In the implementation mode of the application, whether the result of the problem to be inferred can be obtained directly from the determined result to be computed can be judged according to the corresponding relation between the intermediate result to be computed and the above-mentioned corresponding relation, so that the processing speed of the problem to be inferred is further increased. In one possible implementation, the inference model includes a plurality of processing layers, where the processing layers are configured to determine intermediate results that are not provided by the intermediate templates in the process that the inference model infers the problem to be inferred, and the processing layers are further configured to determine whether the result of the problem to be inferred can be determined according to the intermediate results that are not provided. In the imple