CN-122021662-A - Intention recognition method and device

CN122021662ACN 122021662 ACN122021662 ACN 122021662ACN-122021662-A

Abstract

In an application scenario including intent recognition, the intent recognition can be performed through double paths, one is a fast search straight-through path based on corpus data matching, and the other is a Few-Shot enhanced reasoning path based on a search result of the fast search path using a large language model. And aiming at the input information of the user, detecting the matching degree of the input information and each piece of candidate information in the corpus, and carrying out path selection through a preset first threshold value. And selecting a fast search straight-through path to determine a target intention under the condition that the maximum matching degree is larger than or equal to a first threshold value, and selecting a reference intention as Few-Shot from candidate information according to the sequence of the matching degree from large to small under the condition that the maximum matching degree is smaller than the first threshold value, and predicting the target intention according to the input information and task description by a large language model.

Inventors

Ao Wenqi

Assignees

支付宝(杭州)数字服务技术有限公司

Dates

Publication Date: 20260512
Application Date: 20260212

Claims (14)

1. A method of intent recognition, the method comprising: Acquiring first input information; Detecting each piece of candidate information of the corpus, wherein each piece of candidate information corresponds to a single candidate intention; Selecting a first number of candidate information according to the order of the matching degree from large to small under the condition that the maximum value in each matching degree is smaller than a first threshold value, and taking the corresponding candidate intention as a reference intention; And taking the reference intention and the first input information as prompt information of a large language model, and determining a first target intention of the first input information according to output of the large language model.
2. The method of claim 1, wherein, in the case where the maximum value is greater than or equal to the first threshold value, candidate information corresponding to the maximum value is selected, and the corresponding candidate intention is taken as a target intention.
3. The method of claim 1, wherein each piece of candidate information comprises a first candidate information, a first degree of matching of the first candidate information to the first input information being determined by: Respectively determining the text similarity and the semantic vector similarity of the first candidate information and the first input information; and obtaining the first matching degree based on weighting the text similarity and the semantic vector similarity.
4. The method of claim 1, wherein the determining the first target intent of the first input information from the output of the large language model comprises: in the case that the large language model outputs one of the reference intents, taking the reference intention output by the large language model as the first target intention; In the event that the large language model outputs other reference intents or cannot be from an indication of the determined reference intents, the first target intent is determined by a manual or better model.
5. The method of claim 4, wherein, in the event that the first target intent is determined by a manual or better model, the method further comprises: Adding the first target intention into candidate intention, and adding at least one of the first input information and expansion information obtained by expanding the first input information based on a large language model into candidate information corresponding to the first target intention.
6. The method of claim 1, wherein the large language model is a pre-trained natural language model, the method further comprising: Adding the first input information and the first target intent to a corpus for use in constructing training samples for further training of the large language model during online use, the further training comprising at least one of supervised fine tuning, reinforcement learning performed at predetermined cycles.
7. The method of claim 6, wherein a single training sample includes prompt information determined based on input information, task description information, and several candidate intents, and target intents as supervision information.
8. The method of claim 6, wherein the training samples for further training comprise a first type of sample and a second type of sample, the first type of sample and the second type of sample being determined by: Acquiring a first intention and first candidate information corresponding to the first intention; acquiring candidate information with the matching degree with the first candidate information from each piece of candidate information in the corpus, wherein the m candidate information is arranged in the first from large to small and is used as first reference information; And taking the candidate intention corresponding to the first n1 pieces of candidate information in the first reference information as a first reference intention, taking the candidate intention corresponding to the last n2 pieces of candidate information as a second reference intention, generating a first type sample based on the first reference intention, the first candidate information and the first intention, and generating a second type sample based on the second reference intention, the first candidate information and the first intention, wherein n1 and n2 are smaller than m.
9. The method of claim 6, wherein the large language model obtained by a single optimization in the further training comprises a second model, the second model being a model obtained by optimizing the first model on the line, the method further comprising: Comparing model performance of the first model and the second model based on a test set; The first model is replaced with the second model in the event that the model performance of the second model is higher than the model performance of the first model and the higher portion reaches an update threshold.
10. The method of claim 9, wherein the test set includes at least one of a long-tailed intent sample, a confusable intent sample, the confusable intent being a close intent whose semantic similarity satisfies a similar condition, or an intent obtained by verifying/correcting an intent generated by a large language model.
11. The method of claim 9, wherein the method further comprises: The second model is discarded in case the model performance of the second model is lower than the model performance of the first model or higher than the model performance of the first model but the higher part reaches an update threshold.
12. An apparatus for intent recognition, the apparatus comprising: an acquisition unit configured to acquire first input information; The detection unit is configured to detect the matching degree of each piece of candidate information of the corpus with each piece of first input information, and each piece of candidate information corresponds to a single candidate intention; a selection unit configured to select a first number of candidate information in order of the degree of matching from large to small, with a maximum value of the respective degrees of matching being smaller than a first threshold, the corresponding candidate intention being a reference intention; and a determining unit configured to determine a first target intention of the first input information according to an output of the large language model, with the reference intention and the first input information as hint information of the large language model.
13. A computer readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method of any of claims 1-9.
14. A computing device comprising a memory and a processor, wherein the memory has executable code stored therein, and wherein the processor, when executing the executable code, implements the method of any of claims 1-9.

Description

Intention recognition method and device Technical Field One or more embodiments of the present specification relate to the field of computer technology, and more particularly, to a method and apparatus for intent recognition. Background With the popularity of artificial intelligence applications, the demand for intent recognition systems is increasing. Typical intent recognition tasks often involve hundreds of intent categories. However, in large enterprise services, electronic commerce or complex business scenarios, the intent categories often scale to thousands to tens of thousands, such as containing thousands of intent categories in the intent recognition scenario of a government platform, and growing. Traditional small deep learning models (e.g., based on BERT, ERNIE, etc.) have difficulty in effectively distinguishing such huge, semantically similar intent categories, and in adapting to the situation where the intent categories are growing. Although the Large Language Model (LLM) has excellent context learning and generalization capability, the huge parameter quantity thereof causes high reasoning delay and high computing resource consumption, and cannot meet the requirements of high-concurrency and low-delay production environments. In addition, most of intention category training corpuses are sparse, and intention distribution is in a long tail effect, so that the model has poor low-frequency intention recognition performance. Under the condition of dynamic change of service intention, manual labeling and model retraining are frequently carried out, and the iteration efficiency is low. Therefore, how to provide an effective and feasible intention recognition scheme is a technical problem to be solved. Disclosure of Invention One or more embodiments of the present specification describe a method and apparatus for intent recognition to address one or more of the problems mentioned in the background. According to a first aspect, a method for identifying intention is provided, which comprises the steps of obtaining first input information, detecting matching degrees of each piece of candidate information of a corpus and each piece of first input information, wherein each piece of candidate information corresponds to a single candidate intention, selecting a first number of candidate information according to the sequence from the largest matching degree to the smallest matching degree as reference intention under the condition that the largest matching degree is smaller than a first threshold value, and determining a first target intention of the first input information according to output of a large language model by taking the reference intention and the first input information as prompt information of the large language model. In one embodiment, in a case where the maximum value is greater than or equal to the first threshold value, candidate information corresponding to the maximum value is selected, and the corresponding candidate intention is taken as the target intention. In one embodiment, each piece of candidate information comprises first candidate information, and the first matching degree of the first candidate information and the first input information is determined by respectively determining the text similarity and the semantic vector similarity of the first candidate information and the first input information, and obtaining the first matching degree based on weighting of the text similarity and the semantic vector similarity. In one embodiment, the determining the first target intention of the first input information according to the output of the large language model comprises the steps that the large language model outputs one of reference intentions, the reference intention output by the large language model is used as the first target intention, the large language model outputs other reference intentions or an indication that the reference intention cannot be selected from the reference intentions, and the first target intention is determined through a manual or better model. In a further embodiment, in the case that the first target intention is determined by a manual or better model, the method further comprises adding the first target intention to a candidate intention, and adding at least one of the first input information and expanded information obtained by expanding the first input information based on a large language model to a candidate information corresponding to the first target intention. In one embodiment, the large language model is a pre-trained natural language model, the method further comprising adding the first input information and the first target intent to a corpus for use in constructing training samples for further training the large language model during online use, the further training comprising at least one of supervised fine tuning, reinforcement learning performed at predetermined cycles. In a further embodiment, a single training sample includes prom