CN-121983089-A - User emotion recognition method, system, equipment and storage medium based on LLM and context modeling

CN121983089ACN 121983089 ACN121983089 ACN 121983089ACN-121983089-A

Abstract

The invention discloses a user emotion recognition method, a system, equipment and a storage medium based on LLM and context modeling, which relate to the technical field of artificial intelligence and natural language processing, and the method comprises the steps of collecting and structuring dialogue data from a customer service system, and utilizing a large language model to carry out automatic emotion marking and screening to construct a weak supervision training set; the method comprises the steps of training a special model fusing depth semantic coding and dialogue graph structural modeling by using a weak supervision training set, applying special model reasoning and analyzing error samples, inducing model defect modes, guiding a large language model to generate targeted enhancement data, and retraining the special model after merging to form a continuous optimized closed loop.

Inventors

QIN KAI
LI HUI
WU WEIWEI
LI JING
Du Ruorong
YU BO
WANG KAI
ZHENG YI
WANG SHENGZHU
WEI GUOHUI
NONG HUIQING
Lin Zugui

Assignees

广西电网有限责任公司

Dates

Publication Date: 20260505
Application Date: 20260109

Claims (10)

1. A user emotion recognition method based on LLM and context modeling, comprising: Collecting original multi-round dialogue logs from an intelligent customer service system, and constructing a structured dialogue sequence; Based on the structured dialogue sequence, carrying out automatic emotion marking on each round of user speaking by using a large language model, and constructing a weak supervision training set through confidence level screening; training a double-branch special neural network model integrating depth semantic coding and dialogue graph structural modeling by using a weak supervision training data set to obtain a trained emotion recognition special model; Applying the trained emotion recognition special model to new dialogue data or verification set for reasoning, recognizing a sample of prediction errors and performing cluster analysis to obtain the current error pattern description of the emotion recognition special model; based on the error mode description, the large language model is guided to generate targeted enhanced training data, and combined with the weak supervision training set, the special model for emotion recognition is retrained, so that closed loop optimization is formed.
2. The method for identifying emotion of a user based on LLM and context modeling as claimed in claim 1, wherein said collecting original multi-turn dialogue logs from intelligent customer service system, constructing structured dialogue sequence comprises: Collecting an original multi-round dialogue log from an intelligent customer service system, wherein the dialogue log comprises speaking sequences alternately performed by a user and customer service; The method comprises the steps of cleaning an original conversation log, organizing the cleaned conversation log according to a natural time sequence of speaking, and constructing a structured conversation sequence, wherein the structured conversation sequence is stored in an ordered list form, each element in the list corresponds to one round of speaking, and each speaking element comprises a speaker identity and speaking content text; The speaker identity is used for distinguishing user speaking from customer service speaking, and the sequence of the speaking in the ordered list is used for representing the time sequence evolution relation of the dialogue, so that a standardized dialogue data format which can be processed by a machine is formed.
3. The method for recognizing emotion of user based on LLM and context modeling as claimed in claim 2, wherein said structuring based dialogue sequence, using large language model to make automatic emotion labeling for each round of user utterance, constructing weak supervision training set by confidence level screening comprises: for each round of user utterances in the structured sequence of utterances, constructing input information comprising a target utterance and a plurality of rounds of dialog contexts; inputting the input information into the large language model, and guiding the large language model to execute emotion analysis tasks based on a preset annotation prompt template, wherein the annotation prompt template comprises emotion category definitions, text processing rules, annotation element requirements and structural output format specifications.
4. The method for recognizing emotion of user based on LLM and context modeling as recited in claim 3, wherein said structured dialogue-based sequence, using large language model to automatically annotate emotion of each user utterance, constructing weak supervision training set by confidence level screening, further comprises: And setting a confidence coefficient screening threshold value, and screening samples with confidence coefficient higher than the confidence coefficient screening threshold value from all the large language model labeling results to form a weak supervision training data set.
5. The method for recognizing emotion of user based on LLM and context modeling as claimed in claim 4, wherein training a dual-branch special neural network model integrating deep semantic coding with dialog structure modeling using weak supervision training data set, to obtain trained emotion recognition special model comprises: Training a dual-branch neural network model based on the weak supervision training data set, wherein the dual-branch neural network model comprises a semantic coding branch and a dialogue structure modeling branch; in the semantic coding branch, a pre-training language model is used for coding each round of speaking in the dialogue, and depth semantic feature vectors of each round of speaking are extracted; In a conversation structure modeling branch, modeling the whole conversation as a graph structure, wherein nodes in the graph correspond to semantic features of each round of speaking, edges between the nodes are established based on speaker roles and time sequence relations to represent interactive dependence between the speaking, and message transmission and feature aggregation are carried out on the graph structure through a graph neural network to obtain speaking enhancement representation fused with conversation context structure information.
6. The method for recognizing emotion of a user based on LLM and context modeling as claimed in claim 5, wherein said training a dual-branch private neural network model combining deep semantic coding with structure modeling of dialog graph using a weakly supervised training data set, the obtaining the trained emotion recognition private model further comprises: Feature fusion is carried out on the depth semantic feature vector and the speaking enhancement representation, and probability distribution of emotion categories is output through a classifier; and optimizing parameters of the dual-branch neural network model through back propagation by using a weak supervision training data set and a standard loss function to obtain a trained emotion recognition special model.
7. The method for recognizing emotion of user based on LLM and context modeling as claimed in claim 6, wherein said applying the trained emotion recognition dedicated model to new dialogue data or verification set to make reasoning, recognizing samples of prediction errors and performing cluster analysis, obtaining the error pattern description currently existing in emotion recognition dedicated model comprises: Applying the trained emotion recognition special model to a dialogue data set to be analyzed to carry out emotion prediction, and obtaining a prediction result; comparing the prediction result with a corresponding reference label, identifying samples inconsistent in prediction, and forming an error sample set; Performing feature analysis and clustering treatment on samples in the error sample set, and extracting identification defects of the emotion identification special model; based on the result of the cluster analysis, the systematic error mode existing in the emotion recognition dedicated model is summarized.
8. A user emotion recognition system based on LLM and context modeling, applying the method of any one of claims 1-7, comprising: The dialogue log structuring module is used for collecting original multi-round dialogue logs from the intelligent customer service system and constructing a structured dialogue sequence; the intelligent emotion marking and screening module is used for automatically marking emotion of each round of user speaking by utilizing a large language model based on the structured dialogue sequence, and constructing a weak supervision training set through confidence level screening; The context-aware emotion recognition model training module is used for training a double-branch special neural network model integrating depth semantic coding and dialogue graph structure modeling by using a weak supervision training data set to obtain a trained emotion recognition special model; The model performance diagnosis and mode analysis module is used for applying the trained emotion recognition special model to new dialogue data or verification sets for reasoning, recognizing samples with prediction errors and carrying out cluster analysis to obtain the current error mode description of the emotion recognition special model; The data enhancement and iteration optimization module is used for guiding the large language model to generate targeted enhancement training data based on the error mode description, combining the data enhancement and iteration optimization module with the weak supervision training set, and retraining the emotion recognition special model to form closed loop optimization.
9. An electronic device, comprising: A memory and a processor; The memory is for storing computer executable instructions, the processor being for executing the computer executable instructions which when executed by the processor implement the steps of the method of any one of claims 1to 7.
10. A computer-readable storage medium, characterized in that it stores computer-executable instructions which, when executed by a processor, implement the steps of the method of any one of claims 1 to 7.

Description

User emotion recognition method, system, equipment and storage medium based on LLM and context modeling Technical Field The invention relates to the technical field of artificial intelligence and natural language processing, in particular to a user emotion recognition method, system, equipment and storage medium based on LLM and context modeling. Background Along with the rapid development of artificial intelligence technology, intelligent customer service systems have been widely used in a plurality of industries such as e-commerce, finance, telecommunications, government and the like. In the human-computer interaction process, accurately identifying the emotional state (such as anger, anxiety, satisfaction, confusion and the like) of the user has important significance for improving the service quality, optimizing the user experience and intervening in the negative emotion in time. Traditional emotion recognition methods mainly rely on keyword matching, rule engines or shallow machine learning models (such as SVM, random forest, etc.), and the recognition accuracy is limited by the limitations of feature engineering and the lack of context information. In recent years, a deep learning-based emotion recognition method has advanced, for example, a model such as LSTM, transformer is used to model a single-turn dialogue. However, these methods often ignore complex semantic dependencies and emotion evolution processes in multiple rounds of conversations, and it is difficult to accurately capture dynamic changes of user emotion in the conversational context. Still other methods, while performing context modeling, only consider semantic context, but not relationships between speakers, and it is difficult to mine comprehensive, deep context features. In addition, the scarcity of high-quality annotation data severely restricts the training effect of the special emotion recognition model. Although large language models (such as GPT and the like) are excellent in general semantic understanding, the large language models are directly used for emotion recognition tasks and have the problems of high calculation cost, high response delay, poor field suitability and the like. In the prior art, part of research attempts to use LLM (Large Language Model ) for data enhancement or weak supervision labeling, but lacks a systematic closed-loop mechanism, and fails to effectively combine the generalization capability of LLM with the efficient reasoning advantages of a special model. Meanwhile, the context modeling aiming at customer service dialogue scenes stays at the sequence modeling level, and the influence of interaction structures (such as user-customer service turn alternation, topic transfer and the like) among dialogue participants on emotion recognition cannot be fully mined. Disclosure of Invention In view of the above-mentioned problems, the present invention provides a user emotion recognition method, system, device and storage medium based on LLM and context modeling. Therefore, the technical problems of low recognition accuracy, poor generalization capability and low iteration efficiency caused by lack of annotation data, insufficient modeling of multiple dialogue contexts and difficulty in continuous optimization of models when the conventional intelligent customer service system carries out user emotion recognition are solved. In order to solve the technical problems, the invention provides the following technical scheme: In a first aspect, the present invention provides a user emotion recognition method based on LLM and context modeling, including: Collecting original multi-round dialogue logs from an intelligent customer service system, and constructing a structured dialogue sequence; Based on the structured dialogue sequence, carrying out automatic emotion marking on each round of user speaking by using a large language model, and constructing a weak supervision training set through confidence level screening; training a double-branch special neural network model integrating depth semantic coding and dialogue graph structural modeling by using a weak supervision training data set to obtain a trained emotion recognition special model; Applying the trained emotion recognition special model to new dialogue data or verification set for reasoning, recognizing a sample of prediction errors and performing cluster analysis to obtain the current error pattern description of the emotion recognition special model; based on the error mode description, the large language model is guided to generate targeted enhanced training data, and combined with the weak supervision training set, the special model for emotion recognition is retrained, so that closed loop optimization is formed. As a preferred embodiment of a user emotion recognition method based on LLM and context modeling, wherein: the method for collecting the original multi-round dialogue logs from the intelligent customer service system and constructing the struct