CN-121979978-A - Dialogue content adjusting method and device based on emotion state recognition

CN121979978ACN 121979978 ACN121979978 ACN 121979978ACN-121979978-A

Abstract

The invention relates to the technical field of artificial intelligence, and discloses a dialogue content adjusting method and device based on emotion state identification, wherein the method comprises the following steps: collecting user interaction data in a user interaction process, analyzing emotion state information of a user according to the user interaction data, analyzing information coverage of an interaction target in the user interaction process and current dialogue context information according to text data and voice data, calculating a dialogue load currently corresponding to the user according to the current emotion state and the current dialogue context information, generating questioning control parameters for the current dialogue turn according to the dialogue load, the information coverage of the interaction target and emotion development trend, and generating questioning questions according to the questioning control parameters. Therefore, the invention can sense the emotion state of the user in real time and adaptively adjust the dialogue strategy, thereby improving the adaptability of the dialogue flow and the emotion of the user, further reducing the dialogue pressure and the interference emotion of the user and improving the dialogue information acquisition efficiency.

Inventors

LI WEI

Assignees

广州悦数信息科技有限公司

Dates

Publication Date: 20260505
Application Date: 20251229

Claims (10)

1. A method of dialog content adjustment based on emotional state recognition, the method comprising: collecting user interaction data in a user interaction process, and analyzing emotion state information of a user according to the user interaction data through an emotion analysis model, wherein the user interaction data comprises text data, voice data and facial expression data of a plurality of dialogue turns in the user interaction process, and the emotion state information comprises a current emotion state and emotion development trend; According to the text data and the voice data, analyzing information coverage of an interaction target and current dialogue context information of the user interaction process, and calculating a dialogue load currently corresponding to the user according to the current emotion state and the current dialogue context information; According to the dialogue load, the information coverage of the interaction target and the emotion development trend, generating a questioning control parameter for the current dialogue turn; And generating a question for the current dialogue round according to the question control parameters.
2. The method for adjusting dialogue content based on emotion state recognition according to claim 1, wherein said analyzing emotion state information of a user from the user interaction data by an emotion analysis model comprises: extracting features of the user interaction data through an emotion analysis model to obtain user interaction features, wherein the user interaction features comprise text features, acoustic features and visual features; Performing feature normalization processing on the user interaction features to obtain normalized user interaction features, and performing time sequence alignment processing on the normalized user interaction features based on a global time stamp to obtain target user interaction features; carrying out fusion processing on the target user interaction characteristics through the emotion analysis model to obtain fusion characteristic vectors, and calculating gating weight vectors of the fusion characteristic vectors; calculating the initial emotion state of the user according to the fusion feature vector and the gating weight vector; Acquiring a preset smoothing coefficient and a previous emotion state of a previous dialog turn corresponding to a current dialog turn, and carrying out smoothing treatment on the initial emotion state according to the smoothing coefficient and the previous emotion state to obtain a current emotion state of a user; and determining an emotion difference between the current emotion state and the previous emotion state, and determining the emotion development trend of the user according to the emotion difference.
3. The method for adjusting dialogue content based on emotion state recognition according to claim 2, wherein the fusing process is performed on the target user interaction feature by the emotion analysis model to obtain a fused feature vector, and the method comprises: Determining the confidence coefficient of the user interaction feature, and calculating the uncertainty coefficient of the user interaction feature according to the confidence coefficient of the user interaction feature; calculating fusion weights of the user interaction features according to the uncertainty coefficients of the user interaction features, wherein the fusion weights comprise text feature fusion weights, acoustic feature fusion weights and visual feature fusion weights, and the fusion weights of the user interaction features are inversely related to the uncertainty coefficients of the user interaction features; And carrying out weighted calculation on the user interaction characteristics and the fusion weights of the user interaction characteristics to obtain weighted characteristics corresponding to the user interaction characteristics, and splicing the weighted characteristics to obtain fusion characteristic vectors.
4. A dialog content conditioning method based on emotion state recognition according to any of claims 1-3, characterized in that the current dialog context information includes at least one of average answer delay, speech recognition confidence, average answer length of the user; the calculating the current corresponding dialogue load of the user according to the current emotion state and the current dialogue context information comprises the following steps: Determining an input feature vector according to the current emotional state and the current dialogue context information; inputting the input feature vector into a preset load mapping model to obtain an original dialogue load; The original dialogue load is cut off to obtain an initial dialogue load, and the initial dialogue load is restrained in a preset interval; acquiring historical dialogue data of preset rounds in the user interaction process, and calculating historical load data corresponding to the user according to each historical dialogue data, wherein the historical load data comprises a historical load mean value and a historical load variance; and performing individual normalization processing on the initial dialogue load according to the historical load data to obtain the dialogue load currently corresponding to the user.
5. A dialog content conditioning method based on emotional state recognition according to any of claims 1-3, characterized in that the question control parameters include a dialog depth control parameter, a question type control parameter, an information density control parameter, a dialog rhythm control parameter, and a load gain tradeoff coefficient; And generating a question control parameter for the current dialogue turn according to the dialogue load, the information coverage of the interaction target and the emotion development trend, wherein the question control parameter comprises the following steps: acquiring a preset dialogue depth grading threshold, and determining initial dialogue depth control parameters according to the dialogue load and the dialogue depth grading threshold; Acquiring a preset uplink hysteresis threshold and a preset downlink hysteresis threshold, recording a first historical turn of the historical dialogue load exceeding the uplink hysteresis threshold or a second historical turn of the historical dialogue load being lower than the downlink hysteresis threshold in the user interaction process, and adjusting the initial dialogue depth control parameter according to the first historical turn or the second historical turn to obtain a dialogue depth control parameter for controlling the dialogue depth of the current dialogue turn, wherein the dialogue depth comprises non-selection type, fact clarification type, cause comparison type or assumption reasoning inversion type; Determining a question type control parameter for controlling the question type of the current dialogue round according to the dialogue load and the information coverage of the interaction target; According to the dialogue load, the previous information density control parameter and the dialogue rhythm control parameter of the previous dialogue round corresponding to the current dialogue round are adjusted to obtain the information density control parameter and the dialogue rhythm control parameter; And adjusting the last load gain balance coefficient of the last dialog turn corresponding to the current dialog turn according to the emotion development trend to obtain the load gain balance coefficient.
6. The method for adjusting dialogue content based on emotion state recognition according to claim 5, wherein generating a question for a current dialogue round based on the question control parameter comprises: determining an unfilled information slot corresponding to the interaction target according to the information coverage of the interaction target; Generating candidate questions for each unfilled information slot according to the dialogue depth control parameter and the question type control parameter to obtain a candidate question set, wherein each candidate question corresponds to a single unfilled information slot or at least two associated information slots which are associated with each other; Estimating estimated load cost and expected information coverage gain of each candidate problem according to the dialogue depth control parameter, the information density control parameter and the dialogue rhythm control parameter, wherein each expected information coverage gain represents an estimated information coverage increment after selecting the candidate problem corresponding to the expected information coverage gain; Judging whether the estimated load cost corresponding to each candidate problem is smaller than the dialogue load currently corresponding to the user or not for each candidate problem, and determining the candidate problem as a target problem when the estimated load cost of the candidate problem is smaller than the dialogue load currently corresponding to the user; for each target problem, calculating the strategy utility of the target problem according to the information coverage, the estimated load cost of the target problem, the expected information coverage gain and the load gain weighing coefficient; and screening the target problems with highest strategy utility from each target problem as question questions for the current dialogue turn.
7. A dialog content conditioning method based on emotional state recognition according to any of claims 1-3, the method further comprising: collecting current answer information of the question questions of the current dialogue turn by the user, wherein the current answer information comprises current answer content and user response characteristics, and the user response characteristics comprise at least one of current answer time delay, current question refusal rate, current answer modification times, current speech speed change and current repeated answer frequency; Analyzing the question reaction emotion state of the user aiming at the question according to the current answer information, and judging whether the question reaction emotion state is matched with the current emotion state or not; And when the problem response emotional state is not matched with the current emotional state, determining the difference degree of the problem response emotional state and the current emotional state, and updating the emotion analysis model according to the difference degree of the emotional state.
8. A dialog content conditioning device based on emotional state recognition, the device comprising: the acquisition module is used for acquiring user interaction data in the user interaction process; The analysis module is used for analyzing emotion state information of a user according to the user interaction data through an emotion analysis model, wherein the user interaction data comprises text data, voice data and facial expression data of a plurality of dialogue turns in the user interaction process, and the emotion state information comprises a current emotion state and emotion development trend; the analysis module is further used for analyzing the information coverage of the interaction target and the current dialogue context information in the user interaction process according to the text data and the voice data; The calculation module is used for calculating the current corresponding dialogue load of the user according to the current emotion state and the current dialogue context information; the generation module is used for generating questioning control parameters for the current dialogue turn according to the dialogue load, the information coverage of the interaction targets and the emotion development trend; the generating module is further used for generating a question for the current dialogue round according to the question control parameters.
9. A dialog content conditioning device based on emotional state recognition, the device comprising: A memory storing executable program code; A processor coupled to the memory; the processor invokes the executable program code stored in the memory to perform the dialog content adjustment method based on emotional state recognition as claimed in any of claims 1-7.
10. A computer storage medium storing computer instructions for performing the dialog content modification method based on emotional state recognition as claimed in any of claims 1-7 when called.

Description

Dialogue content adjusting method and device based on emotion state recognition Technical Field The invention relates to the technical field of artificial intelligence, in particular to a dialogue content adjusting method and device based on emotion state identification. Background In user-oriented conversational interaction scenarios (e.g., intelligent customer service, online interviews, medical interviews, user research, etc.), conversational systems have become the core carrier for information gathering and service provision. However, most of the existing dialogue systems adopt fixed scripts or preset strategies to promote interaction, real-time and accumulated emotion changes of users are not fully considered, when the users are tired, contradicted, stressed and the like, the system still promotes deep questioning or information output according to the original rhythm, the users are liable to participate in interest reduction, answer quality reduction and incomplete key information acquisition, and particularly in emotion sensitive scenes such as customer complaint processing and psychological support, the problems are more prominent, and the service effect and user experience are seriously affected. Therefore, it is important to provide a technical solution capable of sensing the emotional state of the user in real time and adaptively adjusting the dialogue strategy, so as to improve the dialogue information acquisition efficiency while reducing the dialogue pressure and contradicting the emotion of the user. Disclosure of Invention The invention provides a dialogue content adjusting method and device based on emotion state identification, which can be beneficial to sensing the emotion state of a user in real time and adaptively adjusting a dialogue strategy, so that the dialogue information acquisition efficiency is improved while the dialogue pressure and the interference emotion of the user are reduced. To solve the above technical problem, a first aspect of the present invention discloses a dialogue content adjustment method based on emotion state recognition, the method comprising: collecting user interaction data in a user interaction process, and analyzing emotion state information of a user according to the user interaction data through an emotion analysis model, wherein the user interaction data comprises text data, voice data and facial expression data of a plurality of dialogue turns in the user interaction process, and the emotion state information comprises a current emotion state and emotion development trend; According to the text data and the voice data, analyzing information coverage of an interaction target and current dialogue context information of the user interaction process, and calculating a dialogue load currently corresponding to the user according to the current emotion state and the current dialogue context information; According to the dialogue load, the information coverage of the interaction target and the emotion development trend, generating a questioning control parameter for the current dialogue turn; And generating a question for the current dialogue round according to the question control parameters. As an optional implementation manner, in the first aspect of the present invention, the analyzing, by an emotion analysis model, emotion state information of a user according to the user interaction data includes: extracting features of the user interaction data through an emotion analysis model to obtain user interaction features, wherein the user interaction features comprise text features, acoustic features and visual features; Performing feature normalization processing on the user interaction features to obtain normalized user interaction features, and performing time sequence alignment processing on the normalized user interaction features based on a global time stamp to obtain target user interaction features; carrying out fusion processing on the target user interaction characteristics through the emotion analysis model to obtain fusion characteristic vectors, and calculating gating weight vectors of the fusion characteristic vectors; calculating the initial emotion state of the user according to the fusion feature vector and the gating weight vector; Acquiring a preset smoothing coefficient and a previous emotion state of a previous dialog turn corresponding to a current dialog turn, and carrying out smoothing treatment on the initial emotion state according to the smoothing coefficient and the previous emotion state to obtain a current emotion state of a user; and determining an emotion difference between the current emotion state and the previous emotion state, and determining the emotion development trend of the user according to the emotion difference. In a first aspect of the present invention, the fusing processing is performed on the target user interaction feature by the emotion analysis model to obtain a fused feature vector, including: Determining th