CN-122018680-A - Interactive processing method and device based on robot, computer equipment and medium

CN122018680ACN 122018680 ACN122018680 ACN 122018680ACN-122018680-A

Abstract

The application belongs to the technical field of artificial intelligence, and relates to an interaction processing method based on a robot, which comprises the steps of collecting multi-mode original data of a user based on a multi-mode sensing array; preprocessing multi-mode original data to obtain an emotion data set, performing cross-mode feature fusion on the emotion data set to obtain emotion feature vectors, performing emotion analysis on the emotion feature vectors based on a preset emotion basic model to generate emotion prediction data, determining a current interaction scene based on a scene recognition module, acquiring scene constraint conditions corresponding to the interaction scene, performing strategy generation on the interaction scene, the scene constraint conditions and the emotion prediction data based on a basic interaction strategy library to obtain an interaction strategy, and controlling a robot to perform interaction on a user based on the interaction strategy. The application can be applied to the interactive processing business scene in the financial science and technology field and the digital medical field, improves the accuracy of emotion recognition and improves the intelligence of robot interaction.

Inventors

WANG JIANZONG
SUN AOLAN

Assignees

平安科技（深圳）有限公司

Dates

Publication Date: 20260512
Application Date: 20260109

Claims (10)

1. The interactive processing method based on the robot is characterized by comprising the following steps of: Acquiring multi-mode original data of a user based on a preset multi-mode sensing array; Preprocessing the multi-mode original data to obtain a corresponding emotion data set; Performing cross-modal feature fusion processing on the emotion data set to obtain a corresponding emotion feature vector; Carrying out emotion analysis processing on the emotion feature vector based on a preset emotion basic model to generate corresponding emotion prediction data; Determining a current interaction scene based on a preset scene recognition module, and acquiring scene constraint conditions corresponding to the interaction scene; Performing strategy generation processing on the interaction scene, the scene constraint condition and the emotion prediction data based on a preset basic interaction strategy library to obtain a corresponding interaction strategy; And controlling the robot to execute corresponding interaction processing on the user based on the interaction strategy.
2. The method for processing robot-based interaction according to claim 1, wherein the multi-modal raw data includes visual data, audio data, haptic data and physiological data, and the step of preprocessing the multi-modal raw data to obtain a corresponding emotion data set specifically includes: preprocessing the visual data based on a preset first preprocessing strategy to obtain corresponding target visual data; preprocessing the audio data based on a preset second preprocessing strategy to obtain corresponding target audio data; Preprocessing the haptic data based on a preset third preprocessing strategy to obtain corresponding target haptic data; Preprocessing the physiological data based on a preset fourth preprocessing strategy to obtain corresponding target physiological data; integrating the target visual data, the target audio data, the target tactile data and the target physiological data based on a preset integration strategy to obtain corresponding integrated data; And taking the integrated data as the emotion data set.
3. The method for processing robot-based interaction according to claim 1, wherein the step of performing cross-modal feature fusion processing on the emotion data set to obtain a corresponding emotion feature vector specifically comprises: invoking a preset target feature extractor, wherein the target feature extractor comprises a plurality of feature extractors respectively corresponding to the modal types of the emotion data set; Performing feature extraction on the emotion data set based on the target feature extractor to obtain corresponding multi-modal features; Based on a preset cross-modal fusion engine, carrying out fusion processing on the multi-modal characteristics to obtain corresponding fusion characteristics; optimizing the processing characteristics based on a preset optimization strategy to obtain corresponding target characteristics; And taking the target feature as the emotion feature vector.
4. The interactive processing method based on the robot of claim 3, wherein the step of optimizing the processing features based on a preset optimization strategy to obtain the corresponding target features specifically comprises: adjusting the fusion characteristic based on a preset Bayesian inference model to obtain a corresponding first processing characteristic; Acquiring a basic file of the user; Performing personalized calibration processing on the first processing characteristics based on the basic file to obtain corresponding second processing characteristics; and taking the second processing characteristic as the target characteristic.
5. The method for processing robot-based interaction according to claim 1, wherein the step of performing emotion analysis processing on the emotion feature vector based on a preset emotion basic model to generate corresponding emotion prediction data specifically comprises: Invoking a pre-constructed emotion basic model; Carrying out emotion mapping processing on the emotion feature vector based on the emotion basic model to obtain corresponding emotion state data; trend analysis is carried out on the emotion state data based on a preset emotion state transfer module, so that corresponding emotion trend prediction data are obtained; Integrating the emotion state data and the emotion trend prediction data to obtain corresponding emotion integration data; and taking the emotion integration data as the emotion prediction data.
6. The method for processing interaction based on the robot according to claim 1, wherein the step of performing policy generation processing on the interaction scene, the scene constraint condition and the emotion prediction data based on the preset basic interaction policy library to obtain a corresponding interaction policy specifically comprises: invoking a preset strategy generation engine; Based on the strategy generation engine, carrying out strategy matching on the basic interaction strategy library according to the emotion prediction data so as to obtain a corresponding basic interaction strategy; Performing strategy adjustment on the basic interaction strategy based on the interaction scene and the scene constraint condition to obtain a corresponding appointed interaction strategy; And taking the appointed interaction strategy as the interaction strategy.
7. The robot-based interaction processing method of claim 1, wherein the step of controlling a robot to perform a corresponding interaction process on the user based on the interaction policy further comprises: collecting emotion feedback data corresponding to the user; constructing an emotion preference library corresponding to the user based on the emotion feedback data; performing personalized optimization on the interaction strategy based on the emotion preference library to obtain an optimized target interaction strategy; and controlling the robot to execute corresponding interaction processing on the user based on the target interaction strategy.
8. An interactive processing device based on a robot, comprising: The acquisition module is used for acquiring multi-mode original data of a user based on a preset multi-mode sensing array; The preprocessing module is used for preprocessing the multi-mode original data to obtain a corresponding emotion data set; the processing module is used for performing cross-modal feature fusion processing on the emotion data set to obtain a corresponding emotion feature vector; the analysis module is used for carrying out emotion analysis processing on the emotion feature vector based on a preset emotion basic model to generate corresponding emotion prediction data; The acquisition module is used for determining a current interaction scene based on a preset scene recognition module and acquiring scene constraint conditions corresponding to the interaction scene; The generation module is used for carrying out strategy generation processing on the interaction scene, the scene constraint condition and the emotion prediction data based on a preset basic interaction strategy library to obtain a corresponding interaction strategy; And the execution module is used for controlling the robot to execute corresponding interaction processing on the user based on the interaction strategy.
9. A computer device comprising a memory and a processor, the memory having stored therein computer readable instructions which when executed by the processor implement the steps of the robot-based interaction handling method of any of claims 1 to 7.
10. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon computer readable instructions, which when executed by a processor, implement the steps of the robot-based interaction handling method of any of claims 1 to 7.

Description

Interactive processing method and device based on robot, computer equipment and medium Technical Field The application relates to the technical field of artificial intelligence, which can be applied to the fields of financial science and technology, digital medical treatment and the like, in particular to an interactive processing method, an interactive processing device, computer equipment and a storage medium based on a robot. Background In the field of robot interaction, the current emotion modeling technology of the robot with a body is preliminarily applied to an actual scene, but has obvious technical limitations. In the prior art, emotion judgment is carried out by relying on single-mode data (such as one of voice intonation, facial expression or text semantics), so that emotion recognition dimension is single, and complex and changeable emotion states of a user are difficult to comprehensively capture. For example, in an intelligent customer service scenario in the field of financial insurance, a user may have anxiety (speech shortness) and dissatisfaction (frowning) due to complicated claim process, but the conventional single-mode model can only recognize one emotion, and misjudgment is easy to be caused. In addition, the mapping relation between the existing emotion modeling result and the robot interaction behavior is mostly a fixed rule (such as "anger emotion is detected and pacifying operation is switched"), and dynamic adaptability is lacked, so that rapid change of emotion of a user in a real scene and personalized requirements are difficult to deal with. This problem is also prominent in the medical field. For example, in a psychological health consultation scenario of digital medical treatment, a patient may describe emotion only through words (such as "bad recent state") due to privacy concerns or expression disorders, but the conventional single-mode model cannot combine multidimensional information such as voice tremble, limb movements and the like, so that emotion recognition accuracy is less than 30%. Meanwhile, interaction strategies with fixed rules (such as forced pushing of psychological assessment questionnaires) may exacerbate patient conflict emotions, reducing service compliance. The technical defects cause the problems of low precision and insufficient intelligence of the body robot in the interaction process, and the personal service requirements of the financial insurance field and the interaction requirements of the high-sensitivity scene of the medical field are difficult to meet. Therefore, an intelligent robot interaction technology is needed to improve the comprehensiveness of emotion recognition and adaptability of interaction behavior, so as to optimize user experience and service efficiency. Disclosure of Invention The embodiment of the application aims to provide an interaction processing method, device, computer equipment and storage medium based on a robot, so as to solve the technical problems of low accuracy and insufficient intelligence in the interaction process of the existing robot. In a first aspect, a robot-based interaction processing method is provided, including: Acquiring multi-mode original data of a user based on a preset multi-mode sensing array; Preprocessing the multi-mode original data to obtain a corresponding emotion data set; Performing cross-modal feature fusion processing on the emotion data set to obtain a corresponding emotion feature vector; Carrying out emotion analysis processing on the emotion feature vector based on a preset emotion basic model to generate corresponding emotion prediction data; Determining a current interaction scene based on a preset scene recognition module, and acquiring scene constraint conditions corresponding to the interaction scene; Performing strategy generation processing on the interaction scene, the scene constraint condition and the emotion prediction data based on a preset basic interaction strategy library to obtain a corresponding interaction strategy; And controlling the robot to execute corresponding interaction processing on the user based on the interaction strategy. In a second aspect, there is provided a robot-based interaction processing apparatus, including: The acquisition module is used for acquiring multi-mode original data of a user based on a preset multi-mode sensing array; The preprocessing module is used for preprocessing the multi-mode original data to obtain a corresponding emotion data set; the processing module is used for performing cross-modal feature fusion processing on the emotion data set to obtain a corresponding emotion feature vector; the analysis module is used for carrying out emotion analysis processing on the emotion feature vector based on a preset emotion basic model to generate corresponding emotion prediction data; The acquisition module is used for determining a current interaction scene based on a preset scene recognition module and acquiring scene constraint co