CN-122008253-A - LLVM core-based emotion perception robot control method and device

CN122008253ACN 122008253 ACN122008253 ACN 122008253ACN-122008253-A

Abstract

The embodiment of the application provides a control method and a control device for an emotion perception robot based on an LLVM core, which realize effective emotion perception through multi-mode features and a neural network. A language processing mechanism is constructed, emotion adjustment and dialogue understanding are combined, and a reliable interaction strategy is established. Expression control is introduced, and coordination of emotion expression is ensured through action planning and voice synthesis. The method effectively solves the defects of the traditional technology in the aspects of emotion recognition, language processing, emotion expression and the like, and provides technical support for the emotion perception robot.

Inventors

Adwan Kabrazil
DING JIANYU
DING XIAODUAN
YIN YAN
HUANG KUN

Assignees

深圳智慧林网络科技有限公司

Dates

Publication Date: 20260512
Application Date: 20260413

Claims (10)

1. An emotion perception robot control method based on an LLVM core, which is characterized by comprising the following steps: Carrying out feature extraction on an image data stream acquired by a camera and a voice data stream acquired by a microphone to obtain a multi-modal feature vector, carrying out time sequence alignment processing on the multi-modal feature vector to generate a standardized feature matrix, constructing an emotion feature mapper based on the standardized feature matrix to obtain a low-dimensional characterization vector, inputting the low-dimensional characterization vector into an emotion recognition neural network for training to obtain an emotion assessment model, calculating an emotion state vector according to the emotion assessment model, and establishing a memory buffer pool based on the emotion state vector to obtain an emotion evolution predictor; Inputting the output result of the emotion state vector and the emotion evolution predictor into a language model regulator to adjust attention parameters to obtain an emotion enhancement language model, carrying out feature fusion on user input information and the emotion state vector to obtain an emotion perception input vector, inputting the emotion perception input vector into a dialogue intention understanding module to generate a dialogue history vector, generating an interaction strategy vector according to the dialogue history vector and the emotion state vector, and carrying out emotion intensity regulation on the interaction strategy vector to obtain an emotion expression control vector; Analyzing the emotion expression control vector into an actuator driving parameter and a voice synthesis parameter, performing collision detection on the actuator driving parameter to obtain a safety track, controlling the output action of the robot actuator according to the safety track, inputting the voice synthesis parameter into a voice rhythm regulator to generate a voice waveform, and realizing the collaborative emotion expression of the robot based on the voice waveform and the output action.
2. The LLVM core-based emotion perception robot control method of claim 1, wherein the feature extraction of the image data stream collected by the camera and the voice data stream collected by the microphone to obtain multi-modal feature vectors, the time sequence alignment processing of the multi-modal feature vectors to generate a standardized feature matrix, and the construction of the emotion feature mapper based on the standardized feature matrix to obtain a low-dimensional feature vector comprises: Performing edge detection and region segmentation on an image data stream to obtain an image feature region, extracting a local description operator from the image feature region to generate an image feature description set, converting a voice data stream into a spectrogram sequence, performing time-frequency analysis to obtain an acoustic feature description set, performing feature vector quantization on the image feature description set and the acoustic feature description set to generate a multi-modal feature coding matrix, and constructing a cross-modal alignment network based on the multi-modal feature coding matrix to obtain an alignment feature mapper; Inputting the multi-mode feature coding matrix into an alignment feature mapper for time sequence alignment processing to obtain a standardized feature sequence, carrying out principal component analysis on the standardized feature sequence according to a preset dimension reduction rule to generate a feature projection matrix, training a feature dimension reduction network based on the feature projection matrix to obtain an emotion feature mapper, and inputting the standardized feature sequence input in real time into the emotion feature mapper to generate a low-dimensional characterization vector.
3. The LLVM core-based emotion perception robot control method of claim 1, wherein the inputting the low-dimensional characterization vector into an emotion recognition neural network training to obtain an emotion estimation model, calculating an emotion state vector according to the emotion estimation model, and establishing a memory buffer pool based on the emotion state vector to obtain an emotion evolution predictor comprises: Dividing a low-dimensional characterization vector into a training sequence and a verification sequence according to a time sequence segmentation rule, performing data enhancement processing on the training sequence to generate a training sample set, constructing a multi-layer perceptron network structure based on the training sample set to obtain an emotion recognition model prototype, performing iterative optimization training on the emotion recognition model prototype according to a preset loss function to obtain an emotion assessment model, and inputting the verification sequence into the emotion assessment model to perform cross verification to obtain a model assessment index; inputting the real-time low-dimensional representation vector into the emotion assessment model for forward calculation to obtain an emotion state vector, performing time sequence sliding sampling on the emotion state vector to generate an emotion state sequence, constructing a recurrent neural network based on the emotion state sequence to obtain an emotion evolution predictor, and deploying the emotion evolution predictor into a memory buffer pool for real-time state maintenance and prediction update.
4. The LLVM core-based emotion perception robot control method of claim 1, wherein the inputting the emotion state vector and the emotion evolution predictor output result into a language model adjuster adjusts attention parameters to obtain an emotion enhancement language model, performing feature fusion on user input information and the emotion state vector to obtain an emotion perception input vector, inputting the emotion perception input vector into a dialogue intention understanding module to generate a dialogue history vector, comprises: Characteristic stitching is carried out on the emotion state vector and the output result of the emotion evolution predictor to obtain an emotion regulation vector, normalization processing is carried out on the emotion regulation vector to generate a weight distribution matrix, an attention regulation network is constructed based on the weight distribution matrix to obtain a parameter regulation model, and parameter remapping is carried out on the output result of the parameter regulation model and a pre-training language model to obtain an emotion enhancement language model; Converting user input information into a text sequence, performing word segmentation processing to obtain a word element sequence, performing multi-head attention calculation on the word element sequence and the emotion state vector to generate a fusion feature matrix, inputting the fusion feature matrix into a bi-directional encoder to obtain an emotion perception input vector, and constructing an intention recognition classifier based on the emotion perception input vector to generate a dialogue history vector.
5. The LLVM core-based emotion perception robot control method of claim 1, wherein generating an interaction policy vector according to the dialogue history vector and an emotion state vector, and performing emotion intensity adjustment on the interaction policy vector to obtain an emotion expression control vector comprises: Vector splicing is carried out on the dialogue history vector and the emotion state vector to obtain a multi-mode state vector, hierarchical coding is carried out on the multi-mode state vector to generate a state characterization matrix, a strategy generation network is constructed based on the state characterization matrix to obtain an interaction decision model, and Monte Carlo tree search optimization is carried out on the interaction decision model to obtain an interaction strategy vector; And decoupling and decomposing the interaction strategy vector according to the emotion type to obtain an intensity parameter set, constructing a self-adaptive regulator based on the intensity parameter set to generate a regulating coefficient matrix, and carrying out component cascade on the regulating coefficient matrix and the interaction strategy vector to obtain an emotion expression control vector.
6. The LLVM core-based emotion perception robot control method of claim 1, wherein the analyzing the emotion expression control vector into an actuator driving parameter and a speech synthesis parameter, performing collision detection on the actuator driving parameter to obtain a safety track, and controlling the robot actuator output action according to the safety track comprises: Performing dimensional decomposition on the emotion expression control vector according to a preset analysis rule to obtain an action parameter matrix, constructing a kinematic mapping network based on the action parameter matrix to generate a joint space mapper, performing constraint optimization on an output result of the joint space mapper to obtain an actuator driving parameter, and performing interpolation smoothing processing on the actuator driving parameter to generate an initial track sequence; Inputting the initial track sequence into a collision detection module for space interference analysis to obtain an obstacle avoidance path set, constructing a track optimizer based on the obstacle avoidance path set to generate a safety track, performing inverse solution on the safety track according to a robot kinematics model to obtain a joint driving instruction, and controlling an actuator to finish action output according to the joint driving instruction.
7. The LLVM core-based emotion perception robot control method of claim 1, wherein the inputting the speech synthesis parameters into a speech prosody adjuster generates a speech waveform, and realizing cooperative emotion expression of a robot based on the speech waveform and an output action comprises: Decomposing voice synthesis parameters according to a phoneme structure to obtain a prosodic feature set, carrying out emotion mapping transformation on the prosodic feature set to generate prosodic control vectors, constructing an acoustic parameter generation network based on the prosodic control vectors to obtain a voice synthesis model, and processing an output result of the voice synthesis model by a vocoder to obtain a voice waveform; And carrying out time sequence analysis on the voice waveform to obtain a voice time stamp sequence, and synchronously aligning the voice time stamp sequence with an action execution time sequence to generate a cooperative control instruction, and scheduling a robot executor and a voice player to complete multi-mode emotion expression based on the cooperative control instruction.
8. An emotion perception robot control device based on LLVM core, characterized in that the device comprises: the model construction module is used for extracting features of an image data stream acquired by a camera and a voice data stream acquired by a microphone to obtain a multi-mode feature vector, performing time sequence alignment processing on the multi-mode feature vector to generate a standardized feature matrix, constructing an emotion feature mapper based on the standardized feature matrix to obtain a low-dimensional characterization vector, inputting the low-dimensional characterization vector into an emotion recognition neural network for training to obtain an emotion assessment model, calculating an emotion state vector according to the emotion assessment model, and establishing a memory buffer pool based on the emotion state vector to obtain an emotion evolution predictor; The emotion perception module is used for inputting the emotion state vector and the output result of the emotion evolution predictor into the language model regulator to adjust attention parameters to obtain an emotion enhancement language model, carrying out feature fusion on user input information and the emotion state vector to obtain an emotion perception input vector, inputting the emotion perception input vector into the dialogue intention understanding module to generate a dialogue history vector, generating an interaction strategy vector according to the dialogue history vector and the emotion state vector, and carrying out emotion intensity regulation on the interaction strategy vector to obtain an emotion expression control vector; The robot control module is used for analyzing the emotion expression control vector into an actuator driving parameter and a voice synthesis parameter, performing collision detection on the actuator driving parameter to obtain a safety track, controlling the output action of the robot actuator according to the safety track, inputting the voice synthesis parameter into the voice rhythm regulator to generate a voice waveform, and realizing the collaborative emotion expression of the robot based on the voice waveform and the output action.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the LLVM core based emotion aware robot control method of any of claims 1 to 7 when executing the program.
10. A computer readable storage medium having stored thereon a computer program, characterized in that the computer program when executed by a processor implements the steps of the LLVM core based emotion perception robot control method of any of claims 1 to 7.

Description

LLVM core-based emotion perception robot control method and device Technical Field The application relates to the field of robots with bodies, in particular to a method and a device for controlling an emotion perception robot based on LLVM cores. Background The existing emotion perception robot control method has obvious defects. The traditional system has poor performance in the aspects of multi-mode perception and feature fusion, and cannot effectively realize accurate identification of the emotion state, so that the interaction effect is affected. Furthermore, the prior art has bottlenecks in emotion modeling and language processing. Most systems lack perfect emotion evolution mechanisms and attention regulation strategies, resulting in inadequate conversational understanding. Existing systems have technology shortboards in terms of emotional expression. The lack of deep coordination of actions and voices makes it difficult to achieve efficient emotion communication through multi-modal output, affecting interactive experience. The solution of the problems has important significance for improving the emotion perception robot capability. Disclosure of Invention Aiming at the problems in the prior art, the application provides a method and a device for controlling an emotion perception robot based on an LLVM core, which can effectively solve the defects of the traditional technology in the aspects of emotion recognition, language processing, emotion expression and the like and provide technical support for the emotion perception robot. In order to solve at least one of the problems, the application provides the following technical scheme: in a first aspect, the present application provides a method for controlling an emotion perception robot based on LLVM core, including: Carrying out feature extraction on an image data stream acquired by a camera and a voice data stream acquired by a microphone to obtain a multi-modal feature vector, carrying out time sequence alignment processing on the multi-modal feature vector to generate a standardized feature matrix, constructing an emotion feature mapper based on the standardized feature matrix to obtain a low-dimensional characterization vector, inputting the low-dimensional characterization vector into an emotion recognition neural network for training to obtain an emotion assessment model, calculating an emotion state vector according to the emotion assessment model, and establishing a memory buffer pool based on the emotion state vector to obtain an emotion evolution predictor; Inputting the output result of the emotion state vector and the emotion evolution predictor into a language model regulator to adjust attention parameters to obtain an emotion enhancement language model, carrying out feature fusion on user input information and the emotion state vector to obtain an emotion perception input vector, inputting the emotion perception input vector into a dialogue intention understanding module to generate a dialogue history vector, generating an interaction strategy vector according to the dialogue history vector and the emotion state vector, and carrying out emotion intensity regulation on the interaction strategy vector to obtain an emotion expression control vector; Analyzing the emotion expression control vector into an actuator driving parameter and a voice synthesis parameter, performing collision detection on the actuator driving parameter to obtain a safety track, controlling the output action of the robot actuator according to the safety track, inputting the voice synthesis parameter into a voice rhythm regulator to generate a voice waveform, and realizing the collaborative emotion expression of the robot based on the voice waveform and the output action. Further, the method further comprises the steps of carrying out edge detection and region segmentation on an image data stream to obtain an image feature region, extracting a local description operator from the image feature region to generate an image feature description set, converting a voice data stream into a spectrogram sequence, carrying out time-frequency analysis to obtain an acoustic feature description set, carrying out feature vector quantization on the image feature description set and the acoustic feature description set to generate a multi-modal feature coding matrix, and constructing a cross-modal alignment network based on the multi-modal feature coding matrix to obtain an alignment feature mapper; Inputting the multi-mode feature coding matrix into an alignment feature mapper for time sequence alignment processing to obtain a standardized feature sequence, carrying out principal component analysis on the standardized feature sequence according to a preset dimension reduction rule to generate a feature projection matrix, training a feature dimension reduction network based on the feature projection matrix to obtain an emotion feature mapper, and inputting the standardized feature seq