CN-121973192-A - Robot control method, device, equipment and medium based on emotion enhancement

CN121973192ACN 121973192 ACN121973192 ACN 121973192ACN-121973192-A

Abstract

The invention discloses a robot control method, a device, equipment and a medium based on emotion enhancement, which relate to the artificial intelligence field in the fields of finance, medical treatment, insurance, banking and the like, and comprise the steps of obtaining visual information and natural language instructions of a user and generating emotion embedded vectors; the method comprises the steps of inputting visual information, natural language instructions and emotion embedded vectors into a preset large language model, generating comprehensive instructions which simultaneously contain task semantics and emotion strategies by the large language model, generating an action token according to the task semantics and emotion strategies, wherein the action token comprises action parameters and language response contents for controlling the movement of a robot, driving the robot to execute physical actions based on the action parameters, and synchronously outputting the language response contents. The invention realizes the integrated control of robot physical assistance and emotion support, so that the elderly feel understood and care while obtaining actual assistance, and the service trust and humanized experience are effectively improved.

Inventors

QU XIAOYANG
MO WENTAO

Assignees

平安科技（深圳）有限公司

Dates

Publication Date: 20260505
Application Date: 20260120

Claims (10)

1. The robot control method based on emotion enhancement is characterized by comprising the following steps: acquiring visual information and a natural language instruction of a user, and performing emotion analysis on the visual information and the natural language instruction to generate an emotion embedded vector; Inputting the visual information, the natural language instruction and the emotion embedded vector into a preset large language model, and generating a comprehensive instruction which simultaneously contains task semantics and emotion strategies by the large language model; Generating an action token according to the task semantics and the emotion strategy, wherein the action token comprises action parameters for controlling the robot to move and language response content; and driving the robot to execute physical actions based on the action parameters, and synchronously outputting the language response content.
2. The emotion enhancement based robot control method of claim 1, wherein the emotion analyzing the visual information and the natural language instruction to generate an emotion embedded vector includes: Performing behavioral emotion analysis on the visual information to obtain a first emotion feature; Performing intonation analysis on the natural language instruction to generate a second emotion feature; carrying out semantic analysis on the natural language instruction to generate a third emotion feature; generating the emotion embedded vector based on the first emotion feature, the second emotion feature and the third emotion feature.
3. The emotion enhancement based robot control method of claim 1, wherein the generating an action token according to the task semantics and the emotion policy comprises: determining an action track in the action parameters based on the task semantics; determining the execution speed and acceleration in the action parameters based on the emotion strategy; and generating the language response content based on the emotion strategy and the task semantics.
4. The emotion enhancement based robot control method of claim 3, wherein said determining execution speed and acceleration in the action parameters based on the emotion policy includes: acquiring a speed threshold value and an acceleration threshold value which are preset corresponding to the emotion strategy; The execution speed is defined to be within the speed threshold, and the acceleration is defined to be within the acceleration threshold.
5. The emotion enhancement based robot control method of claim 3, wherein said generating said language response content based on said emotion policy and said task semantics comprises: determining a voice emotion type based on the emotion strategy; and generating the language response content based on the voice emotion type and the task semantics.
6. The emotion enhancement based robot control method of claim 1, wherein after driving the robot to perform a physical action based on the action parameter and synchronously outputting the language response content, the method further comprises: User feedback information in the executing process is collected in real time, and the action or voice of the mechanical arm is dynamically regulated based on the user feedback information.
7. The emotion enhancement based robot control method of claim 6, wherein the dynamically adjusting the motion or voice of the mechanical arm based on the user feedback information comprises: Identifying an emotional state of the user based on the user feedback information; And if the emotional state is a preset negative state, controlling the mechanical arm to slow down the action speed and/or outputting pacifying voice.
8. A robot control device based on emotion augmentation, comprising means for performing the method of any of claims 1-7.
9. A computer device, characterized in that it comprises a memory on which a computer program is stored and a processor which, when executing the computer program, implements the method according to any of claims 1-7.
10. A computer readable storage medium, characterized in that the storage medium stores a computer program which, when executed by a processor, implements the method according to any of claims 1-7.

Description

Robot control method, device, equipment and medium based on emotion enhancement Technical Field The invention relates to the technical field of artificial intelligence, in particular to a robot control method, device, equipment and medium based on emotion enhancement. Background With the continuous development of artificial intelligence and robot technology, the pension robot plays an important role in auxiliary services in the fields of finance, medical treatment, insurance, banking and the like. Currently, research and development of pension robots presents a trend of functional differentiation. One type of robot focuses on the execution of physical tasks, such as delivering medicines, taking drinking water or assisting walking through a mechanical arm, and the technical core is to improve the accuracy and reliability of motion control. Another type of robot is focused on emotion accompanying functions, such as a healing robot with bionic design or a dialogue robot with voice interaction capability, and aims to provide psychological pacifying for the elderly through social interaction. Although these two types of robots have made some progress in the respective fields, they are generally independent of each other in practical applications, and lack functional synergy and architectural unification. In a real pension scenario, the elderly not only needs robots to assist in completing daily tasks, but also hopefully gets emotional care and respect during service. For example, when delivering a cup, whether the robot is gentle or not and whether the action is gentle or not can directly influence the mental safety and service experience of the old. However, prior art systems have difficulty in organically combining affective interactions with motion control. Although emotion computing technology has progressed in expression recognition, speech emotion analysis and the like, the functions of the emotion computing technology stay at a perception level, and deep coupling cannot be realized with the action generation process of the mechanical arm. Specifically, when the elderly show tension or fear, the mechanical arm performing the task may still move rapidly according to the preset track, which not only fails to relieve the user's anxiety, but may even bring secondary risk due to abrupt movements. On the contrary, when the old expresses anxiety or autism through language, the robot can recognize emotion, but can not adjust action rhythm or assist language pacify while executing physical operation. The core requirements of the pension care not only comprise the efficiency and the accuracy of task execution, but also relate to multidimensional targets such as trust establishment, psychological safety, emotion support and the like. The existing robot only pays attention to "making right", but neglects "making warm", and is often difficult to obtain the true acceptance of the old. The simple dependence on the mechanical arm to execute the task easily causes the lack of humanization in the service process and brings indifferent use experience, and the accompanying robot with the dialogue capability cannot meet the requirement of the old on the actual nursing function. Therefore, the current technical architecture has obvious limitation in coping with the special scene of the aged, and an integrated solution capable of fusing motion control and emotion interaction depth is needed, so that the aged can feel understood, cared and honored psychologically while obtaining actual help. Disclosure of Invention The embodiment of the invention provides a robot control method, device, equipment and medium based on emotion enhancement, which aim at solving the problem of how to enable a robot to realize effective emotion support while providing physical assistance. In a first aspect, an embodiment of the present invention provides a robot control method based on emotion enhancement, including: acquiring visual information and a natural language instruction of a user, and performing emotion analysis on the visual information and the natural language instruction to generate an emotion embedded vector; Inputting the visual information, the natural language instruction and the emotion embedded vector into a preset large language model, and generating a comprehensive instruction which simultaneously contains task semantics and emotion strategies by the large language model; Generating an action token according to the task semantics and the emotion strategy, wherein the action token comprises action parameters for controlling the robot to move and language response content; and driving the robot to execute physical actions based on the action parameters, and synchronously outputting the language response content. The further technical scheme is that the emotion analysis is performed on the visual information and the natural language instruction to generate an emotion embedded vector, including: Performing behavioral emotion analysis o