CN-121997973-A - Facial expression control method, device, equipment, medium and product of anthropomorphic robot

CN121997973ACN 121997973 ACN121997973 ACN 121997973ACN-121997973-A

Abstract

The invention discloses a method, a device, equipment, a medium and a product for controlling facial expression of an anthropomorphic robot, and relates to the technical field of bionic people. The method comprises the steps of firstly receiving an external expression trigger instruction, then planning a corresponding facial expression action sequence, specifically, actively inserting a silence delay before executing the action when planning each facial expression action in the sequence, wherein the silence delay time is the time from the instruction receiving to the action starting execution, the ratio of the silence delay time to the execution time of the corresponding action is controlled within a preset interval, and finally, driving a robot facial execution mechanism to finish the expression based on the sequence, so that the ratio of the 'stimulus-response' delay to the action time is actively introduced and quantitatively controlled, the mechanical and abrupt sense caused by transient response in the prior art is overcome, the natural cognitive processing rhythm of a human is simulated, and further, the facial organ motion of the robot can more accord with the motion mode and response rule of the human, and the expression naturalness and the life sense are obviously promoted.

Inventors

HU CHENXU
JIANG ZHEYUAN
WANG CHENG

Assignees

北京松延动力科技集团股份有限公司

Dates

Publication Date: 20260508
Application Date: 20260108

Claims (10)

1. A method for controlling facial expression of an anthropomorphic robot, comprising: Receiving a facial expression trigger instruction from the outside; Determining at least one facial expression action to be independently executed and execution time of each facial expression action in the at least one facial expression action according to the facial expression trigger instruction, and planning a facial expression action sequence formed by the at least one facial expression action according to the following mode, wherein for each facial expression action, a silence delay with a corresponding silence time length is inserted before the corresponding action is executed, wherein the silence time length refers to the time length from the receiving time of the facial expression trigger instruction to the execution starting time of the corresponding action, and the ratio of the silence time length to the execution time length of the corresponding action belongs to a preset interval; And driving the anthropomorphic robot based on the planned facial expression action sequence, and executing corresponding facial expression actions by each facial executing mechanism corresponding to each facial expression action one by one.
2. The anthropomorphic robot facial expression control method of claim 1, wherein the ratio corresponding to a first facial expression motion for making eye movement is greater than the ratio corresponding to a second facial expression motion for making eyelid movement.
3. The method according to claim 1, wherein the ratio is dynamically adjusted by determining an emotion type according to the facial expression trigger instruction, and then decreasing the ratio when the emotion type is a fast-response emotion type, and increasing the ratio when the emotion type is a slow-response emotion type.
4. The anthropomorphic robot facial expression control method of claim 1, wherein the silence duration is dynamically generated as follows: Acquiring execution time length corresponding to corresponding facial expression actions Minimum silence duration Maximum silence duration And the value interval of the ratio ; Randomly generating a belonging section Pure decimal in And calculate and get the said silence duration 。
5. The anthropomorphic robot facial expression control method of claim 4, wherein the minimum silence duration and the maximum silence duration are obtained by: acquiring a plurality of pieces of real facial expression video data which are prestored and correspond to the facial expression triggering instruction; For each video data in the plurality of real facial expression video data, firstly analyzing according to the corresponding video data to obtain at least one action intensity time sequence data corresponding to the at least one facial expression action one by one, then determining the starting moment of the corresponding facial expression action according to the action intensity time sequence data, calculating the relative time difference between the starting moment and the reference starting moment, and finally obtaining at least one relative time difference corresponding to the at least one facial expression action one by one; For each action in the at least one facial expression action, determining a corresponding minimum silence duration and maximum silence duration according to a plurality of relative time differences corresponding to the plurality of real facial expression video data one to one, wherein the minimum silence duration is determined based on a minimum value or a lower percentile of the plurality of relative time differences, and the maximum silence duration is determined based on a maximum value or an upper percentile of the plurality of relative time differences.
6. The facial expression control method of an anthropomorphic robot according to claim 1, wherein the preset interval of the ratio is obtained by: acquiring a plurality of pieces of real facial expression video data which are prestored and correspond to the facial expression triggering instruction; For each video data in the plurality of real facial expression video data, firstly analyzing according to the corresponding video data to obtain at least one action intensity time sequence data corresponding to the at least one facial expression action one by one, then determining starting time and finishing time of the corresponding facial expression action according to the action intensity time sequence data, calculating a first relative time difference between the starting time and a reference starting time, calculating a second relative time difference between the finishing time and the starting time, and finally obtaining at least one first relative time difference and at least one second relative time difference corresponding to the at least one facial expression action one by one; For each action in the at least one facial expression action, determining a corresponding minimum silence duration and a corresponding maximum silence duration according to a plurality of first relative time differences corresponding to the plurality of real facial expression video data one by one, averaging a plurality of second relative time differences corresponding to the plurality of real facial expression video data one by one to obtain a corresponding average execution duration, taking the result of the division of the minimum silence duration and the average execution duration as a lower limit value of the corresponding ratio, and taking the result of the division of the maximum silence duration and the average execution duration as an upper limit value of the corresponding ratio, wherein the minimum silence duration is determined based on the minimum value or the lower percentile of the plurality of first relative time differences, and the maximum silence duration is determined based on the maximum value or the upper percentile of the plurality of first relative time differences.
7. The facial expression control device of the anthropomorphic robot is characterized by comprising a trigger instruction receiving unit, an action sequence planning unit and an actuating mechanism driving unit which are connected in sequence in a communication mode; the trigger instruction receiving unit is used for receiving a facial expression trigger instruction from the outside; The action sequence planning unit is used for determining at least one facial expression action to be independently executed and the execution time of each facial expression action in the at least one facial expression action according to the facial expression trigger instruction, and planning a facial expression action sequence formed by the at least one facial expression action according to the following mode, wherein for each facial expression action, a silence delay with a corresponding silence time length is inserted before the corresponding action is executed, and the silence time length refers to the time length from the receiving time of the facial expression trigger instruction to the execution starting time of the corresponding action, and the ratio of the silence time length to the execution time length of the corresponding action belongs to a preset interval; The executing mechanism driving unit is used for driving the anthropomorphic robot based on the planned facial expression action sequence, and executing corresponding facial expression actions by each facial executing mechanism corresponding to each facial expression action one by one.
8. A computer device, comprising a storage module, a processing module and a transceiver module which are connected in turn in communication, wherein the storage module is used for storing a computer program, the transceiver module is used for receiving and transmitting a message, and the processing module is used for reading the computer program and executing the anthropomorphic robot facial expression control method according to any one of claims 1-6.
9. A computer-readable storage medium having instructions stored thereon that, when executed on a computer, perform the anthropomorphic robot facial expression control method of any one of claims 1-6.
10. A computer program product comprising a computer program or instructions, characterized in that the computer program or instructions, when executed by a computer, implement the anthropomorphic robot facial expression control method according to any one of claims 1-6.

Description

Facial expression control method, device, equipment, medium and product of anthropomorphic robot Technical Field The invention belongs to the technical field of bionic persons, and particularly relates to a method, a device, equipment, a medium and a product for controlling facial expression of an anthropomorphic robot. Background In recent years, with the progress of artificial intelligence and robotics, anthropomorphic robots (particularly social robots, entertainment robots, service robots, and the like) have been rapidly developed. In order to achieve natural, harmonious and affective human-machine interaction, it becomes important to give robots the ability to simulate human expression of emotion and intent. Therefore, the facial expression generation and control technology of the robot becomes an important field of current research hotspots and patent layout. At present, the scheme for improving the facial expression naturalness of the robot in the prior art mainly follows the following two optimization directions: (1) The structure and materials are optimized, namely, part of the prior art aims at directly improving the naturalness of visual and tactile layers by improving the physical structure and surface materials of the robot, for example, a larger number and more precise miniature servo motors or shape memory alloys are adopted to increase the number of facial degrees of freedom, or materials which are closer to the texture and elasticity of human skin such as flexible silica gel or elastomer are used for manufacturing the face skin of the robot, and the improvement scheme of the hardware layers can relieve the static stiffness of the expression appearance to a certain extent, but the nature of the improvement scheme belongs to the category of mechanical design, and is generally accompanied with the problems of high cost, complex system design, difficult maintenance and the like, and more importantly, the method cannot fundamentally solve the problems of 'mechanical feel' and 'non-life feel' of the expression movement mode per se; (2) The optimization of the driving and control algorithm, i.e. another main scheme, focuses on improving the driving and executing algorithm from the software and control level, for example, a more complex PID (Proportional, integral and derivative) control algorithm, fuzzy control or neural network model is adopted to accurately plan and control the motion track of the motor, so as to enable the execution of single or multiple expression actions to be smoother and more accurate, however, the methods generally neglect a key factor affecting the naturalness, namely the time sequence and rhythm law of the occurrence and change of human expression. Through intensive research and analysis, the common deficiency and bottleneck of the prior art is that the facial execution system of the robot almost responds instantaneously and executes immediately after receiving the expression instruction. This "zero-delay" response pattern, which pursues technical efficiency, runs counter to the laws of true physiological and cognitive response of humans. In natural interpersonal interactions, after an individual receives an external stimulus (such as hearing a sentence or seeing a scene), the brain needs a short information processing, emotion generation and intended cognitive process, and then the brain will display a corresponding expression through the neural driving facial muscle groups. This subtle and essential cognitive processing and physiological preparation time, which exists between "stimulus" and "overt response", is the core rhythmic feature that characterizes the life body's response "realism" and "naturalness". What is lacking in the existing robot expression control technology is the "humanized" response rhythm, which performs instant and no hysteresis expression switching and execution, and although realizing high efficiency at the control system level, the robot expression control technology appears abrupt and uncanny and even uncomfortable to human observers, which is considered as one of important sources for exacerbating the "terrorist effect" and generating strong "mechanical feel". Through extensive searching and checking, no patent or prior art is disclosed at present, and the problem of robot expression naturalness is fundamentally solved from the perspective of actively introducing and systematically controlling the 'stimulus-response' delay time and the ratio relation between the stimulus-response delay time and the total action duration. Therefore, there is an urgent need in the art for a robot expression control scheme capable of simulating the natural reaction rhythm of human beings, so as to overcome the above-mentioned drawbacks of the prior art, and promote the human-computer interaction experience of the anthropomorphic robot to develop towards a more natural and more intimate direction. Disclosure of Invention The invention aims to provide a facial exp