CN-115272927-B - Real-time human-computer interaction intention strength identification method

CN115272927BCN 115272927 BCN115272927 BCN 115272927BCN-115272927-B

Abstract

The invention relates to the technical field of robot interaction, in particular to a real-time human-computer interaction intention strength recognition method which comprises the following steps of S1, collecting videos of pedestrians through a collecting unit arranged on a robot body, performing frame extraction processing on the videos, respectively extracting skeleton information of each pedestrian in each frame of images according to time sequence by utilizing a 3D skeleton extraction model, S2, recognizing interaction reference information of each pedestrian by using a preset lightweight motion behavior recognition model based on the skeleton information of each pedestrian in each frame of images, and S3, analyzing and processing three types of interaction reference information, namely straight line distance, orientation information and motion behavior of the pedestrians, so as to obtain the real-time interaction intention strength of the pedestrians relative to the robot. The method can accurately identify the interactive intention of the pedestrian, thereby improving the initiative and naturalness of the social robot when interacting with the person.

Inventors

HE MIAO
BI JIAN
WANG YUJIN
HU FANGCHAO
PAN YONGKANG

Assignees

重庆理工大学

Dates

Publication Date: 20260505
Application Date: 20220726

Claims (7)

1. The real-time human-computer interaction intention strength identification method is characterized by comprising the following steps of: S1, acquiring videos of pedestrians through an acquisition unit arranged on a robot body, performing frame extraction processing on the videos, and respectively extracting the skeleton information of each pedestrian in each frame of image according to time sequence by utilizing a 3D skeleton extraction model; S2, based on skeleton information of each pedestrian in each frame of image, identifying interactive reference information of each pedestrian by using a preset lightweight motion behavior identification model, wherein the interactive reference information comprises linear distance, orientation information and motion behaviors of a robot; S3, analyzing and processing three types of interactive reference information, namely the linear distance, the orientation information and the movement behavior of the pedestrian, so as to obtain the real-time interactive intention strength of the pedestrian relative to the robot; S4, controlling the robot to execute corresponding interaction decisions according to the obtained interaction intention strength, wherein the interaction decisions comprise non-interaction, heuristic interaction and active interaction; In S2, the lightweight motion behavior recognition model is an improved ConvLSTM network model with only one ConvLSTM layer, a high-order feature extraction layer is arranged between an input layer and the ConvLSTM layer, the high-order feature extraction layer consists of three high-order feature extractors and gate functions, the three high-order feature extractors are respectively used for extracting linear distance, orientation information and feature information of inter-frame change after processing each frame of skeleton information, and the gate functions are used for selecting the feature information sent into the ConvLSTM network layer; S3, analyzing and processing three types of interactive reference information, namely linear distance, orientation information and motion behavior of the pedestrian by using a preset DOM model to obtain real-time interactive intention strength of the pedestrian relative to the robot, wherein the DOM model comprises an S axis and a D axis which are perpendicular to each other, the S axis represents static information, and the static information comprises distance intention values of the pedestrian And an orientation intent value The D-axis represents dynamic information including the pedestrian's intent value for movement ; The working process of the Mamdani fuzzy reasoning system comprises the steps of carrying out fuzzy processing on the input linear distance, orientation information and motion behavior according to a preset fuzzy rule, carrying out reasoning analysis on the interaction intention of the pedestrian according to the preset fuzzy reasoning rule to obtain a fuzzy value of the interaction intention of the pedestrian, and carrying out defuzzification on the fuzzy value of the interaction intention through a defuzzification algorithm to obtain an accurate value of the interaction intention of the pedestrian; The fuzzy inference rule comprises a dynamic inference rule and a dynamic inference rule, wherein the static rule is used for judging the strength level of the interaction intention of the pedestrian in a standing state according to the distance and orientation information, and the dynamic rule is used for judging the strength level of the interaction intention of the pedestrian in a approaching, passing or far state according to the linear distance and action behavior; S3 comprises the following steps: s31, calculating a corresponding distance intention value according to the linear distance of the pedestrian : ; Wherein, the Is the maximum amplitude and =10; Is the straight line distance of the pedestrian; The scaling parameter is preset and is used for adjusting the transverse axis span; S32, calculating corresponding direction intention values according to the direction information of the pedestrians : ; Wherein, the Is of maximum amplitude and =10, Information indicating the orientation of the pedestrian, The scaling parameter is preset and is used for adjusting the transverse axis span; s33, calculating corresponding movement intention values according to the movement behaviors of the pedestrians : ; ; Wherein, the In order to be a person's athletic performance, For amplitude scaling factor and =5; S34, calculating the interactive intention strength value of the pedestrian by using a preset DOM model : ; Wherein, the And (3) with Is a preset proportion parameter, when When the number of the codes is not equal to 1, =1, When =0 When the number of the codes is =1, = =0.5; And according to the intensity value of the interactive intention of the pedestrians The strength of the pedestrian's interaction intention is determined.
2. The method for identifying the intensity of the real-time human-computer interaction intention according to claim 1, wherein the working process of the high-order feature extractor for extracting the straight line distance comprises the steps of extracting three-dimensional coordinate information of a pedestrian skeleton under a preset camera coordinate system, wherein the three-dimensional coordinate system takes the position of an acquisition unit as an origin, a y axis represents the position information in the depth direction of the acquisition unit, an x axis represents the position information in the horizontal direction perpendicular to the y axis, and a z axis represents the position information in the vertical direction; The height of the acquisition unit and the heads of all interaction objects are regarded as unified heights, and then the linear distance of the pedestrian is calculated by the following formula: ; In the formula, Is a straight-line distance, and the distance between the two adjacent lines is equal to the straight-line distance, Is the coordinate value of the neck key point of the pedestrian on the x-axis under the three-dimensional coordinate system, Is the coordinate value of the acquisition unit on the x axis under the three-dimensional coordinate system, Is the coordinate value of the neck key point of the pedestrian on the y axis under the three-dimensional coordinate system, Is the coordinate value of the acquisition unit on the y axis under the three-dimensional coordinate system.
3. The method for recognizing intention strength of real-time man-machine interaction according to claim 2, wherein the operation of the high-order feature extractor for extracting the orientation information includes calculating an angle of deviation of a pedestrian body orientation with respect to the robot by using three-dimensional coordinates of left and right shoulders of the pedestrian Then, calculating coordinates of intersection points of rays of the pedestrian body orientation and the z-plane of the robot Recalculating Horizontal distance from robot position And pass through Information indicating the orientation of the pedestrian; In the formula (I), in the formula (II), Represents a scaling factor, Representing the value of the intersection point coordinate of the body orientation of the pedestrian and the vertical plane where the acquisition unit is located on the x axis, Representing coordinate values of the acquisition unit on the x-axis under the three-dimensional coordinate system.
4. The method for recognizing intention strength of real-time man-machine interaction according to claim 3, wherein when At <90 DEG, set The maximum value of (2) is 100 when Setting when the angle is more than or equal to 90 DEG The value of (2) is 104.
5. The method for identifying real-time human-computer interaction intention strength according to claim 4, wherein the operation of the high-order feature extractor for extracting inter-frame changes comprises: Calculating distance change of pedestrians in x-axis direction And distance variation of pedestrians in y-axis direction : ; ; Where k represents the current frame number, Is the x-axis coordinate of the key point of the neck of the pedestrian, Is the y-axis coordinate of the key point of the neck of the pedestrian, And (3) with Is a scaling factor, and = =10; If it is = If 0, the movement behavior is stationary, if =0 And <0, The exercise behavior is close, if =0 And If >0, the locomotor activity is far away, if >0, Then the locomotor activity is passing.
6. The method for recognizing the intensity of the real-time human-computer interaction intention according to claim 5, wherein the blurring value of the interaction intention is defuzzified by an area gravity center method: ; where y is the exact value of the pedestrian's interaction intent after defuzzification, Is that The point at which the maximum value is taken, The membership function of the fuzzy set Bi of the output quantity representing a fuzzy rule.
7. The method for recognizing the intensity of the real-time human-computer interaction intention according to claim 6, wherein in S3, when the fuzzy value of the interaction intention is defuzzified, the value range of the obtained accurate value of the interaction intention of the pedestrian is normalized to be [0,1]; In S4, if the accurate value of the pedestrian ' S interaction intention belongs to [0-0.4], the interaction decision is not to interact, if the accurate value of the pedestrian ' S interaction intention belongs to (0.4-0.7 ], the interaction decision is heuristic interaction, and if the accurate value of the pedestrian ' S interaction intention belongs to (0.7-1.0 ], the interaction decision is active interaction.

Description

Real-time human-computer interaction intention strength identification method Technical Field The invention relates to the technical field of robot interaction, in particular to a real-time human-computer interaction intention strength identification method. Background The rise of social robots brings new opportunities for the development of various industries in society, and people can acquire information or help needed by themselves through communication with the social robots. At present, the interaction between a social robot and a pedestrian is mostly triggered by a verbal command of the pedestrian, and although the robot can recognize the intention of the human by understanding the verbal command, in this interaction mode, the human must always explicitly command the robot, which does not meet the original purpose of the social robot design. Stated another way, most existing social robots can only passively respond to a user's request. Meanwhile, most users do not know the specific functions of the social robot and may not know whether the social robot can work normally, so that many users do not choose to trade and consult the robot under the circumstance, and the social robot is difficult to play the actual function in most times. In addition, for social robots with a realistic human appearance, such as Sophia, nadine, etc., unnatural interactions can cause people to fear them. To improve the naturalness of such social robot interactions, not only the fluency of their actions needs to be improved, but also the social environment perception capability of the social robot should be further improved so as to improve the interaction initiative. Therefore, a real-time human-computer interaction intention strength recognition method is needed, and the interaction intention of pedestrians can be accurately recognized, so that the initiative and naturalness of the social robot in interaction with the human are improved, the social robot can respond to the requirements or psychological states of the users better, and the naturalness and harmony of human-computer interaction are improved. Disclosure of Invention Aiming at the defects of the prior art, the invention provides a real-time human-computer interaction intention strength identification method which can accurately identify the interaction intention of pedestrians, thereby improving the initiative and naturalness of the social robot in interaction with the human. In order to solve the technical problems, the invention adopts the following technical scheme: a real-time human-computer interaction intention strength identification method comprises the following steps: S1, acquiring videos of pedestrians through an acquisition unit arranged on a robot body, performing frame extraction processing on the videos, and respectively extracting the skeleton information of each pedestrian in each frame of image according to time sequence by utilizing a 3D skeleton extraction model; S2, based on skeleton information of each pedestrian in each frame of image, identifying interactive reference information of each pedestrian by using a preset lightweight motion behavior identification model, wherein the interactive reference information comprises linear distance, orientation information and motion behaviors of a robot; S3, analyzing and processing three types of interactive reference information, namely the linear distance, the orientation information and the movement behavior of the pedestrian, so as to obtain the real-time interactive intention strength of the pedestrian relative to the robot; And S4, controlling the robot to execute a corresponding interaction decision according to the obtained interaction intention strength, wherein the interaction decision comprises non-interaction, heuristic interaction and active interaction. Basic scheme theory of operation and beneficial effect: The invention provides a method capable of identifying pedestrian interaction intention strength, in particular to an interactive reference information comprising linear distance, orientation information and movement behavior of a robot, which is identified from each frame of image according to time sequence after dynamic video of a pedestrian is collected and frames are extracted. And then, fusing three interactive reference information, namely the linear distance, the orientation information and the movement behavior of the pedestrian, so as to obtain the real-time interactive intention strength of the pedestrian relative to the robot. Through comprehensive analysis of distance, orientation information and movement behaviors, the robot can accurately know the interaction intention strength of each pedestrian in the visual field, and can know whether to actively interact and what interaction decision (such as heuristic interaction or direct active interaction) is adopted when the active interaction is required according to the interaction intention strength. On the one hand, the dyn