CN-121982781-A - Human body key point movement habit extraction method and system

CN121982781ACN 121982781 ACN121982781 ACN 121982781ACN-121982781-A

Abstract

The application discloses a method and a system for extracting movement habits of key points of a human body, wherein the method comprises the steps of constructing a three-dimensional coordinate system and constructing a multi-camera synchronous acquisition system to record synchronous video data of movement of the human body; the method comprises the steps of performing action time sequence segmentation on synchronous video data to obtain a plurality of video sections with names, extracting three-dimensional coordinates of human body key points of each frame in each video section, calculating motion information of the human body key points in each video section according to the three-dimensional coordinates, and constructing an action-motion information mapping relation table representing motion habits of the human body key points. The system is used for executing the method. The three-dimensional drawing coordinate system has the beneficial effects that the three-dimensional drawing coordinate system is synchronously acquired and constructed by adopting the multiple cameras, so that the problem of missing of spatial dimension information of the single cameras is solved, accurate extraction of three-dimensional position coordinates of key points of a human body is realized, and real spatial motion information is provided for robot learning.

Inventors

YANG CHAO
LI XIANGYUN
YING GUOGANG

Assignees

宁波朗达科技有限公司

Dates

Publication Date: 20260505
Application Date: 20260408

Claims (10)

1. The method for extracting the movement habit of the key points of the human body is characterized by comprising the following steps of: S100, constructing a three-dimensional coordinate system and a multi-camera synchronous acquisition system, and acquiring data of the same behavior of a human body for multiple times to obtain synchronous video data of human body movement; S200, performing action time sequence segmentation on synchronous video data, giving action semantics to obtain a plurality of video sections with drive names, and constructing an optimal action time sequence set, wherein the specific process is that the motion feature vectors are constructed according to the mean value and variance of displacement, speed and acceleration of key points of a human body and by combining the duration of the video sections and the length of a motion track of the key points of the human body, the motion feature vectors of the same action name are clustered to obtain a plurality of motion pattern clusters, and the cluster with the highest typical score is used as a typical motion pattern cluster of the action name; s300, extracting three-dimensional coordinates of human body key points of each frame in each video section, and normalizing; s400, calculating motion information of the human body key points in each video section according to the three-dimensional coordinates of the normalized human body key points, and constructing a motion-motion information mapping relation table representing motion habits of the human body key points; The calculation formula of the representativeness score T m is as follows: ; Where N m represents the number of samples in the cluster, N total represents the total number of samples corresponding to the action name, σ m represents the dispersion in the cluster, σ total represents the total dispersion corresponding to the action name, Q avg,m represents the intra-cluster average quality score, and λ 1 、λ 2 and λ 3 represent the weight coefficients corresponding to the respective items.
2. The method for extracting movement habits of key points of a human body according to claim 1, wherein in step S100, the built multi-camera synchronous acquisition system comprises at least one first camera which is parallel to the front surface of the human body and at least one second camera which is parallel to the side surface of the human body; the frame width of the first camera is used as a ground x-axis coordinate range of the three-dimensional coordinate system, the frame width of the second camera is used as a ground y-axis coordinate range of the three-dimensional coordinate system, and the same frame heights of the two cameras are used as a z-axis coordinate range of the three-dimensional coordinate system.
3. The method for extracting exercise habits of key points of human body according to claim 1, wherein in step S100, for multiple data acquisitions of the same behavior of the human body, a comprehensive quality score corresponding to each acquired data is calculated, and multiple acquired data with a comprehensive quality score higher than a set score threshold are used as synchronous video data; Wherein, the calculation formula of the comprehensive quality score Q is as follows: Q=α·(1-Occ)+β·Conf+γ·Comp+δ·(1-Jerk); Wherein Occ represents the shielding rate of the human body key points, conf represents the average confidence that the human body key points are detected by the human body key point algorithm, comp represents the action integrity score, jerk represents the acceleration fluctuation rate of the human body key points, and alpha, beta, gamma and delta respectively represent the weight coefficients corresponding to the items.
4. The method for extracting the motion habit of the key points of the human body according to claim 1, wherein in the step S200, the motion time sequence segmentation is carried out on the synchronous video data through SlowFast video motion algorithm, and the method specifically comprises the following steps of inputting the synchronous video data into a SlowFast network, sampling video frames by a slow path at a frame rate of 1/alpha, extracting motion global features, sampling video frames by a fast path at an original frame rate, extracting motion detail features, and obtaining a motion time sequence segmentation result through feature fusion, wherein alpha represents a frame rate downsampling proportion of the slow path.
5. The method for extracting exercise habits of key points of human body according to claim 1, wherein in step S200, the collected data corresponding to the typical exercise pattern cluster is used as a typical sample data set of the action name, a cluster center of the typical exercise pattern cluster is calculated, and sample data, the distance between the typical sample data set and the cluster center exceeds a set distance threshold, is removed.
6. The method for extracting motion habit of human body key point according to claim 1, wherein in step S300, three-dimensional coordinates of human body key points of each frame in each video section are extracted by OpenPose algorithm, and the obtained three-dimensional coordinates are divided by corresponding coordinate range values to obtain normalized three-dimensional coordinates of human body key points.
7. The method for extracting motion habit of human body key points according to any one of claims 1-6, wherein in step S100, synchronous video data of human body interaction with an object is recorded by a multi-camera synchronous acquisition system, and then the motion habit of human body key points is refined and compensated by performing motion analysis on the object, which comprises the following steps: And calculating three-dimensional relative motion information of the key points of the human body relative to the target object and each sub-module corresponding to the target object, and completing comprehensive extraction of the motion habit of the key points of the human body.
8. The method for extracting motion habit of human body key point as defined in claim 7, wherein the step of obtaining the coordinates of the solid frame and the segmented sub-module of the target object comprises the steps of: Detecting a target object in the synchronous video data by using a YOLO three-dimensional target detection algorithm, and taking the diagonal corner coordinates of the obtained three-dimensional frame as the three-dimensional frame coordinates of the target object; And dividing the target object into a plurality of sub-modules by adopting a Mask R-CNN three-dimensional dividing algorithm, and taking the diagonal corner coordinates of the three-dimensional frame corresponding to each sub-module as corresponding sub-module coordinates.
9. The method for extracting the movement habit of the key points of the human body according to claim 1, wherein the feedback adaptive optimization is performed by a robot according to the obtained movement habit data of the key points of the human body, and the method specifically comprises the following steps: Deploying the obtained action-motion information mapping relation table to a target robot, and executing action re-engraving by the target robot; In the process of moving the target robot, acquiring an actual movement track of the target robot, and calculating the deviation between the actual movement track and a given action-movement information mapping relation table; When the deviation exceeds a set deviation threshold, analyzing the generation cause of the deviation and triggering a movement habit optimization mechanism, namely, if the deviation is caused by insufficient execution capacity of a target robot, performing action continuity optimization on an action-movement information mapping relation table to adapt the movement habit to a robot physical constraint; if the deviation can not be converged after multiple times of optimization, triggering an incremental acquisition mechanism, namely acquiring the synchronous video data of the human body motion again, and adding the synchronous video data into a typical sample data set after quality evaluation.
10. A human body key point exercise habit extraction system for performing the human body key point exercise habit extraction method according to any one of claims 1 to 9, comprising: The multi-camera synchronous acquisition module is used for constructing a two-camera three-dimensional acquisition system and recording synchronous video data of human body movement; The motion time sequence segmentation module is used for performing motion segmentation on the synchronous video data by adopting a video motion algorithm to obtain a video section with a motion name; the key point extraction module is used for extracting three-dimensional position coordinates of a plurality of key points of a human body by adopting a human body key point algorithm and normalizing the three-dimensional position coordinates; the multi-dimensional motion information calculation module is used for calculating multi-dimensional motion information comprising displacement, speed and acceleration of key points of a human body; the data fusion storage module is used for integrating all movement information and constructing a human body key point movement habit database.

Description

Human body key point movement habit extraction method and system Technical Field The application relates to the technical field of computer vision, in particular to a method and a system for extracting movement habits of key points of a human body. Background Along with the development of service robots and industrial collaborative robots, the accurate learning of the robots on the motion habit of human bodies becomes a core requirement for realizing man-machine collaboration and robot action re-engraving. The extraction of the motion habit of the human body requires the accurate acquisition of three-dimensional space motion information of key points of the human body and the relative motion relationship between the human body and peripheral objects. In the prior art, a single camera is adopted to detect key points of a human body, so that the problems of space dimension information deficiency, single motion parameter extraction and no consideration of relative motion between the human body and an object exist, meanwhile, the existing method lacks time sequence for dividing human body actions, and the extracted motion information only comprises position coordinates and is not converted into multidimensional motion characteristics such as displacement, speed, acceleration and the like, so that the requirement of a robot on accurate learning of human body motion habits cannot be met. In addition, the existing human body motion extraction method does not carry out fine segmentation and detection on surrounding objects, only obtains the overall position of the object, and cannot extract the relative motion information of key points of the human body and all sub-modules of the object, so that the robot cannot accurately reproduce the detailed motions of interaction between the human body and the object after learning, and the reality and the accuracy of motion reproduction are low. Disclosure of Invention One of the objectives of the present application is to provide a method for extracting exercise habits of key points of a human body, which can solve at least one of the above-mentioned drawbacks of the related art. Another object of the present application is to provide a system for extracting exercise habits of key points of a human body, which can solve at least one of the above-mentioned drawbacks of the related art. In order to achieve at least one of the above objects, one aspect of the present application provides a method for extracting exercise habits of key points of a human body, comprising the steps of: S100, constructing a three-dimensional coordinate system and a multi-camera synchronous acquisition system, and acquiring data of the same behavior of a human body for multiple times to obtain synchronous video data of human body movement; S200, performing action time sequence segmentation on synchronous video data, giving action semantics to obtain a plurality of video sections with drive names, and constructing an optimal action time sequence set, namely, constructing motion feature vectors according to the mean value and variance of displacement, speed and acceleration of key points of a human body and combining the duration of the video sections and the length of a motion track of the key points of the human body for the plurality of video sections with the same action name; s300, extracting three-dimensional coordinates of human body key points of each frame in each video section, and normalizing; s400, calculating motion information of the human body key points in each video section according to the three-dimensional coordinates of the normalized human body key points, and constructing a motion-motion information mapping relation table representing motion habits of the human body key points; The calculation formula of the representativeness score T m is as follows: ; Where N m represents the number of samples in the cluster, N total represents the total number of samples corresponding to the action name, σ m represents the dispersion in the cluster, σ total represents the total dispersion corresponding to the action name, Q avg,m represents the intra-cluster average quality score, and λ 1、λ2 and λ 3 represent the weight coefficients corresponding to the respective items. Preferably, in step S100, the built multi-camera synchronous acquisition system includes at least one first camera that is parallel to the front of the human body and at least one second camera that is parallel to the side of the human body, where the width of the frame of the first camera is used as the ground x-axis coordinate range of the three-dimensional coordinate system, the width of the frame of the second camera is used as the ground y-axis coordinate range of the three-dimensional coordinate system, and the same height of the two cameras is used as the z-axis coordinate range of the three-dimensional coordinate system. Preferably, in step S100, the same behavior of the human body is subjected to multiple data collecti