Search

CN-122024331-A - 3D human body posture estimation and multi-key-point time sequence analysis method

CN122024331ACN 122024331 ACN122024331 ACN 122024331ACN-122024331-A

Abstract

The invention relates to the technical field of human motion analysis, in particular to a 3D human body posture estimation and multi-key-point time sequence analysis method which comprises the following steps of multi-mode video access, human body key-point space-time sensing and extraction, 3D posture reconstruction and credibility modeling, key-point time sequence feature quantization, time sequence stage sensing and logic judgment, quantitative evaluation and intelligent diagnosis and result output, namely, the analysis result is output in the form of grading, text, graph or voice and is used for guiding a user to perform action correction, so that the problems of insufficient time sequence semantic understanding, poor individual difference adaptability and lack of interpretability and pertinence guidance of the analysis result in the prior art are solved.

Inventors

  • HUANG XINTING
  • ZHAO YUE
  • HAN JIAQI
  • LIU YUNFEI
  • WU LIHAO
  • XU MIN
  • ZHONG JIANDAN

Assignees

  • 成都信息工程大学

Dates

Publication Date
20260512
Application Date
20260410

Claims (4)

  1. 1. The 3D human body posture estimation and multi-key point time sequence analysis method is characterized by comprising the following steps of: s1, accessing multi-mode video, namely acquiring video data containing human body motion through image acquisition equipment; s2, human body key point space-time sensing and extraction, namely extracting human body key point information from a video frame by utilizing a human body detection model and a posture estimation model based on deep learning, wherein the key point information at least comprises two-dimensional or three-dimensional coordinates and corresponding confidence degrees of a plurality of joints of a human body; s3, reconstructing a 3D gesture and modeling reliability, namely constructing a human body 3D gesture model based on the key point information, and distributing reliability indexes for each key point or bone segment to represent the reliability of the key point gesture estimation result; S4, quantifying key point time sequence characteristics, namely organizing 3D key point information in a continuous time sequence to generate multi-key point time sequence characteristics; s5, time sequence stage sensing and logic judgment, namely, based on multi-key point time sequence characteristics, performing time sequence analysis on human body actions, identifying different stages of the actions, and judging whether the actions meet a preset time sequence logic relation or not; S6, quantitatively evaluating and intelligently diagnosing, namely quantitatively evaluating the human body actions according to a time sequence analysis result, outputting action completion degree, stability and standardability indexes, and identifying action deviation types and occurrence stages thereof; And S7, outputting the result, namely outputting the analysis result in the form of score, text, graph or voice, and guiding the user to perform action correction.
  2. 2. The method for 3D human body posture estimation and multi-key point time sequence analysis according to claim 1, wherein the reliability modeling in the step S3 is constructed from the information of the reliability of the kinematic rationality, specifically: constructing a bone length ratio check score function, wherein for a bone segment consisting of key points p and q, the current length is Presetting a desired length of a bone segment Length tolerance parameter The length rationality score is: wherein For an exponential function based on a natural constant e, L is the actual measured length of the current bone segment, Is the desired length of the bone segment; constructing joint angle limit value check score function, for joint angle formed by adjacent a, b with key point i as peak The normal movement range of the joint is And defining an angle rationality score as: wherein Is the lower limit of the normal movement range of the joint, Is the upper limit of the normal movement range of the joint, As the nearest boundary value, when Time of day ; Time of day , Is a preset constant, controls the dropping speed when the angle exceeds the limit, Scale normalization was performed to square the deviation.
  3. 3. The method for 3D human body posture estimation and multi-key point time sequence analysis according to claim 1, wherein the multi-key point time sequence features in the step S4 comprise key point space coordinate change features, joint angles and change rates thereof, key point track features and posture stability features; the spatial coordinate change characteristic of the key point is that the two-dimensional image coordinate of the ith key point in the frame t is set as The system frame rate is FPS, and adjacent frame time interval Second, the corresponding n-order kinematic quantity is uniformly expressed as , N-order kinematic quantity vector of the ith key point at the frame t, n is a non-negative integer, n=0 is position, n=1 is speed, n=2 is acceleration, For an n-order forward difference starting at frame t-n +1, To the nth power of the time interval; the joint angle and the change rate thereof adopt a confidence weighting time sequence fusion strategy, and the formula is as follows Wherein For the original t + k frame joint angle, For the confidence of the central key point of the joint of the frame, k is the time offset for the current frame t, the integer of the value range of [ -k, k ], As the time-weighting factor is used, Traversing all frames from t-K to t+K for summing all frames within the window; the key point track features are the economic and regular degree of the quantized key point track path, and the track efficiency index is introduced The index closely relates the geometric characteristics of the trajectory to the energy efficiency of the signal, specifically defined as: wherein Two-dimensional coordinate vector at frame t for the ith key point , As euclidean distance of key points between adjacent frames, For the total distance traveled by all keypoints from frame 2 to frame T, For the linear displacement of key points between the head and the tail frames of the window, Is a very small positive constant; The posture stability characteristic is used for evaluating the posture stability degree of a body core region in the process of movement or static maintenance, and the multi-key point cooperative stability index is defined as , Is a collection The number of key points in the middle, For the sample variance of the x-coordinate of the jth keypoint over the time window, For the sample variance of the y-coordinate of the jth keypoint over the time window, For a desired length of bone segment corresponding to the keypoint j, The values of the corresponding expressions for all keypoints in set j are calculated and summed.
  4. 4. The method for 3D human body posture estimation and multi-key point time sequence analysis according to claim 1, wherein the action phase based on the time sequence characteristics in the step S5 is identified as taking the real-time joint angle and the coordinate change rate as input, and the state machine is driven to migrate according to a preset threshold.

Description

3D human body posture estimation and multi-key-point time sequence analysis method Technical Field The invention relates to the technical field of human motion analysis, in particular to a 3D human body posture estimation and multi-key-point time sequence analysis method. Background The human motion analysis technology takes gesture estimation as a core, can extract the spatial position information of general key points of human body, has been applied in the fields of physical education, physical training, action evaluation and the like, but the prior art does not construct a systematic time sequence analysis system around multiple key points, so that the detailed and intelligent analysis of continuous human motion has obvious defects, and the core problems are as follows: 1. The timing sequence semantic analysis capability of multiple key points is lost, the spatial feature extraction or simple time statistics of the multi-focus single-frame key points in the prior art are not utilized, the timing sequence association and cooperative motion rules among the multiple key points are not mined, the stages of continuous actions cannot be precisely divided, the timing sequence logic of the action execution is identified, the problems of misjudgment and missed judgment of action counting, fuzzy stage boundary and the like are easy to occur, and the fine analysis of the whole period of the continuous actions is difficult to realize. The 3D gesture time sequence stability is poor, an individual self-adaptive mechanism is not provided, the 3D key points are easily subjected to coordinate deviation and jump under the influence of factors such as illumination change, limb shielding, rapid movement jitter and the like, a time sequence noise reduction optimization scheme aiming at multiple key points is lacking, a single frame estimation error is amplified in a time sequence dimension, meanwhile, a fixed threshold value or rigidity comparison judging mode is adopted, the movement range of the multiple key points is not matched with individual physiological characteristics, dynamic adjustment is not performed aiming at the time sequence change rate of the key points, individual difference and action rhythm change cannot be adapted, and the objectivity and the universality of analysis results are insufficient. 3. The multi-key-point deviation quantitative analysis is missing, the analysis result has low interpretability and practicability, the prior art can only output simple action classification, counting or comprehensive scoring, the deviation type and degree of each key point in different time sequence stages are not accurately quantized, specific key points and occurrence stages corresponding to action errors cannot be positioned, targeted correction suggestions cannot be generated based on the deviation characteristics of the multi-key points, and the actual requirements of fine guidance in physical education and professional training are difficult to meet. In summary, in the prior art, because 3D pose estimation and multi-key point timing analysis technologies are not fused, there is a significant shorthand in the aspects of continuous motion timing semantic analysis, pose anti-interference, individual self-adaptation, and motion error fine diagnosis, etc., and an innovative technical scheme is needed to be provided, so that the problems are solved, and the accuracy, the universality and the practicability of human motion analysis are improved. Disclosure of Invention The invention aims to provide a 3D human body posture estimation and multi-key-point time sequence analysis method, which solves the problems of insufficient time sequence semantic understanding, poor individual difference adaptability and lack of interpretability and pertinence guidance of analysis results in human body posture estimation and action analysis in the prior art. In order to solve the technical problems, the invention adopts the following technical scheme: A3D human body posture estimation and multi-key point time sequence analysis method comprises the following steps: s1, accessing multi-mode video, namely acquiring video data containing human body motion through image acquisition equipment; s2, human body key point space-time sensing and extraction, namely extracting human body key point information from a video frame by utilizing a human body detection model and a posture estimation model based on deep learning, wherein the key point information at least comprises two-dimensional or three-dimensional coordinates and corresponding confidence degrees of a plurality of joints of a human body; s3, reconstructing a 3D gesture and modeling reliability, namely constructing a human body 3D gesture model based on the key point information, and distributing reliability indexes for each key point or bone segment to represent the reliability of the key point gesture estimation result; S4, quantifying key point time sequence characteristics, namel