CN-121999524-A - Human body posture assessment method, device, equipment, storage medium and program product

CN121999524ACN 121999524 ACN121999524 ACN 121999524ACN-121999524-A

Abstract

The embodiment of the application provides a human body posture assessment method, a device, equipment, a storage medium and a program product, and particularly relates to the technical field of data processing. The method comprises the steps of obtaining two-dimensional human body posture sequence data extracted from video data, carrying out three-dimensional human body posture estimation based on the two-dimensional human body posture sequence data and a pre-trained space time enhancement transducer model to obtain initial three-dimensional human body posture sequence data, rotating joints formed by the initial three-dimensional human body posture sequence data to a target position based on a coronal plane determined by the initial three-dimensional human body posture sequence data to obtain target three-dimensional human body posture sequence data, wherein the target three-dimensional human body posture sequence data is used for carrying out action inspection. The method is used for achieving the effect of improving the accuracy of human body posture assessment.

Inventors

YANG DELIANG
GAI YANRONG
NIU XIAOTIE
WANG BIN

Assignees

北京工业职业技术学院
河北师范大学

Dates

Publication Date: 20260508
Application Date: 20241101

Claims (10)

1. A human body posture assessment method, characterized by comprising: Acquiring two-dimensional human body posture sequence data extracted from video data; Performing three-dimensional human body posture estimation based on the two-dimensional human body posture sequence data and a pre-trained space time enhancement transducer model to obtain initial three-dimensional human body posture sequence data; and rotating a joint formed by the initial three-dimensional human body posture sequence data to a target position based on a coronal plane determined by the initial three-dimensional human body posture sequence data to obtain target three-dimensional human body posture sequence data, wherein the target three-dimensional human body posture sequence data is used for performing action inspection.
2. The method of claim 1, wherein the performing three-dimensional human body pose estimation based on the two-dimensional human body pose sequence data and a pre-trained space-time enhancement transducer model to obtain initial three-dimensional human body pose sequence data comprises: Linearly mapping the coordinates of each joint in the two-dimensional human body posture sequence data to a high-dimensional space to obtain initial high-dimensional characteristics; Embedding the spatial information of each joint into the initial high-dimensional feature, and embedding the time information of the video frame of the video data into the initial high-dimensional feature to construct an embedded high-dimensional feature; Inputting the embedded high-dimensional features into the space-time enhancement transducer model to alternately learn the space correlation of the motions between joints and the time correlation of the motions of each joint, and constructing target dimensional features, wherein the space-time enhancement transducer model is a space-time and time-alternating enhancement model; and inputting the target high-dimensional characteristics into a regression head to obtain the initial three-dimensional human body posture sequence data.
3. The method of claim 2, wherein the space-time enhancement transducer model comprises a number of layers of space-enhancement transducer blocks S-ETB and time-enhancement transducer blocks T-ETB, the space-enhancement transducer blocks S-ETB or the time-enhancement transducer blocks T-ETB comprising layers of normalized LN, enhanced multi-headed self-attention mechanism EMSA, and multi-layer perceptron MLP, the inputting the embedded high-dimensional features into the space-time enhancement transducer model alternately learns spatial correlation of motion between joints and temporal correlation of motion of joints, constructing target-dimensional features comprising: Performing layer normalization processing on the embedded high-dimensional features based on the layer normalization LN; Performing attention computation on the embedded high-dimensional features after the layer normalization processing based on the enhanced multi-head self-attention mechanism EMSA to construct the target high-dimensional features, wherein the computation of the enhanced multi-head self-attention mechanism EMSA comprises performing linear transformation on the embedded high-dimensional features after the layer normalization processing, performing attention computation on a query matrix, a key matrix and a value matrix obtained by linear transformation, and performing convolution processing on the value matrix, wherein the dimensions of the query matrix, the key matrix and the value matrix are different.
4. A method according to any one of claims 1-3, wherein the rotating the joint formed by the initial three-dimensional body posture sequence data to a target position based on the coronal plane determined by the initial three-dimensional body posture sequence data to obtain target three-dimensional body posture sequence data comprises: Determining a coronal plane based on the initial three-dimensional human body pose sequence data and determining a rotation angle based on the coronal plane; And rotating the joint formed by the initial three-dimensional human body posture sequence data to a target position based on the rotation angle to obtain target three-dimensional human body posture sequence data.
5. The method of claim 4, wherein the determining a coronal plane based on the initial three-dimensional human body pose sequence data and determining a rotation angle based on the coronal plane comprises: acquiring coordinates of a target joint from the initial three-dimensional human body posture sequence data; Substituting the coordinates of the target joint into a plane equation of the coronal plane, and calculating to obtain a plane equation coefficient; the rotation angle for rotating the joint formed by the initial three-dimensional human body posture sequence data is determined based on the plane equation coefficients.
6. The method of claim 4, wherein rotating the joint formed by the initial three-dimensional human body posture sequence data to a target position based on the rotation angle, to obtain target three-dimensional human body posture sequence data, comprises: And rotating the joint formed by the initial three-dimensional human body posture sequence data around the z-axis of the coordinate system of the initial three-dimensional human body posture sequence data by the rotation angle by taking the hip joint as a rotation center so as to rotate the joint formed by the initial three-dimensional human body posture sequence data to a target position, thereby obtaining the target three-dimensional human body posture sequence data.
7. A human body posture assessment apparatus, characterized by comprising: The acquisition module is used for acquiring two-dimensional human body posture sequence data extracted from the video data; The estimation module is used for carrying out three-dimensional human body posture estimation based on the two-dimensional human body posture sequence data and a pre-trained space time enhancement transducer model to obtain initial three-dimensional human body posture sequence data; The rotating module is used for rotating a joint formed by the initial three-dimensional human body posture sequence data to a target position based on the coronal plane determined by the initial three-dimensional human body posture sequence data to obtain target three-dimensional human body posture sequence data, and the target three-dimensional human body posture sequence data is used for performing action inspection.
8. A human body posture assessment device is characterized by comprising a memory and a processor; The memory stores computer-executable instructions; the processor executing computer-executable instructions stored in the memory, causing the processor to perform the method of any one of claims 1-6.
9. A computer-readable storage medium comprising, The computer readable storage medium has stored therein computer executable instructions which when executed by a processor are for implementing the method according to any of claims 1-6.
10. A computer program product, characterized in that, The computer program product comprising a computer program which, when executed by a processor, implements the method of any of claims 1-6.

Description

Human body posture assessment method, device, equipment, storage medium and program product Technical Field The present application relates to the field of data processing technologies, and in particular, to a human body posture assessment method, apparatus, device, storage medium, and program product. Background On aircraft carriers, voice instructions are often difficult to clearly communicate to personnel on the deck due to the huge noise generated when the carrier-based aircraft is lifted and the interference of various devices. Therefore, limb movements are an important communication mode for improving command efficiency and accuracy. For example, a common "aircraft carrier Style" take-off instruction has the action key that a right leg is bent forwards, a left leg is bent to a large extent, a lower leg is clung to a deck, the left arm is put down to be clung to a body, the right arm is lifted horizontally, a fist is held by a right hand, and an index finger and a middle finger are gathered to point to the direction of a warship bow. When the carrier aircraft takes off, the commander is closely adjacent to the wings of the carrier aircraft. After the carrier-based aircraft takes off, the commander must bend down to avoid being scalded by the strong tail jet heat of the aircraft or blown into the sea. The action of the commander must meet the specifications and meet safety standards to avoid safety risks caused by misinterpretation by the pilot. In order to further improve safety and operational standardization, cameras may be mounted in place on the aircraft carrier deck. These cameras will be used to capture the action video of the commander and analyze and estimate the gestures of the commander through video processing techniques. Currently, mainly, a monocular camera is used for video or image shooting, and the monocular camera can only capture two-dimensional images, which means that direct information about scene depth is lacking. Without depth information, it is difficult to accurately judge the position and distance of the human body in the three-dimensional space. This makes it more difficult to accurately estimate the 3D human pose from the 2D image, and the lack of depth information may lead to erroneous decisions on joint position. Disclosure of Invention The embodiment of the application provides a human body posture assessment method, a device, equipment, a storage medium and a program product, which are used for achieving the effect of improving the accuracy of human body posture assessment. In a first aspect, an embodiment of the present application provides a human body posture assessment method, including: Acquiring two-dimensional human body posture sequence data extracted from video data; Performing three-dimensional human body posture estimation based on the two-dimensional human body posture sequence data and a pre-trained space time enhancement transducer model to obtain initial three-dimensional human body posture sequence data; and rotating a joint formed by the initial three-dimensional human body posture sequence data to a target position based on a coronal plane determined by the initial three-dimensional human body posture sequence data to obtain target three-dimensional human body posture sequence data, wherein the target three-dimensional human body posture sequence data is used for performing action inspection. In one possible implementation manner, the three-dimensional human body posture estimation based on the two-dimensional human body posture sequence data and a pre-trained space-time enhancement transducer model, to obtain initial three-dimensional human body posture sequence data, includes: Linearly mapping the coordinates of each joint in the two-dimensional human body posture sequence data to a high-dimensional space to obtain initial high-dimensional characteristics; Embedding the spatial information of each joint into the initial high-dimensional feature, and embedding the time information of the video frame of the video data into the initial high-dimensional feature to construct an embedded high-dimensional feature; Inputting the embedded high-dimensional features into the space-time enhancement transducer model to alternately learn the space correlation of the motions between joints and the time correlation of the motions of each joint, and constructing target dimensional features, wherein the space-time enhancement transducer model is a space-time and time-alternating enhancement model; and inputting the target high-dimensional characteristics into a regression head to obtain the initial three-dimensional human body posture sequence data. In one possible implementation, the space-time enhancement transducer model includes several layers of space-enhancement transducer blocks S-ETB and time-enhancement transducer blocks T-ETB, the space-enhancement transducer blocks S-ETB or the time-enhancement transducer blocks T-ETB contain layer normalized LN, enhanced mu