CN-121999526-A - Monocular vision snakelike running action detection method, device, equipment and medium

CN121999526ACN 121999526 ACN121999526 ACN 121999526ACN-121999526-A

Abstract

The invention provides a monocular vision snaking running action detection method, a device, equipment and a medium, wherein the method comprises the steps of collecting real-time video stream of a snaking running test field through a single camera; the method comprises the steps of carrying out region cutting on a video frame based on a site standard point to obtain a human body detection region and a rod body detection region, loading and running a deep learning model, carrying out asynchronous reasoning on the video frame to obtain human body detection frames, human body posture key points, rod body detection frames and rod body state classification results, continuously tracking detected human body targets based on a human body tracking algorithm to obtain movement tracks of effective testers, judging a movement stage in real time according to the spatial relationship between the human body position, posture key points and the rod array, detecting at least one illegal action in real time in the movement process, carrying out automatic timing according to the movement stage and the illegal detection results, and generating a structured report containing movement score and illegal information, wherein full-flow automatic detection can be realized only by a single camera.

Inventors

ZHANG LONG
ZHANG DENGPAN
ZHANG ZHUMING

Assignees

恒鸿达(福建)体育科技有限公司

Dates

Publication Date: 20260508
Application Date: 20251229

Claims (10)

1. A monocular vision snaking action detection method is characterized by comprising the following steps: step1, acquiring real-time video streams of a snakelike running test field through a single camera; Step 2, carrying out region cutting on the video frame based on the site-specific points to obtain a human body detection region and a rod body detection region; step 3, loading and running a deep learning model, and carrying out asynchronous reasoning on the video frame to obtain human body detection frames, human body posture key points, rod body detection frames and rod body state classification results; Step 4, continuously tracking the detected human body target based on a human body tracking algorithm to acquire a motion trail of an effective tester; Step 5, judging a movement stage in real time according to the spatial relation between the position and the gesture key points of the human body and the rod array, wherein the movement stage comprises a preparation state, a voice ending waiting state, a rod winding detection state, an ending state and an ending state; Step 6, detecting at least one type of illegal behaviors in real time based on the human body detection frame, the human body posture key points, the rod body detection frame and the rod body state classification result in the movement process; and 7, automatically timing according to the movement stage and the violation detection result and generating a structured report containing movement score and violation information.
2. A monocular vision serpentine motion detection method according to claim 1, wherein loading and running a deep learning model comprises: Loading YOLOv s model for target detection of human body and rod body; Loading RTMPose a model for identifying key points of human body gestures; Loading ResNet a model for two classification of rod body states so as to judge whether the rod body is vertical or toppled; Loading ArcFace a model for face recognition to realize secondary examination of the tilmicosin; the loading BYTETracker algorithm is used for continuous tracking of the human body.
3. The method for detecting monocular vision snaking motion of claim 1, wherein the cropping the video frame based on the scene location point comprises: calculating and cutting out human body detection areas covering the starting area, the pole winding area and the end area according to boundary points obtained by field marking; and calculating and cutting out a detection area of the long-distance rod body according to the rod array calibration points.
4. The method for detecting monocular vision serpentine running action according to claim 1, wherein the step 5 is specifically: According to the spatial relationship between the position and the gesture key points of the human body and the bar array, judging a motion stage in real time, wherein the motion stage comprises a preparation state, a voice ending state waiting to be started, a bar winding detection state, a terminal state reaching and an ending state; in the preparation state, detecting whether a person enters a preparation area or waits for an external starting instruction according to the configured movement mode, and detecting whether a line is stamped; Judging whether the starting voice is played completely or not in a state of waiting for the ending of the starting voice, and detecting whether a starting behavior exists or not; in the state of detecting the winding rod, the winding rod action, the rod body tilting, the human body going out of bounds and the personnel invasion behavior are detected in real time, judging whether the terminal point is reached; under the condition of reaching the end point, carrying out final judgment of a leakage rod and secondary verification of tilmicosin prevention; in the end state, the traffic processing is stopped and the resources are ready to be released.
5. A monocular vision serpentine motion detection method according to claim 1, wherein said detecting at least one offence in real time comprises at least one of: detecting a starting violation, namely judging that a person starts running before starting voice is finished; Detecting the rod falling violation, namely accumulating and judging the rod falling state through a rod classification model; Detecting the rod leakage violation, namely judging whether a rod body is not marked as finishing detour after the movement is finished; detecting out-of-limit violations, namely judging whether the center point of the bottom of the human body exceeds the boundary formed by the site-specific points; determining whether an ankle key point at the starting time exceeds a starting area boundary marked by a starting rod; detecting violations of an invalid terminal area, namely judging whether an ankle key point does not enter a terminal area boundary marked by a starting rod when the ankle key point arrives; Checking the identity of a tester through a face recognition model in the pole winding process; And (5) detecting the intrusion of personnel, namely judging whether the target number of the personnel in the field is more than one.
6. The method for detecting monocular vision serpentine running action according to claim 1, wherein the detecting of the inverted rod violation comprises: Intercepting the image of each detected rod body area, and submitting the image to a rod body classification model for asynchronous reasoning; Obtaining a classification result, and if the result is toppling, accumulating the toppling times of the rod body; If the accumulated number of dumping times of the same rod body exceeds a preset threshold value, judging that the rod is out of regulation.
7. The method for monocular vision serpentine motion detection as set forth in claim 1, wherein generating the structured report comprises: recording the starting time and the ending time of the exercise, and calculating the total time consumption as an exercise score; counting the illegal states of first-aid running, rod reversing, rod leakage, out-of-bounds, ineffective starting and ineffective arrival; calculating the fastest pole winding time, the slowest pole winding time, the average pole winding time and the average speed; and storing the starting moment picture, the ending moment picture and the violation pictures and videos corresponding to the violations.
8. The monocular vision snaking motion detection device is characterized by comprising: The video acquisition module acquires real-time video streams of the serpentine running test field through a single camera; The clipping region module is used for carrying out region clipping on the video frame based on the site-specific points to obtain a human body detection region and a rod body detection region; The model loading and obtaining data module loads and runs the deep learning model, and performs asynchronous reasoning on the video frame to obtain human body detection frames, human body posture key points, rod body detection frames and rod body state classification results; the motion trail acquisition module is used for continuously tracking the detected human body target based on a human body tracking algorithm to acquire the motion trail of an effective tester; the state acquisition module is used for judging a movement stage in real time according to the spatial relationship between the position and the gesture key points of the human body and the rod array, wherein the movement stage comprises a preparation state, a voice end waiting to be started, a rod winding detection state, an end state reaching and an end state; The violation detection module is used for detecting at least one violation behavior in real time based on the human body detection frame, the human body posture key points, the rod body detection frame and the rod body state classification result in the motion process; and the motion detection report module is used for automatically timing and generating a structured report containing the athletic performance and the violation information according to the athletic stage and the violation detection result.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of any one of claims 1 to 7 when the program is executed by the processor.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the method according to any one of claims 1 to 7.

Description

Monocular vision snakelike running action detection method, device, equipment and medium Technical Field The invention relates to the technical field of computer vision, in particular to a monocular vision snaking motion detection method, a monocular vision snaking motion detection device, monocular vision snaking motion detection equipment and a monocular vision snaking motion detection medium. Background Snaking is a special speed agility test project widely adopted in sports physical ability test, sports training, student physique evaluation and other scenes. The program typically requires the tester to run back and forth in an "S" shaped path between defined bars, during which time the tester needs to sequentially wrap around, not touch, not leak bars, and complete the action in the active starting and reaching zones. In order to ensure the objectivity and fairness of the test result, accurate determination needs to be made on the running path, the winding sequence, the winding direction, whether the rod is touched or reversed, whether the rod is leaked, whether the boundary is crossed, whether the race is started, the overall action continuity and the like of the tester. At present, detection and judgment of serpentine running action mostly depend on the following technologies: (1) And (5) manual observation and manual timing modes. In the traditional body measurement scene, judges visually judge whether the tester finishes the action according to the prescribed route, and record the result by using a manual timer. The method has obvious subjectivity, is easily influenced by observation angles, reaction time, site environment and the like, causes erroneous judgment, missed judgment or timing error, and is difficult to meet objective requirements of large-scale tests or strict examination. (2) Motion determination techniques based on wearable sensors (e.g., IMUs). In part of the prior art, an inertial sensor is arranged on the body or the shoe part of a tester, and the rod winding behavior or the action continuity is judged through data such as acceleration, angular velocity and the like. However, wearing equipment can influence running naturalness, increase examination complexity, and have the problems of equipment loss, irregular wearing, difficulty in multi-person test management and the like, so that the practical application scene is limited. (3) Motion path restoration systems based on binocular vision or multi-camera spatial localization. Partial scientific research and commercial products are subjected to space three-dimensional reconstruction by using a binocular camera, a multi-camera array or a laser measuring device, so that higher-precision path judgment is realized. However, such systems are usually high in hardware cost, complex in deployment, require accurate field calibration, and are not suitable for flexible deployment in general schools, sports grounds or mobile test scenes. (4) A motion detection method based on common monocular vision. Some of the existing literature proposes to analyze running by using monocular cameras in combination with human detection, target tracking, etc. However, the monocular vision cannot directly provide depth information, so that the existing scheme is difficult to judge the rod array position, the rod winding sequence, the winding direction, the action time, the rod reversing state and the like with high precision, and most schemes cannot simultaneously solve the fine granularity evaluation requirements of rod leakage judgment, out-of-limit judgment, first-aid running detection, action area effectiveness judgment and the like, so that the whole function coverage is insufficient. In summary, the following disadvantages still exist in the prior art: 1. the method relies on manual judgment, has large error and low efficiency, and is not suitable for standardized examination and large-scale test. 2. The wearable device affects natural actions, the device management is complex, and the test fairness cannot be guaranteed. 3. The binocular or multi-camera scheme has high cost and difficult deployment, and limits popularization and application. 4. The existing monocular vision method has insufficient judging capability on multiple key actions such as bar array detection, bar winding direction, bar leakage, bar falling, out-of-bounds, first-aid running and the like. 5. Lack of unified logic management of the overall course of motion, including the ability of state machine control, real-time tracking, multi-model collaborative reasoning, etc. 6. Timing, violation determination and action recognition are not integrated in a single system, and complete score and analysis report capable of being automatically output cannot be formed. Therefore, a technical scheme which can be deployed on a common single camera, has multi-model collaborative reasoning capability, can detect snaking running details in real time and conduct multi-type violation judgment is needed, so