CN-121982629-A - Student behavior data set labeling method

CN121982629ACN 121982629 ACN121982629 ACN 121982629ACN-121982629-A

Abstract

The embodiment of the application provides a student behavior data set labeling method, which comprises the steps of carrying out seat area detection on a first image of a target classroom by utilizing a target detection model which completes training to obtain a seat detection frame set, wherein the first image comprises seats arranged in the target classroom, distributing each seat detection frame in the seat detection frame set to grid positions formed on the basis of a preset row number R and a preset column number C based on a shooting view angle of the first image to generate a structured seat template, matching a second image of the target classroom with the structured seat template, identifying student behaviors corresponding to each seat, wherein the second image comprises students in the target classroom, identifying the second image on the basis of the behaviors of the students corresponding to each seat, and adding the second image which completes identification to a data set which completes student behavior identification.

Inventors

XIA LIANGCHU
MA WENLONG
LI JIE
WANG HONGQIN
FAN YEFENG
SUN YE
ZHOU WENTING
SUN RUOHAN
XI XIAOGANG
WEN YAOFEI
LI CHEN
ZHANG HAIBO
LI LI
YANG MIAO

Assignees

新疆思极信息技术有限公司

Dates

Publication Date: 20260505
Application Date: 20251222

Claims (10)

1. A method for labeling student behavior data sets, the method comprising: Carrying out seat area detection on a first image of a target classroom by using a target detection model after training to obtain a seat detection frame set, wherein the first image comprises seats arranged in the target classroom; assigning each seat detection frame in the seat detection frame set to a grid position formed based on a preset number of rows R and a preset number of columns C based on a shooting view angle of the first image to generate a structured seat template, wherein R and C are integers greater than or equal to 1; Matching a second image of the target classroom with the structured seat template, and identifying student behaviors corresponding to each seat, wherein the second image comprises students in the target classroom; and identifying the second image based on the behaviors of the students corresponding to each seat, and adding the identified second image to a data set for completing student behavior identification.
2. The method of claim 1, wherein assigning each seat detection frame of the set of seat detection frames to a grid location formed based on a preset number of rows R and columns C based on the taken view of the first image to generate a structured seat template comprises: Labeling each seat detection frame by using the target detection model after training, and determining the center point coordinate of each seat detection frame; Determining an R center coordinate of a seat detection frame positioned in an R row and a first column based on a shooting view angle of the first image, wherein R is an integer greater than or equal to 1 and less than or equal to R; Determining a seat detection frame which deviates from the r center coordinate in the row direction by less than a row deviation threshold as a seat detection frame corresponding to the r row; determining that the identification of the seat detection frames in the r-th row is completed under the condition that the deviation of the central coordinates of the seat detection frames and the r-th central coordinates in the row direction is larger than the row deviation threshold value or the number of the seat detection frames corresponding to the r-th row is larger than the column number C; Each of the seat detection frames is assigned to the grid location based on the recognition result of the R rows of seat recognition to generate the structured seat template.
3. The method of claim 2, wherein the method further comprises: determining that the R-th row is provided with k seat detection frames, wherein k is an integer smaller than R; based on the k seat detection frames, column numbers are sequentially assigned from the middle column.
4. The method of claim 1, wherein the structured seat template comprises the preset number of rows R and columns C, and coordinate information corresponding to each seat detection frame; The matching of the second image of the target classroom with the structured seat template identifies student behaviors corresponding to each seat, including: determining a shooting view angle of the second image; determining an identification boundary box corresponding to the seat detection box in the r row and the c column based on the shooting view angle of the second image; determining that a target student exists in the identification bounding box; And identifying the behavior of the target student.
5. The method of claim 4, wherein determining an identification bounding box corresponding to an r-th row and c-th column seat detection box based on the taken view of the second image comprises: determining a lower boundary of the identification bounding box based on row coordinates of row r-1; Determining a width of the identification bounding box based on a first expansion coefficient and the width of the seat detection box; an upper boundary of the identification bounding box is determined based on a second expansion coefficient and a height of the seat detection box.
6. The method of claim 5, wherein the method further comprises: Determining that no student is present in the identified bounding box; expanding the upper boundary of the identification boundary frame based on a third expansion coefficient to obtain an expanded identification boundary frame; And stopping expanding the upper boundary of the boundary box under the condition that students exist in the expanded identification boundary box or the upper boundary is larger than a height threshold value.
7. The method of any one of claims 1 to 6, wherein said matching the second image of the target classroom with the structured seat template identifies student activity for each seat, comprising: acquiring a plurality of second images under different shooting visual angles at the same moment; based on the shooting visual angles corresponding to the second images, respectively matching all the second images with the structured seat template, and determining the student behavior confidence coefficient corresponding to the target seat in each second image; and identifying the student behaviors of the target seat based on the behaviors with highest behavior confidence.
8. A student behavior data set labeling apparatus, the apparatus comprising: The detection module is used for detecting the seat area of a first image of a target classroom by utilizing a target detection model which completes training to obtain a seat detection frame set, wherein the first image comprises seats arranged in the target classroom; An allocation module, configured to allocate each seat detection frame in the seat detection frame set to a grid position formed based on a preset number of rows R and a preset number of columns C based on a shooting view angle of the first image, so as to generate a structured seat template, where R and C are integers greater than or equal to 1; The matching module is used for matching a second image of the target classroom with the structured seat template and identifying student behaviors corresponding to each seat, wherein the second image comprises students in the target classroom; And the identification module is used for identifying the second image based on the behaviors of the students corresponding to each seat, and adding the identified second image to a data set for completing student behavior identification.
9. An electronic device comprising a memory and a processor, the memory storing a computer program executable on the processor, wherein the processor, when executing the program, implements the steps of the student behavior dataset labeling method of any of claims 1 to 7.
10. A computer program product, characterized in that it comprises computer program code for implementing the steps of the student behavior data set labeling method according to any one of claims 1 to 7 when said computer program code is run on an electronic device.

Description

Student behavior data set labeling method Technical Field The application relates to the technical field of computer vision, in particular to a student behavior data set labeling method. Background With the rapid development of intelligent education and intelligent classroom, video-based teaching behavior identification becomes an important research direction for classroom teaching quality analysis. The existing teaching behavior recognition models (such as YOLO, openPose and the like) all need a large amount of high-quality annotation data before training, so that the requirement of model training precision is met. However, the existing pure manual labeling has the following problems that classroom scenes are usually multi-person complex environments, problems of shielding, illumination change, similar behaviors and the like exist, so that manual frame-by-frame labeling workload is huge and low in efficiency, boundary frames of the same student in different frames are easy to drift to influence subsequent model training, consistency is low, and row-column alignment of seat layout is difficult to ensure manually, so that space semantic structures are lost. On the other hand, most of the existing automatic labeling technologies are designed aiming at general scenes, lack of utilization of a fixed space structure such as a classroom, and cannot fully utilize seat layout information to improve automatic labeling precision and realize efficient behavior data generation based on a space template. Meanwhile, purely automatic labeling does not handle well for some erroneous or low quality labels. Considering that the classroom environment is complex, the information of all positions is difficult to determine by a single camera information source, so that multiple angles are required to be used as information input, and the labeling effect is improved. Therefore, a man-machine collaborative labeling method which combines the multi-view vision of a space template, does not need to pretrain a behavior model, only depends on the prior of an empty classroom structure and supports efficient manual correction is needed, so that the cost required by data labeling is reduced, and the accuracy and consistency of training data obtained by an automatic behavior recognition model are improved. Disclosure of Invention The application aims to provide a student behavior data set annotation, which adopts the following technical scheme: in a first aspect, an embodiment of the present application provides a student behavior data set labeling method, including: Carrying out seat area detection on a first image of a target classroom by using a target detection model after training to obtain a seat detection frame set, wherein the first image comprises seats arranged in the target classroom; assigning each seat detection frame in the seat detection frame set to a grid position formed based on a preset number of rows R and a preset number of columns C based on a shooting view angle of the first image to generate a structured seat template, wherein R and C are integers greater than or equal to 1; Matching a second image of the target classroom with the structured seat template, and identifying student behaviors corresponding to each seat, wherein the second image comprises students in the target classroom; and identifying the second image based on the behaviors of the students corresponding to each seat, and adding the identified second image to a data set for completing student behavior identification. In a second aspect, there is provided a student behavior data set labeling apparatus, the apparatus comprising: The detection module is used for detecting the seat area of a first image of a target classroom by utilizing a target detection model which completes training to obtain a seat detection frame set, wherein the first image comprises seats arranged in the target classroom; An allocation module, configured to allocate each seat detection frame in the seat detection frame set to a grid position formed based on a preset number of rows R and a preset number of columns C based on a shooting view angle of the first image, so as to generate a structured seat template, where R and C are integers greater than or equal to 1; The matching module is used for matching a second image of the target classroom with the structured seat template and identifying student behaviors corresponding to each seat, wherein the second image comprises students in the target classroom; And the identification module is used for identifying the second image based on the behaviors of the students corresponding to each seat, and adding the identified second image to a data set for completing student behavior identification. In a third aspect, there is provided an electronic device comprising a memory storing a computer program executable on the processor and a processor which when executed causes the electronic device to perform the above method. In a fourth aspect,