CN-117274301-B - Multi-target tracking matching method in sports video based on target nomination replacement
Abstract
The invention relates to a multi-target tracking matching method in sports video based on target nomination replacement, belongs to the technical field of target detection, and solves the problems of repeated detection and target loss in the sports video in the prior art. According to the invention, a matching loss matrix is constructed by repeatedly applying the Hungary algorithm, the target matching quality of adjacent frames is continuously updated, and finally, an optimal matching scheme is obtained, so that the problem of repeated detection can be reduced, under the condition that the appearances of athletes in videos are similar, a single target which is not shielded or not overlapped is not judged to be shielded or overlapped, and a plurality of bounding boxes are not output for the single target. And the condition that targets are lost in sports videos due to mutual shielding and rapid movement of athletes can be reduced.
Inventors
- WANG YUNHONG
- HE RUI
- YANG HAO
- CHEN XUNXUN
- LIU QINGJIE
Assignees
- 北京航空航天大学
- 国家计算机网络与信息安全管理中心
Dates
- Publication Date
- 20260512
- Application Date
- 20230824
Claims (9)
- 1. A multi-target tracking matching method in sports video based on target nomination replacement is characterized by comprising the following specific steps: s1, performing target detection on images of two adjacent frames in a sports video to obtain a previous frame tracking result and a next frame detection result, wherein the next frame detection result comprises a first matching target nomination set and a nomination complement set; s2, constructing a first matching loss matrix based on the first matching target nomination set and the bounding box coordinates of the target nomination in the previous frame tracking result The expression is: Wherein, the Representing a universal cross-over ratio; the general cross-over ratio is expressed as: Wherein L represents a rectangular area of a target nomination set matched with the next frame for the first time, T represents a rectangular area of a tracking result of the previous frame; representing a minimum rectangular area capable of covering a rectangular area of a subsequent frame first matching the target nomination set and a rectangular area of a previous frame tracking result ; Representing the difference; Representing the cross-over ratio; s3, based on the first matching loss matrix, acquiring a plurality of matching values of two adjacent frames of target nominations in the sports video in the first matching target nominations; S4, judging a certain matching value obtained in the step S3, if the matching value is bad matching, selecting a replacement nomination from a nomination replacement set if the target nomination in a first matching target nomination set corresponding to the bad matching is low matching quality target nomination, replacing the low matching quality target nomination corresponding to the bad matching by using the selected replacement nomination, obtaining a1 st generation matching result of a later frame, entering into the step S5, if all the matching values obtained in the step S3 are good matching, obtaining a high matching quality target nomination set if the target nomination in the first matching target nomination set corresponding to each good matching is high matching quality target nomination, and ending the multi-target tracking matching of the images of two adjacent frames in the sports video to obtain a later frame tracking matching result; S5, constructing an nth generation matching loss matrix according to the nth generation matching result of the next frame and the bounding box coordinates of the target nomination in the tracking result of the previous frame, wherein n=1; s6, based on the n-th generation matching loss matrix, acquiring a plurality of matching values of two adjacent frames of target nominations in the sports video in the n-th generation matching result of the next frame; And S7, judging a certain matching value obtained in the step S6, if the matching value is bad matching, selecting a replacement name from a replacement name set, replacing a low matching quality target name corresponding to the bad matching by using the selected replacement name, obtaining an n+1th generation matching result of the next frame, enabling n=n+1, and returning to the step S5, if the matching value is good matching, obtaining a high matching quality target name set with the target name in the n generation matching result of the next frame corresponding to each good matching, and ending multi-target tracking matching of images of the two adjacent frames in the sports video to obtain a tracking matching result of the next frame.
- 2. The multi-target tracking matching method in sports video according to claim 1, wherein a multi-target tracking network model is used to obtain a previous frame tracking result and a subsequent frame detection result in step S1; The method comprises the steps of obtaining a multi-target tracking network model by using a training set of a historical sports video database to carry out tag tracking labeling on multiple targets in the historical sports video to obtain tracking tags of each target, and training the multi-target tracking model by using a deep learning method based on all the tracking tags.
- 3. The method for matching multi-objective tracking in sports video according to claim 2, wherein the multi-objective tracking model comprises a multi-layer perceptron, and the learning of the multi-layer perceptron comprises the following steps: the expression of the multi-layer perceptron is obtained as follows: ; Wherein, the A bounding box for the output object; the method comprises the steps of inputting a single frame picture; is a learnable parameter; target bounding box using back propagation mechanism to output multi-layer perceptron Transferring deviation from tracking tag of target to learnable parameters Training of learnable parameters The output target bounding box Y is approximated to the tracking tag of the target.
- 4. A multi-target tracking matching method in sports video according to any one of claims 1-3, wherein the previous frame tracking result obtained in step S1 The expression of (2) is: ; Wherein, the The mth object nomination, m=1, 2, representing the previous frame in the sports video, M, M represents the total number of target nomination of the previous frame in the sports video.
- 5. A multi-object tracking matching method in sports video according to any of claims 1-3, wherein the subsequent frame detection results obtained in step S1 are arranged in descending order of object confidence, and the previous frame detection results exceeding the threshold of object confidence are extracted The target nomination of the next frame, the expression of each next frame is: ; Wherein, the Indicating the i-th next frame target nomination, i=1, 2,..N, N is the total number of the extracted next frame target nomination; the upper left corner coordinates of the object bounding box representing the object nomination of the i-th subsequent frame, The right lower corner coordinates of the target bounding box for representing the target nomination of the ith later frame; Will be Preceding in target nomination of next frame The target nomination of the next frame is used as the first matching target nomination to obtain a first matching target nomination set Other(s) Taking the target nomination of the next frame as the replacement nomination to obtain the nomination replacement complement 。
- 6. The method according to claim 1, wherein in step S3 and/or step S6, a hungarian matching algorithm is used to obtain a plurality of matching values of coordinates of a target package box in two adjacent frames of the sports video.
- 7. The method according to claim 1 or 6, wherein in step S4 and/or step S7, if the matching value is too large or if some two matching values are very close, the matching value is an outlier.
- 8. The method of claim 1 or 7, wherein in step S4 and/or step S7, the matching value is an outlier if the matching value exceeds a matching value threshold.
- 9. The method for multi-objective tracking matching in sports video according to claim 1, further comprising the steps of: Ss1, acquiring a w frame tracking matching result by using the method, wherein w=2; SS2, obtaining a w+1st frame detection result; SS3, matching the w frame tracking matching result with the w+1th frame detection result by using a Hungary algorithm to obtain the w+1th frame tracking matching result; and S4, if W is more than or equal to W and is the total frame number of the sports video, W is more than or equal to 3, w=w+1, returning to the step S2, and if W is more than or equal to W, using the w+1st frame tracking matching result as the total target tracking matching result of the sports video.
Description
Multi-target tracking matching method in sports video based on target nomination replacement Technical Field The invention belongs to the technical field of target detection, and particularly relates to a multi-target tracking and matching method in sports video based on target nomination and replacement. Background Compared with the pedestrian tracking under the general monitoring video, the multi-target (such as athlete) tracking problem in the sports video has some special points, as shown in fig. 1-2, firstly, uniform sportsmen uniform can be considered as extremely similar appearance of a plurality of targets, and the visual characteristics of different targets can be similarly distributed, so that the repeated detection problem can be caused; second, many athletes run frequently, so the occlusion phenomenon occurs frequently, and the athlete's moving speed is faster than that of pedestrians, and the problem of target loss is more serious. If the two situations occur simultaneously, the existing multi-target tracking method faces very difficult challenges. The sports video is subjected to multi-objective tracking, so that the sports information of the athlete can be obtained, the playing habit of the concerned athlete can be effectively assisted and analyzed for the player, a targeted scheme is specified, and the game dynamics, the athlete technical statistics and the like can be displayed in multiple directions for audiences. The existing multi-target tracking method mostly solves the problem that a plurality of pedestrians are tracked in a monitoring video, under the scene, the number of targets is large, but the visual angles of cameras are fixed, the appearance of the targets has certain difference, the moving speed of the targets is low, and the multi-target tracking method can obtain a good tracking effect. It still has certain limitations and currently few multi-objective tracking studies are conducted specifically for sports video. For the challenges presented in the foregoing sports video, existing methods, even if capable of training out models from the sports video, do not reasonably design for the sports video scene, so existing multi-objective tracking methods are generally not directly applicable to the sports video scene. Disclosure of Invention In view of the above problems, the invention provides a multi-target tracking matching method in sports video based on target nomination replacement, which solves the problems of repeated detection and target loss in the sports video in the multi-target tracking method in the prior art. The invention provides a multi-target tracking and matching method in sports video based on target nomination and replacement, which comprises the following specific steps: s1, performing target detection on images of two adjacent frames in a sports video to obtain a previous frame tracking result and a next frame detection result, wherein the next frame detection result comprises a first matching target nomination set and a nomination complement set; s2, constructing a first matching loss matrix based on the first matching target nomination set and surrounding box coordinates of the target nomination in the previous frame tracking result; s3, based on the first matching loss matrix, acquiring a plurality of matching values of two adjacent frames of target nominations in the sports video in the first matching target nominations; S4, judging a certain matching value obtained in the step S3, if the matching value is bad matching, selecting a replacement nomination from a nomination replacement set if the target nomination in a first matching target nomination set corresponding to the bad matching is low matching quality target nomination, replacing the low matching quality target nomination corresponding to the bad matching by using the selected replacement nomination, obtaining a1 st generation matching result of a later frame, entering into the step S5, if all the matching values obtained in the step S3 are good matching, obtaining a high matching quality target nomination set if the target nomination in the first matching target nomination set corresponding to each good matching is high matching quality target nomination, and ending the multi-target tracking matching of the images of two adjacent frames in the sports video to obtain a later frame tracking matching result; S5, constructing an nth generation matching loss matrix according to the nth generation matching result of the next frame and the bounding box coordinates of the target nomination in the tracking result of the previous frame, wherein n=1; s6, based on the n-th generation matching loss matrix, acquiring a plurality of matching values of two adjacent frames of target nominations in the sports video in the n-th generation matching result of the next frame; And S7, judging a certain matching value obtained in the step S6, if the matching value is bad matching, selecting a replacement name from a rep