CN-121999012-A - Real-time multi-target tracking method based on shielding information
Abstract
The invention discloses a real-time multi-target tracking method based on shielding information, which comprises the following steps of 1, initializing a tracker, 2, obtaining a current frame target and appearance characteristics of the current frame target, sending the obtained target and the corresponding characteristics into the tracker, 3, constructing an incidence matrix to perform two-stage global matching to obtain a matching result M, 4, correcting an affected target based on the division of track types in the step 1 and the division of target types in the step 3, then updating track information, 5, increasing shielding duration of unmatched tracks, deleting the track with overlong shielding duration, converting the track types and initializing the unmatched target, and 6, after the track type conversion is completed, outputting the current frame tracking result And returning to the step2 to circularly track until the tracking is finished. The method effectively improves the tracking effect on the low-confidence targets in the low-camera angle scene and improves the accuracy of multi-target tracking tasks.
Inventors
- MA LILI
- SHEN XUEQIAN
- CHEN JINGUANG
- ZHANG KAIBING
- FAN WEI
Assignees
- 西安工程大学
Dates
- Publication Date
- 20260508
- Application Date
- 20251230
Claims (8)
- 1. The real-time multi-target tracking method based on shielding information is characterized by comprising the following steps of: the real-time multi-target tracking method based on shielding information specifically comprises the following steps: Step 1, initializing a tracker; Step 2, obtaining the current frame target and the appearance characteristics of the current frame target, and sending the obtained target and the corresponding characteristics into a tracker; step 3, constructing an incidence matrix for two-stage global matching to obtain a matching result M; Step 4, correcting the affected target based on the classification of the track in the step 1 and the classification of the target in the step 3; step 5, increasing the shielding time of the unmatched tracks, deleting the tracks with overlong shielding time, converting the types of the tracks and initializing unmatched targets; step 6, after the track category conversion is completed, outputting the current frame tracking result And returning to the step 2 to circularly track until the tracking is finished.
- 2. The method of claim 1, wherein in step 1, the tracker is a tracking part of the method based on the TBD paradigm, and the tracker has three kinds of track sets, which are respectively determined tracks Occlusion trajectory Uncertain trajectories 。
- 3. The method for real-time multi-target tracking based on occlusion information according to claim 2, wherein in step 1, an occlusion trajectory is used Is the disappearance threshold of (2) Set to 90, uncertain track Is the disappearance threshold of (2) Set to 35, complete the initialization of the tracker.
- 4. The method for real-time multi-target tracking based on shielding information according to claim 1, wherein in step 2, the current frame image is extracted from the video to be tracked or acquired in real time, and YOLOX is used as a detector for the current frame image to obtain the current frame target , Representing the ith current frame object by Representing the confidence level, obtaining the appearance characteristics of the current frame target by using Fast-ReID as a target appearance characteristic extractor, Appearance characteristics of (2) are marked as Predicting all trajectories using a Kalman filter And linearly predicting the leading predicted position, The j-th track of (3) The current frame position of (2) is recorded as , The j-th track of (3) Is recorded as the advanced predicted position of (2) If the current frame image is the first, the tracker is obtained in the step 1, otherwise, the tracker is obtained in the step 6.
- 5. The method for real-time multi-target tracking based on occlusion information according to claim 1, wherein step 3 specifically comprises: Step 3.1, dividing the current frame target D into reliable targets according to the confidence level Uncertain reliable targets Unreliable targets Three categories; Representing the target determination threshold value, Representing a target uncertainty threshold; Step 3.2 for all tracks Each of the tracks in (a) Deriving traces from trace categories Is the shielding condition of (2) As shown in formula (1), all tracks in a blocking state Is a shading value of (2) Set to 1 and the rest are 0, based on track And Other trajectories of (3) Advanced predicted position of (c) 、 Calculating track pre-shielding condition through IOA And calculating the trajectory as shown in formula (2) Advanced predicted position of (c) With other trajectories Advanced predicted position IOA values between if all depths are greater than A kind of electronic device In (a), with Has an IOA value exceeding The track is then Corresponding to Setting to 0.5, wherein the track is about to be blocked or is in a blocking state, the rest normal tracks or tracks at the end of blocking are set to 0, and finally, the track is set as shown in formula (3) And Adding to obtain a track Is (are) occlusion information The specific expressions of the formulas (1) - (3) are as follows: (1) (2) (3) Wherein, the Representing the area occlusion threshold with the track, Representation of Is a bottom line abscissa of (2); Obtaining Then judging the track Is (are) occlusion information And For subsequent matching; Indicating a track state which is still in the occlusion state but has been at the end of the occlusion, as shown in equation (4), when Greater than 0.5 and less than 1.5 Set to 1, the remainder to 0; represents the track state during occlusion, as shown in equation (5), when Above 0.5 Set to 1, the remainder to 0; the specific expressions of the formulas (4) - (5) are as follows: (4) (5); Step 3.3, performing two-stage global matching based on the track shielding information obtained in the step 3.2, multiplying the confidence coefficient matrix of the target by the height modulation IOU matrix calculated based on the current frame target position and the current frame positions of all tracks according to elements as shown in a formula (6), and calculating to obtain the corrected target And for the ith target as shown in formula (7) Appearance characteristics of (a) And the j-th track Appearance characteristics of (a) Calculating appearance matrices with different target confidence levels , Representation of The specific expressions of the formulas (6) - (7) are as follows: (6) (7) Wherein the method comprises the steps of And (3) with The dimensions of the matrix are the same, Is the confidence of the target used for the corresponding location; Representation of And Cosine distance of (2); after obtaining And Thereafter, a first stage global matching is performed, wherein gating restrictions are used Screening out reasonable matching pairs and using a cost main body Determining the priority of matching; The calculation of (a) is shown as the formula (8) -formula (11), The calculation of (2) is shown as a formula (12); (8) (9) (10) (11) (12) Wherein, the Is a position correlation matrix incorporating occlusion information, Is the track shielding information correction weight; Is sum of And (3) with Occlusion information matrix of the same dimension, wherein All values of the j-th column are set as traces A kind of electronic device , All values of the j-th column are set as traces A kind of electronic device ; Is the IoU threshold value, which is set at the threshold value, Is an appearance feature threshold; Is a center point distance matrix between the current frame target and the current frame position of the track; is a scale matrix between the current frame target and the current frame position of the track, wherein the values Is the ith target Position frame and j-th track of (2) The sum of the widths of the position frames; is the area ratio of the target and the track used for the corresponding position; Is a traditional matching condition, which is taken as an initial gating limit, and a subsequent one Then it is based on the complement of occlusion information; is the weight of the priority-determining part of the appearance characteristics Confidence of target Area ratio With the same weight ; After obtaining And Thereafter, at Under the limitation of (a) using an optimal matching policy basis Obtaining the first stage matching result ; First stage unmatched tracks And objects As the input of the second stage global matching, the matching is carried out by means of position information, firstly extracting the submatrices corresponding to the unmatched tracks from the matrixes corresponding to the first stage, including 、 、 、 、 、 、 、 The calculation of (a) is shown as a formula (13) -a formula (17), The calculation of (2) is shown as a formula (18); (13) (14) (15) (16) (17) (18) Wherein, the Is an appearance characteristic associated matrix combined with shielding information, and is obtained after And Thereafter, at Under the limitation of (a) using an optimal matching policy basis Obtaining the matching result of the second stage Finally combining the matching results of the two stages to obtain a final matching result Record the unmatched track set as The unmatched set of targets is noted as 。
- 6. The method for real-time multi-target tracking based on occlusion information according to claim 1, wherein step 4 specifically comprises: Step 4.1, matching the result obtained in the step 3 Any matching pair of If the track is Pre-occlusion information of (2) Indicating that the matching pair may be affected if Or alternatively But is provided with The motion information of the matching pair track is more reliable, in which case the target is first aligned Make corrections that otherwise would not be necessary Correction is performed according to the description of formula (19) Is selected as the target Maximum obstruction of (2) Formula (19) is shown below: (19) Wherein, the Representing a covering Is defined by the area of the (c), Representing objects Is a bottom line frame abscissa; obtaining the maximum shielding object After that, through And Is determined by the relative position of (2) Finally, reliable width and height information maintained by the track is used as corrected width and height, and the position of the correction target is determined by the reliable point to complement the affected Obtaining And a corrected target When the track information is updated, the current position is calculated based on the reliable speed within the range of the latest 5 frames of the track; Step 4.2, matching the result Any matching pair of If judge If the correction is needed, the correction is completed to obtain After that, use Updating the current position, confidence coefficient and Kalman filtering parameter of the track, if judging Without correction, the target is directly used Updating track position, confidence coefficient and Kalman filtering parameters, if matched pair In (a) Or alternatively But is provided with Then use Appearance characteristics of (a) Updating a track Appearance characteristics of (a) Appearance characteristic update based on shielding is adopted, if Confidence of (1) Is larger than the track Confidence of (1) The EMA weight of (2) is Otherwise, it is 。
- 7. The method for real-time multi-target tracking based on occlusion information according to claim 1, wherein step 5 specifically comprises: Step 5.1 for all trajectories Will maintain a blocking period Every time a track Will not match The value is increased by one, if the track Is a shading duration of (2) Deleting the track if the track Is a shading duration of (2) The track is deleted whenever it is After successful matching and updating of information, the information will Zeroing; Step 5.2, judging whether the track is at the boundary of the picture after updating the shielding time length, specifically, judging that the track is not matched The position of (2) is expressed as Wherein Representing the upper left-hand vertex of the trajectory, Representing the right lower vertex of the track, the coordinates of the left upper vertex of the current frame image are as follows The coordinates of the lower right vertex of the current frame image are as follows Where w is the width of the current frame image and h is the height of the current frame image, if 、 、 、 Any inequality of the track is established, and the track is represented At the boundary of the picture, deleting the track; step 5.3 for the non-matched determined tracks obtained in step 3.3 And is also provided with First, setting accident delay sign If the determined track Unmatched sum Then make If the determined track Unmatched sum The determined track Converting into occlusion trajectories, i.e. converting the determined trajectories Put into In the process of And one for the unmatched occlusion track obtained in step 3.3 And is also provided with And an uncertain track And is also provided with Updating track position based on the current Kalman filtering state linearity of the track, if the track is matched In (a) Or alternatively Or in addition to And is also provided with Track during time Will be converted into a determined track after updating The current track type is not changed by other matched pairs; step 5.4, regarding the unmatched object obtained in step 3.3 If (if) Confidence of (1) Then pair Initializing the track if Will initialize to determine the trajectory If (if) Will initialize to an uncertain track Each just initialized track is given a unique ID information, increasing gradually from 1.
- 8. The method for real-time multi-target tracking based on occlusion information according to claim 1, wherein step 6 specifically comprises: current frame tracking results All tracks after the end of the process of step 5 The position and ID information of the track with the occlusion duration value of 0 are shown in the formula (20), and then the next frame of image is circularly tracked in the step (2) until the tracking is finished, wherein the expression of the formula (20) is as follows: (20)。
Description
Real-time multi-target tracking method based on shielding information Technical Field The invention belongs to the technical field of computer vision, and particularly relates to a real-time multi-target tracking method based on shielding information. Background In the field of computer vision, visual multi-target tracking has important applications. The multi-target tracking is used as a key task in computer vision and is widely applied to the fields of video monitoring, automatic driving, man-machine interaction and the like. Currently, the mainstream multi-target tracking method mostly adopts a detection tracking paradigm, namely, a target boundary box in each frame is acquired through a detector, and then the target boundary box is matched with the existing track through data association. However, in practical applications, multi-target tracking still faces challenges such as target occlusion, similar appearance between targets, camera motion blur, and detection errors, resulting in limited tracking performance. With the continued development of multi-target tracking algorithm research, many methods under the detection tracking paradigm have been proposed. Some of these related operations :ByteTrack(Zhang Y , Sun P , Jiang Y ,et al.ByteTrack: Multi-Object Tracking by Associating Every Detection Box[EB/OL].2021.DOI:10.48550/arXiv.2110.06864.)、Deep-SORT(WOJKE N, BEWLEY A, PAULUS D. Simple online and realtime tracking with a deep association metric[C]//2017 IEEE International Conference on Image Processing. Beijing, China: IEEE,2017:3645-3649.DOI:10.1109/ICIP.2017.8296962.)、OC-SORT(Cao J , Pang J , Weng X ,et al.Observation-Centric SORT: Rethinking SORT for Robust Multi-Object Tracking[C].2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023:9686-9696.)、ByteTrack are a key operation in the detection tracking paradigm. The method not only provides the use value of low-confidence targets in multi-target tracking, but also provides detectors widely used in the field based on an advanced YOLO series detection model. Deep-SORT introduces Re-ID appearance characteristics into an incidence matrix of a target and a track, and is now a popular method under a detection tracking paradigm. The OC-SORT uses object observations to calculate the error accumulation of the filter parameters during the blocking period so that the parameters of the kalman filter can be blocked more often without excessive impact. Under the TBD framework, the data association link is widely focused. The method for implicitly learning the target motion state through the neural network still has a gap from the most advanced TBD method in the aspects of precision and efficiency, and the real-time requirement of the terminal equipment is difficult to meet by a complex network structure. Thus, a number of association mechanisms are proposed that do not require training. And ,TrackTrack(Shim K , Ko K , Yang Y ,et al.Focusing on Tracks for Online Multi-Object Tracking[C]//2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).DOI:10.1109/CVPR52734.2025.01091.)、Hybrid-SORT(Yang M, Han G, Yan B, et al. Hybrid-SORT: Weak Cues Matter for Online Multi-Object Tracking[C]. Proceedings of the AAAI Conference on Artificial Intelligence, 2024, 38(7): 6504-6512. )、DeconfuseTrack(Huang C , Han S , He M ,et al.DeconfuseTrack:Dealing with Confusion for Multi-Object Tracking[J].IEEE, 2024.DOI:10.1109/CVPR52733.2024.01825.).TrackTrack provides a correlation strategy based on track perspective to replace the traditional Hungary matching, and introduces an initialization method of track perception, so that the use value of the track in the matching process is effectively improved. Hybrid-SORT introduces confidence and altitude states on the basis of speed direction cues and uses these as potential lines of weakness to supplement spatial and appearance isoperimetric lines when objects are occluded and clustered. DeconfuseTrack proposes a new decomposition data association that uses a series of non-learning based modules to decompose a traditional association problem into a plurality of sub-problems and selectively resolve confusion in each sub-problem by purposefully utilizing new cues. However, despite the significant effort achieved by the above algorithm in multi-target tracking tasks, the problems of ID switching and low confidence target usage due to occlusion in multi-target tracking remain. In addition, there are many latest methods for splicing track segments to track results of the current video by using a post-processing method of track interpolation, so that tracks with ID switching are connected, but such a method cannot meet the requirement of real-time tracking. Therefore, the existing method is insufficient in coping with the shielding problem, and more valuable clues still need to be discovered to improve the coping ability of the shielding problem in multi-target tracking. Disclosure of Invention The inve