CN-115496958-B - Method and device for algorithm closed-loop assisted bidirectional multi-model data labeling

CN115496958BCN 115496958 BCN115496958 BCN 115496958BCN-115496958-B

Abstract

The invention provides a method and a device for algorithm closed-loop assisted bidirectional multi-model data labeling, comprising the following steps: initializing a bidirectional multimode annotation workflow; labeling the image data to be labeled based on the workflow, and generating labeling data; generating a data set according to the labeling data and the workflow; wherein the labeling includes one or more of the following: bidirectional single-model labeling, bidirectional multi-model simultaneous correction labeling and multi-target bidirectional multi-model data correction labeling; according to the method, the bidirectional single-model annotation, the bidirectional multi-model simultaneous correction annotation and/or the multi-target bidirectional multi-model data correction annotation are carried out on the image data to be annotated, so that the target pixel coordinates are accurately positioned, the labor annotation cost is greatly saved, and the data annotation speed is improved.

Inventors

WANG HAO
YANG YANTAI
YU ZETING
ZHANG TIANHAO
YIN GUIXIN

Assignees

天津(滨海)人工智能军民融合创新中心

Dates

Publication Date: 20260508
Application Date: 20220812

Claims (10)

1. The method for marking the bi-directional multi-model data with the assistance of the algorithm closed loop is characterized by comprising the following steps: initializing a bidirectional multimode annotation workflow; labeling the image data to be labeled based on the workflow, and generating labeling data; generating a data set according to the annotation data and the workflow; wherein the annotation comprises one or more of the following: bidirectional single-model labeling, bidirectional multi-model simultaneous correction labeling and multi-target bidirectional multi-model data correction labeling; the initializing the bidirectional multimode annotation workflow comprises the following steps: selecting an auxiliary algorithm for labeling from a preset algorithm model library; setting a confidence threshold for the auxiliary algorithm, and setting initial target position coordinates; setting the type and format of a data set and generating configuration parameters of the data set; the data set type at least comprises one or more of a target detection data set and a target tracking data set, wherein the auxiliary algorithm at least comprises one or more of an initial target frame recommendation algorithm, a target detection algorithm, a target frame automatic correction labeling algorithm and a single target tracking algorithm, and the data set generation configuration parameters at least comprise one or more of similarity, time interval, overlapping rate, center point offset and total frame number; based on the workflow, bidirectional multi-model simultaneous correction labeling is carried out on the image data to be labeled, and labeling data is generated, which comprises the following steps: Based on the selected auxiliary algorithms, respectively carrying out forward image continuous labeling on the image data to be labeled to obtain a visual positioning result corresponding to each auxiliary algorithm of the target position in the image data to be labeled; If the errors of the visual positioning results corresponding to all the auxiliary algorithms are smaller than the confidence coefficient threshold value, generating labeling data and ending; If an auxiliary algorithm with the error of the visual positioning result smaller than the confidence coefficient threshold value exists and an auxiliary algorithm with the error of the visual positioning result larger than or equal to the confidence coefficient threshold value exists, stopping continuous labeling of the forward image, correcting the auxiliary algorithm with the error smaller than the confidence coefficient threshold value by adopting the auxiliary algorithm with the error smaller than the confidence coefficient threshold value, continuously labeling the forward image of the image data to be labeled by adopting each auxiliary algorithm respectively until the errors of the visual positioning results corresponding to all the auxiliary algorithms are smaller than the confidence coefficient threshold value, generating labeling data and ending; If the errors of the visual positioning results corresponding to all the auxiliary algorithms are larger than or equal to the confidence coefficient threshold value, stopping continuous labeling of the forward image, reinitializing the target position coordinates, and respectively adopting a plurality of selected auxiliary algorithms to carry out reverse continuous labeling correction on the visual positioning results of the target position based on the reinitialized target position coordinates until the visual positioning results meet the preset termination condition, generating labeling data and ending.
2. The method of claim 1, wherein bi-directional single-model labeling of image data to be labeled based on the workflow, generating labeling data, comprises: Based on the selected auxiliary algorithm, carrying out forward image continuous labeling on the image data to be labeled to obtain a visual positioning result of the target position in the image data to be labeled; if the error of the visual positioning result is smaller than the confidence coefficient threshold value, generating marking data and ending, otherwise, stopping continuous marking of the forward image, and reinitializing the target position coordinate; And carrying out reverse-order continuous labeling correction on the visual positioning result of the target position based on the reinitialized target position coordinate until the visual positioning result meets the preset termination condition and generates labeling data.
3. The method of claim 2, wherein performing multi-target bi-directional multi-model data correction labeling on image data to be labeled based on the workflow, generating labeling data, comprises: and respectively carrying out bidirectional single-model labeling on each target in the image data to be labeled based on an auxiliary algorithm corresponding to each target in the image data to be labeled, and generating labeling data.
4. The method of claim 1, wherein generating a dataset from the annotation data and workflow comprises: selecting a target detection data set and/or a target tracking data set with the data set type to be generated according to the data set type set in the workflow; When a target detection data set needs to be generated, extracting labeling data with the minimum similarity of the same target based on a similarity algorithm to manufacture the target detection data set; When the target tracking data set is required to be generated, extracting continuous images from the labeling data according to the set time interval, the set coincidence rate, the set center point deviation and the set total frame number to form the target tracking data set.
5. The method of claim 1, wherein after generating the data set, further comprising: Based on the target detection data set, carrying out script training and/or super-parameter configuration of a target detection algorithm, training the detection data set, and generating a target detection algorithm model; and (3) based on the target tracking data set, carrying out script training and/or super-parameter configuration of the single target tracking algorithm, training the tracking data set, and generating a single target tracking algorithm model.
6. The device for marking the bi-directional multi-model data assisted by the algorithm closed loop is characterized by comprising a man-machine interaction marking interface, a marking module and a marking data generating module; The man-machine interaction labeling interface is used for initializing a bidirectional multi-model labeling workflow; the labeling module is used for labeling the image data to be labeled based on the workflow, and generating labeling data; the annotation data generation module is used for generating a data set according to the annotation data and the workflow; wherein the annotation comprises one or more of the following: bidirectional single-model labeling, bidirectional multi-model simultaneous correction labeling and multi-target bidirectional multi-model data correction labeling; The man-machine interaction labeling interface is specifically used for: selecting an auxiliary algorithm for labeling from a preset algorithm model library; setting a confidence threshold for the auxiliary algorithm, and setting initial target position coordinates; setting the type and format of a data set and generating configuration parameters of the data set; the data set type at least comprises one or more of a target detection data set and a target tracking data set, wherein the auxiliary algorithm at least comprises one or more of an initial target frame recommendation algorithm, a target detection algorithm, a target frame automatic correction labeling algorithm and a single target tracking algorithm, and the data set generation configuration parameters at least comprise one or more of similarity, time interval, overlapping rate, center point offset and total frame number; the labeling module performs bidirectional multi-model simultaneous correction labeling on the image data to be labeled based on the workflow, and generates labeling data, which comprises the following steps: Based on the selected auxiliary algorithms, respectively carrying out forward image continuous labeling on the image data to be labeled to obtain a visual positioning result corresponding to each auxiliary algorithm of the target position in the image data to be labeled; If the errors of the visual positioning results corresponding to all the auxiliary algorithms are smaller than the confidence coefficient threshold value, generating labeling data and ending; If an auxiliary algorithm with the error of the visual positioning result smaller than the confidence coefficient threshold value exists and an auxiliary algorithm with the error of the visual positioning result larger than or equal to the confidence coefficient threshold value exists, stopping continuous labeling of the forward image, correcting the auxiliary algorithm with the error smaller than the confidence coefficient threshold value by adopting the auxiliary algorithm with the error smaller than the confidence coefficient threshold value, continuously labeling the forward image of the image data to be labeled by adopting each auxiliary algorithm respectively until the errors of the visual positioning results corresponding to all the auxiliary algorithms are smaller than the confidence coefficient threshold value, generating labeling data and ending; If the errors of the visual positioning results corresponding to all the auxiliary algorithms are larger than or equal to the confidence coefficient threshold value, stopping continuous labeling of the forward image, reinitializing the target position coordinates, and respectively adopting a plurality of selected auxiliary algorithms to carry out reverse continuous labeling correction on the visual positioning results of the target position based on the reinitialized target position coordinates until the visual positioning results meet the preset termination condition, generating labeling data and ending.
7. The apparatus of claim 6, wherein the labeling module performs bidirectional single-model labeling on image data to be labeled based on the workflow, generating labeling data, comprising: Based on the selected auxiliary algorithm, carrying out forward image continuous labeling on the image data to be labeled to obtain a visual positioning result of the target position in the image data to be labeled; if the error of the visual positioning result is smaller than the confidence coefficient threshold value, generating marking data and ending, otherwise, stopping continuous marking of the forward image, and reinitializing the target position coordinate; And carrying out reverse-order continuous labeling correction on the visual positioning result of the target position based on the reinitialized target position coordinate until the visual positioning result meets the preset termination condition and generates labeling data.
8. The apparatus of claim 6, wherein the labeling module performs multi-target bi-directional multi-model data correction labeling on the image data to be labeled based on the workflow, generating labeling data, comprising: and respectively carrying out bidirectional single-model labeling on each target in the image data to be labeled based on an auxiliary algorithm corresponding to each target in the image data to be labeled, and generating labeling data.
9. The apparatus of claim 6, wherein the annotation data generation module is specifically configured to: selecting a target detection data set and/or a target tracking data set with the data set type to be generated according to the data set type set in the workflow; When a target detection data set needs to be generated, extracting labeling data with the minimum similarity of the same target based on a similarity algorithm to manufacture the target detection data set; When the target tracking data set is required to be generated, extracting continuous images from the labeling data according to the set time interval, the set coincidence rate, the set center point deviation and the set total frame number to form the target tracking data set.
10. The apparatus of claim 6, further comprising a model online training module, wherein the model online training module is configured to: Based on the target detection data set, carrying out script training and/or super-parameter configuration of a target detection algorithm, training the detection data set, and generating a target detection algorithm model; and (3) based on the target tracking data set, carrying out script training and/or super-parameter configuration of the single target tracking algorithm, training the tracking data set, and generating a single target tracking algorithm model.

Description

Method and device for algorithm closed-loop assisted bidirectional multi-model data labeling Technical Field The invention belongs to the technical field of image data processing, and particularly relates to a method and a device for algorithm closed-loop auxiliary bidirectional multi-model data annotation. Background The target tracking is one of basic tasks in the field of computer vision, is a technology for predicting the motion state of a target and calibrating the position of the target by modeling the appearance information of the target by utilizing the context information of a video or image sequence, and has wide application in the aspects of video monitoring, visual navigation, intelligent man-machine interaction and the like. The existing tracking algorithm can well cope with simple scenes, but is still very difficult to design with high precision and good robustness in the face of uncertainty of target movement, illumination change, shielding and other complex environments, and the positioning difficulty of the target is relatively high. The deep learning has the advantage that the deep learning can be performed through a large amount of data so as to ensure that the tracking algorithm network can achieve the functions of stably tracking and accurately calibrating the target position. At present, a model with excellent performance in the deep learning field needs a large amount of data to be trained and supported, the data is almost marked in a manual mode, a large amount of labor cost is consumed, and marking quality is different from person to person. In the field of visual target tracking algorithms, images need to be marked frame by frame, the similarity of adjacent images is high, the marking efficiency is low only by means of manual marking, and in the application scene of some scene depth learning algorithms, the real-time property of data marking is difficult to have high requirements. Disclosure of Invention In order to overcome the defects in the prior art, the invention provides a method for assisting bidirectional multi-model data labeling by algorithm closed loop, which comprises the following steps: initializing a bidirectional multimode annotation workflow; labeling the image data to be labeled based on the workflow, and generating labeling data; generating a data set according to the annotation data and the workflow; wherein the annotation comprises one or more of the following: Bidirectional single-model labeling, bidirectional multi-model simultaneous correction labeling and multi-target bidirectional multi-model data correction labeling. Preferably, the initializing the bidirectional multimode annotation workflow includes: selecting an auxiliary algorithm for labeling from a preset algorithm model library; setting a confidence threshold for the auxiliary algorithm, and setting initial target position coordinates; setting the type and format of a data set and generating configuration parameters of the data set; The data set type at least comprises one or more of a target detection data set and a target tracking data set, the auxiliary algorithm at least comprises one or more of an initial target frame recommendation algorithm, a target detection algorithm, a target frame automatic correction labeling algorithm and a single target tracking algorithm, and the data set generation configuration parameters at least comprise one or more of similarity, time interval, overlapping rate, center point offset and total frame number. Preferably, based on the workflow, bidirectional single-model labeling is performed on the image data to be labeled, and labeling data is generated, including: Based on the selected auxiliary algorithm, carrying out forward image continuous labeling on the image data to be labeled to obtain a visual positioning result of the target position in the image data to be labeled; if the error of the visual positioning result is smaller than the confidence coefficient threshold value, generating marking data and ending, otherwise, stopping continuous marking of the forward image, and reinitializing the target position coordinate; And carrying out reverse-order continuous labeling correction on the visual positioning result of the target position based on the reinitialized target position coordinate until the visual positioning result meets the preset termination condition and generates labeling data. Preferably, based on the workflow, performing multi-target bidirectional multi-model data correction labeling on the image data to be labeled, generating labeling data includes: and respectively carrying out bidirectional single-model labeling on each target in the image data to be labeled based on an auxiliary algorithm corresponding to each target in the image data to be labeled, and generating labeling data. Preferably, based on the workflow, bidirectional multimode simultaneous correction labeling is performed on the image data to be labeled, and labeling data is gene