CN-115409076-B - Data labeling method, device, equipment and storage medium
Abstract
The application discloses a data labeling method, a device, equipment and a storage medium, wherein the method comprises the steps of determining an object to be labeled when a labeling instruction is detected, and carrying out format alignment treatment on the object to be labeled to obtain an aligned object; and carrying out label propagation correction processing on the alignment object according to the target labeling model and a preset label propagation correction mode to obtain a target label of the alignment object. In the application, in the data preparation stage, the object to be marked is marked rapidly, and the alignment object is also subjected to label propagation correction processing, so that the consistency of rapid marking can be ensured, the labor cost of marking personnel is reduced, and the marking efficiency is improved.
Inventors
- CHEN YUANZHENG
Assignees
- 中国移动通信集团浙江有限公司
- 中国移动通信集团有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20210526
Claims (8)
- 1. The data labeling method is characterized by comprising the following steps of: when a labeling instruction is detected, determining an object to be labeled, and carrying out format alignment treatment on the object to be labeled to obtain an aligned object, wherein the object to be labeled comprises an image to be treated, sound data to be treated and text data to be treated; Determining a target annotation model of the alignment object; Performing label propagation correction processing on the alignment object according to the target labeling model and a preset label propagation correction mode to obtain a target label of the alignment object; the step of performing label propagation correction processing on the alignment object according to the target labeling model and a preset label propagation correction mode to obtain a target label of the alignment object comprises the following steps: determining a weakly enhanced version object of the aligned object and a strongly enhanced version object of the aligned object; Labeling the weak enhancement version object based on the target labeling model to obtain a pseudo tag of the aligned object; Comparing the pseudo tag with a preset threshold value to determine a thermal pseudo tag of the alignment object; Performing object prediction on the strong enhancement version object according to a preset prediction model to obtain an object prediction value; And calculating the target label of the aligned object by the thermal pseudo label and the object predicted value through a preset standard cross entropy loss matching calculation mode.
- 2. The data annotation method of claim 1, wherein the step of determining a target annotation model for the aligned object comprises: determining a target type of the alignment object; determining whether a target object with the same type as the target type exists in a preset object set; if the target object with the same type as the target type exists, determining a labeling model of the target object, and taking the labeling model of the target object as the target labeling model.
- 3. The method for labeling data according to claim 2, wherein after the step of determining whether there is a target object of a type consistent with the target type from the preset object set, the method comprises: If no target object with the same type as the target type exists, displaying a label set in a preset label form; receiving an initial labeling result obtained by labeling the alignment object part based on the label set; And determining a target labeling model of the alignment object based on the initial labeling result and a preset initial labeling model.
- 4. The method for labeling data according to claim 1, wherein the step of performing label propagation correction processing on the alignment object according to the target labeling model and a preset label propagation correction mode to obtain the target label of the alignment object comprises the following steps: according to the target labeling model and a preset label propagation correction mode, carrying out label propagation correction processing on the alignment object to obtain a correction label; Determining whether an audit passing instruction of the correction tag is received; and if the verification passing instruction of the correction label is received, the correction label is used as the target label of the alignment object.
- 5. The method for labeling data according to claim 1, wherein after the step of performing a label propagation correction process on the alignment object according to the target labeling model and a preset label propagation correction method to obtain the target label of the alignment object, the method comprises: performing iterative training on a preset basic model to be trained based on the aligned objects after the labels to obtain a training result model; Predicting the data to be predicted based on the training result model to obtain a prediction label of the data to be predicted; and carrying out iterative updating on the target labeling model according to the prediction label of the data to be predicted.
- 6. A data tagging device, the data tagging device comprising: The first determining module is used for determining an object to be marked when a marking instruction is detected, and carrying out format alignment processing on the object to be marked to obtain an aligned object, wherein the object to be marked comprises an image to be processed, sound data to be processed and text data to be processed; The second determining module is used for determining a target annotation model of the alignment object; the correction module is used for carrying out label propagation correction processing on the alignment object according to the target labeling model and a preset label propagation correction mode to obtain a target label of the alignment object; The correction module includes: a first determining unit configured to determine a weakly enhanced version object of the aligned object and a strongly enhanced version object of the aligned object; The first acquisition unit is used for labeling the weak enhancement version object based on the target labeling model to obtain a pseudo tag of the alignment object; the second determining unit is used for comparing the pseudo tag with a preset threshold value and determining a thermal pseudo tag of the alignment object; The object prediction unit is used for predicting the object of the strong enhancement version object according to a preset prediction model to obtain an object prediction value; and the calculating unit is used for calculating the target label of the aligned object through a preset standard cross entropy loss matching calculation mode by the thermal pseudo label and the object predicted value.
- 7. A data labeling device is characterized by comprising a memory, a processor and a program stored on the memory for realizing the data labeling method, The memory is used for storing a program for realizing a data labeling method; The processor is configured to execute a program for implementing the data labeling method to implement the steps of the data labeling method according to any one of claims 1 to 5.
- 8. A storage medium having stored thereon a program for implementing a data labeling method, the program for implementing the data labeling method being executed by a processor to implement the steps of the data labeling method according to any of claims 1 to 5.
Description
Data labeling method, device, equipment and storage medium Technical Field The present application relates to the field of data processing technologies, and in particular, to a data labeling method, device, apparatus, and storage medium. Background The popular application of artificial intelligence products requires a large amount of sample data such as pictures, texts, sounds and the like with labels, so that a machine learning model is trained based on the sample data, and then, based on the trained machine learning model, the type and the like of data to be processed are predicted, however, the existing sample data usually need manual labeling, and the labor cost is high. Disclosure of Invention The application mainly aims to provide a data labeling method, a device, equipment and a storage medium, and aims to solve the technical problems that sample data needs manual labeling and the cost is high in the prior art. In order to achieve the above object, the present application provides a data labeling method, including: when a labeling instruction is detected, determining an object to be labeled, and carrying out format alignment treatment on the object to be labeled to obtain an aligned object; Determining a target annotation model of the alignment object; and carrying out label propagation correction processing on the alignment object according to the target labeling model and a preset label propagation correction mode to obtain a target label of the alignment object. Optionally, the step of performing label propagation correction processing on the aligned object according to the target labeling model and a preset label propagation correction mode to obtain a target label of the aligned object includes: determining a weakly enhanced version object of the aligned object and a strongly enhanced version object of the aligned object; labeling the weak enhancement version object based on the target labeling model to obtain a pseudo tag of the aligned object; Comparing the pseudo tag with a preset threshold value to determine a thermal pseudo tag of the alignment object; Performing object prediction on the strong enhancement version object according to a preset prediction model to obtain an object prediction value; and calculating the thermal pseudo tag and the object predicted value by a preset standard cross entropy loss matching calculation mode to obtain the target tag of the aligned object. Optionally, the step of determining the target labeling model of the alignment object includes: determining a target type of the alignment object; determining whether a target object with the same type as the target type exists in a preset object set; If a target object with the same type as the target type exists, determining a labeling model of the target object, and taking the labeling model of the target object as the target labeling model. Optionally, after the step of determining whether there is a target object of a type corresponding to the target type from the preset object set, the method includes: If no target object with the same type as the target type exists, displaying a label set in a preset label form; receiving an initial labeling result obtained by labeling the alignment object part based on the label set; And determining a target labeling model of the alignment object based on the initial labeling result and a preset initial labeling model. Optionally, the step of performing label propagation correction processing on the aligned object according to the target labeling model and a preset label propagation correction mode to obtain a target label of the aligned object includes: performing label propagation correction processing on the alignment object according to the target labeling model and a preset label propagation correction mode to obtain a correction label; Determining whether an audit passing instruction of the correction tag is received; and if the verification passing instruction of the correction label is received, the correction label is used as the target label of the alignment object. Optionally, after the step of performing label propagation correction processing on the alignment object according to the target labeling model and a preset label propagation correction mode to obtain the target label of the alignment object, the method includes: performing iterative training on a preset basic model to be trained based on the aligned objects after the labels to obtain a training result model; predicting the data to be predicted based on the training result model to obtain a prediction label of the data to be predicted; and carrying out iterative updating on the target labeling model according to the prediction label of the data to be predicted. Optionally, the object to be annotated includes an image to be processed, sound data to be processed, and text data to be processed. The application also provides a data labeling device, which comprises: The first determining module is used for det