CN-115294336-B - Data labeling method, device and storage medium

CN115294336BCN 115294336 BCN115294336 BCN 115294336BCN-115294336-B

Abstract

The application provides a data labeling method, a device and a storage medium, wherein the method comprises the steps of determining a semantic segmentation model according to street view data to be labeled, determining a self-supervision learning model according to the semantic segmentation model, training the self-supervision learning model by using unlabeled street view data, transplanting a feature extractor of the trained self-supervision learning model into the semantic segmentation model, and carrying out data labeling on the street view data to be labeled by using the transplanted semantic segmentation model. According to the data labeling method, the self-supervision learning model is trained by using the label-free data, the trained semantic segmentation model is obtained by using the transplanting feature extractor, the semantic segmentation and the data labeling of the image data are carried out, the data quantity of labeled data required in the construction and training processes of the semantic segmentation model is reduced, the accuracy of the data labeling by using the semantic segmentation model is ensured, the labeling efficiency is improved, and the labeling cost is reduced.

Inventors

HUANG QIN

Assignees

零束科技有限公司

Dates

Publication Date: 20260512
Application Date: 20220812

Claims (6)

1. A method for labeling data, comprising: Determining a selected semantic segmentation model according to the quantity, complexity, processing precision or type of street view data to be annotated, wherein the semantic segmentation model comprises a second front-end backbone network and a predictor, and the predictor is used for processing the corresponding feature map to generate a semantic segmentation prediction map; The semantic segmentation model comprises a first front-end backbone network, a second front-end backbone network, a predictor, a second front-end backbone network and a feature extractor, wherein the first front-end backbone network is used for extracting features of an input image and generating a corresponding feature map, the pre-task module is used for performing unsupervised semantic extraction to perform supervised training on the first front-end backbone network, the second front-end backbone network also comprises the feature extractor and the predictor, the predictor is used for processing the corresponding feature map to generate a semantic segmentation prediction map, and the feature extractor in the first front-end backbone network and the second front-end backbone network is of the same structure as the encoding structure; training the self-supervision learning model by using unlabeled street view data to obtain a trained self-supervision learning model; Determining the weight of a feature extractor of the trained self-supervision learning model, replacing a second front-end main network of the semantic segmentation model in a weight replacement mode according to the weight of the feature extractor, and determining a transplanted semantic segmentation model of the semantic segmentation model after the second front-end main network is replaced; and carrying out data labeling on the street view data to be labeled through the transplanted semantic segmentation model.
2. The data labeling method according to claim 1, wherein the semantic segmentation model is a semantic segmentation model SETR; the second front-end backbone network of the semantic segmentation model SETR is Transformer Layer feature extractor.
3. The data labeling method according to claim 1, wherein the self-supervision learning model is a self-supervision learning model MOCO; The first front-end backbone network of the self-supervised learning model MOCO comprises a feature extractor consisting of an encoder fq and an encoder fk.
4. The method for labeling data according to claim 1, wherein before labeling the street view data to be labeled by the transplanted semantic segmentation model, the method further comprises Fine-tuning the transplanted semantic segmentation model.
5. A data tagging device, comprising: the first determining module is used for determining the selected semantic segmentation model according to the quantity, the complexity, the processing precision or the type of the street view data to be annotated; the semantic segmentation model comprises a second front-end backbone network and a predictor, wherein the predictor is used for processing a corresponding feature map to generate a semantic segmentation prediction map, a second determining module is used for determining the structure of a first front-end backbone network according to the structure of the second front-end backbone network of the semantic segmentation model, determining the structure of a self-supervision learning model according to the structure of the first front-end backbone network to determine the self-supervision learning model, the self-supervision learning model comprises the first front-end backbone network and a front task module, the first front-end backbone network comprises a feature extractor used for extracting features of an input image and used for extracting picture information to generate a corresponding feature map, the front task module is used for performing unsupervised semantic extraction to perform supervised training on the first front-end backbone network, the second front-end backbone network and the predictor also comprises a feature extractor used for performing corresponding feature extraction on the input image and used for performing corresponding feature extraction on the self-supervision learning model, and the self-supervision learning model is obtained by the self-supervision training model; The transplanting module is used for determining the weight of the feature extractor of the trained self-supervision learning model, replacing the second front-end main network of the semantic segmentation model in a weight replacement mode according to the weight of the feature extractor, and obtaining a transplanted semantic segmentation model by the transplanted semantic segmentation model determined by the semantic segmentation model with the replaced second front-end main network; And the labeling module is used for carrying out data labeling on the street view data to be labeled through the transplanted semantic segmentation model.
6. A storage medium having stored thereon a computer program which, when executed by a processor, implements a data tagging method according to any one of claims 1 to 4.

Description

Data labeling method, device and storage medium Technical Field The present application relates to the field of data processing technologies, and in particular, to a data labeling method, apparatus, and related devices. Background With the development of automatic driving technology, the requirements for a street view sensing algorithm are increasing, and accordingly, when the street view sensing algorithm is designed and optimized, a large amount of street view actual measurement data are often required to be acquired and marked. The traditional data labeling method is characterized in that a large amount of manpower is used for manual labeling, or a neural network model is used for carrying out semantic segmentation on the acquired street view data and then labeling, wherein the labeling accuracy is better, but the time consumption is long, the efficiency is low, the cost is high, and the requirement of carrying out the semantic segmentation and labeling on the mass data is difficult to meet due to the fact that the large amount of manpower is spent. While the latter is fast, in order to ensure the precision of semantic segmentation and labeling, it is necessary to rely heavily on sample data or training data containing various types of different labels used in the process of constructing and training the neural network model, and when the data amount of the sample data or training data with labels is insufficient, the precision of performing semantic segmentation to complete labeling is difficult to be effectively ensured. Disclosure of Invention In view of the above, embodiments of the present application provide a data labeling method, device and storage medium, so as to at least partially solve the above-mentioned problems. In a first aspect, an embodiment of the present application provides a data labeling method, including: determining a semantic segmentation model according to street view data to be annotated; Determining a self-supervision learning model according to the determined semantic segmentation model; training the self-supervision learning model by using unlabeled street view data to obtain a trained self-supervision learning model; transplanting the feature extractor of the trained self-supervision learning model into a semantic segmentation model to obtain a transplanted semantic segmentation model; and carrying out data labeling on the street view data to be labeled through the transplanted semantic segmentation model. Optionally, in one embodiment of the present application, the self-supervised learning model includes a first front-end backbone network and a pre-task module: The feature extractor is contained in the first front-end backbone network, and the first front-end backbone network is used for extracting picture information and generating a corresponding feature map; the pre-task module is used for unsupervised semantic extraction to perform supervision training on the first front-end backbone network. Optionally, in one embodiment of the present application, the semantic segmentation model includes a second front-end backbone network and a predictor; The predictor is used for processing the corresponding feature images and generating a semantic segmentation prediction image. Optionally, in an embodiment of the present application, determining the self-supervised learning model according to the determined semantic segmentation model includes: Determining the structure of the first front-end backbone network according to the structure of the second front-end backbone network of the time semantic segmentation model, and determining the structure of the self-supervision learning model according to the structure of the first front-end backbone network. Optionally, in an embodiment of the present application, transplanting the feature extractor of the trained self-supervised learning model into the semantic segmentation model to obtain a transplanted semantic segmentation model includes: determining the weight of the feature extractor in the first front-end backbone network of the trained self-supervised learning model; And replacing the second front-end backbone network of the semantic segmentation model according to the weight of the feature extractor in the first front-end backbone network, and determining a transplanted semantic segmentation model by the semantic segmentation model with the replaced second front-end backbone network. Optionally, in an embodiment of the present application, the semantic segmentation model is a semantic segmentation model SETR; the second front-end backbone network of the semantic segmentation model SETR is Transformer Layer feature extractor. Optionally, in an embodiment of the present application, the self-supervised learning model is a self-supervised learning model MOCO; The first front-end backbone network of the self-supervised learning model MOCO comprises a feature extractor consisting of an encoder fq and an encoder fk. Optionally, in on