CN-116310707-B - Sample selection network construction method, sample selection method, training method, electronic device, and storage medium

CN116310707BCN 116310707 BCN116310707 BCN 116310707BCN-116310707-B

Abstract

A sample selection network integrating target characteristic information and prediction information comprises a projection layer, a merging unit and a score calculation unit, wherein the sample selection network is used for estimating a sample score for each target according to characteristic information extracted by a detection network, predicted regression information and classification information. The sample selection method is based on the sample score, and the number n i of positive samples needed by each target is calculated according to the sample score of the candidate samples. And sequencing candidate samples of each target according to the score from high to low, selecting the first n i samples with the highest score as positive samples for each target, and selecting the rest as negative samples. The training method comprises the steps of firstly obtaining sample fractions corresponding to each target through a sample selection network, then selecting positive samples and negative samples through a sample selection method, calculating sample loss, classification loss and regression loss, and calculating sample weight by using the sample fractions to serve as weight factors of the classification loss and the regression loss. The sample selection network is applied to a single-stage target detector, and is beneficial to improving the performance of the target detection network.

Inventors

YANG MIN
ZHANG YI
CHANG JIANG
GU XIAOLIN
SHI GAOCHAO
LIU KE

Assignees

北京轩宇空间科技有限公司

Dates

Publication Date: 20260508
Application Date: 20230103

Claims (8)

1. The method is characterized by being applied to a single-stage target detector, wherein a target detection network of the single-stage target detector comprises a backbone network, a feature pyramid, a head network and the sample selection network, and the head network comprises a classification branch and a regression branch; the backbone network is used for extracting characteristics of an input original image and outputting multi-layer characteristics; the feature pyramid is used for carrying out layering processing on the multilayer features and outputting feature information f; the classification branch is used for predicting classification characteristics according to the characteristic information f and outputting classification scores of targets; the regression branch is used for predicting regression characteristics according to the characteristic information f and outputting regression information of the target; the sample selection network is used for calculating the sample score corresponding to each target according to the characteristic information f output by the characteristic pyramid, the classification score output by the classification branch and the regression information output by the regression branch; The sample selection network construction method comprises the following steps of: The projection layer is used for processing the characteristic information f output by the target detection network to obtain a characteristic diagram with sample information ; The merging unit is used for merging regression information and classification scores predicted by the target detection network with the feature map output by the projection layer Merging into new features ; Score calculating unit for characteristic And processing, and calculating the sample score corresponding to each target.
2. The method according to claim 1, wherein the projection layer includes a first convolution layer using a 3×3 convolution kernel, a first activation layer, and a second convolution layer using a1×1 convolution kernel, which are sequentially arranged, and the characteristic information f outputted from the target detection network is set to be (f 1 ,f 2 ,…,f 5 ), and the characteristic information having the sample information is obtained after sequentially passing through the first convolution layer, the first activation layer, and the second convolution layer 。
3. The sample selection network construction method according to claim 2, wherein the merging unit is configured to combine the features Spliced into N x K x C vectors , wherein, L represents a feature layer, H represents the height of a feature map, W represents the width of the feature map, N represents the number of targets, C represents the number of feature channels, and is used for calculating regression scores according to regression information predicted by a target detection network and a target real boundary box, and is used for combining the regression scores and classification information with vectors Adding or combining arrays to obtain features 。
4. The method according to claim 1, wherein the score calculation unit includes a first connection layer, a second activation layer, a second connection layer, and a feature obtained by the merging unit And obtaining sample fractions after sequentially passing through the first connecting layer, the second activating layer and the second connecting layer so as to be used for subsequent sample selection and network training.
5. A sample selection method based on a sample score, applied to a single-stage object detector, comprising the steps of: S11, obtaining sample fractions of samples corresponding to each target by using a sample selection network constructed by the sample selection network construction method according to any one of claims 1-4; S12, screening candidate samples of each target according to the real boundary box of the target; S13, calculating the number n i of positive samples needed by each target according to the sample fraction of the candidate samples; s14, sorting the sample scores of the candidate samples of each target from high to low, and selecting the first n i samples with the highest sample scores as positive samples and the rest as negative samples.
6. A training method for simultaneously training the sample selection network constructed by the sample selection network construction method according to any one of claims 1 to 4 and a single target detection network including the sample selection network, the training method comprising the steps of: s21, firstly, inputting the characteristic information extracted by the target detection network, the predicted regression information and the classification information into the sample selection network, and obtaining the sample score corresponding to each target; S22, according to the sample score obtained in the step S21, selecting a proper positive sample and negative sample for each target by using the sample score-based sample selection method according to claim 5; S23, randomly extracting n neg negative samples from the negative samples obtained in the step S22, and calculating a loss function of sample selection together with all the positive samples obtained in the step S22; S24, calculating the training weight of each positive sample according to the sample score of the positive sample in the step S22, and taking the training weight as a weight factor of classification loss and regression loss to participate in training; s25, training the sample selection network and the single-target detection network according to the method.
7. An electronic device comprising at least one processor and a memory, wherein the memory stores computer-executable instructions, wherein execution of the computer-executable instructions stored in the memory at the at least one processor causes the at least one processor to perform the sample score-based sample selection method of claim 5 or the training method of claim 6.
8. A computer readable storage medium having stored thereon a computer program, which, when executed by a processor, controls a device in which the storage medium is located to perform the sample score based sample selection method of claim 5 or to perform the training method of claim 6.

Description

Sample selection network construction method, sample selection method, training method, electronic device, and storage medium Technical Field The application belongs to the technical field of computer vision, and particularly relates to a sample selection network integrating target characteristic information and prediction information. Background Sample selection is an important component of the target detection algorithm, selecting the appropriate positive and negative samples for network training. Currently, sample selection methods for target detection algorithms are mainly classified into a sample selection method based on prior information and a sample selection method based on prediction information. The sample selection method based on the prior information mainly comprises a sample selection method based on IoU threshold values and a sample selection method based on the center point distance. According to the sample selection method based on IoU scores, firstly, the IoU scores of an anchor (anchor) and a target real boundary box which are set manually are calculated, samples with the scores larger than a threshold value are selected as positive samples, and the rest are negative samples. The sample selection method based on the center point distance takes the samples falling into the target real boundary box as positive samples, and the rest are negative samples. The above two prior information-based sample selection methods only consider the influence of regression information on network training, but target detection is a multi-task mechanism of classification and regression, ignoring samples selected by classification information is often not optimal. Sample selection methods based on prediction information, such as the sample selection method based on minimum cost proposed by OneNet and the sample selection method based on optimal transmission theory proposed by OTA. The sample selection method based on the minimum cost firstly calculates classification and positioning cost according to the prediction information of the network, then uses the weighted combination value of the two costs as a final cost matrix, selects a sample with the minimum cost for each target as a positive sample, and the rest is a negative sample. The sample selection method based on the optimal transmission theory carries out global label distribution by introducing the optimal transmission theory, and calculates a cost matrix by using classification information and regression information of network prediction. The two sample selection methods based on the prediction information consider classification information and regression information when constructing a sample cost matrix, and the selected samples are more representative. However, the rule for calculating the sample cost is still set manually, so that the adaptability to network changes is lacked, and the network training is easy to be fitted. Disclosure of Invention In order to solve the above-mentioned shortcomings of the prior art, an object of the present application is to provide a sample selection network that fuses target feature information and prediction information, which is applied to a single target detection network, and in which feature information learned by the detection network is combined with predicted regression information and classification information as input, changes of target features extracted by the detection network in the training process are perceived in real time, so as to provide samples that meet requirements for network training, and improve detection performance. It is another object of the present application to provide a sample selection method based on sample scores, which calculates training weights for each sample, while acting on the calculation of classification loss and regression loss, providing different priorities for training. It is still another object of the present application to provide a training method, which trains a sample selection network and a target detection network simultaneously, establishes an evaluation criterion for a selection sample in consideration of the distinction between positive samples and negative samples, importance between positive samples, and the like, and filters a noise sample to improve the stability of sample selection. In order to achieve the above object, the present invention adopts the following technique: a sample selection network that fuses object feature information and prediction information for use with a single-stage object detector, the sample selection network comprising: the projection layer is used for processing the characteristic information f output by the detection network to obtain a characteristic diagram with sample information ; A merging unit for merging the regression information and classification score of the detection network prediction with the feature map output by the projection layerMerging into new features; Score calculating unit for characteristicAn