CN-121999199-A - Training method of target detection model, traffic target detection method and device, storage medium, program product and computer equipment

CN121999199ACN 121999199 ACN121999199 ACN 121999199ACN-121999199-A

Abstract

The application discloses a training method of a target detection model, a traffic target detection method and device, a storage medium, a program product and computer equipment, wherein the method comprises the steps of obtaining a standard data set; the method comprises the steps of inputting a sample image into a model to be trained comprising a classification network and a regression network to obtain a sample target detection result output by the model to be trained, and carrying out iterative training on the model to be trained based on the difference between a class label and a classification prediction result, the difference between a regression task label and a regression task prediction result and the difference between an angle label and a target angle prediction result, wherein the model to be trained is used as the target detection model after the iterative training is completed, so that the accuracy of target detection can be improved by utilizing a rotating target frame, and in addition, regression and classification in the model can be split into two independent subtasks and respectively predicted to improve the overall target detection performance of the model.

Inventors

SONG KUN
LIU JINING
HUANG XINGWEI
HUANG CHAOPING
WEN SHANGDONG
WANG YANG
LIN JIAO
CHEN JINXUAN
XU HAO

Assignees

中国移动通信集团广东有限公司
中移湾区(广东)创新研究院有限公司
中国移动通信集团有限公司

Dates

Publication Date: 20260508
Application Date: 20260106

Claims (10)

1. A method of training a target detection model, comprising: Obtaining a standard data set, wherein the standard data set comprises a sample image and a target label, and the target label comprises a category label, a regression task label and an angle label which correspond to a sample target in the sample image; Inputting the sample image into a model to be trained comprising a classification network and a regression network, and obtaining a sample target detection result output by the model to be trained, wherein the sample target detection result comprises a target angle prediction result corresponding to the sample target, a classification prediction result corresponding to the sample target generated by the classification network, and a regression task prediction result corresponding to the sample target generated by the regression network, wherein the target angle prediction result is suitable for indicating the rotation angle of a target detection frame of the sample target; And performing iterative training on the model to be trained based on the difference between the class label and the classification prediction result, the difference between the regression task label and the regression task prediction result and the difference between the angle label and the target angle prediction result, wherein the model to be trained is used as the target detection model after the iterative training is completed.
2. The training method of claim 1, wherein, The classification network comprises a first convolution layer for extracting features according to the sample image; the regression network comprises a second convolution layer for extracting features according to the sample image; The first convolution layer and the second convolution layer are the same convolution layer.
3. The training method of claim 1, wherein, The classification network comprises a first convolution layer for extracting features according to the sample image; the regression network comprises a second convolution layer for extracting features according to the sample image; the first convolution layer and the second convolution layer are different convolution layers, and a convolution kernel size of the first convolution layer is greater than a convolution kernel size of the second convolution layer.
4. The training method according to claim 1, wherein the iteratively training the model to be trained based on the difference between the category label and the classification prediction result, the difference between the regression task label and the regression task prediction result, and the difference between the angle label and the target angle prediction result, comprises: determining a first loss based on a difference between the category label and the classification prediction result; determining a second loss based on a difference between the regression task tag and the regression task prediction result; determining a third loss based on a difference between the angle label and the target angle prediction result; weighting calculation is carried out based on the first loss, the second loss and the third loss, so that comprehensive loss is obtained; and carrying out iterative training on the model to be trained based on the comprehensive loss.
5. The training method of claim 4, wherein the determining a third loss based on a difference between the angle signature and the target angle prediction result comprises: And calculating the third loss by adopting an angle regression loss function based on the difference between the calibrated rotation angle indicated by the angle label and the rotation angle indicated by the target angle prediction result.
6. A traffic target detection method, comprising: Acquiring a traffic image to be detected; Inputting the traffic image into a target detection model to obtain a target detection result which is output by the target detection model and corresponds to the traffic image, wherein the target detection result comprises a traffic target angle prediction result for indicating a traffic target course angle in the traffic image, and the target detection model is trained in advance according to the training method of any one of claims 1-5.
7. A traffic target detection device, characterized by comprising: the traffic image acquisition module is used for acquiring traffic images to be detected; The detection module is used for inputting the traffic image into a target detection model to obtain a target detection result which is output by the target detection model and corresponds to the traffic image, wherein the target detection result comprises a traffic target angle prediction result for indicating a traffic target course angle in the traffic image, and the target detection model is trained in advance according to the training method of any one of claims 1-5.
8. A non-transitory computer readable storage medium having stored thereon a computer program, which when executed by a processor, implements the method of any of claims 1-6.
9. A computer program product comprising computer instructions which, when executed by a processor, implement the method of any of claims 1-6.
10. A computer device comprising a processor, a memory and a computer program stored in the memory and configured to be executed by the processor, the processor implementing the method of any of claims 1-6 when the computer program is executed.

Description

Training method of target detection model, traffic target detection method and device, storage medium, program product and computer equipment Technical Field The present application relates to the field of artificial intelligence, and in particular, to a training method for a target detection model, a traffic target detection method and apparatus, a storage medium, a program product, and a computer device. Background In the related art, the detection frame obtained by the target detection is generally a horizontal target frame, for example, referring to fig. 1, each vehicle in fig. 1 is targeted, and the detection frame corresponding to each vehicle is a horizontal target frame. However, since some of the objects are directional, i.e. the objects may exhibit a certain inclination angle with respect to the horizontal line, the corresponding horizontal object frame needs to be enlarged so that the enlarged horizontal object frame can completely enclose the inclined object within the frame. However, since the horizontal target frame will also enclose the background information other than the corresponding target, the enlarged horizontal target frame will also enclose more background information within the frame, and the enclosed background information will also be identified as the detected target, which will result in a reduced accuracy of target detection. Disclosure of Invention In order to solve the technical problems, the embodiment of the application provides a training method of a target detection model, a traffic target detection method and device, a storage medium, a program product and computer equipment, which can improve the accuracy of target detection. In a first aspect, an embodiment of the present application provides a training method for a target detection model, including: Obtaining a standard data set, wherein the standard data set comprises a sample image and a target label, and the target label comprises a category label, a regression task label and an angle label which correspond to a sample target in the sample image; Inputting the sample image into a model to be trained comprising a classification network and a regression network, and obtaining a sample target detection result output by the model to be trained, wherein the sample target detection result comprises a target angle prediction result corresponding to the sample target, a classification prediction result corresponding to the sample target generated by the classification network, and a regression task prediction result corresponding to the sample target generated by the regression network, wherein the target angle prediction result is suitable for indicating the rotation angle of a target detection frame of the sample target; And performing iterative training on the model to be trained based on the difference between the class label and the classification prediction result, the difference between the regression task label and the regression task prediction result and the difference between the angle label and the target angle prediction result, wherein the model to be trained is used as the target detection model after the iterative training is completed. Optionally, the classification network includes a first convolution layer for extracting features according to the sample image; the regression network comprises a second convolution layer for extracting features according to the sample image; The first convolution layer and the second convolution layer are the same convolution layer. Optionally, the classification network includes a first convolution layer for extracting features according to the sample image; the regression network comprises a second convolution layer for extracting features according to the sample image; the first convolution layer and the second convolution layer are different convolution layers, and a convolution kernel size of the first convolution layer is greater than a convolution kernel size of the second convolution layer. Optionally, the iteratively training the model to be trained based on the difference between the category label and the classification prediction result, the difference between the regression task label and the regression task prediction result, and the difference between the angle label and the target angle prediction result includes: determining a first loss based on a difference between the category label and the classification prediction result; determining a second loss based on a difference between the regression task tag and the regression task prediction result; determining a third loss based on a difference between the angle label and the target angle prediction result; weighting calculation is carried out based on the first loss, the second loss and the third loss, so that comprehensive loss is obtained; and carrying out iterative training on the model to be trained based on the comprehensive loss. Optionally, the determining a third loss based on a difference between