CN-122024183-A - Deep learning-based water surface ship target re-identification method

CN122024183ACN 122024183 ACN122024183 ACN 122024183ACN-122024183-A

Abstract

The invention discloses a deep learning-based water surface ship target re-identification method which comprises the following steps of S1, obtaining a training data set of water surface ship images, carrying out mist adding processing on the images in the training data set according to a proportion, S2, training on the mist added training data set by using a YOLO model, S3, training on the mist added training data set by using a visual auxiliary re-identification network with a mist removing module, S4, detecting by using the trained YOLO model to obtain a prediction frame, carrying out prediction frame de-duplication by using a non-maximum suppression method based on pseudo-overlap ratio combined confidence, S5, cutting each frame of input images according to the prediction frame, then inputting the cut images into a visual auxiliary re-identification network to extract ship characteristics and visual angle characteristics, and identifying ship identities by using a visual auxiliary self-adaptive query expansion method to obtain a re-identification result. The method obviously improves the accuracy of the re-identification of the ship target, and has better robustness in a foggy environment.

Inventors

LIU JUN
MO QIANQIAN
GUAN JIAN
YANG QILIN
PENG DONGLIANG
GU Yu
CHEN HUAJIE

Assignees

杭州电子科技大学

Dates

Publication Date: 20260512
Application Date: 20250922

Claims (9)

1. The method for re-identifying the target of the water surface ship based on the deep learning is characterized by comprising the following steps of: S1, acquiring a training data set of a water surface ship image, and carrying out mist adding treatment on the image in the training data set according to a proportion; S2, training on the training data set after mist adding treatment by using a YOLO target detection model; S3, training on the training data set after mist adding by using a vision auxiliary re-identification network with a mist removing module; s4, detecting an input image to be identified by using the trained YOLO target detection model to obtain a prediction frame, and performing duplication elimination of the prediction frame by using a non-maximum suppression method based on the pseudo-overlapping rate combined confidence coefficient; S5, cutting each input frame of image according to a prediction frame, inputting the cut image into a visual angle auxiliary re-recognition network to extract ship features and visual angle features, and recognizing the ship identity by using a visual angle auxiliary self-adaptive query expansion method to obtain a re-recognition result.
2. The method for re-identifying the target of the water surface ship based on the deep learning according to claim 1, wherein the step S3 comprises the following sub-steps: S3-1, labeling the fogged images, giving out visual angle information and ship ID information of each image, and classifying the labeled ship training data set images according to the ID category and visual angle category; S3-2, extracting features of the marked training data set according to the ID category and the visual angle category through a feature extraction module in the visual auxiliary re-recognition network; s3-3, up-sampling the extracted features to obtain a reconstructed intermediate image; S3-4, inputting the intermediate image and the original image into a subsequent pyramid enhancement module in the vision auxiliary re-recognition network together to obtain a defogged image; S3-4, comparing the defogged image with the original clear image to obtain defogging loss, The defogging loss is calculated by the two norms of the difference value between the original clear image and the defogged image, and the expression is as follows: Where Q represents the number of images per training lot, J i represents the defogging image of the ith image in the lot, Representing the original sharp image of the ith image in the batch.
3. The deep learning based surface vessel target re-identification method of claim 2, wherein the feature extraction module comprises two network branches, each network branch employing ResNet-101 residual networks.
4. A method for re-identifying a target of a water surface vessel based on deep learning according to claim 3, wherein the two ResNet-101 residual networks perform feature extraction on the labeled training data set according to ID category and view category respectively.
5. A method of re-identification of a surface vessel target based on deep learning according to claim 2, wherein in step S3-2, two network branches will get re-identification losses by classification of categories, said re-identification losses being calculated using a cross entropy loss function and a triplet loss function.
6. The method for re-identifying a target of a surface vessel based on deep learning as claimed in claim 4, wherein the feature for up-sampling in the step S3-3 is obtained from a first residual block in a ResNet-101 residual network.
7. The method for re-identifying the target of the water surface ship based on the deep learning according to claim 4, wherein in the step S4, the method for performing the de-duplication of the prediction frame based on the non-maximum suppression method of the pseudo-overlap ratio combined confidence is as follows: Defining the normalized area of the prediction BOX as: Where n is the number of BOXs and a is the area of the BOXs. The pseudo-overlap ratio r of each BOX to the real BOX is defined as: r i ＝p i s i The pseudo-overlap ratio r and the confidence P are weighted as the corrected confidence of the BOX, i.e., the pseudo-overlap ratio joint confidence P is: P i ＝ω 1 p i +ω 2 r i Where p is confidence, ω 1 and ω 2 are weight coefficients, and represent the relative importance degrees of the original confidence and the pseudo-overlap ratio, respectively, i.e. the confidence correction formula is: and then performing non-maximum suppression by using the corrected confidence coefficient, and de-duplicating the prediction frame.
8. The method for re-identifying the target of the water surface ship based on the deep learning of claim 7, wherein in the step S5, the view-angle-assisted adaptive query expansion method is specifically as follows: S5-1, respectively obtaining a visual angle characteristic f O and a ship characteristic f I by the image through two branch networks in the visual angle auxiliary re-identification network; S5-2, inquiring in a viewing angle characteristic auxiliary direction on the basis of AQE, carrying out characteristic fusion by a characteristic similarity-based method, and carrying out image retrieval on fused data to obtain a re-identification result.
9. The method for re-identifying the target of the water surface ship based on the deep learning as set forth in claim 8, wherein the specific method of the step S5-2 is as follows: By using Representing a corpus of gallery sets and query sets, then the corpus Is defined as Wherein n is the total number of pictures in the gallery set and the query set; using cosine distance to represent similarity between image features The similarity matrix of (2) is defined as follows: wherein S I is a similarity matrix between ship features of the image, S O is a similarity matrix between view angle features of the image, And (3) with Respectively is Cosine similarity of ship features and view angle features of the middle image i and the image j is defined as: according to the similarity matrices S I and S O , searching for an image which has the highest similarity with the ship characteristic f I (x) of the image x but is dissimilar in view angle And y+.x, the definition for view similarity is as follows: Where Δ ij =1 indicates that image i is similar to image j in view, otherwise dissimilar, the query formula for image y can be expressed as follows: Where X x is a set that is dissimilar from the view of image X, defined as follows: after finding the corresponding images y and x to be fused, according to the similarity of the ship characteristics The original characteristic f I (x) and the original characteristic f I (y) are subjected to weighted fusion to obtain a new characteristic f I ' (x), and the fusion formula is as follows: wherein lambda is a preset fusion coefficient, and the corpus is traversed to ensure the uniformity of the fused ship characteristics f I And (3) carrying out image retrieval on the fused data to obtain a re-identification result.

Description

Deep learning-based water surface ship target re-identification method Technical Field The invention relates to the technical field of computer vision re-identification, in particular to a water surface vessel target re-identification method based on deep learning. Background In the technical field of computer vision, the re-recognition technology is a target recognition and classification technology based on an artificial intelligence technology, and can realize real-time detection and accurate recognition of interested targets such as personnel, vehicles, articles, faces and the like. Applications in the civilian field include security video monitoring, traffic monitoring, retail and warehouse management, security and authentication, and the like. In addition, the re-identification technology is widely applied in the automatic driving field and the medical field. The focus of research on re-identification technology is mainly focused on identification of pedestrians and vehicles, while research on marine vessels has not been fully developed. The lack of a data set, especially the lack of severe weather such as foggy scene images, which can be used for the water surface ship re-identification technology at present leads to the limitation of research and application of the re-identification technology in foggy scenes, and in addition, the influence of view angle difference on a re-identification network is not negligible. Features such as bow, stern and sides of the ship often have a large gap and tend to negatively impact the accuracy of recognition of the re-recognition network. Disclosure of Invention Aiming at the defects of the prior art, the invention provides the water surface ship target re-identification method based on deep learning, which remarkably improves the accuracy of the ship target re-identification and has better robustness in foggy environment. In order to solve the technical problems, the technical scheme of the invention is as follows: a water surface ship target re-identification method based on deep learning comprises the following steps: And (1) carrying out mist adding treatment on the training data set image according to a proportion. And (2) training on the fogging training dataset image by using You Only Look Once (YOLO) target detection model. And (3) training on the training data set image after mist adding by using the visual angle auxiliary re-identification network with the mist removing module. And (4) detecting each frame of image of the input video stream by using the trained YOLO model to obtain a prediction frame, and performing prediction frame de-duplication by using a non-maximum suppression method based on the pseudo-overlap ratio combined confidence coefficient. And (5) cutting each input frame of image according to the predicted frame, and then inputting the cut image into a re-recognition network to extract ship characteristics and visual angle characteristics. And identifying the ship identity by using a visual angle-assisted self-adaptive query expansion (OAQE) method to obtain a re-identification result. Preferably, the specific method of the step (3) is as follows: Labeling the fogged images, giving visual angle information and ship ID information of each image, and respectively inputting the labeled ship training data set images into a characteristic extraction network with defogging branches shown in the figure 2 of the specification according to the ID category and the visual angle category for training. I.e. training of view class networks is independent and may be parallel to training of ID class networks. Feature extraction (ResNet) branch adopts ResNet-101 residual network to perform feature extraction. The dimension reconstruction module (Dimension Reconstruction) uses up-sampling to up-sample the features obtained by the first residual block of ResNet-101 to obtain a reconstructed intermediate image. The intermediate image is then input to a subsequent pyramid enhancement (PYRAMID ENHANCEMENT) module along with the original image to obtain a defogged image. The defogged image is compared with the original clear image to obtain defogging loss (Defog Loss), and the characteristic extraction branch is classified by category to obtain re-identification loss (ReID Loss). The re-recognition loss is calculated using a cross entropy loss function and a triplet loss function. The defogging loss is calculated by two norms of the difference value between the original clear image and the defogged image, and is defined as follows: Where Q represents the number of images per training lot, J i represents the defogging image of the ith image in the lot, Representing the original sharp image of the ith image in the batch. Preferably, the specific method of the step (4) is as follows: And carrying out ship target detection on each frame of image in the input video stream by using the trained YOLO model to obtain a detection frame, and rejecting the prediction frame