CN-122023468-A - Ocean video multi-target tracking and behavior recognition method and equipment

CN122023468ACN 122023468 ACN122023468 ACN 122023468ACN-122023468-A

Abstract

The application relates to the technical field of ocean monitoring, in particular to a method and equipment for multi-target tracking and behavior recognition of ocean videos, wherein the method comprises the steps of firstly constructing a coordinate transformation matrix based on parameters of camera equipment, carrying out frequency domain denoising, illumination compensation and pseudo-target filtering on an original video, carrying out super-resolution reconstruction enhancement on a small-scale target, detecting an optimized video frame by using a deep learning model, outputting a target boundary frame and identity characteristics, generating a single-view local track by combining Kalman filtering and a Hungary algorithm, carrying out re-recognition through coordinate projection and invariant of the cross-view characteristics when the target crosses a field of view, fusing and generating a cross-view global unique track, finally extracting track motion characteristics, carrying out logic comparison with ocean management rule threshold values, and outputting an abnormal behavior judgment result. The method effectively improves the detection precision of the small target under the complex sea condition, solves the problem of identity fracture of the target across the visual angles, and realizes the global continuous tracking and intelligent early warning of illegal behaviors.

Inventors

ZHOU HONGFENG
LI SHIPING
LING YUN
TANG BENXI
Zeng Huimei

Assignees

深圳微品致远信息科技有限公司

Dates

Publication Date: 20260512
Application Date: 20260403

Claims (10)

1. The marine video multi-target tracking and behavior recognition method is characterized by comprising the following steps of: Coordinate mapping calculation is carried out based on the internal parameters and the installation positions of the image pickup equipment, so that a coordinate conversion matrix from a visual angle coordinate system to a global geographic coordinate system is obtained; Carrying out frequency domain filtering denoising and illumination compensation treatment on an original video frame to obtain a preprocessing video frame with improved signal-to-noise ratio and balanced illumination; Performing local feature matching on the preprocessed video frame and a preset marine pseudo-target feature library, filtering pseudo-target interference in a marine environment, and performing super-resolution reconstruction enhancement on a small-scale target area reserved after filtering to obtain an optimized video frame with the pseudo-target interference removed and the small-target detail enhancement; based on a preset ocean target priori frame library, performing target detection on the optimized video frame by using a deep learning detection model, and outputting a target boundary frame coordinate set and a corresponding identity feature vector; based on the target boundary frame coordinate set, carrying out motion state prediction and data association by combining Kalman filtering and Hungary matching algorithm to obtain a time sequence target track with a unique local identifier under a single view angle; When targets cross the fields of different camera equipment, based on the coordinate transformation matrix, projecting the tail end coordinates of the time sequence target tracks to a global geographic coordinate system, combining the invariants of the cross-view characteristics to carry out target re-identification, endowing the tracks belonging to the same physical target with global unique identification, and generating the cross-view global unique target track; And extracting the motion characteristic data of the global unique target track, carrying out logic comparison with a preset ocean management rule quantization threshold value, and outputting an abnormal behavior judgment result.
2. The marine video multi-target tracking and behavior recognition method according to claim 1, wherein the step of performing coordinate mapping calculation based on the internal parameters and the installation position of the image capturing device to obtain a coordinate conversion matrix from a view angle coordinate system to a global geographic coordinate system comprises: Acquiring an internal focal length, main point coordinates and distortion coefficients of each image pickup device as internal parameters, and acquiring longitude and latitude, mounting height and attitude angle of each image pickup device under a geographic coordinate system as external parameters; constructing a joint mapping equation of an imaging model and a geographic projection model, and calculating a homography matrix by using a calibration algorithm; and decomposing and normalizing the homography matrix to generate a coordinate conversion matrix for converting any pixel coordinate into a global geographic coordinate.
3. The method for multi-target tracking and behavior recognition of marine video according to claim 1, wherein the step of performing frequency domain filtering denoising and illumination compensation processing on the original video frame to obtain a preprocessed video frame with improved signal-to-noise ratio and balanced illumination comprises the steps of: performing multi-scale Gaussian filtering and wavelet transformation on the original video frame, filtering out high-frequency noise components generated by sea surface wave reflection, and performing inverse transformation to complete frequency domain filtering denoising; Image blocking is carried out on the denoised video frame, and pixel-level brightness adjustment is carried out on a backlight area and a low-illumination area, so that illumination balance compensation is realized; and outputting the video frame subjected to noise removal and illumination compensation as the preprocessing video frame.
4. The method for multi-target tracking and behavior recognition of marine video according to claim 1, wherein the step of performing local feature matching on the preprocessed video frame and a preset marine pseudo-target feature library, filtering pseudo-target interference in a marine environment, and performing super-resolution reconstruction enhancement on a small-scale target area reserved after filtering to obtain an optimized video frame with pseudo-target interference removed and small-target detail enhancement comprises the following steps: Constructing and loading a marine pseudo-target feature library, wherein the marine pseudo-target feature library comprises visual feature samples of at least one pseudo-target type in waves, floating trees and spoons; Comparing the local features extracted from the preprocessed video frames with the marine pseudo-target feature library, filtering out successfully matched pseudo-target regions, and reserving real target candidate regions; Identifying a connected domain with the area smaller than a preset threshold value as a small-scale target area in the real target candidate area, and positioning a corresponding image block of the connected domain in the preprocessing video frame; performing feature amplification and texture recovery on the corresponding image blocks by adopting a super-resolution reconstruction network to obtain enhanced image blocks; And backfilling the enhanced image block to the small-scale target area corresponding to the preprocessed video frame to obtain the optimized video frame.
5. The method for multi-target tracking and behavior recognition of ocean video according to claim 1, wherein the step of performing target detection on the optimized video frame by using a deep learning detection model based on a preset ocean target prior frame library and outputting a target bounding box coordinate set and a corresponding identity feature vector comprises the following steps: loading a convolutional neural network detection model which is subjected to ocean scene customization transformation based on a YOLO frame, wherein the model comprises a multi-scale characteristic pyramid structure; the size and length-width ratio information of the ocean exclusive targets such as ships, oyster rows, navigation marks and the like in the preset ocean target priori frame library are configured to a detection head, and feature extraction and target positioning are carried out on the optimized video frame; and removing the overlapped detection frames through a non-maximum suppression algorithm to obtain a target boundary frame coordinate set containing target position information, and extracting a feature vector which is output by the detection model and is used for representing the uniqueness of a target individual as an identity feature vector.
6. The method for multi-target tracking and behavior recognition of marine video according to claim 1, wherein the step of performing motion state prediction and data association based on the target bounding box coordinate set and combining kalman filtering and hungarian matching algorithm to obtain a time sequence target track with unique local identification under a single view angle comprises the following steps: initializing a Kalman filter state vector, and updating a state equation by utilizing the target position information of the previous frame to obtain a target position priori estimated value of the current frame; Calculating cosine similarity between the identity feature vector detected by the current frame and the feature vector stored in the historical track library, and constructing an association cost matrix by combining the prior estimated value of the target position and the mahalanobis distance of the actual detection position; Optimally distributing the association cost matrix by using a Hungary algorithm to finish the data association of the current frame target and the historical track; for the successfully associated target, updating the track state and maintaining the original local track identification; And distributing new local track identifiers and initializing new tracks for new unassociated targets to obtain sequential target tracks under a single view angle after continuous frame association with the local track identifiers.
7. The method for tracking and identifying marine video multiple targets according to claim 1, wherein when targets cross different fields of view of the camera device, the step of projecting end coordinates of the time sequence target track to a global geographic coordinate system based on the coordinate transformation matrix, and performing target re-identification in combination with a cross-view feature invariant, and endowing tracks belonging to the same physical target with global unique identifications to generate cross-view global unique target tracks comprises the following steps: Monitoring whether a boundary box of a time sequence target track with a local track mark under a single view angle touches the edge of the field of view of the current camera equipment or not in real time; triggering a cross-view switching mechanism if touching, and projecting the tail end coordinates of the current track, which are about to leave the field of view of the current camera equipment, to a global geographic coordinate system by utilizing the coordinate conversion matrix to obtain a global position index; Screening candidate tracks with spatial positions matched with the global position index in the coverage range of other image pickup devices except the current image pickup device; extracting a viewing angle-crossing characteristic invariant of the current track and the candidate track, wherein the viewing angle-crossing characteristic invariant comprises hull contour topological characteristics or culture facility structural characteristics which are not influenced by viewing angle changes; Calculating cosine similarity between the current track and the cross-view characteristic invariant of the candidate track as structural similarity; If the structural similarity is greater than a preset re-identification threshold, judging that the current track and the candidate track belong to the same physical target, carrying out association fusion on the current track and the candidate track, and endowing the current track and the candidate track with a uniform global unique identifier to generate a cross-view global unique target track.
8. The marine video multi-target tracking and behavior recognition method according to claim 1, wherein the step of extracting the motion feature data of the globally unique target track, logically comparing the motion feature data with a preset marine management rule quantization threshold, and outputting an abnormal behavior judgment result comprises: Smoothing the cross-view global unique target track with the global unique identifier, obtaining an instantaneous speed and a course angle through differential operation, and obtaining an accumulated displacement through accumulation operation to obtain movement characteristic data comprising position coordinates, moving speed, moving direction and residence time; Loading geofence data and a marine management rule base, wherein the marine management rule base comprises minimum safe distance thresholds, stay time thresholds and maximum allowable speed thresholds corresponding to different types of marine management areas; And carrying out space-time logic comparison on the motion characteristic data and the ocean management rule base, namely marking the motion characteristic data as an abnormal event if the target enters a forbidden area, the distance between the target and the forbidden area is smaller than the minimum safety distance, the residence time in the forbidden area exceeds a time length threshold value or the instantaneous speed exceeds the maximum allowable speed, and outputting an abnormal behavior judging result containing the global unique identifier and the corresponding abnormal type tag.
9. The marine video multi-target tracking and behavior recognition method according to claim 6, wherein in the step of performing motion state prediction and data association based on the target bounding box coordinate set and combining a kalman filtering and hungarian matching algorithm, the method further comprises: Monitoring whether a target boundary box successfully matched with the local track mark exists in the current frame in real time; If the matched target boundary boxes are not detected by the continuous N frames, judging that the corresponding targets enter a short-time complete shielding state; Under the short-time complete shielding state, continuously calculating the predicted position of the target in the current frame and the predicted target identity characteristic by using a historical track trend prediction algorithm based on the historical motion state vector of the corresponding target, and generating a virtual track segment; Setting a maximum shielding tolerance time threshold, and continuously performing secondary matching of space-time positions and target identity characteristics on candidate targets detected in a new frame and the virtual track segments within the range of the maximum shielding tolerance time threshold; If the matching is successful, fusing the virtual track segment with the actually detected target boundary box, and recovering the normal updating state of the local track mark; if the maximum occlusion tolerance time threshold is exceeded and the matching is not successful, ending the prediction and marking the track with the local track mark as an ending state.
10. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any one of claims 1 to 9 when the computer program is executed.

Description

Ocean video multi-target tracking and behavior recognition method and equipment Technical Field The application relates to the field of ocean monitoring, in particular to a method and equipment for ocean video multi-target tracking and behavior recognition. Background With the rapid increase of the intelligent demand of ocean management, the scenes of video multi-target tracking and anomaly identification in ocean complex environments are increasingly popular. In recent years, a visual large model (such as ViT and SAM) and a video large model show strong feature extraction capability in the field of general image recognition, but multi-target tracking application of the visual large model in a marine complex environment still faces the challenges of marine pseudo-target interference, cross-view identity fracture and the like. In the prior art, a land general vision tracking algorithm (such as YOLO, SORT and other single target detection and simple characteristic association are adopted to realize target tracking) is generally adopted, but the method does not carry out depth adaptation aiming at a marine high-dynamic strong-interference environment, lacks a space-time consistency constraint mechanism and multi-view global coordinate fusion capability, leads to frequent loss of IDs (identification) and inevitable interruption of cross-view tracking of sea targets due to wave shielding and track crossing, and is difficult to meet real-time early warning requirements under the limit of computing power of edge equipment. Therefore, the technical problem that the continuous tracking of multiple targets for a long time cannot be performed in the marine complex environment in the prior art is needed to be solved. Disclosure of Invention The application mainly aims to provide a method and equipment for multi-target tracking and behavior recognition of ocean videos, and aims to solve the technical problem that multi-target long-term continuous tracking cannot be performed in an ocean complex environment in the prior art. In order to achieve the above object, the present application provides a method for multi-objective tracking and behavior recognition of marine video, comprising: Coordinate mapping calculation is carried out based on the internal parameters and the installation positions of the image pickup equipment, so that a coordinate conversion matrix from a visual angle coordinate system to a global geographic coordinate system is obtained; Carrying out frequency domain filtering denoising and illumination compensation treatment on an original video frame to obtain a preprocessing video frame with improved signal-to-noise ratio and balanced illumination; Performing local feature matching on the preprocessed video frame and a preset marine pseudo-target feature library, filtering pseudo-target interference in a marine environment, and performing super-resolution reconstruction enhancement on a small-scale target area reserved after filtering to obtain an optimized video frame with the pseudo-target interference removed and the small-target detail enhancement; based on a preset ocean target priori frame library, performing target detection on the optimized video frame by using a deep learning detection model, and outputting a target boundary frame coordinate set and a corresponding identity feature vector; based on the target boundary frame coordinate set, carrying out motion state prediction and data association by combining Kalman filtering and Hungary matching algorithm to obtain a time sequence target track with a unique local identifier under a single view angle; When targets cross the fields of different camera equipment, based on the coordinate transformation matrix, projecting the tail end coordinates of the time sequence target tracks to a global geographic coordinate system, combining the invariants of the cross-view characteristics to carry out target re-identification, endowing the tracks belonging to the same physical target with global unique identification, and generating the cross-view global unique target track; And extracting the motion characteristic data of the global unique target track, carrying out logic comparison with a preset ocean management rule quantization threshold value, and outputting an abnormal behavior judgment result. Further, the step of performing coordinate mapping calculation based on the internal parameters and the installation position of the image capturing device to obtain a coordinate conversion matrix from the view angle coordinate system to the global geographic coordinate system includes: Acquiring an internal focal length, main point coordinates and distortion coefficients of each image pickup device as internal parameters, and acquiring longitude and latitude, mounting height and attitude angle of each image pickup device under a geographic coordinate system as external parameters; constructing a joint mapping equation of an imaging model and a geographic projec