CN-121834243-B - Marine ecology monitoring method based on ecology representation and target perception matching

CN121834243BCN 121834243 BCN121834243 BCN 121834243BCN-121834243-B

Abstract

The invention relates to the technical field of marine ecological monitoring, in particular to a marine ecological monitoring method based on ecological representation and target perception matching. The method comprises the steps of obtaining multi-source ocean monitoring data, preprocessing multi-mode data based on the obtained multi-source ocean monitoring data, generating distinguishable ecological feature vectors according to the preprocessed data, utilizing the distinguishable ecological feature vectors to conduct target perception to generate fusion confidence, and generating a monitoring report and feeding back based on a fusion confidence result. The invention realizes multiple breakthroughs in the precision, efficiency and practicability of marine ecological monitoring through multi-module collaborative optimization, and is obviously superior to the traditional technology.

Inventors

SUN SHAN
CHENG LING
ZHANG CHAO
NIE JIE
SU BO
JIN XIAOJIE
Zuo Zijie
LI HONGKUN
ZHAO YUTING

Assignees

山东省海洋资源与环境研究院(山东省海洋环境监测中心、山东省水产品质量检验中心)

Dates

Publication Date: 20260512
Application Date: 20260313

Claims (7)

1. A marine ecology monitoring method based on ecology representation and target perception matching, comprising: Acquiring multi-source ocean monitoring data; Performing multi-mode data preprocessing based on the acquired multi-source ocean monitoring data; Generating distinguishable ecological feature vectors according to the preprocessed data; Performing target perception by using the distinguishable ecological feature vector to generate fusion confidence; Generating a monitoring report based on the fusion confidence coefficient result and feeding back the monitoring report; The generating distinguishable ecological feature vectors from the preprocessed data includes receiving a standardized multi-modal dataset from a preprocessing layer Generating a decoupled target feature tensor by utilizing task adaptive feature decoupling network TA-FDN And background feature tensor The TA-FDN network employs a parallel dual-branch encoder architecture, each branch consisting of a convolutional layer followed by normalization and ReLU activation functions, using target branches And background branching Input to And (3) performing parallel processing: Wherein, the And For dynamically adapting to different monitoring tasks, a lightweight task self-adapting unit TAU is introduced, and a pair of dynamic modulation vectors are generated on the condition of meta-information of input data C is the number of characteristic channels, and channel-level weighting is carried out on the preliminarily extracted characteristics: , wherein, Broadcast multiplication indicating channel direction, in order to make And Implementing decoupling, defining decoupling loss function based on mutual information minimization By means of mutual information The degree of dependence between two random variables is measured, and the optimization targets are: finally, adopting a variation estimation method based on countermeasure learning, and introducing an auxiliary discriminant network D for distinguishing characteristic pairs ; The method comprises the steps of generating distinguishable ecological feature vectors according to preprocessed data, performing feature purification on decoupled features by utilizing a self-supervision mask real feature enhancement network AIM-Marine, firstly, splicing the decoupled double-flow features, inputting the coupled double-flow features into a lightweight multi-scale feature encoder E, and constructing a four-level feature pyramid: Wherein, the The representation is spliced along the channel dimension, Is the first stage of the encoder, outputs the characteristic diagram To capture information from fine-grained detail to global context, then for each layer of the pyramid A lightweight mask generation network G_l generates a spatial binary mask for it In order to still enable gradient back-propagation in discrete mask decisions, gummel-Softmax re-parameterization techniques are employed, in particular, For each spatial position Outputting a two-dimensional logic value The hard mask value for this location is then obtained by gummel-Softmax sampling: Training of the mask generation network G is performed by self-supervised feature reconstruction tasks, the training objective being when element-wise multiplication is performed on feature pyramids using the generated masks, i.e After that, the remaining features after being masked The multi-layer perceptron MLP classification head can be used for predicting a pre-defined pseudo tag of a sample, G is forced to automatically identify and reserve a characteristic region with the most information quantity for sample discrimination, and a purified multi-scale characteristic set is obtained The ecological target directivity fraction EPG reaches 0.89; The method comprises the steps of generating distinguishable ecological feature vectors according to preprocessed data, performing deep fusion and high-order coding on purification features from different scales by using a hierarchical ecological feature fusion representation network to generate final unified ecological feature representation, and firstly, mapping feature graphs of different scales Upsampling to the same medium size through bilinear interpolation, and splicing in the channel dimension to obtain the aggregation feature Subsequently, the channel attention module SE-Block is applied to calculate weights for each channel to emphasize those characteristic channels that are more important to the current task: Wherein GAP represents global average pooling, Is a function of the activation of the ReLU, , Is the weight of the full connection layer, Is a Sigmoid function of the code, Is a learned channel attention vector, and then enhanced features Inputting a transducer encoder formed by L layers to model the dependency relationship of the inner long distance of the features, fusing multi-mode information and recording the self-attention mechanism of the first layer in the transducer encoder as follows: Wherein, the , , Derived from the input features by linear projection respectively, Is the dimension of the key vector, and the feature after depth coding is obtained through coding of L layers And finally, to Compressing the data into one-dimensional feature vector by global averaging pooling, projecting the feature vector into a preset 512-dimensional space through a full connection layer, and passing through Normalizing and outputting a final ecological feature vector: implementing original, mixed multi-modal data Through deep transformation of feature decoupling, purification and fusion, the feature vector is converted into feature vector with high discriminant, high robustness and rich semantic information 。
2. The marine ecology monitoring method based on ecological representation and target perception matching according to claim 1, wherein the multi-modal data preprocessing based on the acquired multi-source marine monitoring data comprises utilizing Time synchronizing an original multi-modal dataset for a space-time synchronous transformation operator Comprising four sensor data tuples, optical remote sensing image Infrared remote sensing image Underwater acoustic spectrogram Buoy sensing time series data First, for each data tuple, a time alignment function is defined Fitting the historical synchronization point data by least squares to model the fixed offset and linear drift of the sensor clock, calibrated time stamps The calculation is as follows: Function of Output of (c) ensures that for all Meeting the constraint Second, realizing time alignment of all mode data on sub-second precision, and then performing space synchronization to attach original coordinates of all data Conversion to the universal WGS-84 geodetic system by a parameterized mapping function The implementation is realized, and the parameters are determined by external parameter calibration data of each sensor: finally based on unified time and space coordinates For the original data content Spatial resampling and temporal interpolation to fill or align onto a standard spatio-temporal grid by resampling operators Expressed as: Wherein, the Bilinear interpolation is adopted for image data, linear interpolation is adopted for time sequence data, and finally a time-space synchronous data set is obtained: 。
3. The marine ecology monitoring method based on ecological representation and target perception matching according to claim 2, wherein the multi-modal data preprocessing based on the acquired multi-source marine monitoring data further comprises utilizing convolution kernel Extracting noise-sensitive feature maps from images Will convolve the kernel Optimization is performed to enhance local contrast differences of noise regions from signal regions: wherein Representing a two-dimensional convolution operation, Is one Is based on characteristic diagram At each pixel position Computing its local neighborhood Statistical characteristics of the interior, by comparing the deviation degree of the central pixel characteristic value and the local mean value, a soft noise probability mask is generated : Wherein, the And Respectively, are neighborhoods Mean and standard deviation of the internal eigenvalues; Is a very small positive number for preventing zero removal errors; Is a Sigmoid function, maps normalized scores to (0, 1) intervals, and then uses noise masking And in the low-noise probability area, retaining the details of the original image to protect signals, wherein the representation is as follows: Wherein, the Representing an element-by-element multiplication, Is associated with All 1 matrices of the same size are used, A convolution operation is represented and is performed, Is standard deviation of Two-dimensional Gaussian smoothing kernel of (2) After the operation is performed on all the images and the spectrum data, a denoising data set is obtained: 。
4. A marine ecology monitoring method based on matching of ecology representation to target perception according to claim 3, wherein the multi-modal data pre-processing based on acquired multi-source marine monitoring data further comprises data enhancement transformation wherein for denoised acoustic spectrograms Defining a time domain stretching transformation And frequency domain translational transformation To simulate the influence of the change of the sound wave propagation speed and the sound source Doppler effect, and the enhanced acoustic spectrum The result of the transformation sequence is: wherein the time domain stretching parameter Uniformly and randomly sampling in interval [0.8,1.2], frequency domain translation parameters Uniformly and randomly sampling in interval [ -2.0, +2.0] kHz for denoised image blocks Applying space and luminosity By performing a random rotation At an angle of Internal sampling to simulate different viewing angles, followed by random cropping and scaling Cutting original picture Is resampled to standard size and finally subjected to random color dithering Enhanced image block The generation from the transform complex is expressed as: will enhance the transformation Is applied to Obtaining a final normalized pre-processed data set : 。
5. The marine ecology monitoring method based on matching of ecology representation and target perception of claim 4 wherein the target perception generating fusion confidence using distinguishable ecology feature vectors comprises using target perception matching layer to distinguish ecology feature vectors Accurately associating and positioning the template with the known ecological target template, and inputting the template into the template as a feature vector Geographic location corresponding to the geographic location The method comprises the steps of outputting a series of ecological target recognition results with high confidence coefficient, and realizing the ecological target recognition results through three steps of dynamic anchor frame generation, layered diffusion matching and confidence coefficient fusion, wherein the dynamic anchor frame generation comprises the steps of dynamically generating an initial space hypothesis according to input characteristics, and presetting Size of each basic anchor frame Input features Predicting tuning parameters of each basic anchor frame, including center point offset, through lightweight regression network And scale log offset : In combination with input position Dynamic anchor frame Is calculated as: , Wherein: finally, a dynamic anchor frame set is obtained 。
6. The marine ecology monitoring method based on ecological representation-target perception matching of claim 5 wherein the hierarchical diffusion matching comprises utilizing SD-Match network to dynamically anchor frame sets And features For input, the matching is completed in two stages, firstly, coarse matching is carried out, and each dynamic anchor frame is subjected to Extracting its corresponding local feature vector And calculate its and template library Cosine similarity of (2) Meanwhile, to evaluate the geometric fitness, a normalized N.sampstein distance NWD is calculated, and an anchor frame is arranged Sum template Standard frame of (2) Respectively modeled as two-dimensional Gaussian distribution And Its average value For the center of the frame, covariance matrix Is proportional to the frame width height, squared the distance between two distributions of 2 nd order N. Analytical calculations, NWD is defined as: Wherein the method comprises the steps of For the normalization constant, the coarse match composite score is weighted by both: And finally for each anchor frame Selecting the template with the highest score as the coarse matching category And record the score Only when The anchor frame of the coarse sieve is subjected to iterative optimization by adopting a layered diffusion model in the fine matching stage so as to obtain the anchor frame Coarse match category For initial conditions, the anchor frame parameters are encoded as vectors The diffusion model generates optimized parameter vectors through a reverse process In particular, in time steps Noise prediction network According to the current parameters Number of steps Local features And category Prediction noise: Calculating a better parameter estimation by using the predicted noise and the reverse updating rule of the diffusion model : Wherein the method comprises the steps of 、 And Is the super-parameter of the diffusion model, Is standard Gaussian noise, passing through the slave To the point of To obtain optimized parameter vector The vector is decoded into a precision anchor frame The method comprises the following steps: wherein the decoding operation converts the vector into coordinates and dimensions of the box, thus, intermediate variables in the diffusion model Final convergence to And directly output as a fine anchor frame Obtaining the fine anchor frame Thereafter, features are re-extracted based on their location And calculate the template and the same Cosine similarity of (2) And And standard frame A kind of electronic device Value of The fine matching score is as follows: 。
7. The marine ecology monitoring method based on ecology representation-target perception matching of claim 6 wherein the confidence level fusion comprises fusion data-driven and knowledge-driven confidence levels, wherein feature matching confidence levels are normalized by a fine matching score: the confidence of ecological logic is determined by inquiring rule base Obtained according to the fine anchor frame Is a location-extracted environmental data vector of (a) And (3) calculating: the final confidence is the weighted sum of the two: Only when When the result is adopted, the target perception matching layer finally outputs a structured monitoring result set: Wherein each result contains a target type, an optimized geographical bounding box, i.e. a precision anchor box And fusion confidence.

Description

Marine ecology monitoring method based on ecology representation and target perception matching Technical Field The invention relates to the technical field of marine ecological monitoring, in particular to a marine ecological monitoring method based on ecological representation and target perception matching. Background The marine ecological system is a core carrier for fishery resource supply and climate regulation, and the traditional monitoring is mainly based on manual sampling and single equipment, and has double defects of efficiency and precision. Manual sampling depends on fixed-point operation of a scientific investigation ship, a single voyage can only cover 60 to 90 square kilometers of sea area, response to sudden red tides is delayed by more than 48 hours, and the omission rate of micro-plankton reaches 58%. The single sensor monitoring is also limited obviously in that the hydrologic buoy can only collect basic parameters such as temperature, salinity and the like, biological targets cannot be identified directly, the effective monitoring distance of the underwater camera is limited by the transparency of seawater and is generally lower than 5 meters, the underwater camera is completely invalid in a turbid sea area, the misjudgment rate of the optical sensor on red tides is as high as 37%, and common algae aggregation is often misjudged as harmful red tides. The traditional technology has high cost, the monitoring cost per square kilometer is about 800 yuan, which is 9 times of that of an intelligent monitoring scheme, and the wide-range normalized coverage is difficult to realize. However, the existing marine ecological monitoring technology has the defects of low synchronization precision and weak anti-interference capability in the multi-mode data preprocessing link. The conventional method integrates satellite remote sensing, underwater acoustics and buoy sensing data, only relies on simple time stamp alignment, the space-time synchronization error is often more than 5 seconds, the space coordinates are not unified to a WGS-84 coordinate system, so that the data relevance is invalid, and the signal-to-noise ratio can only be improved by about 15% by adopting a common filtering algorithm aiming at the interference such as cloud cover and ship noise in the marine environment, so that the subsequent feature extraction requirement can not be met. The prior art faces the core problems of fuzzy distinguishing between targets and backgrounds and low utilization rate of real features in ecological feature extraction. The conventional feature extraction model does not design a separation mechanism aiming at a marine scene, combines ecological target features with background features such as ocean current textures, temperature gradients and the like, has a target-background feature distinction of less than 50%, and even if a masking technology is introduced, the key features such as red tide spectrums, fish acoustic pulses and the like are lost due to the lack of self-adaption capability, so that the real feature utilization rate is only about 60%. In the prior art, short plates with poor multi-scale adaptation, weak generalization capability and low updating efficiency exist in target matching and model iteration. The traditional matching algorithm adopts a fixed anchor frame design, cannot cover a multi-scale target ranging from 16×16 pixel micron-sized algae to 256×256 pixel meter-sized red tide, the small target omission rate is more than 45%, the identification accuracy is reduced by more than 50% due to lack of a dynamic updating mechanism when the method is applied across sea areas, and the model parameter updating requires full weight retraining, which takes more than 15 days. Disclosure of Invention In order to solve the problems, the invention provides a marine ecological monitoring method based on ecological representation and target perception matching. In a first aspect, the marine ecological monitoring method based on the ecological representation and target perception matching provided by the invention adopts the following technical scheme: A marine ecology monitoring method based on ecology representation and target perception matching, comprising: Acquiring multi-source ocean monitoring data; Performing multi-mode data preprocessing based on the acquired multi-source ocean monitoring data; Generating distinguishable ecological feature vectors according to the preprocessed data; Performing target perception by using the distinguishable ecological feature vector to generate fusion confidence; And generating a monitoring report and feeding back based on the fusion confidence level result. In a second aspect, a marine ecology monitoring system based on ecology representation matching target perception, comprising: The data acquisition module is configured to acquire multi-source ocean monitoring data; The preprocessing module is configured to perform multi-mode data preprocessing based