CN-122024166-A - Oil stain identification method and system based on Mamba and frequency domain cooperation under U-shaped structure
Abstract
A method and a system for identifying greasy dirt by Mamba and frequency domain cooperation under a U-shaped structure relate to the technical field of greasy dirt leakage identification. The method aims to solve the problems that the conventional oil stain recognition model is insufficient or difficult to consider in terms of overall context awareness capability, multi-scale adaptability, calculation efficiency and detail holding capability. The method has the technical key points that a training sample and a test sample are constructed, a model of the oil stain identification network with the cooperation of Mamba and a frequency domain under a U-shaped structure is constructed, the model is designed into a U-shaped structure based on a lightweight backbone network SMT, feature redundancy is filtered step by step according to a design concept from top to bottom, the weight of the model can be ensured on the basis of high-precision identification, perfect balance of volume and precision is achieved, a frequency guiding Mamba module is designed, high-frequency components and low-frequency components are respectively processed, and feature operation efficiency is improved by combining Mamba components. The method comprises the steps of designing a low-scale convolution sensor, training a greasy dirt identification network model based on Mamba and frequency domain cooperation under a U-shaped structure, and carrying out pixel-level judgment on greasy dirt data by utilizing the learned network model to output a black-white two-to-segmentation graph and output a prediction result.
Inventors
- ZHANG QIANG
- WU DAN
- WANG YING
- LIU MIAOMIAO
- LIU YANJUN
- ZHANG JINHAO
Assignees
- 东北石油大学
Dates
- Publication Date
- 20260512
- Application Date
- 20260211
Claims (8)
- 1. A method for identifying oil stains by Mamba and frequency domain cooperation under a U-shaped structure is characterized by comprising the following steps: Firstly, constructing a training sample and a test sample, namely collecting oil stain images generated in an oil field or industrial production process, enhancing and expanding data volume by utilizing data under the condition of insufficient data volume, ensuring the diversity and normalization of the data sample through data preprocessing, and dividing the data set into a training set and a test set according to the proportion principle of large proportion of the training set; Step two, constructing a oil stain identification network model cooperated with a frequency domain under a U-shaped structure Mamba; (1) The model is designed into a U-shaped framework based on lightweight backbone network SMT, feature redundancy is filtered step by step according to a design concept from top to bottom, and the weight of the model is ensured on the basis of high-precision identification, so that the volume and the precision are balanced; (2) From the perspective of frequency domains, a frequency guiding Mamba module is designed, high-frequency and low-frequency components are respectively processed, and feature operation efficiency is improved by combining Mamba components, so that the model can mine the dependency relationship between the frequency domains, capture useful frequency domain information and guide efficient training of the model; (3) Designing a low-scale convolution sensor in the training process of the model to enhance the expressive power of low-level features, and putting the centrality at the integrity degree of feature information to enable the model to learn the discrimination capability; inputting the training set and the testing set which are divided by the graph data in the step one into an oil stain identification network model based on Mamba and frequency domain cooperation under a U-shaped structure, learning and training aiming at the data set model, and finding out a parameter combination which enables the model performance to be optimal; and fourthly, judging the greasy dirt data at a pixel level by utilizing the learned network model, outputting a black-white two-to-segmentation graph, outputting a prediction result, wherein the target is a white area, and the rest of the background is black.
- 2. The method for identifying the oil stain by the cooperation of Mamba and the frequency domain under the U-shaped structure according to claim 1, wherein the specific method of the first step is as follows: a. constructing a high-quality normalized data set for training and evaluating a subsequent model, collecting greasy dirt images from oilfield sites, industrial production pipelines and monitoring video real scenes, wherein the images cover various scene conditions and various visual angles, and different illumination, weather, greasy dirt types and background complexity degrees so as to ensure the diversity and representativeness of the data; b. The method comprises the steps of amplifying an original image library by adopting various data enhancement technologies based on insufficient data quantity or unbalanced samples of greasy images to improve generalization capability and robustness of models, adopting a data enhancement method including but not limited to geometric transformation, random horizontal and vertical overturning, random rotation, random scaling and cutting to simulate different shooting visual angles, color and brightness disturbance, micro random adjustment of hue, saturation and brightness of the images, simulation of illumination change and equipment difference, noise injection, addition of Gaussian noise or spiced salt noise, enhancement of robustness of the models to image quality degradation, simulation of occlusion, random addition of rectangular shielding blocks, simulation of partial shielding conditions possibly existing in sites, expansion of the scale of an original data set to 3 times of the original data set through combination of the methods to ensure sufficiency of training samples, c, randomly dividing the normalized data set finished through the steps into a training set and a test set according to a ratio of 7:3, using the training set for learning and optimization of model parameters, using the test set as independent unseen data for objective evaluation of final performance and generalization capability of the models, and enabling the ratio of the training set to be consistent with the total data set and the test set.
- 3. The method for identifying oil stains by Mamba and frequency domain cooperation under a U-shaped structure according to claim 1 or 2, wherein the specific method of the second step is as follows: Step 1, adopting a framework based on a lightweight backbone network SMT, constructing a U-shaped framework by utilizing convolution layers with convolution kernel sizes of 1 multiplied by 1 and 3 multiplied by 3, carrying out batch normalization and a ReLU activation function, combining Cascade Decoder Cascade Decoder features by element-by-element splicing, and ensuring high segmentation precision and compressing model volume and calculation amount by filtering feature redundancy step by step from top to bottom and path from bottom to top so as to ensure that the model is easy to be deployed on industrial embedded equipment, thereby realizing excellent balance of high precision and light weight: (1) Wherein EB (·) represents a coding block, I represents image data of an input model, and f i is a hierarchical feature after passing through a backbone network; Step 2, introducing a frequency guide Mamba module, namely Fre-Mamba, to convert the characteristics into a frequency domain and separate and process high-frequency and low-frequency components, so that the model can deeply mine the dependency relationship between the frequency domains, and the capturing of detail high-frequency and structure low-frequency information is enhanced: a. the frequency division stage is to combine discrete wavelet transform DWT, and divide the image structure into high-low two different frequency components f l and f h under the multi-level decomposition characteristic, (2) B. Mamba embedding stage Mamba as a variant of the selective state space model SSM, mamba enhancing spatial luminance features by selectively modeling low frequency domain information based on its long range dependency capture, analysis capability and low computational resource consumption; (3) (4) (5) LN (·) represents a layer standardization (LayerNorm) operation, DWT (·) represents a DWT function, mamba (·) represents a Mamba function, split (·) represents a feature segmentation operation, conv n (·) is a convolution operation with a convolution kernel size of n×n, att (·) represents an attention mechanism, a spatial attention mechanism is adopted in the structure, and finally Cat (·) is used for splicing high-frequency and low-frequency component features element by element; And 3, enhancing the perception and utilization of the bottom features through a designed low-scale convolution sensor L-ConvA, wherein the bottom features comprise texture and detail information, and the module ensures that the key information is not lost in the training process, so that a model learns to a feature representation with more discrimination and is used for identifying fine and diffuse greasy dirt so as to realize detail reservation and discrimination capability: a. the front branch structure is that the primary treatment is carried out through the deep convolution and the common convolution, and then the global average pooling and activation function is utilized to multiply the characteristics element by element; (6) b. the back branch structure consists of the same double branches, and features on the two branches are spliced element by element, and then a convolution layer laminated channel is adopted; (7) Where GAP (-) represents the global average pooling function, DW n (-) is a deep convolution operation with a convolution kernel size of n, (. Cndot.) is then an Sigmoid function and ⨂ multiplied element by element.
- 4. The method for identifying the oil stain by the cooperation of Mamba and the frequency domain under the U-shaped structure according to claim 1, wherein the specific method in the third step is as follows: a. Dividing the data set in the first step according to the ratio of 7:3, inputting the divided training set and test set into a model for training, and adopting a composite loss function which integrates Binary Cross Entropy (BCE) loss and cross-over ratio (IoU) loss, wherein the formula of the total loss L t is formulated as follows: (8) In the middle of Representing BCE loss calculations performed on the feature map, Representing IoU penalty calculations on the feature map, P representing the result of predicting the feature map, G t representing the truth map (GT); b. the model is integrally trained by using an Adam optimizer, using a mean square error as a loss function, setting the learning rate to be 0.0001 and the period number to be 150, so that convergence can be achieved.
- 5. The method for identifying the oil stain by the cooperation of Mamba and the frequency domain under the U-shaped structure according to claim 1, wherein the specific method of the fourth step is as follows: and randomly inputting image data to be predicted, testing by using a trained parameter weight model, and finally visually displaying a predicted image.
- 6. A system for identifying oil stains under a U-shaped structure by utilizing Mamba and frequency domains is characterized by comprising a program module corresponding to the steps of the method of any one of claims 1-5, and executing the steps of the oil stain identification method under the U-shaped structure by utilizing Mamba and frequency domains during operation.
- 7. A computer readable storage medium storing a computer program which when executed by a processor implements the steps of a method for identifying oil stains under a U-shaped structure Mamba in cooperation with a frequency domain according to any one of claims 1 to 5.
- 8. A device for identifying greasy dirt by combining Mamba with a frequency domain under a U-shaped structure is characterized by comprising And a memory communicatively coupled to the at least one processor, the memory storing instructions executable by the at least one processor to enable the at least one processor to perform a method of oil stain identification in conjunction with the frequency domain under a U-shaped structure Mamba as set forth in any one of claims 1-5, thereby effecting oil stain identification.
Description
Oil stain identification method and system based on Mamba and frequency domain cooperation under U-shaped structure Technical Field The invention relates to the technical field of oil stain leakage identification, in particular to an oil stain identification method and an oil stain identification system under Mamba and frequency domain cooperation of a U-shaped structure. Background The oil pollution events such as offshore oil spill, industrial oil pollution leakage and the like form serious threats to the marine ecosystem, coastal economy and human health. Therefore, a quick and accurate oil stain remote sensing monitoring technology is developed, accurate identification and area measurement and calculation of oil stain areas are realized, and the method has great significance for environmental management and disaster emergency response. In recent years, deep learning-based techniques, particularly Convolutional Neural Networks (CNNs) and vision Transformer (ViT), have become a research hotspot in this area. For example, the prior art with the document number of CN112802006B discloses an edge computing motor oil stain identification method based on deep learning, the oil stain identification method achieves the purposes of training, cutting and reasoning prediction separation of a deep learning model, the deep learning model is trained and cut at a cloud end, the cut deep learning model is suitable for the resource and calculation force of edge equipment, the oil stain detection is carried out on local edge equipment by using the deep learning model, the requirement of oil stain identification data real-time processing is met, and the deep learning model is small and high in accuracy. The prior art with the document number of CN114943895B discloses a method, a device, electronic equipment and a storage medium for identifying the ground oil stain of a transformer substation, and the aim of the prior art is to improve the accuracy and the technical effect of the identification speed of the ground oil stain. The CNN-based model achieves significant success in image segmentation tasks through its local connection and weight sharing characteristics. However, the inherent local receptive field of CNN limits the capability of capturing long-distance spatial dependency relationship, and the greasy dirt area often presents the characteristics of irregular form, distributed dispersion and fuzzy boundary with the background in the remote sensing image, and the lack of the long-distance contextual information easily causes misjudgment and omission of greasy dirt on a large area or complex edge. To overcome the limitations of CNN, a visual Transformer (ViT) model was introduced that demonstrated excellent performance in a variety of visual tasks by virtue of the strong global modeling capabilities of the self-attention mechanism. However, viT has obvious defects that firstly, the calculation complexity of the self-attention mechanism is proportional to the square of the image size, and huge calculation and memory overhead are faced when high-resolution remote sensing images are processed, so that the model is difficult to be deployed in a lightweight mode. Secondly, the standard ViT lacks effective perception of multi-scale information in the characteristic extraction process, and the scale difference from a small oil film to a large-scale oil belt in an oil pollution area is huge, the insensitivity of the scale change can directly influence the identification precision, and especially the detection effect of the small-area oil pollution is poor. Recently, state space models (STATE SPACE Model, SSM), particularly Mamba architecture, provide new ideas for visual tasks due to the linear computational complexity and global receptive field they possess in long-sequence modeling. However, the direct application of Mamba to complex oil stain identification tasks remains a challenge. Simple sequence modeling may not adequately capture high frequency details in the image, such as sharp boundaries of greasy dirt with water, shoreline, which are critical to accurate segmentation. The frequency domain analysis technology has been proved to be capable of effectively enhancing the extraction capability of the model to image textures and edge information, but the existing method is rarely used for carrying out depth fusion on the global context modeling capability of Mamba and the frequency domain analysis technology so as to cooperatively solve the problems of long-range dependence and fine boundary segmentation in oil stain identification. In summary, the oil stain recognition model in the prior art has a disadvantage or difficulty in considering global context awareness, multi-scale adaptability, calculation efficiency and detail retention capability. Therefore, a novel technical scheme capable of efficiently and accurately identifying oil stain areas with different scales and simultaneously considering light weight of a model a