CN-122024158-A - Intelligent water surface floating garbage identification system and method based on improved YOLO model

CN122024158ACN 122024158 ACN122024158 ACN 122024158ACN-122024158-A

Abstract

The invention belongs to the technical field of computer vision and relates to an intelligent recognition system and method for water surface floating garbage based on an improved YOLO model, wherein the system comprises a water surface floating garbage recognition module, a water surface floating garbage recognition module and a context feature fusion feature extraction unit, wherein the water surface floating garbage recognition module is used for collecting water surface floating garbage images, inputting the water surface floating garbage images into the water surface floating garbage recognition model to obtain water surface floating garbage recognition results, the water surface floating garbage recognition model is obtained by utilizing a YOLOv model after training and improving a training set, the water surface floating garbage recognition module comprises an initial model construction submodule, a frequency domain time domain fusion feature extraction unit is used for replacing a preset C2f layer of a backhaul network in the YOLOv model, and a context feature fusion unit is introduced into a Neck network in the YOLOv model and used for carrying out feature fusion on different-level feature graphs.

Inventors

YU YUANHUI
Hou Yazheng

Assignees

集美大学

Dates

Publication Date: 20260512
Application Date: 20260202

Claims (10)

1. Water surface floating garbage intelligent recognition system based on improved YOLO model, which is characterized by comprising: The water surface floating garbage identification module is used for acquiring a water surface floating garbage image, inputting the water surface floating garbage image into a water surface floating garbage identification model to acquire a water surface floating garbage identification result, wherein the water surface floating garbage identification model is acquired by utilizing a YOLOv model after training improvement of a training set, and the training set comprises an original water surface floating garbage image and a corresponding label; The water surface floating garbage identification module comprises an initial model construction submodule, a context feature fusion unit and a context feature fusion unit, wherein the initial model construction submodule is used for improving the YOLOv model, a preset C2f layer of a backhaul network in the YOLOv model is replaced by a frequency domain time domain fusion feature extraction unit, the frequency domain time domain fusion feature extraction unit is used for extracting the water surface floating garbage image, and the context feature fusion unit is introduced into a Neck network in the YOLOv model and used for carrying out feature fusion on feature graphs of different levels.
2. The improved YOLO model based water surface floating waste intelligent identification system of claim 1, wherein the water surface floating waste identification module further comprises: the data labeling sub-module is used for carrying out square frame labeling on various water surface floating garbage images in the original water surface floating garbage images to obtain labeled water surface floating garbage images; and the data processing sub-module is used for overturning, cutting and exposing and adjusting the marked water surface floating garbage image to obtain a marked data set, and the marked data set is used as the training set.
3. The improved YOLO model-based intelligent recognition system for water surface floating garbage as claimed in claim 1, wherein the frequency domain time domain fusion feature extraction unit comprises: The frequency domain feature extraction branch is used for converting the first time domain image into a frequency domain by adopting fast Fourier transform, carrying out frequency domain feature extraction by utilizing convolution, and converting the frequency domain into a second time domain image further based on inverse fast Fourier transform; A time domain feature extraction branch, which is used for strengthening the target edge feature of the first time domain image by adopting a Scharr operator, and then completing time domain feature extraction by using convolution; And a frequency domain time domain feature fusion branch, which receives the second time domain image from the frequency domain branch and the time domain feature image of the time domain branch, performs preliminary feature fusion by adaptively distributing fusion proportion of two paths of features through a group of learnable weights, further introduces a Coordinate attention mechanism, further highlights important feature areas in space dimension, and outputs a fusion feature image with global perception capability.
4. The improved YOLO model based intelligent recognition system for water surface floating garbage of claim 1, wherein the contextual feature fusion unit comprises: The feature extraction branch is used for extracting features of the high-resolution feature map and the low-resolution feature map by using independent mapping respectively to obtain target features; A channel space gating branch for adjusting the channel and space information of the target feature to obtain a target enhancement feature; The feature fusion branch is used for dynamically weighting the channel attention and the space attention of the target enhanced feature through the learnable parameters, outputting a final gating value and carrying out self-adaptive fusion on the channel and the space information of the feature based on the final gating value; In the self-adaptive fusion process, the low-resolution feature weight and the high-resolution feature weight are utilized to dynamically fuse the trans-scale features through residual connection, semantic interaction is carried out on the context feature information based on grouping convolution and channel shuffling, and a feature map containing the context feature information is generated.
5. The improved YOLO model based water surface floating garbage intelligent identification system of claim 4, wherein the channel space gating branch comprises: The mixed channel attention sub-branch is used for extracting channel global information through average pooling and maximum pooling, and generating a channel attention gating value through Sigmoid activation after element-by-element addition and convolution processing; The direction-sensitive spatial attention branch is used for capturing spatial context information in the horizontal direction and the vertical direction by adopting a direction-sensitive convolution combination and acquiring a spatial attention gating value.
6. The intelligent recognition method of the water surface floating garbage based on the improved YOLO model, which is realized by the system according to any one of claims 1 to 5, is characterized by comprising the following steps: The method comprises the steps of collecting a water surface floating garbage image, inputting the water surface floating garbage image into a water surface floating garbage identification model to obtain a water surface floating garbage identification result, wherein the water surface floating garbage identification model is obtained by utilizing a YOLOv model after training improvement of a training set, and the training set comprises an original water surface floating garbage image and a corresponding label; And (3) improving the YOLOv model, namely replacing a preset C2f layer of a back bone network in the YOLOv model with a frequency domain time domain fusion feature extraction unit for extracting frequency domain time domain fusion features of the water surface floating garbage image, and introducing a context feature fusion unit into a Neck network in the YOLOv model for carrying out feature fusion on feature images of different levels.
7. The improved YOLO model based water surface floating garbage intelligent recognition system of claim 6, wherein obtaining the training set comprises: performing square frame marking on various water surface floating garbage images in the original water surface floating garbage image to obtain a marked water surface floating garbage image; and turning, cutting and exposing the marked water surface floating garbage image to obtain a marked data set, and taking the marked data set as the training set.
8. The improved YOLO model-based intelligent recognition system for water surface floating garbage as claimed in claim 6, wherein the frequency domain time domain fusion feature extraction unit comprises: The frequency domain feature extraction branch is used for converting the first time domain image into a frequency domain by adopting fast Fourier transform, carrying out frequency domain feature extraction by utilizing convolution, and converting the frequency domain into a second time domain image further based on inverse fast Fourier transform; A time domain feature extraction branch, which is used for strengthening the target edge feature of the first time domain image by adopting a Scharr operator, and then completing time domain feature extraction by using convolution; And a frequency domain time domain feature fusion branch, which receives a second time domain image from the frequency domain branch and a time domain feature image of the time domain branch as inputs, and adaptively distributes fusion proportion of two paths of features through a group of learnable weights to realize preliminary feature fusion, further introduces a Coordinatate attention mechanism, further highlights important feature areas in a space dimension, and finally outputs a fusion feature image with global perception capability.
9. The improved YOLO model based intelligent recognition system for water surface floating garbage of claim 6, wherein the contextual feature fusion unit comprises: The feature extraction branch is used for extracting features of the high-resolution feature map and the low-resolution feature map by using independent mapping respectively to obtain target features; A channel space gating branch for adjusting the channel and space information of the target feature to obtain a target enhancement feature; The feature fusion branch is used for dynamically weighting the channel attention and the space attention of the target enhanced feature through the learnable parameters, outputting a final gating value and carrying out self-adaptive fusion on the channel and the space information of the feature based on the final gating value; In the self-adaptive fusion process, the low-resolution feature weight and the high-resolution feature weight are utilized to dynamically fuse the trans-scale features through residual connection, semantic interaction is carried out on the context feature information based on grouping convolution and channel shuffling, and a feature map containing the context feature information is generated.
10. The improved YOLO model based water surface floating garbage intelligent identification system of claim 9, wherein the channel space gating branch comprises: The mixed channel attention sub-branch is used for extracting channel global information through average pooling and maximum pooling, and generating a channel attention gating value through Sigmoid activation after element-by-element addition and convolution processing; The direction-sensitive spatial attention branch is used for capturing spatial context information in the horizontal direction and the vertical direction by adopting a direction-sensitive convolution combination and acquiring a spatial attention gating value.

Description

Intelligent water surface floating garbage identification system and method based on improved YOLO model Technical Field The invention relates to the technical field of computer vision, in particular to an intelligent recognition system and method for water surface floating garbage based on an improved YOLO model. Background With the rapid development of artificial intelligence and computer vision technology, the intelligent recognition system for the garbage floating on the water surface is gradually applied to the fields of industry, environmental monitoring and the like. The traditional water surface floating garbage identification method generally relies on manual observation and classification, which is time-consuming and labor-consuming and is easy to generate errors. With the continuous progress of deep learning technology, a water surface floating garbage intelligent recognition system based on a YOLO algorithm is becoming an efficient and accurate solution. Conventional water surface floating garbage identification systems generally employ CNN and R-CNN based methods for image classification and identification. However, these methods have some drawbacks in practical applications, limiting their performance in efficient real-time identification. R-CNN is a deep learning method based on region extraction, which performs image segmentation by generating candidate regions and performs convolutional neural network classification on each candidate region, thereby realizing target recognition. Although R-CNN can accurately identify floating garbage on the water surface, the calculation amount is huge, and the processing speed is low, because each candidate region needs to be extracted and classified by independent CNN characteristics, which makes it perform poorly in real-time identification tasks. CNN is an infrastructure network architecture in the deep learning field, and is widely applied to tasks such as image classification, feature extraction and the like. Conventional CNN methods typically require pre-extraction of image features and input of these features into the fully connected layer for classification. Although CNN is excellent in recognition accuracy, it lacks accurate prediction capability for the position and number of targets when dealing with complex scenes or multi-target detection. Thus, CNNs typically require additional algorithms to handle multi-target detection tasks, which increases computational effort and complexity. Compared with R-CNN and CNN, the YOLO has obvious advantages in the intelligent recognition system of the floating garbage on the water surface. The high-efficiency and real-time characteristics of the device can quickly identify the floating garbage on the water surface under the condition of not sacrificing the precision, and the device is suitable for various dynamic environments. While R-CNN and CNN perform well in still image recognition, it is often difficult to meet the real-time recognition requirement due to the high computational complexity. Disclosure of Invention In order to solve the problems in the prior art, the invention aims to provide an intelligent recognition system and method for water surface floating garbage based on an improved YOLO model, wherein a YOLOv model is adopted as a reference model, YOLOv8 is the most widely used model in the existing YOLO model, and the intelligent recognition system and method have the characteristics of high stability, rich deployment experience and the like, and realize accurate recognition of the water surface floating garbage. In order to achieve the above object, the present invention provides the following solutions: Water surface floating garbage intelligent recognition system based on improved YOLO model includes: The water surface floating garbage identification module is used for acquiring a water surface floating garbage image, inputting the water surface floating garbage image into a water surface floating garbage identification model to acquire a water surface floating garbage identification result, wherein the water surface floating garbage identification model is acquired by utilizing a YOLOv model after training improvement of a training set, and the training set comprises an original water surface floating garbage image and a corresponding label; The water surface floating garbage identification module comprises an initial model construction submodule, a context feature fusion unit and a context feature fusion unit, wherein the initial model construction submodule is used for improving the YOLOv model, a preset C2f layer of a backhaul network in the YOLOv model is replaced by a frequency domain time domain fusion feature extraction unit, the frequency domain time domain fusion feature extraction unit is used for extracting the water surface floating garbage image, and the context feature fusion unit is introduced into a Neck network in the YOLOv model and used for carrying out feature fusion on fe