CN-121982706-A - Litchi fruit detection method and device based on improvement YOLOv n

CN121982706ACN 121982706 ACN121982706 ACN 121982706ACN-121982706-A

Abstract

The invention discloses a litchi fruit detection method and device based on improvement YOLOv n, which belong to the technical field of computer vision, and the method comprises the steps of obtaining litchi fruit images, preprocessing data to generate image tensors, carrying out feature mapping on the image tensors through a CBS module of a backbone network, carrying out feature extraction on feature mapping results through a C2f-DWRR module, outputting a first feature map with multiple scales, carrying out feature enhancement on the first feature map with the largest scale through an SPPF-LSKA module of a neck network, carrying out feature integration on feature enhancement results and the first feature map with other scales through a C2f module, outputting a second feature map with multiple scales, carrying out classification tasks and regression tasks on each second feature map through an LSCD detection head in parallel, outputting a plurality of predicted result tensors, and carrying out confidence screening and non-maximum suppression processing on each predicted result tensor to generate litchi fruit detection results. Therefore, by implementing the invention, the accuracy and efficiency of litchi fruit detection can be improved.

Inventors

LI XINCHAO
YANG MINGYAN
SUN GUOXI
ZHANG KUNTAO
LI ZHENYING

Assignees

广东石油化工学院

Dates

Publication Date: 20260505
Application Date: 20260409

Claims (10)

1. The litchi fruit detection method based on the improvement YOLOv n is characterized by comprising the following steps of: acquiring a litchi fruit image to be detected, and carrying out data preprocessing on the litchi fruit image to generate an image tensor; Inputting the image tensor into a pre-trained backbone network of an improved YOLOv n model, so that the backbone network performs feature mapping on the image tensor through a CBS module, performs feature extraction on a feature mapping result through a C2f-DWRR module, and outputs a first feature map with multiple scales; inputting each first characteristic diagram into a neck network of the improved YOLOv n model, so that the neck network performs characteristic enhancement on the first characteristic diagram with the largest scale through an SPPF-LSKA module, performs characteristic integration on a characteristic enhancement result and the first characteristic diagrams with other scales through a C2f module, and outputs a second characteristic diagram with multiple scales; Inputting each second feature map to an LSCD detection head of the improved YOLOv n model, so that the LSCD detection head performs a classification task and a regression task on each second feature map in parallel based on shared convolution, and outputs a plurality of prediction result tensors, wherein the prediction result tensors comprise coordinates, confidence degrees and class probabilities of a plurality of candidate frames; And screening each candidate frame in each predicted result tensor according to a preset confidence threshold, and performing non-maximum suppression processing on each candidate frame remained after screening to generate a litchi fruit detection result.
2. The litchi fruit detection method based on improvement YOLOv n as set forth in claim 1, wherein before the litchi fruit image to be detected is obtained and the litchi fruit image is subjected to data preprocessing, the method further includes: Acquiring a litchi fruit data set, wherein the litchi fruit data set is marked by adopting an OBB directional boundary box; Based on the litchi fruit dataset, performing iterative training on an initial improved YOLOv n model until the improved YOLOv n model reaches a preset convergence condition to obtain an optimal improved YOLOv n model, wherein the improved YOLOv n model performs iterative training by adopting a PIoUv2 loss function.
3. The litchi fruit detection method based on improvement YOLOv n as set forth in claim 1, wherein the inputting the image tensor into a pre-trained backbone network of an improved YOLOv n model, so that the backbone network performs feature mapping on the image tensor through a CBS module, performs feature extraction on a feature mapping result through a C2f-DWRR module, and outputs a first feature map with multiple scales, includes: Performing feature mapping on the image tensor through a first CBS module to generate an initial feature map; based on the initial feature map, feature extraction is sequentially carried out through a first module group with multiple scales according to descending order of the scales, and the first feature map with multiple scales is output, wherein the first module group comprises a second CBS module and a C2f-DWRR module, the second CBS module is used for carrying out feature mapping on input features, the C2f-DWRR module is used for carrying out feature extraction on feature mapping results of the second CBS module, input features of the first module group with the largest corresponding scales are the initial feature map, and input features of the other first module groups are output features of the first module group.
4. The litchi fruit detection method based on improvement YOLOv n as set forth in claim 3, wherein the C2f-DWRR module is configured to perform feature extraction on the feature mapping result of the second CBS module, and includes: Performing feature mapping on the feature mapping result of the second CBS module through a third CBS module to generate a first sub-feature map; splitting the first sub-feature map into a second sub-feature map and a third sub-feature map, inputting the third sub-feature map to a DWRR module, so that the DWRR module performs feature enhancement on the third sub-feature map based on an expansion residual error mechanism and a DRB module, and outputting a fourth sub-feature map; and performing feature stitching on the second sub-feature map and the fourth sub-feature map, and performing feature mapping on a feature stitching result through a fourth CBS module to generate a first feature map.
5. The litchi fruit detection method based on improvement YOLOv n as set forth in claim 1, wherein the inputting each of the first feature graphs into the neck network of the improved YOLOv n model to enable the neck network to perform feature enhancement on the first feature graph with the largest scale through an SPPF-LSKA module, and performing feature integration on the feature enhancement result and the first feature graphs with other scales through a C2f module, and outputting a plurality of second feature graphs, includes: Performing feature enhancement on the first feature map with the maximum scale through an SPPF-LSKA module to generate a first enhancement feature map; Based on the first enhancement feature map, sequentially performing feature integration through a second module group with multiple scales according to the ascending order of the scales, and outputting a second enhancement feature map with multiple scales, wherein the second module group comprises a Upsample module, a first Concat module and a first C2f module, the Upsample module is used for upsampling input features, the first Concat module is used for performing feature stitching on an upsampling result and a first feature map with corresponding scales, and the first C2f module is used for performing feature integration on a feature stitching result of the first Concat module, the input features of the second module group with the smallest corresponding scales are the first enhancement feature map, the input features of the other second module groups are the output features of the last second module group, and the first enhancement feature map is used as a second enhancement feature map with the smallest corresponding scales; Based on the second enhancement feature graphs, feature integration is sequentially performed through a third module group with multiple scales according to descending order of the scales, and a second feature graph with multiple scales is output, wherein the third module group comprises a fifth CBS module, a second Concat module and a second C2f module, the fifth CBS module is used for performing feature mapping on input features, the second Concat module is used for performing feature splicing on feature mapping results of the fifth CBS module and the second enhancement feature graph with corresponding scales, the second C2f module is used for performing feature integration on feature splicing results of the second Concat module, input features of the third module group with the largest corresponding scales are the second enhancement feature graphs with the largest corresponding scales, input features of the other third module groups are output features of the last third module group, and the second enhancement feature graph with the largest corresponding scales is used as the second feature graph with the largest corresponding scales.
6. The method for detecting litchi fruits based on improvement YOLOv n as claimed in claim 5, wherein the feature enhancement of the first feature map with the largest scale by the SPPF-LSKA module is performed to generate a first enhancement feature map, which includes: performing feature mapping on the first feature map with the maximum scale through a sixth CBS module to generate a fifth sub-feature map; Performing feature pooling on the fifth sub-feature map through a plurality of MaxPool d modules to generate a sixth sub-feature map; Inputting the sixth sub-feature map to a LSKA module, so that the LSKA module performs feature enhancement on the sixth sub-feature map based on a large-kernel separation convolution attention mechanism, and outputs a seventh sub-feature map; And performing feature stitching on the seventh sub-feature map and the fifth sub-feature map, and performing feature mapping on a feature stitching result through a seventh CBS module to generate a first enhanced feature map.
7. The litchi fruit detection method based on improvement YOLOv n as set forth in claim 1, wherein the inputting each of the second feature maps to the LSCD detection head of the improved YOLOv n model to cause the LSCD detection head to perform classification tasks and regression tasks on each of the second feature maps in parallel based on shared convolution, and outputting a plurality of predicted result tensors includes: Performing feature compression on each second feature map through a plurality of 1X 1Conv-GN modules respectively to generate compressed feature maps with a plurality of scales; Feature fusion is carried out on each compressed feature map through a plurality of 3X 3Conv-GN modules, and fusion feature maps with a plurality of scales are generated; Based on the fusion feature graphs, a classification task and a regression task are executed in parallel through a fourth module group with multiple scales, and multiple prediction result tensors are output, wherein the fourth module group comprises classification branches and regression branches, the classification branches are used for outputting class probabilities, the regression branches are used for outputting coordinates and confidence degrees, and the regression branches comprise Scale layers.
8. The litchi fruit detection device based on the improvement YOLOv n is characterized by comprising a data preprocessing module, a backbone network operation module, a neck network operation module, a detection head operation module and a result generation module; the data preprocessing module is used for acquiring litchi fruit images to be detected, and performing data preprocessing on the litchi fruit images to generate image tensors; The backbone network operation module is used for inputting the image tensor into a pre-trained backbone network of an improved YOLOv n model, so that the backbone network performs feature mapping on the image tensor through a CBS module, performs feature extraction on a feature mapping result through a C2f-DWRR module, and outputs a first feature map with multiple scales; the neck network operation module is used for inputting each first feature map to the neck network of the improved YOLOv n model, so that the neck network performs feature enhancement on the first feature map with the largest scale through the SPPF-LSKA module, performs feature integration on the feature enhancement result and the first feature maps with other scales through the C2f module, and outputs second feature maps with multiple scales; The detection head operation module is used for inputting each second feature map to an LSCD detection head of the improved YOLOv n model so that the LSCD detection head can execute a classification task and a regression task on each second feature map in parallel based on shared convolution and output a plurality of prediction result tensors, wherein the prediction result tensors comprise coordinates, confidence degrees and class probabilities of a plurality of candidate frames; The result generation module is used for screening each candidate frame in each predicted result tensor according to a preset confidence coefficient threshold value, and carrying out non-maximum value inhibition processing on each candidate frame remained after screening to generate a litchi fruit detection result.
9. A terminal device comprising a processor, a memory and a computer program stored in the memory and configured to be executed by the processor, when executing the computer program, implementing a litchi fruit detection method based on improvement YOLOv n as claimed in any one of claims 1-7.
10. A computer readable storage medium comprising a stored computer program, wherein the computer program, when run, controls a device in which the computer readable storage medium is located to perform a litchi fruit detection method based on improvement YOLOv n as claimed in any one of claims 1-7.

Description

Litchi fruit detection method and device based on improvement YOLOv n Technical Field The invention relates to the technical field of computer vision, in particular to a litchi fruit detection method and device based on improvement YOLOv n. Background The existing litchi fruit detection method is mainly divided into two types, namely, a traditional image processing technology is used, the traditional image processing technology mainly uses the characteristics of color, texture or shape of fruits to identify litchi fruits, but the method is difficult to achieve higher precision due to complex natural environment background, fruit shielding and the like. The other is that by using a deep learning network model, a target detection algorithm based on deep learning not only can learn representative characteristics autonomously, but also can directly detect an input image, thereby realizing an efficient litchi fruit detection task. However, because litchi fruits grow in clusters and are blocked by thick leaves, complicated background interference is added to an orchard, and the conditions of omission, false detection and the like still occur in the deep learning network model when litchi fruits are detected. Meanwhile, as the detection model needs to be deployed on the edge equipment with limited memory and computing capacity, the litchi fruit detection model is required to meet the requirement of light weight, and the light weight detection model often sacrifices part of detection precision, so that the detection efficiency and the detection precision are difficult to balance. Disclosure of Invention The invention provides a litchi fruit detection method and device based on improvement YOLOv n, which can improve the accuracy and efficiency of litchi fruit detection. The embodiment of the invention provides a litchi fruit detection method based on improvement YOLOv n, which comprises the following steps: acquiring a litchi fruit image to be detected, and carrying out data preprocessing on the litchi fruit image to generate an image tensor; Inputting the image tensor into a pre-trained backbone network of an improved YOLOv n model, so that the backbone network performs feature mapping on the image tensor through a CBS module, performs feature extraction on a feature mapping result through a C2f-DWRR module, and outputs a first feature map with multiple scales; inputting each first characteristic diagram into a neck network of the improved YOLOv n model, so that the neck network performs characteristic enhancement on the first characteristic diagram with the largest scale through an SPPF-LSKA module, performs characteristic integration on a characteristic enhancement result and the first characteristic diagrams with other scales through a C2f module, and outputs a second characteristic diagram with multiple scales; Inputting each second feature map to an LSCD detection head of the improved YOLOv n model, so that the LSCD detection head performs a classification task and a regression task on each second feature map in parallel based on shared convolution, and outputs a plurality of prediction result tensors, wherein the prediction result tensors comprise coordinates, confidence degrees and class probabilities of a plurality of candidate frames; And screening each candidate frame in each predicted result tensor according to a preset confidence threshold, and performing non-maximum suppression processing on each candidate frame remained after screening to generate a litchi fruit detection result. According to the embodiment of the application, an original image can be converted into a standard format which can be identified by a model through data preprocessing on litchi fruit images, the perception capability of the model on the detailed characteristics and edges of litchi fruits can be enhanced through adopting a C2f-DWRR module based on an expansion residual error mechanism and an expansion heavy parameterization module for characteristic extraction, the identification capability of the model on litchi fruits in a complex background can be enhanced through adopting an SPPF-LSKA module based on a large-kernel separation convolution attention mechanism for characteristic enhancement, redundant calculation of repeated convolution in a detection head can be avoided through adopting an LSCD detection head based on shared convolution for parallel processing of classification tasks and regression tasks, and redundant and low-quality candidate frames can be removed through carrying out confidence screening and non-maximum suppression processing on model output. Compared with the prior art, the litchi fruit detection method and device have the advantages that the detection efficiency and the detection precision are difficult to balance, and the litchi fruit detection accuracy and efficiency can be improved. Further, before the litchi fruit image to be detected is obtained and data preprocessing is performed on the litchi fr