CN-121982012-A - Method for phenotyping and quantifying severity of potato tuber diseases by using double-headed CNN-transporter network
Abstract
A method for carrying out phenotypic analysis and severity quantification on potato tuber diseases by using a double-headed CNN-transporter network belongs to the technical field of potato disease detection. In order to realize the accurate classification of potato tuber diseases. The method comprises the steps of constructing a potato disease data set, marking by using Anylabeling tools, carrying out data enhancement to obtain the potato disease data set after data enhancement, dividing the potato disease data set into a training set and a testing set, constructing a double-head hybrid CNN-transducer framework, sequentially connecting a CNN encoder for local feature extraction, a transducer bottleneck layer for global context modeling and a double-head output structure coupled by a novel SAP mechanism, constructing a composite multitasking loss function to train the double-head hybrid CNN-transducer framework, and adopting the testing set to carry out automatic diagnosis and detection on potato diseases.
Inventors
- WANG WENZHONG
- Zhao Jiangsan
Assignees
- 黑龙江省农业科学院经济作物研究所
Dates
- Publication Date
- 20260505
- Application Date
- 20260205
Claims (5)
- 1. A method for phenotyping and quantifying the severity of potato tuber diseases using a double ended CNN-transporter network, comprising the steps of: s1, combining potato disease images acquired from the ground and potato disease images of public data resources, constructing a potato disease dataset, and marking all images in the potato disease dataset by using Anylabeling tools; S2, carrying out data enhancement on the image in the potato disease data set obtained in the step S1 by adopting an image enhancement library albumentations to obtain a potato disease data set after data enhancement, wherein the potato disease data set is divided into a training set and a testing set; S3, constructing a double-head hybrid CNN-converter architecture, wherein CNN encoders for local feature extraction, converter bottleneck layers for global context modeling and a double-head output structure coupled through a novel SAP mechanism are sequentially connected; S4, constructing a composite multitask loss function, training the double-head hybrid CNN-converter framework constructed in the step S3 by using the training set obtained in the step S2 to obtain a trained double-head hybrid CNN-converter framework, and performing automatic diagnosis and detection on potato diseases by using a test set.
- 2. The method for phenotyping and quantifying the severity of potato tuber diseases using a double-ended CNN-transporter network according to claim 1, wherein the specific implementation method of step S1 comprises the following steps: s1.1, autonomously collecting potato disease images in the field, wherein the potato disease images comprise potato disease images of black nevus, common scab, pink rot, powdery scab and psoriasis; s1.2, selecting potato disease images of public data resources, wherein the potato disease images comprise a public reference data set, a newly built expert annotation data set from Heilongjiang province of China and an independent data set of Norway, establishing samples of different severity stages of dry rot, and dividing the internal type of the dry rot, the surface type of the dry rot and the severity type of the dry rot into three independent categories; S1.3, constructing a potato disease data set based on the images of the step S1.1 and the step S1.2, marking by using a Anylabeling tool, and generating a pixel-level disease mask for all the images by drawing polygon lesion boundaries.
- 3. The method for phenotyping and quantifying the severity of potato tuber diseases using a double-ended CNN-transporter network according to claim 2, wherein the specific implementation method of step S2 comprises the following steps: S2.1, performing rigid geometric transformation on the image in the potato disease data set obtained in the step S1, setting the probability of horizontal/vertical overturning to be 0.5/0.3, setting the probability of translational scaling rotation to be 0.7, and setting the rotation angle to be +/-25 degrees; s2.2, performing non-rigid geometric transformation on the picture processed in the step S2.1, wherein the probability of the shot elasticity/grid distortion is 0.4, setting the control pixel displacement amplitude alpha=40, and controlling the spatial smoothness sigma=8 of deformation; S2.3, performing photometric transformation on the picture processed in the step S2.2, setting the probability of brightness/contrast as 0.6, setting the limit range as +/-0.2, setting the probability of limiting contrast self-adaptive histogram equalization as 0.3, and setting the clipping limit as 3.0; S2.4, performing environment simulation on the picture processed in the step S2.3, setting the probability of shadow to be 0.5, then performing idle blurring processing, setting the probability of Gaussian/motion/median blurring to be 0.4, setting the convolution kernel size to be 3-5, and obtaining a potato disease data set with the picture size of 256 x 256 after data enhancement; S2.5, dividing the data volume of the potato disease data set with the enhanced data into a training set and a testing set according to the proportion of the layering 9:1.
- 4. A method for phenotyping and quantifying the severity of potato tuber diseases using a double ended CNN-transporter network according to claim 3, wherein the specific implementation of step S3 comprises the steps of: S3.1, setting a CNN encoder for local feature extraction, taking a ResNet-34 main network pre-trained on an ImageNet as a feature encoder, removing a final classification layer and a pooling layer, inputting a potato disease image into the CNN encoder for local feature extraction, and obtaining a feature map output by the encoder; s3.2, setting a converter bottleneck layer for global context modeling to be composed of 4 converter encoder modules, and inputting a characteristic diagram output by the encoder to the converter bottleneck layer for global context modeling to obtain a characteristic diagram output by a converter bottleneck position; S3.3, setting a novel SAP mechanism coupled double-end output structure comprising a classification head activated by softmax and used for identifying diseases and a segmentation decoder activated by deconvolution structure and combined with sigmoid and used for generating lesion masks, selectively gating a feature map by implementing segmentation perception pooling hard attention constraint, and establishing an explicit structure dependency relationship between segmentation and classification tasks; Setting F to represent the feature map output by the transducer bottleneck, M is the predicted soft segmentation mask, and the classification feature vector v_cls is calculated as follows: v_cls = GAP(F ⊙ Norm(M)) Where, as follows, as per element multiplication, norm (M) represents normalization of the soft mask to form a spatial attention map derived from the segmented input, GAP is global average pooling for compressing the spatial feature map into a classification feature vector.
- 5. The method for phenotyping and quantifying the severity of potato tuber diseases using a double ended CNN-transporter network according to claim 4, wherein the specific implementation method of step S4 comprises the steps of: S4.1, optimizing a network by adopting a compound multitask loss function L_total, wherein the expression is as follows: L_total = λ_cls · L_CE + λ_seg · (L_Dice + L_BCE) Wherein l_ce represents a cross entropy loss used by the classification task, l_bce represents a binary cross entropy loss, l_ce represents a dess loss, a classification target weight coefficient λ_cls=1.0 is set, and a segmentation target weight coefficient λ_seg=1.0; S4.2, adopting an Adam optimizer for network training, setting an initial learning rate to be 1e -4 , setting a weight attenuation rate to be 1e -2 , setting the training batch size to be 4, and setting the verification batch size to be 1. A decoupling strategy of learning rate adjustment and model selection is adopted, wherein a ReduceLROnPlateau scheduler is adopted to dynamically adjust the learning rate based on verification set loss; Setting that if the training set loss is not reduced by 5 continuous epochs, the learning rate is attenuated by 0.9 times of the coefficient; s4.3, setting a selection standard of an optimal model check point to maximize a training set dess similarity coefficient, and triggering an early-stopping mechanism by monitoring the Dice score if the continuous 40 epochs of the training set Dice score are not lifted.
Description
Method for phenotyping and quantifying severity of potato tuber diseases by using double-headed CNN-transporter network Technical Field The invention belongs to the technical field of potato disease detection, and particularly relates to a method for phenotype analysis and severity quantification of potato tuber diseases by utilizing a double-headed CNN-transporter network. Background Accurate and interpretable potato tuber disease detection is critical to postharvest quality assessment and disease management. However, existing models often exploit background correlations ("shortcut learning") rather than true pathological features. Furthermore, existing methods typically consider disease classification and lesion segmentation as independent tasks, failing to model their inherent spatial dependencies. And there is still a general but often neglected challenge in deploying deep learning models in the agricultural field-short-cut learning. Standard classifiers often rely on false correlations (such as background color, mesa texture, or lighting conditions) to infer disease categories, rather than focusing on the pathology itself. This results in the model being a "black box" that performs well on a particular test set, but is difficult to generalize to a real environment with varying background. Furthermore, most of the existing multitasking frameworks treat lesion classification and lesion segmentation as parallel independent data streams. This design ignores the biological logical dependence that disease identification should be determined entirely by the visual characteristics of the affected tissue, rather than the surrounding healthy epidermis or background environment. Failure to model such dependencies limits the interpretability and biological effectiveness of the system. Disclosure of Invention The invention aims to solve the problems of realizing accurate classification, lesion segmentation and interpretable lesion quantitative analysis of potato tuber diseases, and provides a method for carrying out phenotypic analysis and severity quantification on potato tuber diseases by utilizing a double-headed CNN-transporter network. In order to achieve the above purpose, the present invention is realized by the following technical scheme: A method for phenotyping and quantifying the severity of potato tuber diseases using a double ended CNN-transporter network, comprising the steps of: s1, combining potato disease images acquired from the ground and potato disease images of public data resources, constructing a potato disease dataset, and marking all images in the potato disease dataset by using Anylabeling tools; S2, carrying out data enhancement on the image in the potato disease data set obtained in the step S1 by adopting an image enhancement library albumentations to obtain a potato disease data set after data enhancement, wherein the potato disease data set is divided into a training set and a testing set; S3, constructing a double-head hybrid CNN-converter architecture, wherein CNN encoders for local feature extraction, converter bottleneck layers for global context modeling and a double-head output structure coupled through a novel SAP mechanism are sequentially connected; S4, constructing a composite multitask loss function, training the double-head hybrid CNN-converter framework constructed in the step S3 by using the training set obtained in the step S2 to obtain a trained double-head hybrid CNN-converter framework, and performing automatic diagnosis and detection on potato diseases by using a test set. Further, the specific implementation method of the step S1 includes the following steps: s1.1, autonomously collecting potato disease images in the field, wherein the potato disease images comprise potato disease images of black nevus, common scab, pink rot, powdery scab and psoriasis; s1.2, selecting potato disease images of public data resources, wherein the potato disease images comprise a public reference data set, a newly built expert annotation data set from Heilongjiang province of China and an independent data set of Norway, establishing samples of different severity stages of dry rot, and dividing the internal type of the dry rot, the surface type of the dry rot and the severity type of the dry rot into three independent categories; S1.3, constructing a potato disease data set based on the images of the step S1.1 and the step S1.2, marking by using a Anylabeling tool, and generating a pixel-level disease mask for all the images by drawing polygon lesion boundaries. Further, the specific implementation method of the step S2 includes the following steps: S2.1, performing rigid geometric transformation on the image in the potato disease data set obtained in the step S1, setting the probability of horizontal/vertical overturning to be 0.5/0.3, setting the probability of translational scaling rotation to be 0.7, and setting the rotation angle to be +/-25 degrees; s2.2, performing non-rigid geo