CN-122023903-A - Crop disease identification method and device based on semantic cone

CN122023903ACN 122023903 ACN122023903 ACN 122023903ACN-122023903-A

Abstract

The invention relates to a crop disease identification method based on semantic cone, which comprises the steps of collecting crop disease images comprising different disease types to form a crop disease image data set, improving a visual language model to obtain mapped image features and mapped text features, performing fine tuning training on a projection layer to obtain a trained model, inputting the crop disease images to be identified into the trained model, and outputting corresponding crop disease identification results. According to the invention, by introducing semantic cone consistency constraint, the semantic relation among crop disease categories is subjected to structural modeling, so that the semantic relevance among the disease categories can be considered simultaneously in the identification process of the model, the rationality of disease identification results can be improved, confusion among different disease category characteristics can be reduced, and the stability and reliability of the model in crop disease identification under a complex scene can be improved.

Inventors

CHEN HAO
ZHAO JINLING
Ruan chao
GUO HENGLI
WANG JINCHENG
WANG YULONG
HUANG LINSHENG

Assignees

安徽大学

Dates

Publication Date: 20260512
Application Date: 20260129

Claims (6)

1. A crop disease identification method based on semantic cone is characterized by comprising the following steps in sequence: (1) Collecting crop disease images containing different crop disease types and corresponding disease types, and carrying out supplementary text description on the crop disease images, wherein the crop disease images and the text description form a crop disease image data set; (2) Improving the visual language model to obtain an improved model, wherein the visual language model comprises an image encoder and a text encoder, a projection layer is added behind the image encoder and the text encoder to obtain the improved visual language model, a crop disease image is input into the image encoder to extract corresponding image features, meanwhile, a text description is input into the text encoder to extract corresponding text features, and the image features and the text features are respectively input into the projection layer to be mapped to obtain mapped image features And mapped text features ; (3) Fine tuning training is carried out on the projection layer, semantic cone consistency constraint is introduced, and hierarchical semantic relations between mapped image features and text features are constrained, so that the features of the same crop disease category maintain consistent semantic directions in semantic space, and a trained model is obtained; (4) And inputting the crop disease image to be identified into the trained model, and outputting a corresponding crop disease identification result.
2. The method of claim 1, wherein in step (1), the crop disease image dataset comprises PLANTVILLAGE dataset, plantaeK dataset, PLANTLEAF dataset, plantDoc dataset, and PLANTWILD dataset.
3. The method for identifying crop diseases based on semantic cone as defined in claim 1, wherein the step (2) specifically comprises the steps of extracting image features by using a pre-trained CLIP model and using the pre-trained CLIP model by using the visual language model And text features : ; Wherein, the An image representing a disease of a crop, Representing a textual description; in the case of an image encoder, Is a text encoder; Characterizing an image And text features Respectively inputting the image features into a projection layer to make the image features after mapping With mapped text features In the unified feature representation space, the image encoder and text encoder parameters remain frozen during the mapping process, and the mapping formula is: ; Wherein, the Representing the modulo length of the mapped image feature, Representing the mapped image feature vector; representing the modular length of the mapped text feature, Representing the mapped text feature vector.
4. The method for identifying crop diseases based on semantic cone as defined in claim 1, wherein the step (3) specifically comprises constructing a total loss function for fine tuning as follows : ; Wherein, the Representing a loss of contrast for constraining image features to text features, To constrain semantic cone loss of a disease category hierarchy semantic structure, Weight coefficients lost for semantic cones; For B image text pairs in a group of samples, the image features and the text features are respectively recorded as And calculating the matching degree of the image and the text by using the cosine similarity: ; Wherein, the Representative image Matching to text Is a function of the probability of (1), Representing an exponential mapping of the number of points, Representing the degree of cosine similarity, Each represents a constant between 1 and B, Represents the first The characteristics of the individual text are such that, Is a temperature parameter for adjusting the confidence distribution of the prediction; represents the first A text feature; loss of contrast from corresponding image to text The method comprises the following steps: ; Accordingly, the matching degree of the text and the image is calculated as follows: ; Wherein, the Representative text Matching to an image Probability of (2); represents the first Image features; Obtaining contrast loss from text to image direction : ; Loss of contrast The method comprises the following steps: ; Mapping the image features Mapped text features Mapping to Lorentz model by exponential mapping In, obtaining image features in Lorentz space Text feature : In Lorentz space text features in (a) As a semantic center, constructing a semantic cone for the corresponding crop disease category, and setting the aperture angle of the semantic cone The definition is as follows: ; Wherein, the And Are all parameters for controlling the shape of the semantic cone, For controlling the width of the semantic cone, Determining the depth of the semantic cone, the aperture angle Representing the semantic range of the disease category corresponding to the text feature in the semantic space; subsequently, image features in Lorentz space are calculated based on the Lorentz inner product Text features in Lorentz space Deviation angle between : ; Wherein, the Representing the lorentz inner product; When the deviation angle Greater than the aperture angle Semantic cone loss at time The expression is as follows: ; Wherein, the And Representing the image features and text features of the ith sample in lorentz space, respectively; semantic cone loss The method is used for restraining the image features to converge towards the inside of the corresponding text semantic cone; Based on total loss function And under the condition of keeping the trunk parameters of the visual language model frozen, only performing fine adjustment and optimization on the parameters of the projection layer to obtain the trained model.
5. An electronic device, comprising: Processor, and A memory having stored therein computer program instructions that, when executed by the processor, cause the processor to perform the semantic cone-based crop disease identification method of any one of claims 1-4.
6. A computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, cause the processor to perform the semantic cone based crop disease identification method of any one of claims 1-4.

Description

Crop disease identification method and device based on semantic cone Technical Field The invention relates to the technical field of crop disease identification methods, in particular to a semantic cone-based crop disease identification method and a semantic cone-based crop disease identification device. Background Agricultural diseases are statistically responsible for 20% to 40% of global crop yield loss each year. In the context of an ever-increasing global population, increasing crop yield has become an important challenge for all humans. Therefore, the method can timely and accurately identify and prevent crop diseases, and has important practical significance for guaranteeing grain safety and improving the sustainable development level of agriculture. With the continuous development of agricultural informatization, the research of mainstream crop disease identification methods gradually turns to a model based on deep learning from traditional image processing methods such as spectrogram analysis and the like so as to realize more efficient identification of crop diseases. However, these supervised learning methods generally rely on a large number of manual labeling images, which is difficult for the agricultural field to acquire a large number of labeling images, and the category labels of the existing data sets cannot express semantic relationships among categories, and meanwhile, the generalization energy of the new disease categories, the distribution shifts and the diversified shooting conditions which are common in the actual agricultural environment are also obviously weakened. In recent years, a visual language pre-training model such as CLIP (computer language control protocol) performs contrast learning through images and texts, and visual features are aligned with natural language semantics, so that zero sample and cross-domain recognition capability in the general visual field is remarkably improved, and a new solution idea is provided for agricultural disease recognition. Meanwhile, the prompt learning method further improves the expression of the CLIP on specific agricultural tasks by automatically optimizing the learnable text prompt words, and the existing research shows that the prompt learning method has good application potential in agricultural scenes. However, the gradient descent-based cue word optimization method is prone to overfitting. In addition, crop diseases have a natural hierarchical structure in biology, but the current mainstream prompt learning method generally regards all diseases as a concept of a level, and it is difficult to embody and utilize such a fine-grained hierarchical relationship. Therefore, there is a need for a method that can distinguish category structures from semantic boundaries to enhance the separability of text representations and image semantic consistency. Disclosure of Invention The invention aims to solve the problems of complex semantic structure and insufficient category distinction of crop diseases in an agricultural scene, and the primary aim of the invention is to provide a semantic cone-based crop disease recognition method which realizes the expression of disease level semantics by introducing Lorentz space constraint in a feature mapping stage on the premise of not changing a visual language model main structure, has small calculation cost and improves the stability and reliability of the model in the process of recognizing crop diseases in a complex scene. In order to achieve the purpose, the invention adopts the following technical scheme that the crop disease identification method based on the semantic cone comprises the following steps in sequence: (1) Collecting crop disease images containing different crop disease types and corresponding disease types, and carrying out supplementary text description on the crop disease images, wherein the crop disease images and the text description form a crop disease image data set; (2) The visual language model is improved to obtain an improved model, wherein the visual language model comprises an image encoder and a text encoder, a projection layer is added behind the image encoder and the text encoder to obtain the improved visual language model, a crop disease image is input into the image encoder to extract corresponding image features, meanwhile, text description is input into the text encoder to extract corresponding text features, and the image features and the text features are respectively input into the projection layer to be mapped to obtain mapped image features And mapped text features; (3) Fine tuning training is carried out on the projection layer, semantic cone consistency constraint is introduced, and hierarchical semantic relations between mapped image features and text features are constrained, so that the features of the same crop disease category maintain consistent semantic directions in semantic space, and a trained model is obtained; (4) And inputting the crop disease i