CN-122029575-A - Cascading method for few-sample semantic segmentation

CN122029575ACN 122029575 ACN122029575 ACN 122029575ACN-122029575-A

Abstract

A computer-implemented method for semantic segmentation includes building a co-occurrence table that includes predictions for a pre-trained model of a base class and co-occurrences of labels for a novel class from the pre-trained model for the base class and from training data with the novel class. A classifier is trained that is associated with the base class and classifies the input according to co-occurrence as one of the base class and a novel class having co-occurrence with the base class. Predictions from the pre-trained model and trained classifier are fused to obtain the final prediction result as a fully annotated image.

Inventors

Sakai Zhizai
YAGI TAKAYUKI
QIU HAOXIANG
RUDY RAYMOND HARRY
INOUE TADANOBU

Assignees

国际商业机器公司

Dates

Publication Date: 20260512
Application Date: 20240815
Priority Date: 20231017

Claims (20)

1. A computer-implemented method for semantic segmentation, the method comprising: constructing a co-occurrence table comprising predictions for a pre-trained model of a base class and co-occurrences of tags for a novel class, the tags from the pre-trained model for the base class and from training data with the novel class; training one or more classifiers associated with the base class and classifying the input according to the co-occurrence table as one of the base class and a novel class having co-occurrence with the base class, and Predictions from the pre-trained model and the one or more classifiers are fused to obtain a final prediction result as fully labeled data.
2. The computer-implemented method of claim 1, wherein fusing predictions includes predicting base classes and novel classes simultaneously.
3. The computer-implemented method of claim 1 or claim 2, wherein building the co-occurrence table includes counting a number of data points having predictions for a pre-trained model of a base class and labels for a novel class.
4. The computer-implemented method of any preceding claim, wherein training one or more classifiers comprises selecting a classifier to train using multiple top co-occurrence pairs for each novel class.
5. The computer-implemented method of any preceding claim, wherein training one or more classifiers comprises training a classifier if a value of co-occurrence in a co-occurrence table exceeds a threshold.
6. The computer-implemented method of any preceding claim, wherein building the co-occurrence table comprises considering co-occurrence if a softmax probability of a pre-trained model for a base class exceeds a threshold.
7. The computer-implemented method of any preceding claim, wherein training one or more classifiers comprises: Decomposing training data into subsets, and At least one integration method is employed to train a classifier using the subset.
8. A computer program product for semantic segmentation, the computer program product comprising a computer-readable storage medium embodying program instructions executable by a computer to cause the computer to: constructing a co-occurrence table comprising predictions for a pre-trained model of a base class and co-occurrences of tags for a novel class, the tags from the pre-trained model for the base class and from training data with the novel class; training one or more classifiers associated with the base class and classifying the input according to the co-occurrence table as one of the base class and a novel class having co-occurrence with the base class, and Predictions from the pre-trained model and the one or more classifiers are fused to obtain a final prediction result as fully labeled data.
9. The computer program product of claim 8, wherein the final prediction result comprises a base class and a novel class predicted simultaneously.
10. The computer program product of claim 8 or claim 9, wherein the co-occurrence table comprises a plurality of data points with predictions for a pre-trained model of a base class and labels for a novel class.
11. The computer program product of any of claims 8 to 10, wherein one or more classifiers are trained by selecting a classifier to train using multiple top co-occurrence pairs for each novel class.
12. The computer program product of any of claims 8 to 11, wherein one or more classifiers are trained by training the classifier if a value of co-occurrence in a co-occurrence table exceeds a threshold.
13. The computer program product of any of claims 8 to 12, wherein the co-occurrence table is constructed by including co-occurrences of pixels if a softmax probability of a pre-trained model of a base class exceeds a threshold.
14. The computer program product of any of claims 8 to 13, wherein the instructions cause the computer to train the one or more classifiers by: Decomposing training data into subsets, and At least one integration method is employed to train a classifier using the subset.
15. A system for semantic segmentation, comprising: Hardware processor, and A memory coupled to the hardware processor, the memory storing: a co-occurrence table comprising a prediction of a pre-training model for a base class and co-occurrence of a tag for a novel class, the tag from the pre-training model for the base class and from training data with the novel class, and One or more classifiers associated with the base class for classifying the input as one of the base class and a novel class having co-occurrence with the base class according to a co-occurrence table; the hardware processor generates final prediction results by fusing pixel labels from the pre-trained model with predicted new results from the one or more classifiers to achieve fully labeled data.
16. The system of claim 15, wherein the co-occurrence table includes a count of data points having predictions for a pre-trained model of a base class and labels for a novel class.
17. The system of claim 15 or claim 16, wherein the one or more classifiers are selected for training based on a number of top co-occurrence pairs for each novel class.
18. The system of any of claims 15 to 17, wherein the one or more classifiers are selected for training based on whether a value of co-occurrence in a co-occurrence table exceeds a threshold.
19. The system of any of claims 15 to 18, wherein the co-occurrence table includes co-occurrences if a softmax probability of a pre-trained model of a base class exceeds a threshold.
20. The system of any one of claims 15 to 19, comprising an integrated method for training a classifier.

Description

Cascading method for few-sample semantic segmentation Technical Field The present invention relates generally to image classification using artificial intelligence, and more particularly to a system and method for classifying or partitioning objects in an image into base categories (base categories) and novel categories (base categories) simultaneously. Background The deep learning model requires training on a large number of training examples of classes (classes) to identify samples in the class. In many real world settings, a large number of annotation examples are not available for all classes. In these cases, the standard deep learning model will perform poorly for classes with only a few training examples. Visual inspection plays a vital role in many industries. Semantic segmentation is useful for manufacturing processes, automated driving of automobiles, and other machine learning tasks, for example. Semantic segmentation associates a label (label) or category with each pixel in the image. This is used to identify a set of pixels that form an object that can be categorized into different categories. Standard learning methods require a large amount of annotation data to train the model. This becomes more burdensome as the annotation data needs to be annotated accurately to obtain the best results. Thus, there is a need for semantic segmentation that allows novel features in images to be identified with a small number of training samples. There is further a need for a small sample (few-shot) semantic segmentation that classifies novel features more accurately and using less computer resources. Furthermore, there is a need for a low sample semantic segmentation that classifies both novel features and underlying features. Disclosure of Invention According to an embodiment of the invention, a computer-implemented method for semantic segmentation includes building a co-occurrence table (co-occurrence table) that includes predictions of a pre-trained model for a base class and co-occurrences of labels for a novel class from the pre-trained model for the base class and from training data with the novel class. A classifier is trained that is associated with the base class and classifies the input according to co-occurrence as one of the base class and a novel class that has co-occurrence with the base class. Predictions from the pre-trained model and the trained classifier are fused to obtain a final prediction result as fully labeled data (fully labeled data). The complete annotation data may include an image and final prediction results may be generated for both the novel class and the base class. This reduces computer processing time. By including additional trained classifiers, novel features can be more accurately identified with less labeled training data. According to an embodiment of the present invention, a computer program product for semantic segmentation, wherein the computer program product comprises a computer readable storage medium embodying program instructions. The program instructions are executable by a computer to cause the computer to construct a co-occurrence table comprising predictions for a pre-trained model of a base class and co-occurrences of labels for novel classes from the pre-trained model for the base class and from training data with the novel classes, train one or more classifiers associated with the base class and classifying inputs as one of the base class and the novel class having co-occurrences with the base class according to the co-occurrence table, and fuse the predictions from the pre-trained model and the one or more classifiers to obtain a final prediction result as a fully labeled image. While generating fully annotated images for the novel class and the base class, which reduces computer processing time. By including additional trained classifiers, novel features can be more accurately identified with less labeled training data. According to an embodiment of the present invention, a system for semantic segmentation includes a hardware processor and a memory coupled to the processor. The memory stores a co-occurrence table that includes predictions of the pre-training model for the base class and co-occurrences of tags for the novel class from the pre-training model for the base class and from training data with the novel class. One or more classifiers are associated with the base class and classify the input as one of a base class and a novel class having co-occurrence with the base class according to a co-occurrence table. The hardware processor generates final prediction results by fusing pixel labels from the pre-trained model with new results of predictions from one or more classifiers to achieve a fully annotated image. While generating fully annotated images for the novel class and the base class. This reduces computer processing time. By including additional trained classifiers, novel features can be more accurately identified with less labeled training d