Tags:Contrastive Learning, Data augmentation and Few-shot classification
Abstract:
Few-shot classification categorizes objects with minimal training data, making it valuable when large datasets are impractical. Models are trained on a base set with many samples per class and tested on a novel set, where they classify new samples using only a few examples per class. Since base and novel classes are distinct, models must generalize to unseen classes while training on the base set, making meta-learning more effective than traditional classification methods. State-of-the-art techniques improve generalization by pretraining on large datasets, followed by applying meta-learning to further enhance performance. However, we argue that although meta-learning is effective for few-shot tasks, models often overfit to the base classes, reducing performance on novel classes, even with pretraining. To address this issue, we propose two techniques in the meta-learning phase to reduce overfitting and improve generalization. First, we mask important parts of the sample to prevent the model from over-relying on specific features. Masking is applied using attention scores in ViT-like backbones or class activation maps in CNN-based backbones. Using the masked samples, we apply contrastive loss to prototypical network training, reducing the distance between a sample and its class prototype while increasing the distance to prototypes of other classes. The proposed method is applicable regardless of the backbone, whether a pretrained model is used, or whether the approach is inductive or transductive. We conduct experiments on various benchmark datasets and configurations to demonstrate the effectiveness of our method.
Supervised Contrastive Learning with Importance-based CutOut for Few-shot Image Classification