Few-shot Food Recognition with Pre-trained Model.

Yanqi Wu,Xue Song,Jingjing Chen
DOI: https://doi.org/10.1145/3552485.3554939
2022-01-01
Abstract:Food recognition is a challenging task due to the diversity of food. However, conventional training in food recognition networks demands large amounts of labeled images, which is laborious and expensive. In this work, we aim to tackle the challenging few-shot food recognition problem by leveraging the knowledge learning from pre-trained models, e.g., CLIP. Although CLIP has shown a remarkable zero-shot capability on a wide range of vision tasks, it performs poorly in the domain-specific food recognition task. To transfer CLIP's rich prior knowledge, we explore an adapter-based approach to fine-tune CLIP with only a few samples. Thus we combine CLIP's prior knowledge with the new knowledge extracted from the few-shot training set effectively for achieving good performance. Besides, we also design appropriate prompts to facilitate more accurate identification of foods from different cuisines. Experiments demonstrate that our approach achieves quite promising performance on two public food datasets, including VIREO Food-172 and UECFood-256.
What problem does this paper attempt to address?