Back to articles
Article
Volume: 35 | Article ID: IMAGE-269
Image
Self-supervised visual representation learning on food images
  DOI :  10.2352/EI.2023.35.7.IMAGE-269  Published OnlineJanuary 2023
Abstract
Abstract

Food image classification is the groundwork for image-based dietary assessment, which is the process of monitoring what kinds of food and how much energy is consumed using captured food or eating scene images. Existing deep learning based methods learn the visual representation for food classification based on human annotation of each food image. However, most food images captured from real life are obtained without labels, requiring human annotation to train deep learning based methods. This approach is not feasible for real world deployment due to high costs. To make use of the vast amount of unlabeled images, many existing works focus on unsupervised or self-supervised learning to learn the visual representation directly from unlabeled data. However, none of these existing works focuses on food images, which is more challenging than general objects due to its high inter-class similarity and intra-class variance. In this paper, we focus on two items: the comparison of existing models and the development of an effective self-supervised learning model for food image classification. Specifically, we first compare the performance of existing state-of-the-art self-supervised learning models, including SimSiam, SimCLR, SwAV, BYOL, MoCo, and Rotation Pretext Task on food images. The experiments are conducted on the Food-101 dataset, which contains 101 different classes of foods with 1,000 images in each class. Next, we analyze the unique features of each model and compare their performance on food images to identify the key factors in each model that can help improve the accuracy. Finally, we propose a new model for unsupervised visual representation learning on food images for the classification task.

Subject Areas :
Views 44
Downloads 23
 articleview.views 44
 articleview.downloads 23
  Cite this article 

Andrew W. Peng, Jiangpeng He, Fengqing Zhu, "Self-supervised visual representation learning on food imagesin Electronic Imaging,  2023,  pp 269-1 - 269-6,  https://doi.org/10.2352/EI.2023.35.7.IMAGE-269

 Copy citation
  Copyright statement 
Copyright © 2023, Society for Imaging Science and Technology 2023
ei
Electronic Imaging
2470-1173
2470-1173
Society for Imaging Science and Technology
IS&T 7003 Kilworth Lane, Springfield, VA 22151 USA