Back to articles
Proceedings
Volume: 36 | Article ID: IMAGE-238
Image
Adapt to Distill or Distill to Adapt
  DOI :  10.2352/EI.2024.36.8.IMAGE-238  Published OnlineJanuary 2024
Abstract
Abstract

Domain Adaptation (DA) techniques aim to overcome the domain shift between a source domain used for training and a target domain used for testing. In recent years, vision transformers have emerged as a preferred alternative to Convolutional Neural Networks (CNNs) for various computer vision tasks. When used as backbones for DA, these attention-based architectures have been found to be more powerful than standard ResNet backbones. However, vision transformers require a larger computational overhead due to their model size. In this paper, we demonstrate the superiority of attention-based architectures for domain generalization and source-free unsupervised domain adaptation. We further improve the performance of ResNet-based unsupervised DA models using knowledge distillation from a larger teacher model to the student ResNet model. We explore the efficacy of two frameworks and answer the question: is it better to distill and then adapt or to adapt and then distill? Our experiments on two popular datasets show that adapt-to-distill is the preferred approach.

Subject Areas :
Views 38
Downloads 8
 articleview.views 38
 articleview.downloads 8
  Cite this article 

Georgi Thomas, Andreas Savakis, "Adapt to Distill or Distill to Adaptin Electronic Imaging,  2024,  pp 238-1 - 238-6,  https://doi.org/10.2352/EI.2024.36.8.IMAGE-238

 Copy citation
  Copyright statement 
Copyright © 2024, Society for Imaging Science and Technology 2024
ei
Electronic Imaging
2470-1173
2470-1173
Society for Imaging Science and Technology
IS&T 7003 Kilworth Lane, Springfield, VA 22151 USA