Domain Adaptation (DA) techniques aim to overcome the domain shift between a source domain used for training and a target domain used for testing. In recent years, vision transformers have emerged as a preferred alternative to Convolutional Neural Networks (CNNs) for various computer vision tasks. When used as backbones for DA, these attention-based architectures have been found to be more powerful than standard ResNet backbones. However, vision transformers require a larger computational overhead due to their model size. In this paper, we demonstrate the superiority of attention-based architectures for domain generalization and source-free unsupervised domain adaptation. We further improve the performance of ResNet-based unsupervised DA models using knowledge distillation from a larger teacher model to the student ResNet model. We explore the efficacy of two frameworks and answer the question: is it better to distill and then adapt or to adapt and then distill? Our experiments on two popular datasets show that adapt-to-distill is the preferred approach.
Reference-based image quality assessment techniques use information from an undistorted reference image of the same scene to estimate the quality of a distorted target image. The main challenge in designing algorithms for quality assessment is to incorporate the behavior of the human visual system into the algorithms. The advent of deep learning (DL) techniques has garnered sufficient interest among researchers in the field of image quality assessment. The common limitation of applying deep learning for image quality assessment is its dependence on a large amount of subjective training data. Recent advances in the field of patch-based self-supervised vision transformers have achieved remarkable results for tasks like object segmentation, copy detection, etc. and other downstream computer vision tasks. In this paper, we study how the distance between the pretrained self-supervised vision transformer features applied on pristine and distorted images is related to the human visual system. Experiments carried out on three publicly available image quality databases (namely KADID-10K, TID2013, and MDID2016) have yielded promising results that can be further exploited to design perceptual reference-based image quality assessment methods.