This paper proposes a novel frame selection technique based on embedding similarity to optimize video quality assessment (VQA). By leveraging high-dimensional feature embeddings extracted from deep neural networks (ResNet-50, VGG-16, and CLIP), we introduce a similarity-preserving approach that prioritizes perceptually relevant frames while reducing redundancy. The proposed method is evaluated on two datasets, CVD2014 and KonViD-1k, demonstrating robust performance across synthetic and real-world distortions. Results show that the proposed approach outperforms state-of-the-art methods, particularly in handling diverse and in-the-wild video content, achieving robust performances on KonViD-1k. This work highlights the importance of embedding-driven frame selection in improving the accuracy and efficiency of VQA methods.
Abderrezzaq Sendjasni, Mohamed-Chaker Larabi, Seif-Eddine Benkabou, "Optimizing Frame Selection for Improved Video Quality Assessment Through Embedding Similarity" in Electronic Imaging, 2025, pp 251-1 - 251-7, https://doi.org/10.2352/EI.2025.37.9.IQSP-251