Back to articles
Proceedings Paper
Volume: 38 | Article ID: CVAA-172
Image
Leveraging Vision–language Models for Semantic Interpretation of Historical Paintings: A Case Study on Religiousness
  DOI :  10.2352/EI.2026.38.14.CVAA-172  Published OnlineMarch 2026
Abstract
Abstract

Historical paintings reflect the social, cultural, and religious contexts of their time. With the emergence of vision–language models (VLMs), it has become possible to generate textual interpretations from images; however, it remains unclear what information these models rely on and how their outputs should be evaluated. This study examines the characteristics and validity of VLM-generated interpretations through two experiments. First, an art style classification task using the Pandora dataset shows that VLMs tend to group paintings into historically related styles, although strict distinctions are not always achieved. Second, focusing on religious paintings, we evaluate the agreement between generated interpretations and museum descriptions using BERTScore under three conditions: image only, image with metadata, and metadata only. Results indicate that metadata improves scores, while visual input has limited impact. Moreover, evaluation outcomes depend strongly on the content of reference texts. These findings suggest that VLM-based interpretation relies more on linguistic context than visual information and highlight limitations of using museum descriptions as evaluation references.

Subject Areas :
Views 16
Downloads 5
 articleview.views 16
 articleview.downloads 5
  Cite this article 

Yuya Kanazawa, Midori Tanaka, Hiroshi Kera, Takahiko Horiuchi, "Leveraging Vision–language Models for Semantic Interpretation of Historical Paintings: A Case Study on Religiousnessin Electronic Imaging,  2026,  pp 172-1 - 172-7,  https://doi.org/10.2352/EI.2026.38.14.CVAA-172

 Copy citation
  Copyright statement 
Copyright ©2026 Society for Imaging Science and Technology 2026
ei
Electronic Imaging
2470-1173
2470-1173
Society for Imaging Science and Technology
IS&T 7003 Kilworth Lane, Springfield, VA 22151 USA