Back to articles
Proceedings Paper
Volume: 38 | Article ID: GENAI-180
Image
A Quantitative Framework for Evaluating Color-name Understanding in Generative AI Models
  DOI :  10.2352/EI.2026.38.12.GENAI-180  Published OnlineMarch 2026
Abstract
Abstract

With the proliferation of text-to-image generative AI, understanding the fidelity of their output is critical. While these models can generate visually stunning images, their interpretation of nuanced, subjective concepts like color names remains largely unquantified. This paper introduces a systematic framework to evaluate how accurately leading generative AI models (including Flux, Ideogram, Kandinsky, Gemini and Stable Diffusion) understand and reproduce colors from textual prompts. We prompted these models with both one-word (e.g., ”blue”) and two-word (e.g., ”sky blue”) color names to generate uniform color fields. The resulting images were analyzed by converting them to the perceptually uniform CIE Lab color space. An adaptive k-means clustering algorithm was employed to extract the dominant color, mitigating issues of non-uniformity in the generated images. By calculating the perceptual color difference using CIEDE2000 (ΔE00) and the chromatic distance (Δab) between the AI-generated colors and standardized ground-truth values, we provide a quantitative benchmark of each model’s color accuracy. Our findings reveal that while all models broadly understand the mapping between color names and hue, significant performance variations exist among models, with systematic differences in lightness and chroma reproduction. Per-model analysis reveals a clear hierarchy in chromatic fidelity: Gemini and Flux demonstrate the strongest anchoring, while Kandinsky exhibits striking hue-dependent anisotropy and Stable Diffusion shows the broadest isotropic dispersion. Per-color analysis identifies systematic undersaturation of short-wavelength and high-chroma colors (blue, indigo, magenta) across all models, while warm colors (red, orange, yellow) are generally better grounded. We highlight that results vary significantly across random seeds for the same prompt and model, and that lexical specificity generally—but not universally—improves chromatic grounding. This work provides a robust methodology for auditing and improving color fidelity in future generative models.

Subject Areas :
Views 18
Downloads 3
 articleview.views 18
 articleview.downloads 3
  Cite this article 

Robin Jenkin, Vijayalaxmi M, Shailesh Pawale, Francis Fernandes, Ashutosh Naryagol, Salman Sanadi, "A Quantitative Framework for Evaluating Color-name Understanding in Generative AI Modelsin Electronic Imaging,  2026,  pp 180-1 - 180-14,  https://doi.org/10.2352/EI.2026.38.12.GENAI-180

 Copy citation
  Copyright statement 
Copyright ©2026 Society for Imaging Science and Technology 2026
ei
Electronic Imaging
2470-1173
2470-1173
Society for Imaging Science and Technology
IS&T 7003 Kilworth Lane, Springfield, VA 22151 USA