IS&T | Library

A Quantitative Framework for Evaluating Color-name Understanding in Generative AI Models

18 3

Generative AI
Large Language Models
Color Science
CIE Lab
Color Naming
Text-to-Image Generation
Perceptual Color Difference (Delta E)

Robin Jenkin, Vijayalaxmi M, Shailesh Pawale, Francis Fernandes, Ashutosh Naryagol, Salman Sanadi

DOI

10.2352/EI.2026.38.12.GENAI-180

Volume 38

Issue 12

Abstract

With the proliferation of text-to-image generative AI, understanding the fidelity of their output is critical. While these models can generate visually stunning images, their interpretation of nuanced, subjective concepts like color names remains largely unquantified. This paper introduces a systematic framework to evaluate how accurately leading generative AI models (including Flux, Ideogram, Kandinsky, Gemini and Stable Diffusion) understand and reproduce colors from textual prompts. We prompted these models with both one-word (e.g., ”blue”) and two-word (e.g., ”sky blue”) color names to generate uniform color fields. The resulting images were analyzed by converting them to the perceptually uniform CIE Lab color space. An adaptive k-means clustering algorithm was employed to extract the dominant color, mitigating issues of non-uniformity in the generated images. By calculating the perceptual color difference using CIEDE2000 (ΔE00) and the chromatic distance (Δab) between the AI-generated colors and standardized ground-truth values, we provide a quantitative benchmark of each model’s color accuracy. Our findings reveal that while all models broadly understand the mapping between color names and hue, significant performance variations exist among models, with systematic differences in lightness and chroma reproduction. Per-model analysis reveals a clear hierarchy in chromatic fidelity: Gemini and Flux demonstrate the strongest anchoring, while Kandinsky exhibits striking hue-dependent anisotropy and Stable Diffusion shows the broadest isotropic dispersion. Per-color analysis identifies systematic undersaturation of short-wavelength and high-chroma colors (blue, indigo, magenta) across all models, while warm colors (red, orange, yellow) are generally better grounded. We highlight that results vary significantly across random seeds for the same prompt and model, and that lexical specificity generally—but not universally—improves chromatic grounding. This work provides a robust methodology for auditing and improving color fidelity in future generative models.

Digital Library: EI

Published Online: March 2026

From Pixels to Worlds: A Survey on the New Wave of High-fidelity Video Generation

4 0

Video Generation
Large Multimodal Models
Audio-Visual Synthesis
Generative AI
Controllable Generation

Weijuan Xi

DOI

10.2352/EI.2026.38.7.IMAGE-266

Volume 38

Issue 7

Abstract

The field of computer vision is currently undergoing a pivotal transformation, shifting its focus from discriminative to generative tasks. Over the past two decades, the discipline was primarily defined by the discriminative imperative, which sought to enable machines to perceive, classify, and segment the visual world. However, catalyzed by the development of the Diffusion Transformer (DiT), the years 2024 and 2025 marked a Generative Turn, where the benchmark of artificial visual intelligence has evolved from mere classification to controllable simulation. The ability to generate high-fidelity, physically consistent video has led to the development of advanced generative models capable of representing underlying physical dynamics and environmental causality through large-scale data and computation. This survey provides a comprehensive analysis of the recent emergence of high-fidelity video generation. It traces the evolution from the era of feature engineering to the current Diffusion Transformers (DiTs) based generation era, summarizes the present state of video generation and the technical advancements driving this period, and offers a guide detailing the architectures, data selection, and training methodologies essential for high-fidelity video generation.

Digital Library: EI

Published Online: March 2026

Ghiblification and Color Richness in Material Appearance: How Human Observers and Image Quality Metrics Perceive Them?

29 12

Translucency
Gloss
Image Quality
Subjective study
Generative AI
Style Transfer

Mobina Mobini, Olga Cherepkova, Davit Gigilashvili

Pages 136 - 141, October 2025, This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. 2025

DOI

10.2352/CIC.2025.33.1.26

Volume 33

Issue 1

Abstract

Distortions introduced during the reproduction of digital images can lead to substantial changes in their color composition. The motivations for altering images range from practical purposes, such as image compression and color quantization to reduce file size, to more aesthetic applications like style transfer using generative AI. In this work, we investigate how the reproduction of color images affects material appearance, in particular, the perception of gloss and translucency. We applied different image quality distortions to natural images of glossy and translucent objects. Additionally, we Ghiblified them – a recent viral social media phenomenon of mimicking the Japanese anime style using generative AI style transfer. Afterward, we conducted a series of user studies to evaluate the fidelity of gloss and translucency reproduction. The experimental results represent how the reproductions are perceived by image quality metrics and open up a new direction for material appearance studies.

Digital Library: CIC

Published Online: October 2025

Integration of Protocol-driven Chatbots with Generative AI and a Case Study

42 7

chatbot
deep logic
Generative AI

Hasmik Yengibaryan, David Akopian

DOI

10.2352/EI.2025.37.3.MOBMU-319

Volume 37

Issue 3

RiFT - Radiance Field Tomography

Abstract

The integration of deterministic protocol-specified chatbots with generative AI bridges the gap between precise, protocol-driven logic and conversational flexibility. This paper introduces MachineQuizzing, a chatbot designed to enhance learning in machine learning through gamified quizzes and real-time explanations. Leveraging platforms like Dialogflow for structured logic and Gemini for generative capabilities, the chatbot demonstrates how the integration of these technologies can enhance conversational experience.

Digital Library: EI

Published Online: February 2025

Proceedings

174 49

Deep Learning
Diffraction Limit
Fourier Ptychography
Generative AI
Neural Representation
Radiance Fields
Single Image Super-Resolution
Transformer Networks

Kevin Chew Figueroa, Zhipeng Dong, Greg Nero, Gordon Hageman, David J. Brady

DOI

10.2352/EI.2024.36.15.COIMG-123

Volume 36

Issue 15