Real or Synthetic? An Evaluation of AI-generated Audio-visual Content

Devi  Klein; Anustup  Choudhury; Evan  Gitterman; Jaclyn  Pytlarz; Scott  Daly

doi:10.2352/EI.2026.38.10.HVEI-232

Abstract

Generative AI (GenAI) models enable scalable multimedia content creation but can introduce artifacts that lack perceived realism. We conducted a perceptual study to assess how audio-visual cues impact people’s ability to discriminate real user-generated content (UGC) from synthetic AI-generated content. Observers (N=36) participated in a two-interval forced-choice task across conditions that manipulated audiovisual consistency. They reliably identified synthetic content, achieving the highest accuracy when visual cues were available and the lowest when having to solely rely on audio content/quality issues. Our eye-tracking analysis indicated that biological motion inconsistencies were salient, while lower-level, texture-related distortions received less attention. Our proposed taxonomy of audio content and quality issues did not significantly predict task performance. However, these findings highlight the dominant role of visual artifacts in the decision-making process and the relative robustness of GenAI audio. Our work provides guidance for improving the perceptual quality of future, edge-deployed GenAI models.

Electronic Imaging

2470-1173

Society for Imaging Science and Technology

IS&T 7003 Kilworth Lane, Springfield, VA 22151 USA

10.2352/EI.2026.38.10.HVEI-232

HVEI-232

Proceedings Paper

Real or Synthetic? An Evaluation of AI-generated Audio-visual Content

KleinDevi

Dolby Laboratories Inc., US

ChoudhuryAnustup

Dolby Laboratories Inc., US

GittermanEvan

Dolby Laboratories Inc., US

PytlarzJaclyn

Dolby Laboratories Inc., US

DalyScott

Dolby Laboratories Inc., US

Abstract

132026

HVEI

Human Vision and Electronic Imaging 2026

232-1

232-8

This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

2026

Audio-visual perceptionGenerative AI Quality AssesmentPsychophysicsEye trackingVisual search and Attention

articleview.keywords