Analyzing the performance of autoencoder-based objective quality metrics on audio-visual content

Helard Becerra Martinez; Mylène C.Q. Farias; Andrew Hines

doi:10.2352/ISSN.2470-1173.2020.9.IQSP-167

Abstract

The development of audio-visual quality models faces a number of challenges, including the integration of audio and video sensory channels and the modeling of their interaction characteristics. Commonly, objective quality metrics estimate the quality of a single component (audio or video) of the content. Machine learning techniques, such as autoencoders, offer as a very promising alternative to develop objective assessment models. This paper studies the performance of a group of autoencoder-based objective quality metrics on a diverse set of audio-visual content. To perform this test, we use a large dataset of audio-visual content (The UnB-AV database), which contains degradations in both audio and video components. The database has accompanying subjective scores collected on three separate subjective experiments. We compare our autoencoder-based methods, which take into account both audio and video components (multi-modal), against several objective (single-modal) audio and video quality metrics. The main goal of this work is to verify the gain or loss in performance of these single-modal metrics, when tested on audio-visual sequences.

72010604

Electronic Imaging

2470-1173

Society for Imaging Science and Technology

7003 Kilworth Lane, Springfield, VA 22151 USA

10.2352/ISSN.2470-1173.2020.9.IQSP-167

2470-1173(20200126)2020:9L.1671;1-

ei_24701173_v2020n9_input/s7.xml

/ist/ei/2020/00002020/00000009/art00007

Articles

Analyzing the performance of autoencoder-based objective quality metrics on audio-visual content

MartinezHelard Becerra

FariasMylène C.Q.

HinesAndrew

26012020

2020

167-1

167-6

2020

Audio qualityVideo qualityAutoencoderNo-reference quality metricAudio degradationsVideo degradations

articleview.keywords