Back to articles
Regular Article
Volume: 66 | Article ID: 030511
Image
Studies on Cross-modal Feature-based Mapping from Voice-source to Texture through Image Association by Listening Speech
  DOI :  10.2352/J.ImagingSci.Technol.2022.66.3.030511  Published OnlineMay 2022
Abstract
Abstract

The direct correlations between modality driven parameters of voice-source and texture were investigated. A perceptual experiment was conducted using vowel sounds with three representative phonation differences (modal, creaky and breathy) and texture images annotated with semantic terms. For quantitative analyses, acoustic features measuring vocal fold vibration, periodicity, spectral noise level, fundamental frequency and energy were calculated. Computational textural features containing coarseness, contrast, directionality, busyness, complexity, strength and brightness were extracted. The results showed that the most important feature is the amplitude difference between the first two harmonics (H1-H2). H1-H2 significantly correlates to coarseness, contrast, busyness, complexity, strength and brightness. Harmonic-to-Noise Ratios (HNRs) highly correlate to coarseness, busyness, complexity and strength. Significant correlations were also observed between Cepstral Peak Prominence (CPP) & coarseness, fundamental frequency (F0) & complexity, brightness and energy & strength. These parametric correlations can serve as basic scientific knowledge for cross-modal mapping.

Subject Areas :
Views 70
Downloads 8
 articleview.views 70
 articleview.downloads 8
  Cite this article 

Win Thuzar Kyaw, Yoshinori Sagisaka, "Studies on Cross-modal Feature-based Mapping from Voice-source to Texture through Image Association by Listening Speechin Journal of Imaging Science and Technology,  2022,  pp 030511-1 - 030511-13,  https://doi.org/10.2352/J.ImagingSci.Technol.2022.66.3.030511

 Copy citation
  Copyright statement 
Copyright © Society for Imaging Science and Technology 2022
  Article timeline 
  • received October 2021
  • accepted March 2022
  • PublishedMay 2022

Preprint submitted to:
  Login or subscribe to view the content