
Emotion recognition using physiological signals is often limited by unimodal analysis, which fails to capture interactions across physiological systems. This study proposes a multimodal framework that integrates heart rate (HR) and pupil diameter signals, with a particular focus on modeling cross-modal interactions. We introduce composite features that explicitly represent relationships between HR and pupil dynamics, combined with a two-step feature optimization strategy using correlation-based reduction and mutual information ranking. Experiments were conducted on an emotion-elicitation dataset with three emotional states (Joy, Neutral, Sad), using multiple classifiers and crossvalidation schemes. The proposed method achieved a classification accuracy of 91.1%, significantly outperforming HR-only (61.1%) and pupil-only (72.2%) approaches. Feature analysis revealed that cross-modal descriptors, particularly an entropy-based interaction feature, contributed most to performance improvement. These results demonstrate that explicitly modeling cross-modal physiological interactions provides an effective strategy for enhancing emotion recognition accuracy.
Tsukasa Yano, Midori Tanaka, Takahiko Horiuchi, "Enhancing Emotion Estimation Accuracy through Integrated Analysis of Heart Rate and Pupil Signals" in Electronic Imaging, 2026, pp 226-1 - 226-5, https://doi.org/10.2352/EI.2026.38.10.HVEI-226