We propose Identical and Disparate Feature Decomposition (INDeeD) from multi-label data that explicitly learn the characteristics of individual label.
The last decades witnessed an increasing number of works aiming at proposing objective measures for media quality assessment, i.e. determining an estimation of the mean opinion score (MOS) of human observers. In this contribution, we investigate a possibility of modeling and predicting single observer’s opinion scores rather than the MOS. More precisely, we attempt to approximate the choice of one single observer by designing a neural network (NN) that is expected to mimic that observer behavior in terms of visual quality perception. Once such NNs (one for each observer) are trained they can be looked at as “virtual observers” as they take as an input information about a sequence and they output the score that the related observer would have given after watching that sequence. This new approach allows to automatically get different opinions regarding the perceived visual quality of a sequence whose quality is under investigation and thus estimate not only the MOS but also a number of other statistical indexes such as, for instance, the standard deviation of the opinions. Large numerical experiments are performed to provide further insight into a suitability of the approach.
For decades, image quality analysis pipeline has been using filters that are derived from human vision system. Although this paradigm is able to capture the basic aspects of human vision, it falls short of characterizing the complex human perception of different visual appearance and image quality. In this work, we propose a new framework that leverages the image recognition capabilities of convolution neural networks to distinguish the visual differences between uniform halftone target samples that are printed on different media using the same printing technology. First, for each scanned target sample, a pre-trained Residual Neural Network is used to generate 2,048-dimension vision feature vector. Then, Principal Component Analysis is used to reduce the dimension to 48 components, which is then used to train a Support Vector Machine to classify the target images. Our model has been tested on various classification and regression tasks and shows very good performance. Further analysis shows that our neural-network-based image quality model learns to makes decisions based on the frequencies of color variations within the target image, and it is capable of characterizing the visual differences under different printer settings.
Many people cannot see depth in stereoscopic displays. These individuals are often highly motivated to recover stereoscopic depth perception, but because binocular vision is complex, the loss of stereo has different causes in different people, so treatment cannot be uniform. We have created a virtual reality (VR) system for assessing and treating anomalies in binocular vision. The system is based on a systematic analysis of subsystems upon which stereoscopic vision depends: the ability to converge properly, appropriate regulation of suppression, extraction of disparity, use of disparity for depth perception and for vergence control, and combination of stereoscopic depth with other depth cues. Deficiency in any of these subsystems can cause stereoblindness or limit performance on tasks that require stereoscopic vision. Our system uses VR games to improve the function of specific, targeted subsystems.