IS&T | Library

Conference Overview and Papers Program

22 4

Fundamental vision
perception
cognition research
Perceptual approaches to image quality
Visual and cognitive issues in imaging and analysis
Art
aesthetics
emotion
Vision
audition
haptics
multisensory

Pages A12-1 - A12-10, January 2019, © Society for Imaging Science and Technology 2019

DOI

10.2352/ISSN.2470-1173.2019.12.HVEI-A12

Volume 31

Issue 12

Digital Library: EI

Published Online: January 2019

Object-based and multi-frame motion information predict human eye movement patterns during video viewing

44 1

Eye Movements
Object Recognition
Dynamic Features

Zheng Ma, Jiaxin Wu, Sheng-hua Zhong, Stephen J Heinen

Pages 205-1 - 205-6, January 2019, © Society for Imaging Science and Technology 2019

DOI

10.2352/ISSN.2470-1173.2019.12.HVEI-205

Volume 31

Issue 12

Compared to low-level saliency, higher-level information better predicts human eye movement in static images. In the current study, we tested how both types of information predict eye movements while observers view videos. We generated multiple eye movement prediction maps based on low-level saliency features, as well as higher-level information that requires cognition, and therefore cannot be interpreted with only bottom-up processes. We investigated eye movement patterns to both static and dynamic features that contained either lowor higher-level information. We found that higher-level object-based and multi-frame motion information better predict human eye movement patterns than static saliency and two-frame motion information, and higher-level static and dynamic features provide equally good predictions. The results suggest that object-based processes and temporal integration of multiple video frames are essential to guide human eye movements during video viewing.

Digital Library: EI

Published Online: January 2019

Discovery of activities via statistical clustering of fixation patterns

29 4

eye movements
machine learning
scan path comparison

Jeffrey B Mulligan

Pages 206-1 - 206-8, January 2019, © Society for Imaging Science and Technology 2019

DOI

10.2352/ISSN.2470-1173.2019.12.HVEI-206

Volume 31

Issue 12

Human behavior often consists of a series of distinct activities, each characterized by a unique pattern of interaction with the visual environment. This is true even in a restricted domain, such as a pilot flying an airplane; in this case, activities with distinct visual signatures might be things like communicating, navigating, monitoring, etc. We propose a novel analysis method for gaze-tracking data, to perform blind discovery of these hypothetical activities. The method is in some respects analogous to recurrence analysis, which has previously been applied to eye movement data. In the present case, however, we compare not individual fixations, but groups of fixations aggregated over a fixed time interval (t). We assume that the environment has been divided into a finite set of discrete areas-of-interest (AOIs). For a given time interval, we compute the proportion of time spent fixating each AOI, resulting in an N-dimensional vector, where N is the number of AOIs. These proportions can be converted to integer counts by multiplying by t divided by the average fixation duration, a parameter that we fix at 283 milliseconds. We compare different intervals by computing the chi-squared statistic. The p-value associated with the statistic is the likelihood of observing the data under the hypothesis that the data in the two intervals were generated by a single process with a single set of probabilities governing the fixation of each AOI. We cluster the intervals, first by merging adjacent intervals that are sufficiently similar, optionally shifting the boundary between non-merged intervals to maximize the difference. Then we compare and cluster non-adjacent intervals. The method is evaluated using synthetic data generated by a hand-crafted set of activities. While the method generally finds more activities than put into the simulation, we have obtained agreement as high as 80% between the inferred activity labels and ground truth.

Digital Library: EI

Published Online: January 2019

Investigation of the effect of pupil diameter on visual acuity using a neuro-physiological model of the human eye

58 6

visual acuity
pupil diameter
human eye model
psychometric function
surrounding illumination
vision

Csilla Timár-Fülep, Gábor Erdei

Pages 207-1 - 207-7, January 2019, © Society for Imaging Science and Technology 2019

DOI

10.2352/ISSN.2470-1173.2019.12.HVEI-207

Volume 31

Issue 12

Human visual acuity strongly depends on environmental conditions. One of the most important physical parameters affecting its value is the pupil diameter, which follows changes in the surrounding illumination by adaptation. Thus, the direct measurement of its influence on visual performance would require either medicaments or inconvenient apertures placed in front of the subjects? eyes to examine different pupil sizes, so it has not been studied in detail yet. In order to analyze this effect directly, without any external intervention, we accomplished simulations by our complex neuro-physiological vision model. It considers subjects as ideal observers limited by optical and neural filtering, as well as neural noise, and represents character recognition by templatematching. Using the model, we reconstructed the monocular visual acuity of real subjects with optical filtering calculated from the measured wavefront aberration of their eyes. According to our simulations, 1 mm alteration in the pupil diameter causes 0.05 logMAR change in the visual acuity value on average. Our result is in good agreement with former clinical experience derived indirectly from measurements that had independently analyzed the effect of background illumination on pupil size and on visual quality.

Digital Library: EI

Published Online: January 2019

On the role of edge orientation in stereo vision

32 2

stereo
orientation
visual cortex

Alfredo Restrepo, Julian Quiroga

Pages 209-1 - 209-6, January 2019, © Society for Imaging Science and Technology 2019

DOI

10.2352/ISSN.2470-1173.2019.12.HVEI-209

Volume 31

Issue 12

As a biologically inspired guess, we consider two stereo information channels. One is the traditional channel that works on the basis of the horizontal disparity between the left and right projections of single points in the 3D scene; this channel carries information regarding the absolute depth of the point. The second channel works on the basis of the projections of pairs of points in the 3D scene and carries information regarding the relative depth of the points; equivalently, for a given azimuth disparity of the points, the channel carries information of the ratio of the orientations of the left and right projections of the line segment between the pair of points.

Digital Library: EI

Published Online: January 2019

A visual model for predicting chromatic banding artifacts

321 36

perception
image quality
quantization
contouring
banding
color spaces

Gyorgy Denes, George Ash, Huameng Fang, Rafał K Mantiuk

Pages 212-1 - 212-8, January 2019, © Society for Imaging Science and Technology 2019

DOI

10.2352/ISSN.2470-1173.2019.12.HVEI-212

Volume 31

Issue 12

Quantization of images containing low texture regions, such as sky, water or skin, can produce banding artifacts. As the bitdepth of each color channel is decreased, smooth image gradients are transformed into perceivable, wide, discrete bands. Commonly used quality metrics cannot reliably measure the visibility of such artifacts. In this paper we introduce a visual model for predicting the visibility of both luminance and chrominance banding artifacts in image gradients spanning between two arbitrary points in a color space. The model analyzes the error introduced by quantization in the Fourier space, and employs a purpose-built spatio-chromatic contrast sensitivity function to predict its visibility. The output of the model is a detection probability, which can be then used to compute the minimum bit-depth for which banding artifacts are just-noticeable. We demonstrate that the model can accurately predict the results of our psychophysical experiments.

Digital Library: EI

Published Online: January 2019

NARVAL: A no-reference video quality tool for real-time communications

200 4

video quality metric
real-time communication
neural network

Augustin Lemesle, Alexis Marion, Ludovic Roux, Alexandre Gouaillard

Pages 213-1 - 213-7, January 2019, © Society for Imaging Science and Technology 2019

DOI

10.2352/ISSN.2470-1173.2019.12.HVEI-213

Volume 31

Issue 12

In this paper we introduce two new no-reference metrics and compare their performance to state-of-the-art metrics on six publicly available datasets having a large variety of distortions and characteristics. Our two metrics, based on neural networks, combine the following features: histogram of oriented gradients, edges detection, fast fourier transform, CPBD, blur and contrast measurement, temporal information, freeze detection, BRISQUE and Video BLIINDS. They perform better than Video BLIINDS and BRISQUE on the six datasets used in this study, including one made up of natural videos that have not been artificially distorted. Our metrics show a good generalization as they achieved high performance on the six datasets.

Digital Library: EI

Published Online: January 2019

An improved objective metric to predict image quality using deep neural networks

62 4

Objective image quality assessment
Full reference image quality assessment
Deep convolutional neural networks
Wavelet transform

Pinar Akyazi, Touradj Ebrahimi

DOI

10.2352/ISSN.2470-1173.2019.12.HVEI-214

Volume 31

Issue 12

Objective quality assessment of compressed images is very useful in many applications. In this paper we present an objective quality metric that is better tuned to evaluate the quality of images distorted by compression artifacts. A deep convolutional neural networks is used to extract features from a reference image and its distorted version. Selected features have both spatial and spectral characteristics providing substantial information on perceived quality. These features are extracted from numerous randomly selected patches from images and overall image quality is computed as a weighted sum of patch scores, where weights are learned during training. The model parameters are initialized based on a previous work and further trained using content from a recent JPEG XL call for proposals. The proposed model is then analyzed on both the above JPEG XL test set and images distorted by compression algorithms in the TID2013 database. Test results indicate that the new model outperforms the initial model, as well as other state-of-the-art objective quality metrics.

Digital Library: EI

Published Online: January 2019

Analyze and predict the perceptibility of UHD video contents

115 5

uhd
video perception
machine learning

Steve Göring, Julian Zebelein, Simon Wedel, Dominik Keller, Alexander Raake

DOI

10.2352/ISSN.2470-1173.2019.12.HVEI-215

Volume 31

Issue 12

720p, Full-HD, 4K, 8K, …, display resolutions are increasing heavily over the past time. However, many video streaming providers are currently streaming videos with a maximum of 4K/UHD-1 resolution. Considering that normal video viewers are enjoying their videos in typical living rooms, where viewing distances are quite large, the question arises if more resolution is even recognizable. In the following paper we will analyze the problem of UHD perceptibility in comparison with lower resolutions. As a first step, we conducted a subjective video test, that focuses on short uncompressed video sequences and compares two different testing methods for pairwise discrimination of two representations of the same source video in different resolutions. We selected an extended stripe method and a temporal switching method. We found that the temporal switching is more suitable to recognize UHD video content. Furthermore, we developed features, that can be used in a machine learning system to predict whether there is a benefit in showing a given video in UHD or not. Evaluating different models based on these features for predicting perceivable differences shows good performance on the available test data. Our implemented system can be used to verify UHD source video material or to optimize streaming applications.

Digital Library: EI

Published Online: January 2019

Complexity measurement and characterization of 360-degree content

123 7

Content characterisation
Omnidirectional content
Video quality assessment
Content complexity
360-degree content

Francesca De Simone, Jesús Gutiérrez, Patrick Le Callet

DOI

10.2352/ISSN.2470-1173.2019.12.HVEI-216

Volume 31

Issue 12