The conference on Human Vision and Electronic Imaging explores the role of human perception and cognition in the design, analysis, and use of electronic media systems. Over the years, it has brought together researchers, technologists, and artists, from all over the world, for a rich and lively exchange of ideas. We believe that understanding the human observer is fundamental to the advancement of electronic media systems, and that advances in these systems and applications drive new research into the perception and cognition of the human observer. Every year, we introduce new topics through our Special Sessions, centered on areas driving innovation at the intersection of perception and emerging media technologies.
The method of loci (memory palace technique) is a learning strategy that uses visualizations of spatial environments to enhance memory. One particularly popular use of the method of loci is for language learning, in which the method can help long-term memory of vocabulary by allowing users to associate location and other spatial information with particular words/concepts, thus making use of spatial memory to assist memory typically associated with language. Augmented reality (AR) and virtual reality (VR) have been shown to potentially provide even better memory enhancement due to their superior visualization abilities. However, a direct comparison of the two techniques in terms of language-learning enhancement has not yet been investigated. In this presentation, we present the results of a study designed to compare AR and VR when using the method of loci for learning vocabulary from a second language.
We have developed an assistive technology for people with vision disabilities of central field loss (CFL) and low contrast sensitivity (LCS). Our technology includes a pair of holographic AR glasses with enhanced image magnification and contrast, for example, highlighting objects, and detecting signs, and words. In contrast to prevailing AR technologies which project either mixed reality objects or virtual objects to the glasses, Our solution fuses real-time sensory information and enhances images from reality. The AR glasses technology has two advantages: it’s relatively ‘fail-safe.” If the battery dies or the processor crashes, the glasses can still function because it is transparent. The AR glasses can also be transformed into a VR or AR simulator when it overlays virtual objects such as pedestrians or vehicles onto the glasses for simulation. The real-time visual enhancement and alert information are overlaid on the transparent glasses. The visual enhancement modules include zooming, Fourier filters, contrast enhancement, and contour overlay. Our preliminary tests with low-vision patients show that the AR glass indeed improved patients' vision and mobility, for example, from 20/80 to 20/25 or 20/30.
The critical flicker fusion (CFF) is the frequency of changes at which a temporally periodic light will begin to appear completely steady to an observer. This value is affected by several visual factors, such as the luminance of the stimulus or its location on the retina. With new high dynamic range (HDR) displays, operating at higher luminance levels, and virtual reality (VR) displays, presenting at wide fields-of-view, the effective CFF may change significantly from values expected for traditional presentation. In this work we use a prototype HDR VR display capable of luminances up to 20,000 cd/m^2 to gather a novel set of CFF measurements for never before examined levels of luminance, eccentricity, and size. Our data is useful to study the temporal behavior of the visual system at high luminance levels, as well as setting useful thresholds for display engineering.
Spatial and temporal contrast sensitivity is typically measured using different stimuli. Gabor patterns are used to measure spatial contrast sensitivity and flickering discs are used for temporal contrast sensitivity. The data from both types of studies is difficult to compare as there is no well-established relationship between the sensitivity to disc and Gabor patterns. The goal of this work is to propose a model that can predict the contrast sensitivity of a disc using the more commonly available data and models for Gabors. To that end, we measured the contrast sensitivity for discs of different sizes, shown at different luminance levels, and for both achromatic and chromatic (isoluminant) contrast. We used this data to compare 6 different models, each of which tested a different hypothesis on the detection and integration mechanisms of disc contrast. The results indicate that multiple detectors contribute to the perception of disc stimuli, and each can be modelled either using an energy model, or the peak spatial frequency of the contrast sensitivity function.
Lightness perception is a long-standing topic in research on human vision, but very few image-computable models of lightness have been formulated. Recent work in computer vision has used artifical neural networks and deep learning to estimate surface reflectance and other intrinsic image properties. Here we investigate whether such networks are useful as models of human lightness perception. We train a standard deep learning architecture on a novel image set that consists of simple geometric objects with a few different surface reflectance patterns. We find that the model performs well on this image set, generalizes well across small variations, and outperforms three other computational models. The network has partial lightness constancy, much like human observers, in that illumination changes have a systematic but moderate effect on its reflectance estimates. However, the network generalizes poorly beyond the type of images in its training set: it fails on a lightness matching task with unfamiliar stimuli, and does not account for several lightness illusions experienced by human observers.
Accurate models of the electroretinogram are important both for understanding the multifold processes of light transduction to ecologically useful signals by the retina, but also its diagnostic capabilities for the identification of the array of retinal diseases. The present neuroanalytic model of the human rod ERG is elaborated from the same general principles as that of Hood & Birch (1992), but incorporates the more recent understanding of the early stages of ERG generation by Robson & Frishman (2014). As a result, it provides a significantly better match in six different waveform features of the canonical ERG flash intensity series than previous models of rod responses.
During these past years, international COVID data have been collected by several reputable organizations and made available to the worldwide community. This has resulted in a wellspring of different visualizations. Many different measures can be selected (e.g., cases, deaths, hospitalizations). And for each measure, designers and policy makers can make a myriad of different choices of how to represent the data. Data from individual countries may be presented on linear or log scales, daily, weekly, or cumulative, alone or in the context of other countries, scaled to a common grid, or scaled to their own range, raw or per capita, etc. It is well known that the data representation can influence the interpretation of data. But, what visual features in these different representations affect our judgments? To explore this idea, we conducted an experiment where we asked participants to look at time-series data plots and assess how safe they would feel if they were traveling to one of the countries represented, and how confident they are of their judgment. Observers rated 48 visualizations of the same data, rendered differently along 6 controlled dimensions. Our initial results provide insight into how characteristics of the visual representation affect human judgments of time series data. We also discuss how these results could impact how public policy and news organizations choose to represent data to the public.
The population of low vision people increases continuously with the acceleration of aging society. As reported by WHO, most of this population is over the age of 50 years and 81% were not concerned by any visual problem before. A visual deficiency can dramatically affect the quality of life and challenge the preservation of a safe independent existence. This study presents a LED-based lighting approach to assist people facing an age-related visual impairment. The research procedure is based on psychophysical experiments consisting in the ordering of standard color samples. Volunteers wearing low vision simulation goggles performed such an ordering under different illumination conditions produced by a 24-channel multispectral lighting system. A filtering technique using color rendering indices coupled with color measurements allowed to objectively determine the lighting conditions providing the best scores in terms of color discrimination. Experimental results were used to combine 3 channels to produce white light inducing a stronger color perception in a low vision context than white LEDs nowadays available for general lighting. Even if further studies will be required, these first results give hope for the design of smart lighting devices that adapt to the visual needs of the visually impaired.
The motivation for use of biosensors in audiovisual media is made by highlighting problem of signal loss due to wide variability in playback devices. A metadata system that allows creatives to steer signal modifications as a function of audience emotion and cognition as determined by biosensor analysis.