Modern production and distribution workflows have allowed for high dynamic range (HDR) imagery to become widespread. It has made a positive impact in the creative industry and improved image quality on consumer devices. Akin to the dynamics of loudness in audio, it is predicted that the increased luminance range allowed by HDR ecosystems could introduce unintended, high-magnitude changes. These luminance changes could occur at program transitions, advertisement insertions, and channel change operations. In this article, we present findings from a psychophysical experiment conducted to evaluate three components of HDR luminance changes: the magnitude of the change, the direction of the change (darker or brighter), and the adaptation time. Results confirm that all three components exert significant influence. We find that increasing either the magnitude of the luminance or the adaptation time results in more discomfort at the unintended transition. We find that transitioning from brighter to darker stimuli has a non-linear relationship with adaptation time, falling off steeply with very short durations.
Interdisciplinary research in human vision has greatly contributed to the current state-of-the-art in computer vision and machine learning starting with low-level topics such as image compression and image quality assessment up to complex neural networks for object recognition. Representations similar to those in the primary visual cortex are frequently employed, e.g., linear filters in image compression and deep neural networks. Here, we first review particular nonlinear visual representations that can be used to better understand human vision and provide efficient representations for computer vision including deep neural networks. We then focus on i2D representations that are related to end-stopped neurons. The resulting E-nets are deep convolutional networks, which outperform some state-of-the-art deep networks. Finally, we show that the performance of E-nets can be further improved by using genetic algorithms to optimize the architecture of the network.
The psychogenesis of visual awareness is an autonomous process in the sense that you do not “do” it. However, you have some control due to your acting in the world. We share this process with many animals. Pictorial awareness appears to be truly human. Here situational awareness splits into an “everyday vision” and a “pictorial” mode. Here we focus mainly on spatial aspects of pictorial art. You have no control whatever over the picture’s structure. The pictorial awareness is pure imagery, constrained by the (physical) structure of the picture. Crafting pictures and beholding pictures are distinct, but closely related, acts. We present an account from experimental and formal phenomenology. It results in a generic model that accounts for the bulk of formal (rare) and informal (common) observations.
Experiencing art calls for a unique processing mode – this premise has been repeatedly debated during the last 300 years. Despite that, we still lack a theoretical and empirical basis for understanding this mode essential to understanding experiencing of art. We begin this position paper by reviewing the literature related to this mode and revealing a wide diversity and hardly commensurable theoretical approaches. This might be an important reason for the thin empirical data regarding this theme, especially when looking for ecologically valid experimental studies. We propose the Mode of Art eXperience (MAX) concept to establish a coherent theoretical framework. We argue that even very established works often overlook the essence of more profound and so to say “true” art experience. We discuss MAX in relation to evolutionary psychology, art history, and other cognitive modes (play, religion, and the Everyday). We also propose that MAX is not the only extraordinary mode to process information specifically, but that for experiencing art, we evidently need a frame that enables MAX to unfold the full range of art-related phenomena which make art so culturally particular and essential for humankind.
Light-permeable materials are usually characterized by perceptual attributes of transparency, translucency, and opacity. Technical definitions and standards leave room for subjective interpretation on how these different perceptual attributes relate to optical properties and one another, which causes miscommunication in industry and academia alike. A recent work hypothesized that a Gaussian function or a similar bell-shaped curve describes the relationship between translucency on the one hand, and transparency and opacity, on the other hand. Another work proposed a translucency classification system for computer graphics, where transparency, translucency and opacity are modulated by three optical properties: subsurface scattering, subsurface absorption, and surface roughness. In this work, we conducted two psychophysical experiments to scale the magnitude of transparency and translucency of different light-permeable materials to test the hypothesis that a Gaussian function can model the relationship between transparency and translucency, and to assess how well the aforementioned classification system describes the relationship between optical and perceptual properties. We found that the results vary significantly between the shapes. While bell-shaped relationship between transparency and translucency has been observed for spherical objects, this was not generalized to a more complex shape. Furthermore, how optical properties modulate transparency and translucency is also dependent on the object shape. We conclude that these cross-shape differences are rooted in different image cues generated by different object scales and surface geometry.
Both natural scene statistics and ground surfaces have been shown to play important roles in visual perception, in particular, in the perception of distance. Yet, there have been surprisingly few studies looking at the natural statistics of distances to the ground, and the studies that have been done used a loose definition of ground. Additionally, perception studies investigating the role of the ground surface typically use artificial scenes containing perfectly flat ground surfaces with relatively few non-ground objects present, whereas ground surfaces in natural scenes are typically non-planar and have a large number of non-ground objects occluding the ground. Our study investigates the distance statistics of many natural scenes across three datasets, with the goal of separately analyzing the ground surface and non-ground objects. We used a recent filtering method to partition LiDAR-acquired 3D point clouds into ground points and non-ground points. We then examined the way in which distance distributions depend on distance, viewing elevation angle, and simulated viewing height. We found, first, that the distance distribution of ground points shares some similarities with that of a perfectly flat plane, namely with a sharp peak at a near distance that depends on viewing height, but also some differences. Second, we also found that the distribution of non-ground points is flatter and did not vary with viewing height. Third, we found that the proportion of non-ground points increases with viewing elevation angle. Our findings provide further insight into the statistical information available for distance perception in natural scenes, and suggest that studies of distance perception should consider a broader range of ground surfaces and object distributions than what has been used in the past in order to better reflect the statistics of natural scenes.
The investigation of aesthetics has primarily been conducted within the visual domain. This is not a surprise, as aesthetics has largely been associated with the perception and appreciation of visual media, such as traditional artworks, photography, and architecture. However, one doesn’t need to look far to realize that aesthetics extends beyond the visual domain. Media such as film and music introduce a unique and equally rich temporally changing visual and auditory experience. Product design, ranging from furniture to clothing, strongly depends on pleasant tactile evaluations. Studies involving the perception of 1/f statistics in vision have been particularly consistent in demonstrating a preference for a 1∕f structure resembling that of natural scenes, as well as systematic individual differences across a variety of visual objects. Interestingly, comparable findings have also been reached in the auditory and tactile domains. In this review, we discuss some of the current literature on the perception of 1∕f statistics across the contexts of different sensory modalities.
In this work we study the perception of suprathreshold translucency differences to expand the knowledge about material appearance perception in imaging and computer graphics, and 3D printing applications. Translucency is one of the most considerable appearance attributes that significantly affects the look of objects and materials. However, the knowledge about translucency perception remains limited. Even less is known about the perception of translucency differences between materials. We hypothesize that humans are more sensitive to small changes in absorption and scattering coefficients when optically thin materials are examined and when objects have geometrically thin parts. To test these hypotheses, we generated images of objects with different shapes and subsurface scattering properties and conducted psychophysical experiments with these visual stimuli. The analysis of the experimental data supports these hypotheses and based on post experiment comments made by the observers, we argue that the results could be a demonstration of a fundamental difference between translucency perception mechanisms in see-through and non-see-through objects and materials.
Medical image data is critically important for a range of disciplines, including medical image perception research, clinician training programs, and computer vision algorithms, among many other applications. Authentic medical image data, unfortunately, is relatively scarce for many of these uses. Because of this, researchers often collect their own data in nearby hospitals, which limits the generalizabilty of the data and findings. Moreover, even when larger datasets become available, they are of limited use because of the necessary data processing procedures such as de-identification, labeling, and categorizing, which requires significant time and effort. Thus, in some applications, including behavioral experiments on medical image perception, researchers have used naive artificial medical images (e.g., shapes or textures that are not realistic). These artificial medical images are easy to generate and manipulate, but the lack of authenticity inevitably raises questions about the applicability of the research to clinical practice. Recently, with the great progress in Generative Adversarial Networks (GAN), authentic images can be generated with high quality. In this paper, we propose to use GAN to generate authentic medical images for medical imaging studies. We also adopt a controllable method to manipulate the generated image attributes such that these images can satisfy any arbitrary experimenter goals, tasks, or stimulus settings. We have tested the proposed method on various medical image modalities, including mammogram, MRI, CT, and skin cancer images. The generated authentic medical images verify the success of the proposed method. The model and generated images could be employed in any medical image perception research.