Individuals with aphantasia report either absent or dramatically reduced mental imagery compared to control participants. The image of an object or scene produced “in the mind’s eye” lacks detail for these individuals or is simply not there. Line drawings made from memory are a straightforward way to assess the contents of visual imagery for aphantasic individuals relative to controls. Prior analyses of the Aphantasia Drawing Database have revealed specific impairments in visual memory for objects, but relatively spared scene accuracy, suggesting that the encoding of visual scenes in aphantasia is more complex than an overall reduction in imagery might suggest. Here, we examined the mid-level image statistics of line drawings from this database to determine how simpler visual feature distributions differed as a function of aphantasia and reliance on image recall rather than direct observation during image reproduction. We find clear differences across several different sets of mid-level properties as a function of aphantasia, which offers further characterization of the nature of visual encoding in this condition.
Pictorial research can rely on computational or human annotations. Computational annotations offer scalability, facilitating so-called distant-viewing studies. On the other hand, human annotations provide insights into individual differences, judgments of subjective nature. In this study, we demonstrate the difference in objective and subjective human annotations in two pictorial research studies: one focusing on Avercamp’s perspective choices and the other on Rembrandt’s compositional choices. In the first experiment, we investigated perspective handling by the Dutch painter Hendrick Avercamp. Using visual annotations of human figures and horizons, we could reconstruct the virtual viewpoint from where Avercamp depicted his landscapes. Results revealed an interesting trend: with increasing age, Avercamp lowered his viewpoint. In the second experiment, we studied the compositional choice that Rembrandt van Rijn made in Syndics of the Drapers’ Guild. Based on imaging studies it is known that Rembrandt doubted where to place the servant, and we let 100 annotators make the same choice. Subjective data was in line with evidence from imaging studies. Aside from having their own merit, the two experiments demonstrate two distinctive ways of performing pictorial research, one that concerns the picture alone (objective) and one that concerns the relation between the picture and the viewer (subjective).
Modern production and distribution workflows have allowed for high dynamic range (HDR) imagery to become widespread. It has made a positive impact in the creative industry and improved image quality on consumer devices. Akin to the dynamics of loudness in audio, it is predicted that the increased luminance range allowed by HDR ecosystems could introduce unintended, high-magnitude changes. These luminance changes could occur at program transitions, advertisement insertions, and channel change operations. In this article, we present findings from a psychophysical experiment conducted to evaluate three components of HDR luminance changes: the magnitude of the change, the direction of the change (darker or brighter), and the adaptation time. Results confirm that all three components exert significant influence. We find that increasing either the magnitude of the luminance or the adaptation time results in more discomfort at the unintended transition. We find that transitioning from brighter to darker stimuli has a non-linear relationship with adaptation time, falling off steeply with very short durations.
Interdisciplinary research in human vision has greatly contributed to the current state-of-the-art in computer vision and machine learning starting with low-level topics such as image compression and image quality assessment up to complex neural networks for object recognition. Representations similar to those in the primary visual cortex are frequently employed, e.g., linear filters in image compression and deep neural networks. Here, we first review particular nonlinear visual representations that can be used to better understand human vision and provide efficient representations for computer vision including deep neural networks. We then focus on i2D representations that are related to end-stopped neurons. The resulting E-nets are deep convolutional networks, which outperform some state-of-the-art deep networks. Finally, we show that the performance of E-nets can be further improved by using genetic algorithms to optimize the architecture of the network.
The psychogenesis of visual awareness is an autonomous process in the sense that you do not “do” it. However, you have some control due to your acting in the world. We share this process with many animals. Pictorial awareness appears to be truly human. Here situational awareness splits into an “everyday vision” and a “pictorial” mode. Here we focus mainly on spatial aspects of pictorial art. You have no control whatever over the picture’s structure. The pictorial awareness is pure imagery, constrained by the (physical) structure of the picture. Crafting pictures and beholding pictures are distinct, but closely related, acts. We present an account from experimental and formal phenomenology. It results in a generic model that accounts for the bulk of formal (rare) and informal (common) observations.
Experiencing art calls for a unique processing mode – this premise has been repeatedly debated during the last 300 years. Despite that, we still lack a theoretical and empirical basis for understanding this mode essential to understanding experiencing of art. We begin this position paper by reviewing the literature related to this mode and revealing a wide diversity and hardly commensurable theoretical approaches. This might be an important reason for the thin empirical data regarding this theme, especially when looking for ecologically valid experimental studies. We propose the Mode of Art eXperience (MAX) concept to establish a coherent theoretical framework. We argue that even very established works often overlook the essence of more profound and so to say “true” art experience. We discuss MAX in relation to evolutionary psychology, art history, and other cognitive modes (play, religion, and the Everyday). We also propose that MAX is not the only extraordinary mode to process information specifically, but that for experiencing art, we evidently need a frame that enables MAX to unfold the full range of art-related phenomena which make art so culturally particular and essential for humankind.
Light-permeable materials are usually characterized by perceptual attributes of transparency, translucency, and opacity. Technical definitions and standards leave room for subjective interpretation on how these different perceptual attributes relate to optical properties and one another, which causes miscommunication in industry and academia alike. A recent work hypothesized that a Gaussian function or a similar bell-shaped curve describes the relationship between translucency on the one hand, and transparency and opacity, on the other hand. Another work proposed a translucency classification system for computer graphics, where transparency, translucency and opacity are modulated by three optical properties: subsurface scattering, subsurface absorption, and surface roughness. In this work, we conducted two psychophysical experiments to scale the magnitude of transparency and translucency of different light-permeable materials to test the hypothesis that a Gaussian function can model the relationship between transparency and translucency, and to assess how well the aforementioned classification system describes the relationship between optical and perceptual properties. We found that the results vary significantly between the shapes. While bell-shaped relationship between transparency and translucency has been observed for spherical objects, this was not generalized to a more complex shape. Furthermore, how optical properties modulate transparency and translucency is also dependent on the object shape. We conclude that these cross-shape differences are rooted in different image cues generated by different object scales and surface geometry.
Both natural scene statistics and ground surfaces have been shown to play important roles in visual perception, in particular, in the perception of distance. Yet, there have been surprisingly few studies looking at the natural statistics of distances to the ground, and the studies that have been done used a loose definition of ground. Additionally, perception studies investigating the role of the ground surface typically use artificial scenes containing perfectly flat ground surfaces with relatively few non-ground objects present, whereas ground surfaces in natural scenes are typically non-planar and have a large number of non-ground objects occluding the ground. Our study investigates the distance statistics of many natural scenes across three datasets, with the goal of separately analyzing the ground surface and non-ground objects. We used a recent filtering method to partition LiDAR-acquired 3D point clouds into ground points and non-ground points. We then examined the way in which distance distributions depend on distance, viewing elevation angle, and simulated viewing height. We found, first, that the distance distribution of ground points shares some similarities with that of a perfectly flat plane, namely with a sharp peak at a near distance that depends on viewing height, but also some differences. Second, we also found that the distribution of non-ground points is flatter and did not vary with viewing height. Third, we found that the proportion of non-ground points increases with viewing elevation angle. Our findings provide further insight into the statistical information available for distance perception in natural scenes, and suggest that studies of distance perception should consider a broader range of ground surfaces and object distributions than what has been used in the past in order to better reflect the statistics of natural scenes.
The investigation of aesthetics has primarily been conducted within the visual domain. This is not a surprise, as aesthetics has largely been associated with the perception and appreciation of visual media, such as traditional artworks, photography, and architecture. However, one doesn’t need to look far to realize that aesthetics extends beyond the visual domain. Media such as film and music introduce a unique and equally rich temporally changing visual and auditory experience. Product design, ranging from furniture to clothing, strongly depends on pleasant tactile evaluations. Studies involving the perception of 1/f statistics in vision have been particularly consistent in demonstrating a preference for a 1∕f structure resembling that of natural scenes, as well as systematic individual differences across a variety of visual objects. Interestingly, comparable findings have also been reached in the auditory and tactile domains. In this review, we discuss some of the current literature on the perception of 1∕f statistics across the contexts of different sensory modalities.