Back to articles
Regular Articles
Volume: 3 | Article ID: jpi0119
Image
Neurocomputational Lightness Model Explains the Appearance of Real Surfaces Viewed Under Gelb Illumination
  DOI :  10.2352/J.Percept.Imaging.2020.3.1.010502  Published OnlineJanuary 2020
Abstract
Abstract

One of the primary functions of visual perception is to represent, estimate, and evaluate the properties of material surfaces in the visual environment. One such property is surface color, which can convey important information about ecologically relevant object characteristics such as the ripeness of fruit and the emotional reactions of humans in social interactions. This paper further develops and applies a neural model (Rudd, 2013, 2017) of how the human visual system represents the light/dark dimension of color—known as lightness—and computes the colors of achromatic material surfaces in real-world spatial contexts. Quantitative lightness judgments conducted with real surfaces viewed under Gelb (i.e., spotlight) illumination are analyzed and simulated using the model. According to the model, luminance ratios form the inputs to ON- and OFF-cells, which encode local luminance increments and decrements, respectively. The response properties of these cells are here characterized by physiologically motivated equations in which different parameters are assumed for the two cell types. Under non-saturating conditions, ON-cells respond in proportion to a compressive power law of the local incremental luminance in the image that causes them to respond, while OFF-cells respond linearly to local decremental luminance. ON- and OFF-cell responses to edges are log-transformed at a later stage of neural processing and then integrated across space to compute lightness via an edge integration process that can be viewed as a neurally elaborated version of Land’s retinex model (Land & McCann, 1971). It follows from the model assumptions that the perceptual weights—interpreted as neural gain factors—that the model observer applies to steps in log luminance at edges in the edge integration process are determined by the product of a polarity-dependent factor 1—by which incremental steps in log luminance (i.e., edges) are weighted by the value <1.0 and decremental steps are weighted by 1.0—and a distance-dependent factor 2, whose edge weightings are estimated to fit perceptual data. The model accounts quantitatively (to within experimental error) for the following: lightness constancy failures observed when the illumination level on a simultaneous contrast display is changed (Zavagno, Daneyko, & Liu, 2018); the degree of dynamic range compression in the staircase-Gelb paradigm (Cataliotti & Gilchrist, 1995; Zavagno, Annan, & Caputo, 2004); partial releases from compression that occur when the staircase-Gelb papers are reordered (Zavagno, Annan, & Caputo, 2004); and the larger compression release that occurs when the display is surrounded by a white border (Gilchrist & Cataliotti, 1994).

Subject Areas :
Views 100
Downloads 18
 articleview.views 100
 articleview.downloads 18
  Cite this article 

Michael E. Rudd, "Neurocomputational Lightness Model Explains the Appearance of Real Surfaces Viewed Under Gelb Illuminationin Journal of Perceptual Imaging,  2020,  pp 010502-1 - 010502-16,  https://doi.org/10.2352/J.Percept.Imaging.2020.3.1.010502

 Copy citation
  Copyright statement 
Copyright © Society for Imaging Science and Technology 2020
  Article timeline 
  • received March 2019
  • accepted May 2020
  • PublishedJanuary 2020

Preprint submitted to:
jpi
Journal of Perceptual Imaging
J. Percept. Imaging
J. Percept. Imaging
2575-8144
Society for Imaging Science and Technology
1.
Introduction and overview
In the natural world, one of the key functions of visual perception is to estimate properties of material surfaces. Surface color is one such property, which in turn conveys important information about many other ecologically relevant properties of both objects and agents such as the ripeness of fruit and the emotional reactions of humans in social interactions. For this reason, and because of the obvious importance of color in technology, attempts to quantify and model human color perception have a longstanding history in the visual perception literature.
Some of the key recent work on this problem has focused on the simplified—and dimensionally reduced—problem of achromatic color perception; that is, the perception of gray levels of achromatic surfaces, also known as “lightness” perception. In the domain of lightness, a host of competing models have been proposed to account for a rapidly growing range of quantifiable visual phenomena. The pros and cons of some of these models will be discussed below.
One of the properties of lightness perception that makes it interesting to study is that the perceived gray level of an achromatic surface can be strongly affected by the spatial layout of other surfaces in which the target surface is embedded. This produces to what are often called visual “illusions”—situations in which the apparent “color” of a surface is strongly affected by the context in which it is viewed. Although there exists a large amount of disagreement among theorists regarding the correct explanation for these spatial context effects, a common starting point of many of the proposed explanations is luminance contrast. That is, the perceived lightness of a surface is thought to depend on a spatial comparison of the intensity of the light reflected from the surface measured relative to the intensity of the light reflected from other nearby surfaces. Luminance contrast also forms the starting point for the model described here.
Perhaps the simplest and most well-known example of the influence of luminance contrast on lightness is the phenomenon known as simultaneous contrast, in which a gray paper presented against a dark background appears to be a lighter shade of gray than a physically identical paper presented against a white background. Figure 1(a) illustrates a display that is only slightly more complex than the standard simultaneous lightness contrast (SLC) display but is nevertheless able to test competing theories of how luminance contrast is utilized by the visual system to “compute” lightness. The disks and the annuli on the two sides of this figure are identical in both size and luminance, but the disk and the annulus on the left appear lighter than the disk and annulus on the right. The effect is due to luminance contrast with respect to the background field, which is darker on the left than on the right. A simple model of simultaneous contrast would predict that whereas the left annulus may look lighter than its surround, the left disk would look darker than the surround due to perceptual contrast with its lighter surround [24]. Instead, both the left annulus and the disk appear lighter, demonstrating a type of assimilation. Indeed, it appears as though the human visual system adds up the steps in luminance along a path from the background field to compute each disk’s lightness.
Figure 1.
(a) Perceptual demonstration of edge integration in lightness. The two disks and the two annuli are identical, but the disk and the annulus on the left both appear lighter than the disk and the annulus on the right because the background luminance is lower on that side. The lightness of each disk can be modeled as a weighted sum of steps in log-luminance steps at inner and outer borders of the annuli surrounding the disk (see Eq. (1)). (b) Diagram of the disk/annulus stimulus used by Rudd and Zemach [44] to test the edge integration model, with the luminances of the disk, annulus, and background field labeled D, A, and B. The symbols ξDA and ξAB denote the locations of the edges (i.e., the luminance steps) between the regions D and A and between A and B, respectively. Note that this stimulus is similar to the disk/annulus configuration on the left side of panel (a), but with a homogeneous dark background rather than a gradient background.
In previous work, Zemach and I experimentally investigated and mathematically modeled this perceptual edge integration phenomenon using disk–annulus stimuli in which the disks were either luminance decrements [44] or luminance increments [43] with respect to their surrounding annuli. In these experiments—unlike in Fig. 1(a)—the disk–annulus pairs were always presented against a homogeneous dark background. For both incremental and decremental disks, the results were well described by a model in which the disk lightness was determined by a weighted sum of the steps in luminance (measured in log units) from the background to the annulus and from the annulus to the disk, with the edge weights decreasing as a function of the distance of the edge from the disk. This idea is expressed mathematically by the equation
(1)
ΦD=wDA(logDlogA)+wAB(logAlogB),
where ΦD is the disk lightness; D, A, and B are the luminances of the disk, annulus, and background field; and wDA and wAB are weights given to the disk/annulus and annulus/background edges ξDA and ξAB, respectively, in the process of computing the disk lightness. Fig. 1(b) illustrates how these concepts can be applied to model the lightness of the disk on the left side of Fig. 1(a). Note that Eq. (1) can be viewed as a modified version of the retinex algorithm of [29], in which the assumption that equal weights are applied to all steps in log luminance across the image has been replaced with the assumption that the weights applied to the disk/annulus and ring/background edges may be different.
A conceptually similar model was proposed earlier by [35, 46], who posited that lightness depends on a sum of weighted steps in Michelson contrast in which the size of the edge weights decays spatially as a function of distance from the target. Rudd and Zemach [44] verified this assumption in their experiments but showed that their model based on sums of steps in log luminance gave a better account of psychophysical data. However, the most important innovation of the work by Rudd and Zemach was to model lightness matches obtained with incremental and decremental disk/annulus stimuli separately. By doing this, they established that the weights applied to edges in the lightness computation differed not only as a function of distance from the target but also for incremental and decremental stimuli.
To account for these findings and other data from the literature regarding lightness–darkness asymmetries (e.g. [5, 13, 17, 21, 22, 50, 51]), Rudd [38] proposed a neural model based on an elaborated version of the edge integration model. One implication of this neural theory of lightness computation is that the weights associated with edges in the perceptual edge summation are determined by two independent factors: the distance of the edge from the target region whose lightness is being computed and the edge contrast polarity. That is,
(2)
wi=ω(di)×n(ρi),
where wi is the weight associated with an edge i in computing the target lightness, ω(di) is a function that depends only on the distance of edge i from the target and whose value decreases monotonically with di, and n(ρi) is a factor that depends on the contrast polarity ρi of edge i. In the context of the disk/annulus display illustrated in Fig. 1(b), the neural model implies that the weights applied to edges in the process of computing the disk are determined by the distance between ξDA and ξAB and the contrast polarities of ξDA and ξAB.
Rudd [38, 41] further argued that (1) the contrast-polarity-dependent factor n(pi) arises at a neural level from the differing responses to ON- and OFF-cell responses, (2) these differing response properties of ON cells and OFF cells provide a general explanation of well-known asymmetries in the magnitudes of lightness and darkness induction, and (3) the edge integration theory developed to explain lightness perception in the context of disk–annulus stimuli can be extended to account for lightness computation in a wide range of visual stimuli.
The current paper takes a significant step in this last direction by showing how the neural edge integration model can be successfully applied for the explanation of lightness judgments made with real-world illuminated material displays. In particular, here I use the neural model to account for recent results in which a failure of lightness constancy was observed when the level of illumination of a simultaneous contrast display was changed [53], and also to account for quantitative Munsell matches made to papers arranged in staircase-Gelb and scrambled-Gelb formations [6, 13, 14, 17, 52]. In these perceptual studies, a display consisting of a small number of surfaces was viewed in a spotlight within an otherwise dimly lit room. Experiments of this type—which have been referred to by Zavagno, Daneyko, and Liu as “Gelb illumination” experiments—have been postulated to reveal the rules governing lightness computation within isolated frameworks of illumination [13, 17]. Application of the neurocomputational model to data from such experiments allows for direct comparisons of the performance of the neural model with that of other theoretical lightness models that have been proposed to account for these and similar data.
1.1
Proposed Neural Origin of the Contrast-Polarity Edge Weighting Factor n(ρi)
Before applying the model to the explanation of these experiments, I first elaborate in this section the part of the neural model that explains how ON- and OFF-cell responses relate to lightness perception in human observers. It is well known that incremental and decremental steps in image luminance are encoded by separate populations of ON-center and OFF-center neurons in the primate early visual system (e.g., retina, lateral geniculate nucleus (LGN), and early visual cortex). Both cell types possess circularly-symmetric center–surround receptive fields. ON-cells respond when the neurally weighted amount of light falling within their receptive field centers exceeds the weighted amount of light falling within their spatial surrounds. OFF-cells respond when the neurally weighted amount of light falling within their receptive field surround exceeds the weighted amount of light falling within their receptive field centers [1, 23, 27].
A physiological model that is often used to model the response of a generic neuron to its input is the Naka–Rushton function [34], which is defined by the equation
(3)
R(λ)=Rmaxλnλn+σn,
where R(λ) is the neural response to an input λ, Rmax is the value of the neural response to the saturating input, σ is the input level that produces a half-saturating neural response, and n is the exponent of a power law that relates R(λ) to λ for small values of λ.
To model the responses of ON- and OFF-cells, I will assume here that the neural input λ in Eq. (3) is different for each of these two cell types and expressed by the equations
(4a)
λon=γ(IC+Id)IS+Id,γ(IC+Id)IS+Id0,γ(IC+Id)<IS+Id
and
(4b)
λoff=IS+Idγ(IC+Id),IS+Idγ(IC+Id)0,IS+Id<γ(IC+Id),
where IC and IS are the intensities of the light falling on the receptive field centers and surrounds, Id is spontaneous neural activity that is indistinguishable from effects of retinal photoisomerizations (i.e., “dark noise”), and γ models the weight given to the receptive field center relative to the surround.
It follows that for light intensities well above the dark noise level, the model ON-cell input is proportional to the luminance ratio ICIS and the model OFF-cell input is proportional to the luminance ratio ISIC. The exponent n in Eq. (3) that transforms these inputs according to a power law might be a property either of the neuron itself or of neural circuitry along the visual pathway from the photon absorptions to the LGN. The model is agnostic with respect to the physiological origin of n, which need not be a property of the neuron itself. In what follows, we will assume that n differs for ON- and OFF-cells and argue that this difference has important implications for perceptual asymmetries between lightness and darkness [38, 41]. This assumption of different exponents for ON- and OFF-cells is a key assumption of the neural lightness model.
An implication of the system of Eqs. (3), (4a), and (4b) is that the half-saturation constant σ corresponds to the contrast ratio at which the neural response equals Rmax∕2 (assuming that the dark noise contribution Id is negligible). For sufficiently large input luminance ratios, the neural response Eq. (3) saturates at the value Rmax. For sufficiently small input ratios, the neural response is related to the input luminance ratio by a power law that is characterized by the exponent n. In the later case, we say that the cell is operating in its optimal encoding range. For the purpose of modeling the perceptual results discussed in this paper, I will assume that the ON- and OFF-cells were always operating in their optimal encoding range in the experiments modeled and thus that these cells performed a power law transformation of their luminance ratio inputs. However, the full model expressed by Eq. (3) suggests that dark noise plays a significant role in lightness perception when display luminances are sufficiently low and that neural saturation may come into play when the dynamic range of the display is increased beyond the levels encountered in the experiments modeled here.
To apply this model of ON- and OFF-cell responses to lightness perception, we need to make some more specific assumptions about the values of the ON- and OFF-cell exponents non and noff. Billock [2] fit the spiking response of an individual ON-cell in the lateral geniculate nucleus of macaque monkeys recorded by De Valois, Abramov, and Jacobs [9] with a power law regression model and estimated non = 0.27 (see Figure 2). His result indicates that the ON-cell response is highly compressive even in its optimal operating range. In what follows, I will begin by assuming this value for the ON-cell exponent and I will also assume that the exponent for OFF-cells is noff = 1. In other words, in the optimal operating range, the ON-cell response will be assumed to be proportional to (ICIS)0.27 and the OFF-cell response to be proportional to ISIC.
Figure 2.
The quantitative mapping between luminance input and the spike-rate response of an ON-cell response recorded by De Valois, Abramov, and Jacobs [9] is a power law with an exponent of 0.27 (r2 = 0.85) over the range of inputs tested. On the log–log plot of R(I) versus I, the model equation R(I) = κIn takes the form of a straight line with intercept logκ and slope n (see the arrow). The value of the ON-cell exponent nON = 0.27 is the slope of the least-squares linear model of the neural response. The spike-rate data were taken from Table 10 of [36] and the luminance efficiency data were taken from http://www.cvrl.org (figure and caption adapted from  [2]).
1.2
Assumed Logarithmic Transformation of ON- and OFF-Cell Responses
ON- and OFF-cells will both respond to the presence of an edge when it is in proper spatial alignment with their neural receptive fields. ON-cells will respond on one side of the edge and OFF-cells will respond on the other side. The lightness model assumes that the cortical mechanism that computes the lightness of a target patch, such as the disk in a disk-annulus display, takes as its input only the responses of ON- and OFF-cells that are located on the side that points in the direction of the target patch. The outputs of ON- and OFF-cells that respond to the other side of each edge are somehow filtered out of the lightness computation, even when the edge participates in the neural edge integration process. This should make it clear that the lightness model proposed here is not strictly a low-level physiological model. This idea is illustrated and further elaborated in Figure 3, which shows how the computational model developed earlier for disk/annulus displays can be generalized to account for the lightness of a target paper in a staircase-Gelb display, a display that is discussed in more detail below. For a more complete description of the processing stages assumed by the neural model, see [39, Fig. 1] and the full description of the model presented in [41].
Figure 3.
This figure shows how the edge integration model developed to explain lightness computation for disk/annulus displays can be generalized to compute the lightness of a target paper in a staircase-Gelb display. It also clarifies how the model differs from low-level filtering accounts of lightness. ON- and OFF-cells in the lateral geniculate nucleus will respond on opposite sides of each edge in this display. However, only those cells that respond on the side of an edge that points toward the target (illustrated) are assumed by the model to make a contribution to the edge integration computation. Edge integration occurs along paths that are directed from the background toward the target location, as indicated by the red arrows [39]. This could be accomplished by a simple neural summation across large receptive fields [37, 38, 41]. In the full neurocomputational model [41], edge integration is assumed to occur beyond area V4 in the ventral stream of the visual cortex at a processing stage beyond which midlevel processes such as boundary completion and border ownership are known to occur (in area V2), and these neural image segmentation processes are expected to make a contribution to the final percept in appropriate contexts by further modifying the gains of the edge encoding cortical neurons whose outputs are spatially integrated to compute surface lightness (see [41] for further details). However, these additional assumptions are not required to explain the data modeled in the present paper and therefore will be neglected in what follows.
The spatial summation of edge responses that occurs in the process of neural edge integration is assumed to take place in logarithmic coordinates. In other words, it is the quantities non[(logIc −logIs)]+ and −noff[(logIs −logIc)]+ (where the mathematical operator []+ models half-wave rectification) that are summed in neural edge integration, not the raw ON- and OFF-cell responses modeled by the system of Eqs. (3), (4a), and (4b). This implies that the neural response to an edge at the level of the edge integration computation will depend on the step in luminance at the edge, measured in log units, as required by the Rudd–Zemach edge integration model for disk–annulus stimuli (Eq. (1)). Another implication is that the neural weight given to an edge in the edge integration process will be proportional to the exponent n of the ON- or OFF-cells that mediate the edge response. As a consequence of the logarithmic transformation of the ON- and OFF-cell responses, the exponent n that characterizes the power law response of these ON- and OFF-cells to luminance ratios is converted to a neural gain factor [41]. Thus, given the choice of parameters non = 0.27 and noff = 1, the gain applied to an incremental edge will be only 0.27 times as large as the gain applied to a decremental edge. This feature of the model–that logarithmic transformation of these ON- and OFF-cell responses transforms power law exponents into neural gain factors–will be shown in what follows to be critical to the model’s success in fitting quantitative psychophysical data.
1.3
A Simple Example to Illustrate the Behavior of the Model
To help the reader understand how the model works, in this section I briefly review a previously published experiment in which subjects matched the lightness of two square patches, each surrounded by a frame and presented simultaneously on a computer monitor [38]. The squares were equal in width (1.06) and both were luminance decrements with respect to their surrounding frames, but the frame on the right (target) was narrower (0.19) than the frame on the left (1.78). The subject’s task was to adjust the luminance of the left square to match the two squares in lightness. This task was performed at 12 levels of the background luminance, ranging from a background luminance that was well below the luminance of either square to a value that was well above the common luminance of the frames.
According to the neural edge integration model, the lightness of each square is computed from the sum of weighted steps in log luminance at the inner and outer edges of the frames. Each edge weight is determined by the product of two separable functions: the function ω(d) that depends only on the distance of the edge from the square and the contrast-polarity-dependent factor non or noff, which characterizes the neural gain applied to the edge.
The neural model asserts that the step in log luminance at the outer edge of each frame contributes to the lightness of the square that is surrounded by that frame, so we expect that changing the background level will influence the lightness of each square. However, if the influence of the background were the same for the target and match stimuli, any effect of changing the background would cancel out in the process of generating a lightness match. In the experiment, the influence of the outer frame edge was expected to be greater on the target side of the display because that edge was closer to the square on the target side than it was on the match side, and the model asserts that the weights given to edges in the lightness computation tend to decrease with distance. Thus, changing the background luminance was predicted to affect the appearance of the target square more than it did the appearance of the match square.
The first model prediction that was tested was the prediction that there should be any effect of changing the background at all. A more stringent model prediction that was also tested was that the effect of changing the background should depend on the contrast polarity of the outer frame edge. Specifically, when the background luminance was smaller than the common luminance of the frames surrounding the target squares, it was predicted that the magnitude of the induction effect from the background should be only |non|∕|noff| as large as when the background luminance is greater than the frame luminance. Given the values of non and noff assumed here, the ratio of the background induction strengths measured under the two conditions should therefore equal 0.27.
The actual lightness matches made by the two experimental subjects are plotted in Figure 4. The equations on the plots are the least-squares linear regression models of the matches made by each of the two experimental subjects, with separate regression models fit to the data corresponding to the background luminance ranges B < F and B > F. The ratios of the slopes of the least-squares linear models associated with these two luminance ranges were 0.21 and 0.31 for each of the two observers, and the average slope was 0.26. Therefore, the average data conforms closely to the model prediction.
Figure 4.
Dependence of the lightness of a target square surrounded by a higher luminance frame on the luminance of the remote background field (from Rudd [38]). The neural edge integration model simulated here predicts that the slope of the matching plot should be about 0.27 times as large when the luminance of the background field is smaller than the luminance of the surround frame than when it is larger. This prediction is confirmed by the average data from the two observers. In the case of Observer AH, the ratio of the slope when the background luminance is less than the frame luminance to the slope when the background luminance is greater than the frame luminance is (−0.0642∕ − 0.2059) = 0.3118. In the case of Observer JA, the ratio of the slope when the background luminance is less than the frame luminance to the slope when the background luminance is greater than the frame luminance is (−0.0608∕ − 0.2861) = 0.2125. The average of the two slope ratios is (0.3118 + 0.2125)∕2 = 0.2613.
It should be emphasized that the change in the magnitude of the background change effect occurred not at the background luminance at which the contrast between each square and the background changed sign but instead at the background luminance at which the outer frame edge switched from being an increment to a decrement, as predicted by the model. Thus, the experimental results strongly support a model of the square lightness based on the hypothesis that the outer frame edge sums with the square edge to determine the square lightness, as opposed, say, to an alternative model in which lightness is determined by a long-range comparison of the target luminance to the background luminance. One influential model of lightness perception that is therefore ruled out by these results is Gilchrist’s lightness anchoring theory, which asserts that the lightness of any given region in the image is computed by a direct comparison of that region’s luminance to the highest luminance in the region’s framework of illumination [13, 1517]. The fact that the background luminance influenced the target luminance in this experiment even when it was the lowest luminance in the display also refutes the idea that lightness is determined only by a comparison with the highest luminance in the display.
It should be noted that Gilchrist has also incorporated a second anchoring rule—called the “area rule”—into his theory, which posits that changes in the size, but not the luminance, of regions in the target’s surround having a luminance lower than that of the target can produce changes in the target appearance, most notably by making the target appear self-luminous [17, 31]. But the area rule should not have been a factor in the background change experiment or in any of the other experiments whose results are modeled in this paper since no changes in the sizes of contextual surfaces occurred in these experiments. Thus, the area rule can be neglected for present purposes. Therefore, when I refer to anchoring theory in the present paper, I mean a simplified version of anchoring theory in which the only anchoring rule is anchoring to the highest luminance with the target’s framework of illumination.
2.
Modeling Lightness Judgments Made with Real-World Illuminated Material Surfaces
2.1
Zavagno, Daneyko, and Liu [53]
Zavagno, Daneyko, and Liu [53] (hereafter, ZDL) measured both the magnitude of the overall SLC effect and the perceived lightnesses of the individual incremental and decremental targets in an SLC display as a function of the intensity of a spotlight illuminating the display. The display was located in an otherwise dimly lit room (Figure 5). The experiment was carried out with four types of SLC displays, including a “classic” SLC display and three modified SLC displays in which luminance gradients were added to the target’s surround (Figure 6). Here, I discuss only the Munsell matches made to targets in the classic SLC display.
Figure 5.
Schematic illustration of the lighting setup used by Zavagno, Daneyko, and Liu [53] in their Gelb-illuminated SLC experiment. The SLC displays were directly illuminated by a theatrical spotlight hidden from the observer’s view. The white paper on the walls and ceiling was positioned for another experiment but not illuminated in the experiment modeled here (adapted from Zavagno, Daneyko, and Liu [53]).
Figure 6.
The four types of SLC displays studied by Zavagno, Daneyko, and Liu [53] (figure from Zavagno, Daneyko, and Liu [53]).
The main experimental question that Zavagno and his colleagues were interested in was whether the overall magnitude of the SLC effect and the lightnesses of the incremental and decremental SLC targets would remain constant when the intensity of the spotlight was varied. In other words, would simultaneous lightness contrast exhibit lightness constancy? As shown in Figure 7, lightness constancy did not hold in the experiment. Instead, the lightness of both the incremental and the decremental targets increased monotonically as the level of the illumination increased. In what follows, I show how this failure of lightness constancy can be understood both qualitatively and quantitatively as a consequence of neural edge integration.
Figure 7.
Munsell matches made to the incremental and decremental targets in a classic SLC display plotted against the target luminance on a log–log scale (from Zavagno, Daneyko, and Liu [53]). The horizontal dashed line indicates the target’s actual reflectance (in log units). Error bars denote standard errors of the mean. Target luminance was varied by changing the intensity of the spotlight that illuminated the display. The horizontal red line with the double arrows has been added to indicate that the spotlight intensity, and therefore also the target luminance, was varied over a total range of about 2.5 log units. As the target luminance varied over this range, the target lightness estimates varied by about 0.3–0.4 log units (vertical red line).
In the ZDL experiment, changing the illumination level also changed the luminance ratio at the outer edges of the SLC display; that is, at the edges between the display and the dark background against which the display was presented. According to the neural model, changing the luminance ratio of a remote edge can change the perceived reflectance of an arbitrary target patch as long as the remote edge is within the spatial range of the brain’s edge integration computation.
In the ZDL experiment, the remote edge was always a luminance increment, so the edge integration model predicts that increasing the illumination level should increase the perceived reflectance of the SLC target, regardless of whether the target patch itself is a luminance increment or decrement. This prediction was verified by the lightness matches plotted in Fig. 7. According to the model, the magnitude of the lightness increase should depend on the product of the luminance step in log units at the outer edge of the SLC display and the weight given to this edge in the neural edge integration computation. The edge weight, in turn, is the product of the distance-dependent function ω(d) and the ON-cell exponent non (Eq. (2)). We do not know the exact form of ω(d); we only know that its value decreases with distance of the outer display edge from the target, so we cannot predict the exact amount of constancy failure that should be observed. Nevertheless, the neural model places an upper limit on the degree of constancy failure. That value is non, the value that would be expected if there were no spatially dependent falloff in the weight given to the remote edge.
In the ZDL experiment, the illumination level was varied over a range of about 2.5 log units, as indicated by the change in target luminance on the x-axis in Fig. 5. This follows from the fact that the target luminance variation was achieved by changing the illumination level while keeping the target reflectance fixed. The model therefore predicts that the perceived reflectance of the targets should vary by at most 2.5 × 0.27 = 0.675 log units (assuming non = 0.27). In the actual experiment, increasing the illumination level over the 2.5 log unit range increased the lightness of both incremental and decremental targets by roughly 0.3–0.4 log units (the best estimate differs for the two target types, but the difference is within the error bars). Thus, the observed degree of lightness constancy failure is at least consistent with the upper limit set by the model. If it were not, the model would be disproven. If we go a step further and assume that the model holds and that non = 0.27, then we can estimate the value of ω(d) at the distance of the remote edge—which was located at 1.49 from the SLC target center—from the equation 2.46logunit×ω(1.49)×0.270.37logunits. From these values, we estimate ω(1.49)0.56.
2.2
Cataliotti and Gilchrist [6]
In a pioneering study on the topic now known as lightness anchoring, the Gestalt psychologist Adehar Gelb [12] illuminated a piece of black paper by a spotlight in an otherwise dimly lit room. Despite the fact that the paper was actually black (i.e., had a low surface reflectance), the paper appeared to observers to be white when presented in the spotlight. Gelb then surrounded the actual black paper by a true white paper and the black paper now appeared darker than the white paper.
Cataliotti and Gilchrist [6] repeated Gelb’s experiment with added variations. Following Gelb, they first presented their observers with a single black paper isolated in a spotlight. Under these conditions, the black paper appeared white (as shown earlier by Gelb). Next, they introduced a dark gray paper into the spotlight abutting the black paper. The dark gray paper then appeared white and the actual black paper appeared relatively darker than the white-appearing dark gray paper. Papers with progressively higher reflectances were then introduced into the spotlight one at a time until the spotlight contained a total of five papers, ordered in reflectance from true black to true white. When the five ordered papers were viewed together in the spotlight, the paper with the highest reflectance appeared white and the lightnesses of the other four papers were positively correlated with their actual physical reflectances. Importantly, however, the perceived reflectances of the five papers—as measured by Munsell matches—did not scale in direct proportion to the physical reflectances of the papers. Instead, the perceived dynamic range was only about 1∕3 as large, in log units, as the actual range of paper reflectances in their experiment, as determined in a subsequent quantitative analysis by Rudd [38].
In a related experiment, Cataliotti and Gilchrist ([14]; see also [17]) surrounded the five papers in the staircase-Gelb series with a white border and discovered that this manipulation had the effect of decompressing the perceived dynamic range of the spatially well-ordered papers. Although they did not report quantitative measurements of the magnitude of this decompression, the authors reported in their abstract that the perceived range of the paper reflectances was more similar to the actual reflectance range of the papers (i.e., ground truth) when the white border was present than when the papers comprising the staircase-Gelb series were presented against the dark background.
To account for this decompression effect, Gilchrist has suggested the white border serves to perceptually insulate the papers in the Gelb series from the larger framework of illumination defined by the room in which the experiment was conducted [17]. According to this hypothesis, when the white border is absent, the visual system maps the actual range of paper reflectances into the much narrower range of perceived reflectances in order to accommodate the larger range of luminances present in the global visual environment of the room. This interpretation is consistent with the principle of co-determination, originally proposed by Kardos [25]; see also [17], [13], which asserts that the lightness of a given surface within a scene is determined by a weighted average of the lightness that is computed by comparing the luminance of that surface to the luminances of the other surfaces belonging to the same illumination framework as the target surface (in this case, the framework of Gelb papers) and the lightness that is computed from a comparison of the target surface’s luminance to all of the surfaces within the larger visual environment (in this case, the entire room). These principles of grouping by illumination frameworks and co-determination are key tenets of Gilchrist’s anchoring theory. I will return to the discussion of these principles again below.
2.3
Zavagno, Annan, and Caputo [52]
Zavagno, Annan, and Caputo [52] (hereafter, ZAC) replicated the staircase-Gelb experiment of Cataliotti and Gilchrist with two added conditions in which the same five papers were spatially reordered in two different ways. Figure 8—reprinted from ZAC’s article—illustrates the spatial orderings of the papers under their three experimental conditions, denoted as Series A, B, and C. The Munsell values of the papers were 2.0, 4.0, 6.0, 8.0, and 9.5 [33], which correspond to measured log reflectances 0.4942, 1.0792, 1.4778, 1.7716, and 1.9543 (Daniele Zavagno, personal communication, December, 2018).
Figure 8.
The staircase-Gelb effect and lightness compression. The data labeled ‘A’ are the average lightness matches made to the papers in a staircase-Gelb display in which the five squares are arranged in order from darkest to lightest. The curves labeled B and C indicate lightness matches made to the same papers after the spatial arrangement of the papers was altered as shown in the inset in the lower right of the figure. The spatial arrangement of the papers was altered in two different ways to position the white square next to the black square. The theoretical lines on the plot correspond to the lightness assignments expected from the highest luminance anchoring within the local illumination framework of the papers and the global illumination framework of the room, according to anchoring theory (figure and caption adapted from Zavagno, Annan, and Caputo, 2004).
The Munsell matches made to the individual papers in each series are also plotted in Fig. 8. Clearly, the spatial ordering of the papers in the Gelb series matters. For example, the lowest reflectance paper appears darker when it appears next to the white paper in Series C than when it appears next to the second lowest reflectance paper in Series A. This is an important result because it runs contrary to the central claim of anchoring theory that the lightness of each individual paper in any given series is determined by the ratio of that paper’s luminance to the luminance of the highest luminance paper in the series and not by proximity. If anchoring theory were correct, the spatial ordering of the papers should make no difference.
Another noteworthy feature of the results plotted in Fig. 8 is that the matches made in Series A (the staircase-Gelb series in which the papers were ordered from darkest to lightest) replicate the compression effect observed by Cataliotti and Gilchrist in their original staircase-Gelb experiment, both qualitatively and quantitatively. I fit a least-squares linear regression model to the data from Series A and obtained a slope estimate of 0.30, which is close to the cube-root compression exhibited in Cataliotti and Gilchrist’s experiment with the staircase-Gelb series presented against a dark background [38].
According to the neurocomputational model, the lightness of a homogeneous surface, such as one of the papers in these experiments, is determined by a sum over neurally weighted steps in log luminance computed at the surface borders and at other nearby edges that are oriented roughly parallel to the surface borders [41, 54]. In the ZAC experiment, the “other nearby edges” were the edges between papers in the Gelb series that were not the edges of the target paper itself. Edge weights are assumed to decay with distance, so we anticipate that the borders of the target paper itself will make the largest contribution to the edge integration computation for each individual paper.
To discover whether the model can account for the matches made by the observers in Series A, B, and C, I proceeded in a stepwise manner. I first simulated the behavior of a reduced version of the neurocomputational model in which only the immediate borders of each paper contribute to that paper’s lightness. Then I added the effects of other neighboring borders one by one, beginning with the next closest borders. I first report the results for the first-order model in which only the immediate borders of each paper contribute to the paper’s lightness. In the simulation, incremental steps were weighted by the neural gain factor 0.27 and decremental steps by the gain factor 1.0. Since papers in Zavagno et al.’s study were square-shaped, the weights assigned to each of the four paper borders in the simulation were determined solely by the contrast polarity of each border and not by the relative border length, which was equal for all four borders of each paper.
The results for the first-order model are presented in Figure 9(a). The plots at the top of the figure are the actual matches made by observers in Series A, B, and C; the plots at the bottom are the simulated matches. Clearly, the two sets of plots do not align. However, the mismatch is largely remedied by renormalizing the model output so that the largest output of the neural computation in each series appears white (i.e., matches a Munsell 9.5 standard) as shown in Fig. 9(b). This was accomplished by adding a different constant to the output of the simulated data from each series. Note that the normalization rule applied here is not the same as the anchoring rule adopted in Gilchrist’s anchoring theory and in some versions of the retinex color model (e.g., [28]), which is that the highest luminance always appears white.
Figure 9.
Comparison of the lightness matches made in Series A, B, and C of ZAC’s study with the predictions of a neural model based on a weighted average of the incremental and decremental steps in log luminance at the four edges of each paper. In the simulations, incremental steps in log luminance were weighted by the experimentally measured ON-cell exponent 0.27, and decremental steps in log luminance were weighted by the assumed OFF-cell exponent 1.0. (a) Simulated matches for the neural model without white anchoring. (b) Simulated matches shifted upward to make the paper producing the highest model output in each series match a Munsell 9.5 standard.
In previously published work, I have explained why I believe that the white anchoring happens at a processing stage beyond that of edge integration, and thus at a stage of neural processing at which information about luminance per se has been lost [38, 39, 41, 43]. For example, the anchoring rule applied here allows for the appearance of two simultaneously presented highest luminance targets to appear different depending on the differing spatial compositions of their local contexts, while the highest luminance anchoring rule would predict that they should both appear (equally) white regardless of their spatial context [39, 43]. As discussed above, anchoring theory asserts that the appearance of any given target surface should not be affected by changes in the luminance of any lower-luminance contextual elements in the image unless those elements change size [13, 17].
The first-order model with white-level anchoring does a pretty good job of accounting for the matches made by ZAC’s observers. Nevertheless, there is some residual mismatch between the real and simulated lightness matching results even after applying the highest output anchoring rule. To quantify this residual error, I compared the total sums-of-squares error for all 15 simulated versus actual match pairs plotted in Fig. 9(b) to the total sums of squares of the 15 actual matches around their grand mean. This calculation yielded an average error of 5.8%.
I next varied the initial value of non = 0.27 in steps of 0.01, while keeping all other elements of the model fixed, to search for the parameterization of the first-order model that minimized the average error. A model with non = 0.22 produced the least-squares error of 3.0%. With non fixed at this new value, I varied noff in increments of 0.01 and discovered a new error minimum (3.02% versus 3.03%) at the value noff = 0.99. Since this value of noff was extremely close to the originally assumed OFF-cell exponent of 1.0, I continued my stepwise analysis with the assumed parameters non = 0.22 and noff = 1.0. A plot of the first-order model fits obtained with these parameters is presented in Figure 10.
Figure 10.
Same as Fig. 9(b) except with the ON-cell exponent set to 0.22, the value that produced the least-squares error for the model fit to the psychophysical data from all three series combined.
To simulate the full neural model that includes edge integration—that is, the effects of other parallel paper edges in the series in addition to the edges of the target paper—I fixed the values of the ON- and OFF-exponents at the values non = 0.22 and noff = 1.0 then added the influences of nearby edges to the lightness the calculation of each paper’s lightness in pairs based on the number of edges away from the target each pair was located, weighting all steps in log luminance that incremented along a path directed toward that paper by the value 0.22 and all steps in log luminance that decremented in the direction of that paper by the value 1.0.
The size of the papers used in the study was 2.38×2.38 (Daniele Zavagno, personal communication, July 2019). Thus, the second-order model included the four paper borders plus the two parallel borders that were each located 2.38 from the target paper, the third-order model included all of these borders plus the two parallel borders that were located 4.76 from the target paper, and so on. In those cases, only one additional border was present at a given distance. In such cases, one additional border was added to the model at the corresponding stage of the simulation process.
To compute the weight given to any given edge, the contrast-polarity-dependent weighting factor for that edge was multiplied by an independent, distance-dependent weighting factor that depended only on the distance of each pair of edges from the target paper. I also weighted the contribution of each edge—regardless of distance—by the same proportionality factor of 0.25 that I had applied to the four paper edges in the first-order model because any added edges were expected to produce their effects by the principle of parallelism [54] with the target border, which itself received a length-proportional weighting of 0.25. Thus, the total weight assigned to each edge in the lightness computation depended on three independent multiplicative factors: the distance of the edge from the target, the edge contrast polarity, and the proportion of the target edge that was parallel to the influencing edge.
Figure 11 presents the results of the simulation of the full edge integration model corresponding to the set of distance-dependent weighting factors illustrated in Figure 12(a). This set of weighting factors produced a total squared error of 1.55% under the choice of contrast-polarity-dependent parameters non = 0.22 and noff = 1.0. I found that I could reduce this error to 1.48% by slightly increasing the value of noff from 1.00 to 1.03, but any further increases in noff increased the percent error. Changes in the value of non did not improve the model fit. I concluded that any deviations from the value noff = 1.0 were likely to be due to noise. Therefore, the results shown in the plots correspond to the model where non = 0.22 and noff = 1.
Figure 11.
Simulated lightness matches for Series A, B, and C corresponding to the least-squares parameterization of the full neural edge integration model. The ON-cell exponent was here set to the value 0.22 and the OFF-cell exponent was set to the value 1.0. The distance-dependent components of the edge weights were set at the values plotted in Fig. 12.
Figure 12.
The distance-dependent component of the edge-weighting function that produced the simulation results presented in Fig. 11 calculated in two ways. (a) Calculation based on weighting the contrast-polarity dependent and spatially dependent components of the weights for each edge by the proportion of the target border that paralleled the edge whose contribution was weighted. The value of the distance-dependent weighting factor at the location of the target border was here normalized to 1.0. (b) Same as (a) except that the proportionality factor was computed from the 2D angle that the contributing edge subtended with respect to the center of the paper whose lightness was being computed. No normalization was applied.
The distance-dependent weighting function plotted in Fig. 12(a) was obtained under the assumption that each remote edge had the same effective length as a target edge (i.e., was assigned a proportionality factor of 0.25), as described above. Fig. 12(b) plots an alternative distance-dependent weighting function that was computed by assuming that the influence of each successive edge decreases in proportion to the inverse of the two-dimensional (2D) spatial angle (in the plane of the SLC display) of the remote edge length with respect to the target paper’s center. Both simulation results are shown here because it is not clear which of these assumptions, if either, is correct. Both correspond equally to the model fits shown in Fig. 9 that produce a total squared error of 1.55%. However, it is worth noting that the distance-dependent weighting function plotted in Fig. 12(a) is consistent with the estimate ω(1.49)0.56 obtained above by applying the edge integration model to the data from Zavagno, Daneyko, and Liu, whereas the weighting function plotted in Fig. 12(b) is not. Thus the results favor the model corresponding to Fig. 12(a).
These conclusions are important, so it is worth explaining them out in more detail. The distance of the remote edge to the target edge in ZDL’s SLC display was 1.27. From Fig. 12(a), we can estimate that the edge weight corresponding to this distance is about 0.54, so the correspondence is almost exact for the first model of the distance-dependent edge weighting factor (and within the error of the estimate of the degree of lightness constancy failure in ZDL’s experiment). The distance from the remote edge to the target center in the display was 1.49. According to the second model, whose distance-dependent edge function is plotted in Fig. 12(b), the edge weight at that distance should be about 9.0, which is more than 1.5 times larger than the value of 0.56 that we estimated from the degree of lightness constancy failure in ZDL’s experiment. The first model is plausible, while the second model is not.
2.4
Lightness Compression Evaluated for a Single Decremental Edge
The neural model accounts for the overall compressed dynamic range of the perceived reflectances of the papers in the staircase-Gelb and scrambled-Gelb experiments by assuming that only the neural responses to those steps in log luminance that increment in the direction of the target paper (i.e., the neural edge response derived from ON-cell responses) are compressed in the process of computing that paper’s lightness relative to the size of the true physical reflectance steps. Luminance decrements, on the other hand, are assumed to be represented veridically by the human visual system. The idea that decremental steps in log luminance are represented veridically is an important property of the model that deserves an independent check.
To test this prediction of the model, I examined how the darkest (Munsell 2.0) paper was differentially influenced by its neighboring papers in Series A and B of the ZAC study. In Series A, the darkest paper was positioned next to a Munsell 4.0 paper (the second lowest reflectance in the series) while in Series B, it was positioned next to the Munsell 9.5 paper (the highest reflectance paper in the series) (Fig. 7). In the first-order model, the perceived reflectance of the 2.0 paper depends on a weighted average of the luminance steps in log units at the four paper borders. Three of these steps in log luminance were the same in Series A and B, but one was different. The one that was different was the decremental step between the neighboring (either Munsell 4.0 or 9.5) paper and the Munsell 2.0 target paper. According to the first-order model, a change in reflectance of the neighboring paper should therefore alter the perceived lightness of the 2.0 paper by an amount equal to 1∕4 of the step in log reflectance at the border between the 2.0 paper and its neighboring paper. Changing the log reflectance of the immediately neighboring paper from 1.0792 (in Series A) to 1.9543 (in Series B) should therefore decrease the perceived reflectance on the target paper by 14(1.95431.0792)=0.21878log units.
Taking into account the influence of the other parallel edges, the full neural edge integration model predicts a slightly smaller shift in the perceived reflectance of about 0.1960 log units. Figure 13 illustrates this predicted lightness change graphically. The horizontal red arrow in the figure indicates the distance in log units between the physical reflectances of the Munsell 4.0 and Munsell 9.5 papers that neighbored the target Munsell 2.0 paper in Series A and B, respectively, while the vertical red arrow indicates the predicted shift in the perceived reflectance of the Munsell 2.0 paper in going from Series A to Series B. As can be seen from the figure, this prediction of the neural edge integration model is verified. The significance of this confirmation of the model cannot be overstated because it supports the key hypothesis of the model that the reflectance ratios at decremental borders are represented veridically by the human visual system even while the reflectance-to-lightness mapping exhibits dynamic range compression on the whole. If the visual response to decremental edges was compressed by the same amount as that the visual response to incremental edges are here assumed to be, the predicted shift in lightness would be only about 1∕5 to 1∕4 as large as it was actually measured to be. Hence these result provide strong evidence that incremental steps in log luminance only are compressed in the neural computation of lightness.
Figure 13.
Predicted shift in the log perceived reflectance (y-axis) of the darkest Gelb paper when its neighboring paper in Series A is replaced by the paper with the highest reflectance in Series B: an increase in the neighboring paper of 0.875 log units. The observed shift is predicted on the basis of the assumption that the darkness-inducing effects of decremental edges and each edge of the square target paper contributes 1/4 of the total influence of the steps in log luminance at the target’s edges to the target lightness. The influence of other edges in the scene that are oriented parallel to the target paper’s edge have also been taken into account here, but their total influence on the lightness of this particular target paper was small (see text for details).
2.5
Effect of Adding a White Border to the Staircase-Gelb Display
As mentioned above, Gilchrist and Cataliotti [14] reported that surrounding a staircase-Gelb display by a white border brought the observers’ lightness judgments more in line with the physical luminance ratios in the display (see also [17]). To investigate whether the edge integration model mimics this effect, I repeated my simulation of the model’s response to the papers presented in Series A, but with the same arrangement of five papers instead presented against a background consisting of a Munsell 9.5 paper. Figure 14 presents the results of this simulation together with the actual and simulated results for Series A when presented against a dark background. The line in the figure labeled “ground truth” indicates the matches that observers would make if their lightness judgments were veridical. As can be seen from the figure, the model exhibited a strong release from compression in the direction of veridicality when a white background was added to the display.
Figure 14.
Comparison of the simulated output of the neural edge integration model in response to Series A (staircase Gelb) when the display is presented against a white background to the model’s output in response to the same arrangement of papers when they are presented against a dark background. The results from the simulated white background condition are consistent with the Gilchrist and Cataliotti’s report [14] that surrounding the staircase-Gelb display by a white border tends to shift the lightness matches closer to the physical ratio scaling of the papers (i.e., ground truth).
To help quantify the magnitude of the observed change in the overall amount of dynamic range compression produced by the model in the presence of the white border, I fit linear regression models to the lightness matches performed by Zavagno et al.’s observers when the staircase-Gelb display was presented against a black background and to the simulated matches produced by the model when the same stimulus arrangement of papers presented against a white background. The estimated slope for the actual Series A data was 0.30, while the slope of the simulated data corresponding to the white border condition was 0.85. For comparison, the log–log slope corresponding to a veridical reflectance match (i.e., ground truth) would be 1.0. By these measures, adding the white border had the effect of relieving about 79% of the dynamic range compression observed when Series A was presented against a dark background in ZAC’s original study ((1.0 − 0.85)∕(1.0 − 0.30) = (100% − 79%)).
The modeling results are thus consistent with Gilchrist and Cataliotti’s report that adding a white border tends to counteract the compression effect observed when their staircase-Gelb stimulus is presented against a dark background. However, the neural model’s explanation of this decompression effect differs fundamentally from the one proposed by Gilchrist and his colleagues. According to their lightness anchoring theory, the white border acts to perceptually insulate the illumination framework of Gelb papers from that of the larger visual environment of the room, with the result that the visual system more accurately scales the reflectance ratios of the papers within the series. The neural edge integration model instead explains the effect of the white border on the basis of the fact that more of the paper borders in the display are luminance decrements with respect to the immediate surround when the white border is present. Since decremental steps in log luminance are represented veridically in the neural model (as verified through post-hoc data analysis above), while incremental edges are not, replacing the incremental steps in log luminance at the borders between the Gelb papers and the dark background with the decremental steps in log luminance at the borders between the same papers and a white background has the overall effect of bringing the lightness matches more in line with ground truth.
3.
General Discussion
It has been shown here that a neural model based on the principle of edge integration can account for lightness matching data from experiments performed with material displays illuminated by a spotlight in an otherwise dimly lit room (Gelb illumination) as well as for data from experiments conducted with disk/annulus and square/frame stimuli presented on computer monitors. In the case of the material displays, the model explains real-world failures of lightness constancy as well as key quantitative properties of perceptual dynamic range, including lightness compression and various releases from compression.
The neural model combines edge integration with distance and contrast-polarity edge weights plus a renormalization of the highest output of the neural edge integration process to appear white. Together, these assumptions result in a model in which lightness depends not only on a spatial extended (beyond local contrast), but also spatially windowed, analysis of the visual scene. The purpose of this analysis is to combine local contrast measures across space to establish a scale of perceived reflectance that applies to a substantial region of the image [37].
The modeling results reported here confirm a key model prediction that the influence of spatial context on lightness declines with distance. The maximum extent of this influence is estimated here to be about 10. This result is consistent with previous measurements of the spatial extent of the contextual influence on lightness based on experiments conducted with disk-ring [44] and in other experimental lightness and brightness paradigms (e.g., ([8, 26, 30, 47])).
The neural model produces a distorted representation of real-world surface reflectances. Because the edge weights in the model depend on both the distance of the weighted edge from the target and the edge contrast polarity, and because the weighted edges themselves depend on local steps in log luminance, the perceptual rescaling of reflectance that the model produces depends on both the geometric and photometric properties of the target surface and other surface in the target’s visual environment. Incremental steps in log luminance are perceptually compressed, while decremental steps in log luminance are represented veridically. But even the influences of decremental steps on lightness depend on distance. The overall lightness scaling produced by combining incremental and decremental luminance steps through the process of edge integration accounts for the key behavioral properties of lightness matching observed here, including constancy failures, lightness compression, and various releases from compression. The model’s assumption of distance-dependent and contrast-polarity-dependent edge weights can be usefully compared to the assumptions of the original retinex model of Land and McCann [29], which also (implicitly) assumed that lightness is computed from a sum of steps in log luminance, but with edge weights that were uniformly equal to 1.0. Unlike the present model, retinex predicts veridical ratio scaling and lightness constancy with respect to changes in spatial context [37], contrary to the results of the experiments whose results were modeled here.
3.1
Implications of These Results for Other Models of Lightness Perception
In this section, I critique some competing models of lightness perception in light of the results reported above.
3.1.1
Gilchrist’s Lightness Anchoring Theory
Anchoring theory is a spatially global theory of lightness computation in the sense that the lightness of any given surface in an image is determined by an arbitrarily long-range comparison of the target surface’s luminance with the highest luminance within the target’s illumination framework. This long-range comparison is not based on ratios at edges but instead on the ratio of the luminance of the target surface to that of the highest luminance in the target’s framework of illumination. In anchoring theory, these two surfaces do not have to be contiguous.
Several of the empirical findings reported in this paper provide strong evidence against anchoring theory. First, as demonstrated by the background change experiment, the lightness of any given target surface can be influenced by the luminances of surfaces within the image that have a lower luminance than the target, including regions that have the lowest luminance in the image as a whole. This finding alone is enough to refute the claim that lightness is based solely on the comparison of a surface’s luminance with that of the surface having the highest luminance. Furthermore, as mentioned above, this conclusion holds even when there is no change in the area of any surface that has a luminance lower than that of the target surface, so that the area rule of anchoring theory cannot be invoked as an alternative explanation of the results discussed in this paper.
Second, contextual elements in the vicinity of a target tended to influence the target’s lightness more than spatially distance elements in the experiments analyzed here, even when the near and far contextual elements were presented withing the same illumination framework as the target (as in the Gelb illumination experiments modeled here). This finding directly refutes the assumption of a global perceptual comparison of the luminance of each surface to that of the highest luminance of the sort assumed by anchoring theory.
Third, anchoring theory treats the well-known asymmetries in the strengths of lightness and darkness induction (e.g. [5, 13, 17, 21, 22, 43, 44, 50, 51]) in an all-or-none fashion. Anchoring theory’s sole mechanism for addressing these asymmetries (again ignoring the area rule, which does not to the experiments modeled here) is highest luminance anchoring. Although the neural model proposed here is in agreement with anchoring theory on the need for a white anchor, it assumes that the white anchor corresponds to the highest output of the edge integrator rather than to the highest luminance per se. The assumption that it is the highest edge integrator output, rather than the highest luminance, that appears white has been demonstrated in previous research to provide a better account of the full panoply of results pertaining to white point anchoring [37, 38, 41, 43]. Furthermore—and perhaps more importantly—a mechanism other than anchoring is needed to fully account for the overall pattern of lightness–darkness asymmetries documented in the literature. The neural model provides this additional mechanism in the form of asymmetries in the neural gains applied to incremental and decremental steps in log luminance that result from the different response characteristics of ON- and OFF-neurons in early visual pathways. Critically, anchoring theory provides no account of how lightness is influenced by changes in the luminance or proximity of scene elements other than the surface with the highest luminance, while these consequences emerge naturally from the asymmetric processing of incremental and decremental luminance steps assumed by the neural model.
Importantly, the neural model accounts for both compression and release from compression in the Gelb effect on the basis of the model’s assumptions about asymmetric processing of incremental and decremental luminances without the need to invoke the ancillary assumptions of grouping by the illumination framework and co-determination proposed by anchoring theory to explain these same effects. This is not to say that grouping and segmentation do not play any role in lightness perception, but these concepts are not needed to account for staircase- and scrambled-Gelb illusions. The issues of grouping and segmentation will be addressed in future work on the neural model.
3.1.2
ODOG Model
The oriented difference-of-Gaussians (ODOG) model [3] comprises a bank of 42 oriented difference-of-Gaussians filters tuned to seven spatial scales and six orientations. ODOG computes brightness by convolving the input image with all 42 filters and then summing the filter outputs across spatial scales, with the higher spatial frequency mechanisms assigned higher weights. The outputs of the resulting six orientation channels are independently normalized by their root mean square contrasts evaluated across the entire image then summed to produce the model output.
The ODOG model has been shown to be capable of providing a qualitative explanation of many standard brightness/lightness illusions [32], including the results of a study performed with staircase-Gelb display and scrambled-Gelb displays on a computer monitor [4]. Because the scale of its largest filters is about 6, the ODOG model might also be able to account for the distance-dependent falloff in contextual influence on target lightness discussed and modeled here. However, it seems unlikely that the ODOG could explain quantitative properties of the dynamic range compression observed in the ZAC study, and even less likely that it could account for the fact that decremental steps in log luminance at individual edges were represented veridically by the visual system in the same experiments. To account for such effects would seem to require a mechanism for asymmetric lightness and darkness encoding, which ODOG lacks because the linear filters on which the ODOG is based treat increments and decrements symmetrically.
Relatedly, the use of linear cortical filters to model visual perception—which is often considered as one of ODOG’s strengths because it is thought to reflect the properties of known physiological mechanisms—is brought into question by the physiological evidence cited here for a strong compressive response in macaque LGN ON-cells, which are commonly believed to be part of the visual pathway that mediates primate color vision. It is unclear at present how these seemingly conflicting views of the underlying visual physiology are to be resolved.
The ODOG also lacks any form of lightness anchoring, which is needed to account for the fact that Cataliotti and Gilchrist’s and ZAC’s subjects always matched the highest lightness paper in each series to a white (Munsell 9.5) standard. On the other hand, in the study that was conducted to test ODOG’s ability to explain results like the ones modeled here, but with the stimuli presented on a computer monitor [4], the lightness of the simulated highest luminance paper depended on that target’s position in the series in which it was embedded. This suggests that the simulated highest luminance paper did not always appear white and, thus, that the computer-generated and monitor-display stimuli were not processed in the same way as the real-world illuminated stimuli employed in the experiments modeled here.
Finally, psychophysical evidence supports the conclusion that a full account of lightness perception requires image segmentation mechanisms that are not included in ODOG or other low-level filter models (e.g. [26]). Consistent with the perceptual evidence for image segmentation in lightness is evidence from neurophysiology for neural mechanisms supporting such image segmentation cues as illusory contour completion [49] and border ownership [55] located in area V2 of the visual cortex. I have previously addressed this issue and proposed that such mechanisms could play a role within the overall architecture of my neural model by further modulating the gains of visual neurons at a processing stage that comes before the stage of edge integration in a feedforward circuit in a manner that is consistent with the cortical architecture of known neural image segmentation mechanisms cited above ([39], Fig. 1; [41]). Since image segmentation mechanisms are not required to account for the lightness phenomena modeled in the present paper, I have mostly ignored them here. Nevertheless, the fact that they have a place in the larger neural circuit that I have proposed as a physiological substrate for cortical lightness computation distinguishes my approach from that of low-level filtering models such as the ODOG.
3.1.3
Stiehl, McCann, and Savoy [48]
Stiehl et al. proposed a lightness model in which dynamic range compression is produced by a combination of intraocular scattering and neural processing. In their model, intraocular scattering produces a compressive power law mapping of physical reflectance to retinal illuminance, and the nervous system subsequently performs a logarithmic mapping of retinal illuminance to lightness.
In principle, intraocular scattering might explain the compressive power law response of the LGN ON-cells observed in the data of De Valois et al. [9]. If scattering was the correct explanation for the ON-cell compressive response, however, we would expect OFF-cells to exhibit a similar compressive behavior. In the course of the modeling described in the present paper, multiple checks were performed to verify that decremental luminance ratios at edges were represented veridically by the visual system as a whole, while incremental luminance ratios at edges were subject to compression, including a direct test in which only the luminance ratio at a single decremental edge was altered (see Fig. 13). In each case, the assumption that decremental edges are represented veridically at the perceptual level was verified. Thus, if intraocular scattering indeed produces a compressive response in both ON- and OFF-cells, then an additional neural mechanism located subsequent to the LGN would be needed to explain why this compression does not apply to the perception of decremental luminance steps, while it does apply to the perception of increments. If, on the other hand, the ON-cell compression seen in neural data is not due to scattering, then it is not clear how scattering could influence perception. Therefore, it is difficult to see how the model proposed by Stiehl et al. can be reconciled with the overall pattern of results presented here.
Another argument against the hypothesis that intraocular scattering explains perceptual lightness compression is that spatial filtering in the human nervous system likely filters out the effects of the slow luminance gradients produced by scattering. Although the ON- and OFF-cells in the LGN do transmit information about the DC luminance level, cortical neurons such as simple cells arguably exhibit no such DC response. It is the point of view of the current model that the outputs of such DC-suppressing cortical neurons (including edge detectors) for the basis for the cortical representation of surface lightness. This idea will be further addressed in future work on the model.
3.1.4
Perceptual Filling-in Models of Grossberg and Colleagues
The model proposed here has important points of commonality with the brightness “filling-in” models first proposed by Grossberg and his colleagues [7, 19, 20] in the 1980s, and subsequently incorporated into a complex and sophisticated model of cortical visual processing called FACADE theory [18].
Like the filling-in models proposed by Grossberg et al., the present model proposes that the visual nervous system first encodes information about edges and borders, then uses this information to construct a cortical representation of surfaces and objects. However, the present model differs from Grossberg’s theory in important and testable ways. The mechanism that Grossberg and colleagues have proposed to account for the filling of surface properties is a diffusing neural signal that propagates from the locations of surface borders within one or more topological maps of the visual environment and stops when it reaches the next border (see, for example, [10, p. 58]). This is how surface properties are confined to the surface region itself according to their model.
The results presented here raise serious obstacles for such diffuse-to-border-type filling-in models by demonstrating that information about multiple incremental and decremental luminance steps is spatially summed by the visual system to compute lightness. This would seem to rule out a mechanism in which information about the luminance step at an edge literally spreads from the location of that edge to fill in only the region between that edge and the next edge in a neural map because the implied blockage of information flow would make edge summation impossible. For this reason, I have proposed an alternative mechanism to explain how perceptual filling-in of surface properties occurs in the brain. According to this alternative model, edge summation is performed by large-scale receptive fields of color representing neurons at a subsequent stage of neural processing [37, 38, 41]. This alternative account of cortical visual processing has not been developed to the level of sophistication as that of Grossberg’s group, it nevertheless worth emphasizing here that the current approach is both well-defined and distinctly different from Grossberg’s model of how edge information is utilized by the brain to construct a cortical representation of surfaces and objects in the visual environment. Furthermore, my model makes predictions that can be tested against the predictions of diffusive filling-in models. One of these predictions is, in fact, tested here by the demonstration that edge integration occurs in a variety of perceptual contexts and is thus a fundamental mechanism contributing to the perceptual computation of lightness.
3.2
Outstanding Issues and Future Extensions of the Model
Previous work on the model has motivated the introduction of several components of the model beyond those discussed here. These include contrast gain control [3739, 41, 42, 45], top-down gain control of edge weights based on edge classification [37, 41], and individual differences in the spatial extent of edge integration [41]. These model properties were not discussed in the present paper because they were not needed to model the data under investigation.
Because ON- and OFF-cells possess center–surround receptive fields, the neural model proposed here can also spatially integrate luminance gradients in addition to hard edges (see [41]). This model property has previously been used, in conjunction with the model’s contrast gain control property, to simulate the phantom illusion: a lightness illusion in which surrounding targets with luminance gradients can reverse the apparent contrast polarity of the target, thus making increments appear as decrements and vice versa [11, 41]. The fact that the model can integrate both hard edges and gradients suggests that the model might be extended in future work to account for the lightness of several novel SLC displays that Zavagno, Daneyko, and Liu created by adding additional edges and gradients to the classical SLC display (Fig. 6).
Acknowledgments
Michael Rudd is supported by NIH COBRE P20GM103650.
References
1BarlowH. B.“Summation and inhibition in the toad’s retina,” J. Physiol. (London) 119, 69–88 (1953)
2BillockV. A.2018Hue opponency: chromatic valence functions, individual differences, cortical winner-take-all opponent modeling, and the relationship between spikes and sensitivityJ. Opt. Soc. Am. A35B267B277B267–7710.1364/JOSAA.35.00B267
3BlakesleeB.McCourtM. E.1999A multiscale spatial filtering account of the White effect, simultaneous brightness contrast and grating inductionVision Res.39436143774361–7710.1167/9.3.22
4BlakesleeB.ReetzD.McCourtM. E.2009Spatial filtering versus anchoring accounts of brightness/lightness perception in staircase and simultaneous brightness/lightness contrast stimuliJ. Vision910.1068/p3103
5BressanP.Actis-GrossoR.2001Simultaneous lightness contrast with double incrementsPerception30889897889–9710.3758/BF03206499
6CataliottiJ.GilchristA. L.1995Local and global processes in lightness perceptionPercept. Psychophys.57125135125–3510.3758/BF03207497
7CohenM. A.GrossbergS.1984Neural dynamics of brightness perception: features, boundaries, diffusion, and resonancePercept. Psychophys.36428456428–5610.3758/BF03213045
8ColeR. E.DiamondA. L.1971Amount of surround and test inducing separation in simultaneous lightness contrastPercept. Psychophys.9125128125–810.1364/JOSA.56.000966
9De ValoisR. L.AbramovI.JacobsG. H.1966Analysis of response patterns of LGN cellsJ. Opt. Soc. Am.56966977966–7710.1163/156856809786618484
10FangL.GrossbergS.2009From stereogram to surface: how the brain sees the world in depthSpatial Vis.22458245–8210.1016/j.visres.2015.10.007
11GalmonteA.SoranzoA.RuddM. E.AgostiniT.2015The phantom illusionVision Res.117495849–58
12GelbA.von BetheW. A.1929Die ‘Farbenkonstanz’ der SehdingeHandbuch der Normal und Pathologische PsychologieVol. 12594678594–678SpringerBerlin
13GilchristA.Seeing Black and White2006Oxford University PressNew York
14GilchristA.CataliottiJ.1994Anchoring of surface lightness with multiple illumination levelsInvestigative Ophthalmology and Visual Science35S216510.1167/9.9.13
15GilchristA. L.RadonjićA.2009Anchoring of lightness values by relative luminance and relative areaJ. Vision913, 11013, 1–1010.1167/10.5.6
16GilchristA. L.RadonjićA.2010Functional frameworks of illumination revealed by probe disk techniqueJ. Vision1010.1037/0033-295X.106.4.795
17GilchristA.KossyfidisC.BonatoF.AgostiniT.CataliottiJ.LiX.SpeharB.AnnanV.EconomouE.1999An anchoring theory of lightness perceptionPsychol. Rev.106795834795–834
18GrossbergS.PessoaL.DeWeerdP.2003Filling-in the forms: Surface and boundary interactions in visual cortexFilling-in: From Perceptual Completion to Skill Learning133713–37Oxford University PressNew York
19GrossbergS.MingollaE.1985Neural dynamics of form perception: boundary completion, illusory figures, and neon color spreadingPsychol. Rev.92173211173–21110.3758/BF03207869
20GrossbergS.TodorovićD.1988Neural dynamics of 1-D and 2-D brightness perception: a unified model of classical and recent phenomenaPercept. Psychophys.43241277241–7710.1037/h0040919
21HeinemannE. G.1955Simultaneous brightness induction as a function of inducing- and test-field luminancesJ. Exp. Psychol.50899689–96
22HeinemannE. G.JamesonD.HurvichL.1972Simultaneous brightness inductionHandbook of Sensory PhysiologyVol. VII/4146169146–69SpringerBerlin
23HubelD. H.Eye, Brain, and Vision1988W. H. FreemanNew York
24D.JamesonL. M.Hurvich1961Complexities of perceived brightnessScience133174179174–910.1167/18.13.1
25KardosL.1934Ding und Schatten [Thing and shadow]Zeitschrift für Psychologie, Erg bd.2310.1152/jn.1953.16.1.37
26KimM.GoldJ. M.MurrayR. F.2018What image features guide lightness perception?J. Vis.181201–2010.1073/pnas.83.10.3078
27KufflerS. W.1953Discharge patterns and functional organization of mammalian retinaJ. Neurophysiol.16376837–6810.1364/JOSA.61.000001
28LandE. H.1986An alternative technique for the computation of the designator in the retinex theory of color visionProc. Natl. Acad. Sci. USA83307830803078–8010.1037/h0062595
29LandE. H.McCannJ. J.1971Lightness and retinex theoryJ. Opt. Soc. Am.611111–11
30LeibowitzH.MoteF. A.ThurlowW. R.1953Simultaneous contrast as a function of separation between test and inducing fieldsJ. Exp. Psychol.46453456453–610.1113/jphysiol.1966.sp008001
31LiX.GilchristA.1999Relative area and relative luminance combine to anchor surface lightness valuesPercept. Psychophys.61771785771–8510.1016/0042-6989(88)90013-2
32McCourtM. E.BlakesleeB.CopeD.2016The oriented difference-of-Gaussians model of brightness perceptionJ. Electron. Imaging6191–910.1073/pnas.0503887102
33MunsellA. H.Munsell Book of Color: Defining, Explaining and Illustrating the Fundamental Characteristics of Color1929Munsell Color Company
34NakaK. I.RushtonW. A.1966S-potentials from colour units in the retina of fish (Cyprinidae)J. Physiol. (London)185536555536–5510.1167/13.14.18
35ReidR. C.ShapleyR.1988Brightness induction by local contrast and the spatial dependence of assimilationVision Res.28115132115–3210.3389/fnhum.2014.00640
36RomneyA. K.D’AndradeR. G.IndowT.2005The distribution of response spectra in the lateral geniculate nucleus compared with reflectance spectra of Munsell color chipsProc. Natl. Acad. Sci. USA102972097259720–5
37RuddM. E.2010How attention and contrast gain control interact to regulate lightness contrast and assimilationJ. Vision104010.1117/1.JEI.26.3.031209
38RuddM. E.2013Edge integration in achromatic color perception and the lightness-darkness asymmetryJ. Vision131810.1364/JOSAA.24.002766
39RuddM. E.2014A cortical edge-integration model of object-based lightness computation that explains effects of spatial context and individual differencesFront. Hum. Neurosci.81141–1410.1167/5.11.5
40RuddM. E.Retinex-like computations in human lightness perception and their possible realization in visual cortexProc. Imag. Sci. Tech. Intl. Symp. Electr. Imag. IS&T Electronic Imaging: Retinex at 50 20162016IS&TSpringfield, VApg. Retinex-021
41RuddM. E.2017Lightness computation by the human visual systemJ. Electron. Imaging2603120910.1364/JOSAA.24.002134
42RuddM. E.PopaD.2007Stevens’ brightness law, contrast gain control, and edge integration in achromatic color perception: A unified modelJ. Opt. Soc. Am. A24276627822766–8210.1073/pnas.82.17.5983Errata. J. Opt. Soc. Am. A,24, 3335 (2007)
43RuddM. E.ZemachI. K.2005The highest luminance rule in achromatic color perception: Some counterexamples and an alternative theoryJ. Vision59831003983–100310.3758/BF03213048
44RuddM. E.ZemachI. K.2004Quantitative studies of achromatic color induction: An edge integration analysisVision Res.44971981971–8110.1126/science.6539501
45RuddM. E.ZemachI. K.2007Contrast polarity and edge integration in achromatic color perceptionJ. Opt. Soc. Am. A24213421562134–5610.1037/h0053804
46ShapleyR.ReidR. C.1985Contrast and assimilation in the perception of brightnessProc. Natl. Acad. Sci. USA82598359865983–610.1038/scientificamerican0163-107
47StevensJ. C.1967Brightness inhibition re size of surroundPercept. Psychophys.2189192189–92
48StiehlW. A.McCannJ. J.SavoyR. L.1983Influence of intraocular scattered light on lightness-scaling experimentsJ. Opt. Soc. Am.73114311481143–810.1177/2041669518787212
49von der HeydtR.PeterhansE.BaumgartnerG.1984Illusory contours and cortical neuronsScience224126012621260–210.1364/JOSAA.24.001830
50WallachH.1948Brightness constancy and the nature of achromatic colorsJ. Exp. Psychol.38310324310–2410.1523/JNEUROSCI.20-17-06594.2000
51WallachH.1963The perception of neutral colorsScientific American208107116107–16
52ZavagnoD.AnnanV.CaputoG.2004The problem of being white: Testing the highest luminance ruleVision16149159149–59
53ZavagnoD.DaneykoO.LiuZ.2018The influence of physical illumination on lightness perception in simultaneous contrast displaysi-Percept.91221–22
54ZemachI. K.RuddM. E.2007Effects of surround articulation on lightness depend on the spatial arrangement of the articulated regionJ. Opt. Soc. Am. A24183018411830–41
55ZhouH.FriedmanH. S.von der HeydtR.2000Coding of border ownership in monkey visual cortexJ. Neurosci.20659466116594–611