A Psychological Study: Importance of Contrast and Luminance in Color to Grayscale Mapping

Grayscale images are essential in image processing and computer vision tasks. They effectively emphasize luminance and contrast, highlighting important visual features, while also being easily compatible with other algorithms. Moreover, their simplified representation makes them efficient for storage and transmission purposes. While preserving contrast is important for maintaining visual quality, other factors such as preserving information relevant to the specific application or task at hand may be more critical for achieving optimal performance. To evaluate and compare different decolorization algorithms, we designed a psychological experiment. During the experiment, participants were instructed to imagine color images in a hypothetical"colorless world"and select the grayscale image that best resembled their mental visualization. We conducted a comparison between two types of algorithms: (i) perceptual-based simple color space conversion algorithms, and (ii) spatial contrast-based algorithms, including iteration-based methods. Our experimental findings indicate that CIELAB exhibited superior performance on average, providing further evidence for the effectiveness of perception-based decolorization algorithms. On the other hand, the spatial contrast-based algorithms showed relatively poorer performance, possibly due to factors such as DC-offset and artificial contrast generation. However, these algorithms demonstrated shorter selection times. Notably, no single algorithm consistently outperformed the others across all test images. In this paper, we will delve into a comprehensive discussion on the significance of contrast and luminance in color-to-grayscale mapping based on our experimental results and analysis.


Introduction
Grayscale images are widely used in various applications such as cost-efficient printing, medical displays, and e-ink devices.Many image processing algorithms, originally designed for color, are used with grayscale images to reduce memory and time complexity.However, this transformation leads to loss of information and as a result can reduce the range of the final resulting images as compared to the original.Decolorization methods compute grayscale images using luminance values from the source pixels [1,2].Figure 1 presents few decolorization examples from which we can learn their capabilities and limitations.The WCCD algorithm (warm-cool color-based decolorization) [3] is showcased in the middle row of Fig. 1, demonstrating its effectiveness in maintaining perceptual qualities.In contrast, (a) Kim et al. [4] exhibit a loss of chromatic contrast, (b) Nafchi et al. [5] mishandle variations in luminance, (c) CIELAB fails with isoluminant colors, and (d) Lu et al. [6] struggle with specific color combinations in the bottom row.Recent algorithms promise detailed grayscale output [2].However, they require more resources and time due to iterative processes thus, being unsuitable for practical real-time applications like video and printing.Decolorization algorithms aim to preserve color contrast in grayscale images, but this doesn't always optimize their performance in other image processing applications.Kanan and Cottrell studied the effect of grayscale images for image recognition application, and they conclude that the recognition performance is significantly affected by the underlying grayscales [7].
In this paper, we present the results of a comprehensive psychological experiment evaluating six distinct decolorization algorithms.Our evaluation includes three perception-based preprocessing methods (CIELAB, YCbCr, and WCCD [3]) and three spatial contrast-based algorithms ( [6,5,8]).Our study underscores the pivotal roles played by contrast and luminance in the generation of grayscale images.Contrast and luminance preservation are crucial for maintaining visual quality when transitioning from color to grayscale.We chose not to include learningbased color-to-grayscale algorithms due to their black-box nature, which makes it challenging to understand the decision-making process behind color-to-gray mapping.Our findings reveal that perceptual algorithms excel in scenarios with limited color variance, where high color variance indicates a wide range of colors in the source image, whereas low color variance implies a more restricted color palette.Decolorization algorithms need to handle these variations appropriately to generate visually pleasing grayscale representations.Moreover, these perceptual methods offer simplicity, minimal parameter settings, and adaptability, making them valuable for video applications.This paper bridges the gap between perceptual quality and grayscale conversion techniques, enhancing our understanding of image processing applications.

Related Works
Mapping 3D color information onto a 1D grayscale image while preserving the original appearance, contrast, and finest details is a non-trivial task, as depicted in Fig. 1.There are several established methods for converting a color image into a grayscale one, and they can be categorized based on their mapping techniques (global and local methods), color space (RGB, LAB), computational complexity, and optimization methods, as reported by Akbulut et al. [1].While our primary focus in this work is on non-neural network-based methods due to concerns about their black-box nature, it's worth noting that there have been notable  [10], and Lin et al.'s use of partial differential equations (PDEs) for mapping color to gray [11].However, they often face challenges related to control, computational complexity, and cost.These learning-based methods aim to generate high-quality grayscale images but come with their own set of limitations.
Luminance is fundamental to the human visual system, and several studies in the past have established that our visual acuity is dependent on changes in luminance [12].Luminance also plays a significant role in perceiving a scene as natural [13].Choi et al. define naturalness as the level of resemblance between the photograph and the memories of real-life scenery [14].Additionally, Jiang et al. define naturalness as an image attribute based on the differences between normal exposure images and over-exposed or under-exposed images.To calculate their proposed tone mapping image quality metric, they compute the naturalness value based on luminance and yellow values [13].Luminance of an image can be measured from the pixel intensity, and in our experiment, we make use of the simple mean of the image as given by Eq. 1.
Additionally, it's important to consider video decolorization, a related field with practical applications.Several video decolorization methods have been proposed in the past [15,16,17].Tao et al. [17] introduced three different decolorization strategies (lowproximity, median-proximity, and high-proximity) with varying computational costs and decolorization qualities.These methods address the unique challenges of converting video sequences into grayscale while preserving important visual information and offer insights into the complexities of real-time applications, which can be relevant to the field of image decolorization.This comprehensive overview of related works sets the stage for our discussion on contrast, luminance, and their significance in grayscale image generation.

Psychological Experiment, Analysis and Results
We conducted a thorough psychological experiment and analyzed the results.The experiment provided valuable insights into the performance and effectiveness of different decolorization algorithms.We believe the findings can guide readers in selecting appropriate algorithms for specific applications and contribute to the advancement of color to grayscale techniques.

Setup, materials & design
We designed a MATLAB R2020b and Psychtoolbox (3.0.17)-based paired comparison graphical user interface (GUI) on an Ubuntu platform.GUI was used to display images and record participant's observations (image selection and timestamp).Our test images were loaded into computer memory and randomly generated pairs were displayed.The experiments were performed in a darkened room.The display monitor employed in our experiment was the EIZO EV3285 (3840 × 2160), and the viewing distance was fixed at 50 cm (viewing angle ≈ 34.92 • ).The monitor was calibrated with a gamma setting of 2.2 and a white point calibrated to 6500K.Additionally, the brightness was set to 120.All the images used in our study were encoded in the RGB color space In order to avoid any artifacts due to head movements, we gently stabilized the participant's head position by using a chin and forehead rest as shown in Fig. 2. The 53 test images for our experiments were selected from references [6], [18] and royalty free images from the internet.We recruited 54 naive observers (26 females, 28 males) aged between 22 to 50 who had normal or corrected-to-normal 20/20 vision from our university faculty, staff, graduate and undergraduate students' community.The average time required to complete the test was about 21 minutes, and for their participation time all the participants were remunerated as per the university regulations.The only instruction we gave to the participants was to mentally visualize the presented color image in a "colorless world" and pick one grayscale image from the six different grayscale images (CIELAB, YCbCr, WCCD [3], Lu et al. [6], Nafchi et al. [5], Liu et al. [8] which resembles it the most.

Methodology and Hypothesis Testing
Our experiment was conducted in accordance with our university's Code of Ethics regulations.The participants gave an in-formed and written consent to contribute to the experiment.A pretest examination was carried out to check for any color blindness among the participants using Ishihara test and one participant was excluded from the experiment due to color vision deficiency.During the psychological test when a participant chose a grayscale image it was scored 1, and the other 5 images were scored 0. Participants compared a color test image (at the center) with the six decolorized images (surrounding it), each of which were processed by a different algorithm.The participant response data were stored as a MATLAB data type struct with fields recording participant number, observation table (53 × 12) date and time of the experiment.The order, and location of grayscales were randomized to avoid any location-based bias and this information were stored in our observation table along with the user response time.
In our paper, we have chosen algorithms for its distinct approach to color-to-grayscale conversion.CIELAB employs the CIELAB color space, focusing on perceptual uniformity and color-based conversion.YCbCr (MATLAB: rgb2ycbcr) utilizes a color space-centric approach and is commonly used in image and video processing.WCCD utilizes a customizable luminance calculation through adjustable bias parameters based on warmcool colors [3].Lu et al. [6] delves into contrast-preserving decolorization with an emphasis on global mapping methods.Nafchi et al. [5] introduces a unique correlation-based approach, considering both color contrast and correlation.Lastly, Liu et al. [8] adopts log-Euclidean metrics to preserve contrast during decolorization.These chosen methods collectively provide a comprehensive range of grayscale conversion techniques, facilitating a thorough evaluation across various applications, including perceptual quality assessment and video analysis.
We recorded 17,172 participant responses as 54 observers evaluated 53 images and their corresponding 6 grayscales.We analyzed the experimental data for all the color images individually and performed χ 2 test(goodness of fit) [19] to show that the participants choices were not random (i.e.all six grayscale images are equally likely to be chosen).The null hypothesis (H 0 ) of this test was that "all grayscales were preferred equally", a scenario when the participants cannot tell the difference between grayscales, and/or, the images are all equally good.The alternative hypothesis (H 1 ) was "all grayscales are not preferred equally", and the choice distribution was dependent on the perceptual quality of the grayscale image.For example, if the CIELAB grayscale transformation is consistently favored for a given test image, while the YCbCr transformation is never chosen, it provides a hint regarding the characteristics of an "effective" color-to-grayscale conversion.
In Fig. 3, we present two test images and their grayscale conversions along with computed χ 2 test statistics.The test statistic is computed using Eq. 2, where observed frequencies (O i ) represent user choices, and expected frequencies (E i ) indicate equal preference for all grayscale algorithms.
For instance, in the case of the red flower image (Fig. 3a), we observed χ 2 = 26.77,rejecting the null hypothesis.This indicates that grayscales were not equally preferred, with a significant preference for one or more algorithms.The degree of freedom (d f )  for this test is calculated as 5 (i.e., k − 1), where k represents the number of grayscale algorithms.To determine whether the χ 2 values were statistically significant, we compared them with critical χ 2 values for the 95th percentile (χ .05 2 ) and 99th percentile (χ .01 2 ) at d f = 5.Since the calculated χ 2 exceeded both χ .05 2 and χ .01 2 , we rejected the null hypothesis at both p < .05 and p < .01levels, indicating that participant choices were not random but influenced by grayscale quality.However, one image (Fig. 3f) yielded a χ 2 = 2.33, leading to the acceptance of our null hypothesis.We attribute this result to the image's complex scene content and a lack of clear cues, making it challenging for participants to distinguish grayscale quality effectively.

Results & discussion
Several methods have been proposed to measure contrast, beginning with a simple ratio between maximum and minimum luminance of the scene, and choice of other metrics (like Weber contrast, Michelson contrast, and RMS contrast) depending on the end application.In this study we use RMS contrast which measures the standard deviation of pixel intensities and does not depend upon the spatial frequency content [20].For an M × N image the RMS contrast is computed using Eq.3 where I i, j is the i-th, and j-th pixel intensity and Ī is the average intensity of the image. 2 (3) We recorded the participants response (reaction) time, measuring the time it took for the observer to make a selection after the stimulus was presented.Upon arranging their response times in ascending order, we observed an interesting pattern, as shown in Fig. 4. We can clearly notice that the participants who made quick choices, preferred grayscale images generated by spatial contrast-based algorithms.We know that these algorithms employ techniques to enhance contrast, which can be lost during the dimensionality reduction process (color to gray).As a result, the grayscale images produced by these algorithms often exhibit high contrast, well-defined edges, and distinct textures, thereby improving visual salience.This relationship between contrast and salience has been studied by Cheng et al. [21].The presence of good structures and edges in an image can improve the perceptual organization of visual objects, and this has been studied by Elder and Zucker [22].Furthermore, studies by Itti et al. [23] have reported that images with enhanced contrast, salient regions, and easily recognizable objects can lead to faster recognition.Our findings align with these studies and suggest that spatial contrast-based decolorization algorithms, emphasizing contrast enhancement, lead to quicker decision-making.However, this contrast-based approach compensates for this loss by introducing artificial contrast, as demonstrated in Fig. 5.This figure illustrates the relative contrast difference between input and output concerning user selection.The selection ratio curve represents the average rate across all images under the current contrast level.Since we used six algorithms, the average selection ratio is 1/6.
Nonetheless, spatial contrast-based algorithms may inadvertently emphasize details to the point of creating an exaggerated effect.In contrast, perceptual algorithms (CIELAB, WCCD, and YCbCr) avoid this by preserving the original contrast information.However, it's important to note that these claims are currently speculative.To investigate this hypothesis further, one potential approach is to filter the image using spatial bands, which would enable us to either confirm or refute this idea easily.
In Fig. 6 we plot the luminance difference between the input, output, and also include the reference to user selection rate.From this plot we can observe that the simple color space conversion algorithms retain the original luminance thereby invoking natural-like feeling, whereas the spatial contrast-based algorithms perform a DC-offset removal like operation.Mapping a light red kite tail with a sky blue background to darker shades of gray may be considered unnatural or undesirable in terms of preserving the visual appearance of the original scene as illustrated in Fig. 7(d), (e), and (f).Grayscale images aim to represent the intensity or luminance information of the original colors, and mapping a distinct color combination to a significantly darker shade may result   in a loss of information and visual fidelity.This suggests that the decolorization algorithm is not able to accurately capture the relative luminance values of the original colors, and it also indicates a limitation or deficiency in the algorithm's ability to handle specific color combinations or variations in luminance.

Exploring color variance and video decolorization analysis
In the following discussion, we compare the two decolorization methods: spatial contrast-boost algorithms and perceptual color space-based decolorization, with a specific focus on color variance and video decolorization.From our analysis, we find the unique characteristics and implications of each method in addressing these aspects.
Measuring Color Variance: To assess color variance, we compute the standard deviation of color values within an image.This metric evaluates color variation across the image, assum-  ing that changes in color are minimally influenced by shifts in luminance.Our approach involves converting RGB images to the YCbCr color space, which segregates luminance (Y) from chrominance (Cb and Cr).This separation proves especially practical for video applications.In Fig. 8 we plot the color variance versus luminance and observe that the simple color space conversion algorithms are practical candidates to generate grayscales for low variance images with good luminance information.Perceptual decolorization algorithms, such as CIELAB, WCCD, or YCbCr, offer several benefits and advantages for video decolorization: (i) These algorithms are designed to preserve the luminance information from the color image, maintaining the brightness variations and luminance characteristics of the original scene.(ii) Additionally, they take into account the color perception characteristics of the human visual system.By considering factors like color spaces that align with human perception, these algorithm mappings ensure a consistent and accurate representation of colors in grayscale.This result in generating visually pleasing, natural-feeling, and coherent videos.Furthermore, the characteristics like simplicity, low parameter dependency, and compatibility, make them well-suited for video applications.On the other hand, spatial contrast-based algorithms prioritize local contrast, edge enhancement, and emphasize objects and details based on their spatial characteristics.These algorithms typically apply a uniform contrast enhancement across the entire image or regions of interest.However, this approach is ineffective in adapting to variations in image content, resulting in undesirable grayscale mappings as demonstrated in Fig. 9. Furthermore, this operation may potentially lead to abrupt changes in contrast and introduce flicker artifacts due to localized contrast enhancements resulting in pixel intensity variations, as shown in Fig. 10.However, the actual performance and presence of flickering can vary depending on the specific algorithms, parameters, and character-istics of the input video.

Conclusion
Grayscale images are essential for many image processing applications.Despite the existence of numerous proposed solutions, determining the most suitable one for a specific application remains a challenge, and it often requires a subjective evaluation.In order to address this challenge, we conducted a psychological experiment to compare and evaluate two types of algorithms: (i) simple color space conversion algorithms and (ii) spatial contrastbased algorithms.The results of our experiment demonstrate that, on average, CIELAB performs better, and it highlights the need for preserving luminance information for achieving a natural appearance.Interestingly, we observed that individuals with quick response times tend to prefer spatial contrast-based algorithms.However, it is important to note that spatial contrast-boosting algorithms may result in an overemphasis of grayscales, which can be perceived as exaggerated.By conducting this psychological experiment and analyzing the data, our findings highlights the significance of preserving luminance information and maintaining appropriate contrast levels in grayscale images to ensure the visual quality.

Figure 2 .
Figure 2. Our psychological experimental setup: (a) Participant preparing for the test.(b) Sample test display from our experimental dataset showing six grayscales surrounding the color image.

Figure 4 .
Figure 4. Participants with quick response times in seconds overwhelmingly preferred spatial contrast-based algorithms.User response times are arranged in ascending order, indicating the speed of decision-making.

Figure 5 .
Figure 5.This visualization offers insights into user preferences and the distribution of output images based on relative RMS contrast.Spatial contrastbased methods enhance contrast in grayscale images.The left Y1-axis represents the curve plot, while the right Y2-axis represents the bar plot.

Figure 6 .
Figure 6.This plot illustrates the impact of relative luminance on decolorization methods.Spatial contrast-based methods, as shown here, tends to neglect the significance of luminance and exhibit operations similar to DCoffset removal (see Fig. 7 for more details).

Figure 8 .
Figure 8.The scatter plot reveals that simple color space conversion algorithms exhibit better luminance preservation in grayscale images with low color variance, compared to spatial contrast-boost algorithms.This finding underscores the importance of considering the color variance in selecting the appropriate decolorization algorithm for optimal luminance representation.