Consumer cameras are indispensable tools for communication, content creation, and remote work, but image and video quality can be affected by various factors such as lighting, hardware, scene content, face detection, and automatic image processing algorithms. This paper investigates how web and phone camera systems perform in face-present scenes containing diverse skin tones, and how performance can be objectively measured using standard procedures and analyses. We closely examine image quality factors (IQFs) commonly impacted by scene content, emphasizing automatic white balance (AWB), automatic exposure (AE), and color reproduction according to Valued Camera Experience (VCX) standard procedures. Video tests are conducted for scenes containing standard compliant mannequin heads, and across a novel set of AI-generated faces with 10 additional skin tones based on the Monk Skin Tone Scale. Findings indicate that color shifts, exposure errors, and reduced overall image fidelity are unfortunately common for scenes containing darker skin tones, revealing a major short-coming in modern-day automatic image processing algorithms, highlighting the need for testing across a more diverse range of skin tones when developing automatic processing pipelines and the standards that test them.
High dynamic range (HDR) imaging has greater contrast reproduction capability than standard imaging techniques. It can achieve natural and pleasing appearance in terms of image quality. A tone mapping model (TMOz) is developed based on the center-surround properties of the mammalian ganglion cells of the human visual system for feature enhancement. The contrast of the HDR image is mapped adaptively to an SDR display range using a global method followed by contrast enhancement in local regions. A psychophysical experiment was conducted to refine the model for adaptivity of the contrast mapping function. Finally, the performance of the TMOz was evaluated using CIELAB (2:1) formula together with high quality reference images. The results showed that TMOz outperformed the other tone mapping operators (TMOs).
The edge response in retinal image is the first step for human vision recognizing the outside world. A variety of receptive field models for describing the impulse response have been proposed. Which satisfies the uncertain principle? occupied the interest from a point of minimizing the product (Δx)(Δ w) both in spatial and spectral. Among the typical edge response models, finally Gabor function and 2nd. Gaussian Derivative GD2 remained as strong candidates. While famous D. Marr and R. Young support GD2, many vision researchers prefer Gabor. The retinal edge response model is used for image sharpening.<br/> Different from the conventional image sharpening filters, this paper proposes a novel image sharpening filter by modifying the Lanczos resampling filter. The Lanczos filter is used for image scaling to resize digital images. Usually it works to interpolate the discrete sampled points like as a kind of smoothing filter not as sharpening. The Lanczos kernel is given by the product of sampling Sinc function and the scaled Sinc function. The scaled Sinc function expanded by the scale "s" plays a role of window function. The author noticed that the inverse scaling of Lanczos window can be used not for smoothing but for sharpening filter.<br/> This paper demonstrates how the proposed model works effectively in comparison with Gabor and GD2.
Whiteboards are commonly used as a medium of instant illustration of ideas during several activities including presentations, lectures, meetings, and related others through videoconferencing systems. However, the acquisition of whiteboard contents is inhibited by issues inherent to the camera technologies, the whiteboard glossy surfaces along with other environmental issues such as room lighting or camera positioning. The contents of whiteboards are mostly invisible due to the low luminance contrast and other related color degradation problems. This article presents an account of a work aimed at extracting the whiteboard image and consequently enhancing its perceptual quality and legibility. Two different methods based on color balancing and color warping are introduced to improve the global and local luminance contrast as well as color saturation of the contents. The methods are implemented based on different general models of the videoconferencing environment for avoiding color shifts and unnaturalness of results. Our evaluations, through psycho-visual experiments, reveal the significance of the proposed method's improvements over the state of the art methods in terms of visual quality and visibility. © 2018 Society for Imaging Science and Technology. [DOI: 10.2352/J.Percept.Imaging.2018.1.1.010504]
A frequently used method for camera imaging performance evaluation is that based on the ISO standard for resolution and spatial frequency responses (SFR). This standard, ISO 12233, defines a method based on a straight edge element in a test chart. While the method works as intended, results can be influenced by lens distortion due to curvature in the captured edge feature. We interpret this as the introduction of a bias (error) into the measurement, and describe a method to reduce or eliminate its effect. We use a polynomial edge-fitting method, currently being considered for a revised IS012233. Evaluation of image distortion is addressed in two more recent standards, ISO 17850 and 19084. Applying these methods along with the SFR analysis complements the SFR analysis discussed here.
Imaging system performance measures and Image Quality Metrics (IQM) are reviewed from a systems engineering perspective, focusing on spatial quality of still image capture systems. We classify IQMs broadly as: Computational IQMs (CP-IQM), Multivariate Formalism IQMs (MF-IQM), Image Fidelity Metrics (IF-IQM), and Signal Transfer Visual IQMs (STV-IQM). Comparison of each genre finds STV-IQMs well suited for capture system quality evaluation: they incorporate performance measures relevant to optical systems design, such as Modulation Transfer Function (MTF) and Noise-Power Spectrum (NPS); their bottom-up, modular approach enables system components to be optimized separately. We suggest that correlation between STV-IQMs and observer quality scores is limited by three factors: current MTF and NPS measures do not characterize scene-dependent performance introduced by imaging system non-linearities; contrast sensitivity models employed do not account for contextual masking effects; cognitive factors are not considered. We hypothesize that implementation of scene and process-dependent MTF (SPD-MTF) and NPS (SPD-NPS) measures should mitigate errors originating from scene dependent system performance. Further, we propose implementation of contextual contrast detection and discrimination models to better represent low-level visual performance in image quality analysis. Finally, we discuss image quality optimization functions that may potentially close the gap between contrast detection/discrimination and quality.
Color imaging is such a ubiquitous capability in daily life that a general preference for color over black-and-white images is often simply assumed. However, tactical reconnaissance applications that involve visual detection and identification have historically relied on spatial information alone. In addition, realtime transmission over narrow communication channels often restricts the amount of image data, requiring tradeoffs in spectral vs. spatial content. For these reasons, an assessment of the discrimination differences between color and monochrome systems is of significant interest to optimize the visual detection and identification of objects of interest. We demonstrate the amount of visual image "utility" difference provided by color systems through a series of subjective experiments that pair spatially degraded color images with a reference monochrome sample. The quality comparisons show a performance improvement in intelligence value equivalent to that achieved from a spatial improvement of about a factor of two (approximately 1.0 NIIRS). Observers were also asked to perform specific detection tasks with both types of systems and their performance and confidence results were measured. On average, a 25 percent accuracy improvement and a 30 percent corresponding confidence improvement were measured for the color presentation vs. the same image presented in black-and-white (monochrome).
Image quality is an important aspect for several applications (biometrics, tracking, object detection and so on). Several methods have been proposed in the literature to estimate it. These methods are able to predict subjective judgments according to different characteristics. The goal of this paper is to present a framework for stereoscopic image quality metric with reference based on Neural Networks (CNN & ANN). The proposed CNN model is composed of 3 convolutional layers and two Fully Connected (FC) layers and it is used to identify the degradation type in the image. The quality is then estimated using an ANN model. Its inputs are some computed features, selected according to the identified degradation type. The results obtained through two common datasets show the relevance of the proposed approach.
When evaluating camera systems for their noise performance, uniform patches in the object space are used. This is required as the measurement is based on the assumption that any variation of the digital values can be considered as noise. In presence of adaptive noise removal, this method can lead to misleading results as it is relatively easy for algorithms to smooth uniform areas of an image. In this paper, we evaluate the possibilities to measure noise on the so called dead leaves pattern, a random pattern of circles with varying diameter and color. As we measure the noise on a non-uniform pattern, we have a better description of the true noise performance and a potentially better correlation to the user experience.