We live in a visual world. The perceived quality of images is of crucial importance in industrial, medical, and entertainment application environments. Developments in camera sensors, image processing, 3D imaging, display technology, and digital printing are enabling new or enhanced possibilities for creating and conveying visual content that informs or entertains. Wireless networks and mobile devices expand the ways to share imagery and autonomous vehicles bring image processing into new aspects of society. The power of imaging rests directly on the visual quality of the images and the performance of the systems that produce them. As the images are generally intended to be viewed by humans, a deep understanding of human visual perception is key to the effective assessment of image quality.
Recently, smartphones are equipped with high resolution mobile camera modules of 100 million pixels or more. After that, it is expected that much higher resolution mobile camera modules will be mounted. However, in order to mount more pixels in a limited space, the pixel size must be reduced. If 1.0 um pixel sensor was the mainstream in the past, now 0.64um pixel sensor has been developed now, and a sensor with smaller pixel will be developed in the future. However, there are technical limitations. In terms of image quality of sensor, if the size of pixel becomes smaller, the amount of light received decreases, and the image quality in terms of noise becomes poor. In order to solve this limitation, an attempt is made to develop a high-sensitivity sensor in various ways. One of them is the image sensor using CMY color filter technology. CMY color filter has higher sensitivity than RGB, so it is advantageous for developing high sensitivity sensors. In this paper, we introduce a method to Image quality evaluate the CMOS image sensor equipped with CMY color filter in mobile devices.
Recent advances in capture technologies have increased the production of 3D content in the form of Point Clouds (PCs). The perceived quality of such data can be impacted by typical processing including acquisition, compression, transmission, visualization, etc. In this paper, we propose a learning-based method that efficiently predicts the quality of distorted PCs through a set of features extracted from the reference PC and its degraded version. The quality index is obtained here by combining the considered features using a Support Vector Regression (SVR) model. The performance contribution of each considered feature and their combination are compared. We then discuss the experimental results obtained in the context of state-of-the-art methods using 2 publicly available datasets. We also evaluate the ability of our method to predict unknown PCs through a cross-dataset evaluation. The results show the relevance of introducing a learning step to merge features for the quality assessment of such data.
In the last few years, the popularity of immersive applications has experienced a major increase because of the introduction of powerful imaging and display devices. The most popular immersive media are 360-degree videos, which provide the sensation of immersion. Naturally, these videos require significantly more data, which is a challenge for streaming applications. In this work, our goal is to design a perceptually efficient streaming protocol based on edited versions of the original content. More specifically, we propose to use visual attention and semantic analysis to implement an automatic perceptual edition of 360-degree videos and design an efficient Adaptive Bit Rate (ABR) streaming scheme. The proposed scheme takes advantage of the fact that movies are made of a sequence of different shots, separated by cuts. Cuts can be used to attract viewer’s attention to important events and objects. In this paper, we report the first stage of this scheme: the content analysis used to select temporal and spatial candidate cuts. For this, we manually selected candidate cuts from a set of 360-degree videos and analyzed the users' quality of experience (QoE). Then, we computed their salient areas and analyzed if these areas are good candidates for the video cuts.
360-degree image quality assessment using deep neural networks is usually designed using a multi-channel paradigm exploiting possible viewports. This is mainly due to the high resolution of such images and the unavailability of ground truth labels (subjective quality scores) for individual viewports. The multi-channel model is hence trained to predict the score of the whole 360-degree image. However, this comes with a high complexity cost as multi neural networks run in parallel. In this paper, a patch-based training is proposed instead. To account for the non-uniformity of quality distribution of a scene, a weighted pooling of patches’ scores is applied. The latter relies on natural scene statistics in addition to perceptual properties related to immersive environments.
Thus far, there were the difference meaning and interpretation for definition of "Shitsukan". Therefore, there were many studies whether "Shitsukan" can be evaluated quantitatively or not. As one of the past our study, we carried out texture analysis for classification method of texture types by using the Shitsukan Research Database. As a result, we were able to see characteristics between contrast and correlation for texture types. In this paper, we analyzed statistically after comparing between texture analysis for classification method of texture types and luminance information in the Shitsukan Research Database for the free of charge obtained from Web. And then, we obtained novelty and knowledge discussing characteristics of texture types.
High Dynamic Range (HDR) videos attract industry and consumer markets thanks to their ability to reproduce wider color gamuts, higher luminance ranges and contrast. While the cinema and broadcast industries traditionally go through a manual mastering step on calibrated color grading hardware, consumer cameras capable of HDR video capture without user intervention are now available. The aim of this article is to review the challenges found in evaluating cameras capturing and encoding videos in an HDR format, and improve existing measurement protocols to objectively quantify the video quality produced by those systems. These protocols study adaptation to static and dynamic HDR scenes with illuminant changes as well as the general consistency and readability of the scene’s dynamic range. An experimental study has been made to compare the performances of HDR video capture to Standard Dynamic Range (SDR) video capture, where significant differences are observed, often with scene-specific content adaptation similar to the human visual system.
Cameras, especially cameraphones, are using a large panel of technologies, such as multi-frame stacking and local tone mapping to capture and render scenes with high dynamic range. ISO defined charts for OECF estimation and visual noise measurement are not really designed for these specific use cases, especially when no manual control of the camera is available. Moreover, these charts are limited to one measurement. We developed a versatile laboratory setup to evaluate image quality attributes, like autofocus, exposure and details preservation. It is tested in various lighting conditions, with several dynamic ranges up to 7EV difference within the scene, under different illuminants. Latest visual noise measurements proposed by IEEE P1858 or ISO-15739 are not giving fully satisfactory results on our laboratory scene, due to differences in the chart, framing and lighting conditions used. We performed subjective visual experiments to build a quality ruler of noisy grey patches, and use it as a dataset to develop and validate an improved version of a visual noise measurement. In the experiments we also studied the impact of different environment conditions of the grey patches to assess their relevance to our algorithm. Our new visual noise measurement uses a luminance sensitivity function multiplied by the square root of the weighted sum of the variances of the Lab coordinates of the patches. A non-linear JND scaling is applied afterwards to get a visual noise measurement in units of JND of noisiness.
Image enhancement is important in different application areas such as medical imaging, computer graphics, and military applications. In this paper, we introduce a dataset with enhanced images. The images have been enhanced by five end users, and these have been evaluated by observers in an online image quality experiment. The enhancement steps by the end users and subjective results are analysed in detail. Furthermore, 38 image quality metrics have been evaluated on the introduced dataset to reveal their suitability to measure image enhancement. The results show that the image quality metrics have low to average performance on the new dataset.
Video conferencing has become extremely relevant in the world in the latest years. Traditional image and video quality evaluation techniques prove insufficient to properly assess the quality of these systems, since they often include special processing pipelines, for example, to improve face rendering. Our team proposes a suite of equipment, laboratory scenes and measurements that include realistic mannequins to simulate a more true-to-life scene, while still being able to reliably measure image quality in terms of exposure, dynamic range, color and skin tone rendering, focus, texture, and noise. These metrics are used to evaluate and compare three categories of cameras for video conference that are available on the market: external webcams, laptop integrated webcams and selfie cameras of mobile devices. Our results showed that external webcams provide a real image quality advantage over most built-in webcams in laptops but cannot match the superior image quality of tablets and smartphones selfie cameras. Our results are consistent with perceptual evaluation and allow for an objective comparison of very different systems.