IS&T | Library

Abstract

The London Imaging Meeting is a yearly topics-based conference organized by the Society of Imaging Science and technology (IS&T), in collaboration with the Institute of Physics (IOP) and the Royal Photographic Society. This year's topic was "Imaging for Deep Learning". At the heart of our conference were five focal talks given by worldrenowned experts in the field (who then also organised the related sessions). Focal speakers were Dr. Seyed Ali Amirshahi, NTNU, Norway (Image Quality); Prof. Jonas Unger, Linköping University, Sweden (Datasets for Deep Learning); Prof. Simone Bianco, Università degli Studi di Milano-Bicocca, Italy (Color Constancy); Dr. Valentina Donzella, University of Warwick, UK (Imaging Performance); and Dr. Ray Ptucha, Apple Inc. US (Characterization and Optimization). We also had two superb keynote speakers. Thanks to Dr. Robin Jenkin, Nvidia, for his talk on "Camera Metrics for Autonomous Vision" and to Dr. Joyce Farrell, Stanford University, for her talk on "Soft Prototyping Camera Designs for Autonomous Driving". As a new innovation this year—and to support the remit of LIM to reach out to students in the field—we included an invited tutorial research lecture. Given by Prof. Stephen Westland, University of Leeds, the presentation titled "Using Imaging Data for Efficient Colour Design" looked at deep learning techniques in the field of design and demonstrated that simple applications of deep learning can deliver excellent results. There were many strong contenders for the LIM Best Paper Award. Noteworthy, honourable mentions include "Portrait Quality Assessment using Multi-scale CNN", N. Chahine and S. Belkarfa, DXOMARK' "HDR4CV: High dynamic range dataset with adversarial illumination for testing computer vision methods", P. Hanjil et al., University of Cambridge; "Natural Scene Derived Camera Edge Spatial Frequency Response for Autonomous Vision Systems", O. van Zwanenberg et al., University of Westminster; and "Towards a Generic Neural Network Architecture for Approximating Tone Mapping Algorithms", J. McVey and G. Finlayson, University of East Anglia. But, by a unanimous vote, this year's Best Paper was awarded to "Impact of the Windshield's Optical Aberrations on Visual Range Camera-based Classification Tasks Performed by CNNs", C. Krebs, P. Müller, and A. Braun, (Hochschule Düsseldorf) (University of Applied Sciences Düsseldorf), Germany. We thank everyone who helped make LIM a success including the IS&T office, and the LIM presenters, reviewers, focal speakers, and keynotes, as well as the audience, who participated in making the event engaging and vibrant. This year, the conference was run by the IOP and we are extremely grateful for their help in hosting the event. A final special thanks go to the Engineering and Physical Sciences Research Council (EPSRC) who provided funding through the grant EP/S028730/1. Finally, we are pleased to announce that next year's LIM conference will be in the area of "Displays"; the conference chair is Dr. Rafal Mantiuk, University of Cambridge. —Prof. Graham Finlayson, LIM series chair, and Prof. Sophie Triantaphillidou, LIM2021 conference chair

Digital Library: LIM

Published Online: September 2021

London Imaging Meeting 2021: Imaging for Deep Learning

78 29

Page 0, September 2021, © Society for Imaging Science and Technology 2021

DOI

10.2352/issn.2694-118X.2021.LIM-A

Volume 2

Issue 1

Digital Library: LIM

Published Online: September 2021

Portrait Quality Assessment using Multi-Scale CNN

172 54

Image quality assessment
CNN
Portraits

Chahine Nicolas, Belkarfa Salim

Pages 5 - 10, September 2021, © Society for Imaging Science and Technology 2021

DOI

10.2352/issn.2694-118X.2021.LIM-5

Volume 2

Issue 1

In this paper, we propose a novel and standardized approach to the problem of camera-quality assessment on portrait scenes. Our goal is to evaluate the capacity of smartphone front cameras to preserve texture details on faces. We introduce a new portrait setup and an automated texture measurement. The setup includes two custom-built lifelike mannequin heads, shot in a controlled lab environment. The automated texture measurement includes a Region-of-interest (ROI) detection and a deep neural network. To this aim, we create a realistic mannequins database, which contains images from different cameras, shot in several lighting conditions. The ground-truth is based on a novel pairwise comparison technology where the scores are generated in terms of Just-Noticeable-differences (JND). In terms of methodology, we propose a Multi-Scale CNN architecture with random crop augmentation, to overcome overfitting and to get a low-level feature extraction. We validate our approach by comparing its performance with several baselines inspired by the Image Quality Assessment (IQA) literature.

Digital Library: LIM

Published Online: September 2021

Modeling image aesthetics through aesthetics-related attributes

99 24

Image aesthetics
convolutional neural networks
SVM

Marco Leonardi, Paolo Napoletano, Alessandro Rozza, Raimondo Schettini

Pages 11 - 15, September 2021, © Society for Imaging Science and Technology 2021

DOI

10.2352/issn.2694-118X.2021.LIM-11

Volume 2

Issue 1

Automatic assessment of image aesthetics is a challenging task for the computer vision community that has a wide range of applications. The most promising state-of-the-art approaches are based on deep learning methods that jointly predict aesthetics-related attributes and aesthetics score. In this article, we propose a method that learns the aesthetics score on the basis of the prediction of aesthetics-related attributes. To this end, we extract a multi-level spatially pooled (MLSP) features set from a pretrained ImageNet network and then these features are used to train a Multi Layer Perceptron (MLP) to predict image aesthetics-related attributes. A Support Vector Regression machine (SVR) is finally used to estimate the image aesthetics score starting from the aesthetics-related attributes. Experimental results on the ”Aesthetics with Attributes Database” (AADB) demonstrate the effectiveness of our approach that outperforms the state of the art of about 5.5% in terms of Spearman’s Rankorder Correlation Coefficient (SROCC).

Digital Library: LIM

Published Online: September 2021

Generative inter-class transformations for imbalanced data weather classification

50 9

deep learning
image classification
generative models

Apostolia Tsirikoglou, Marcus Gladh, Daniel Sahlin, Gabriel Eilertsen, Jonas Unger

Pages 16 - 20, September 2021, © Society for Imaging Science and Technology 2021

DOI

10.2352/issn.2694-118X.2021.LIM-16

Volume 2

Issue 1

This paper presents an evaluation of how data augmentation and inter-class transformations can be used to synthesize training data in low-data scenarios for single-image weather classification. In such scenarios, augmentations is a critical component, but there is a limit to how much improvements can be gained using classical augmentation strategies. Generative adversarial networks (GAN) have been demonstrated to generate impressive results, and have also been successful as a tool for data augmentation, but mostly for images of limited diversity, such as in medical applications. We investigate the possibilities in using generative augmentations for balancing a small weather classification dataset, where one class has a reduced number of images. We compare intra-class augmentations by means of classical transformations as well as noise-to-image GANs, to interclass augmentations where images from another class are transformed to the underrepresented class. The results show that it is possible to take advantage of GANs for inter-class augmentations to balance a small dataset for weather classification. This opens up for future work on GAN-based augmentations in scenarios where data is both diverse and scarce.

Digital Library: LIM

Published Online: September 2021

Visual Scan-Path based Data-Augmentation for CNN-based 360-degree Image Quality Assessment

82 10

CNN
Image Quality Assessment
360-degree
Scan-path

Abderrezzaq Sendjasni, Mohamed-Chaker Larabi, Faouzi Alaya Cheikh

Pages 21 - 26, September 2021, © Society for Imaging Science and Technology 2021

DOI

10.2352/issn.2694-118X.2021.LIM-21

Volume 2

Issue 1

360-degree Image quality assessment (IQA) is facing the major challenge of lack of ground-truth databases. This problem is accentuated for deep learning based approaches where the performances are as good as the available data. In this context, only two databases are used to train and validate deep learning-based IQA models. To compensate this lack, a dataaugmentation technique is investigated in this paper. We use visual scan-path to increase the learning examples from existing training data. Multiple scan-paths are predicted to account for the diversity of human observers. These scan-paths are then used to select viewports from the spherical representation. The results of the data-augmentation training scheme showed an improvement over not using it. We also try to answer the question of using the MOS obtained for the 360-degree image as the quality anchor for the whole set of extracted viewports in comparison to 2D blind quality metrics. The comparison showed the superiority of using the MOS when adopting a patch-based learning.

Digital Library: LIM

Published Online: September 2021

Joint Unsupervised Infrared-RGB Video Registration and Fusion

99 21

deep learning
image registration
image fusion

Imad Eddine Marouf, Luca Barras, Hakki Can Karaimer, Sabine Süsstrunk

Pages 38 - 42, September 2021, © Society for Imaging Science and Technology 2021

DOI

10.2352/issn.2694-118X.2021.LIM-38

Volume 2

Issue 1

We present a system to perform joint registration and fusion for RGB and Infrared (IR) video pairs. While RGB is related to human perception, IR is associated with heat. However, IR images often lack contour and texture information. The goal with the fusion of the visible and IR images is to obtain more information from them. This requires two completely matched images. However, classical methods assuming ideal imaging conditions fail to achieve satisfactory performance in actual cases. From the data-dependent modeling point of view, labeling the dataset is costly and impractical.In this context, we present a framework that tackles two challenging tasks. First, a video registration procedure that aims to align IR and RGB videos. Second, a fusion method brings all the essential information from the two video modalities to a single video. We evaluate our approach on a challenging dataset of RGB and IR video pairs collected for firefighters to handle their tasks effectively in challenging visibility conditions such as heavy smoke after a fire, see our project page.

Digital Library: LIM

Published Online: September 2021

Content Fidelity of Deep Learning Methods for Clipping and Over-exposure Correction

117 22

over-exposure
deep learning
GANs
HDR

Mekides Assefa Abebe

DOI

10.2352/issn.2694-118X.2021.LIM-43

Volume 2

Issue 1

Exposure problems, due to standard camera sensor limitations, often lead to image quality degradations such as loss of details and change in color appearance. The quality degradations further hiders the performances of imaging and computer vision applications. Therefore, the reconstruction and enhancement of uderand over-exposed images is essential for various applications. Accordingly, an increasing number of conventional and deep learning reconstruction approaches have been introduced in recent years. Most conventional methods follow color imaging pipeline, which strongly emphasize on the reconstructed color and content accuracy. The deep learning (DL) approaches have conversely shown stronger capability on recovering lost details. However, the design of most DL architectures and objective functions don’t take color fidelity into consideration and, hence, the analysis of existing DL methods with respect to color and content fidelity will be pertinent. Accordingly, this work presents performance evaluation and results of recent DL based overexposure reconstruction solutions. For the evaluation, various datasets from related research domains were merged and two generative adversarial networks (GAN) based models were additionally adopted for tone mapping application scenario. Overall results show various limitations, mainly for severely over-exposed contents, and a promising potential for DL approaches, GAN, to reconstruct details and appearance.

Digital Library: LIM

Published Online: September 2021

On the Semantic Dependency of Video Quality Assessment Methods

45 4

video quality assessment
no-reference quality assessment
scene-type dependency

Mirko Agarla, Luigi Celona

DOI

10.2352/issn.2694-118X.2021.LIM-49

Volume 2

Issue 1

Blind assessment of video quality is a widely covered topic in computer vision. In this work, we perform an analysis of how much the effectiveness of some of the current No-Reference VQA (NR-VQA) methods varies with respect to specific types of scenes. To this end, we automatically annotated the videos from two video quality datasets with user-generated videos whose content is unknown and then estimated the correlation for the different categories of scenes. The results of the analysis highlight that the prediction errors are not equally distributed among the different categories of scenes and indirectly suggest what next generation NR-VQA methods should take into account and model.

Digital Library: LIM

Published Online: September 2021

Camera Colour Correction using Neural Networks

168 61

Colour constancy
discounting illuminant
neural network
hyperspectral imaging

Lindsay MacDonald, Katarina Mayer

DOI

10.2352/issn.2694-118X.2021.LIM-54

Volume 2

Issue 1