IS&T | Library

Mix-loss trained bias-removed blind image denoising network

Abstract

The detection of the contaminants in daily food and drinking water is crucial for global public health. For heavy metals detection of Mercury (Hg) and Arsenic (As), our group has proposed a novel paper-based and microfluidic device integrated with a mobile phone and an image analysis pipeline to capture and analyze the sensor images on-site. Still, the detection of lower contamination levels remains challenging due to the small number of available data samples and large intra-class variance of our application. To overcome this challenge, we explore traditional data augmentation and GAN-based augmentation techniques for synthesizing realistic colorimetric images; and we propose a CNN classifier for five-contamination-levels classification. Our proposed system is trained and evaluated on a limited dataset of 126 phone captured images of five contamination levels. Our system yields 88.1% classification accuracy and 91.92% precision, demonstrating the feasibility of this approach. We believe that this approach of training deep learning models on limited detection images datasets presents a clear path toward phone-based contamination-levels detection.

Digital Library: EI

Published Online: January 2022

Article

122 8

Image denoising network
Perceptual trained
Convolutional neural network

Yi Yang, Chih-Hsien Chou, Jan P. Allebach

DOI

10.2352/EI.2022.34.8.IMAGE-288

Volume 34

Issue 8

CNN to mitigate atmospheric turbulence effect on Shack-Hartmann Wavefront Sensing: A case study on the Magdalena Ridge Observatory Interferometer.

Abstract

We studied the modern deep convolutional neural networks used for image denoising, where RGB input images are transformed into RGB output images via feed-forward convolutional neural networks that use a loss defined in the RGB color space. Considering the difference between human visual perception and objective evaluation metrics such as PSNR or SSIM, we propose a data augmentation technique and demonstrate that it is equivalent to defining a perceptual loss function. We trained a network based on this and obtained visually pleasing denoised results. We also combine an unsupervised design and the bias-free network to deal with the overfitting due to the absence of clean images, and improve performance when the noise level exceeds the training range.

Digital Library: EI

Published Online: January 2022

Article

127 17

Convolutional neural network
Denoising convolutional neural network
Shack-Hartmann Wavefront Sensor
Interferometry

Siavash Norouzi, James J. D. Luis, Ramyaa Ramyaa, John S. Young, Eugene B. Seneta, Morteza Darvish Morshedi Hosseini, Edgar R. Ligon

Pages 203-1 - 203-6, January 2022, © Society for Imaging Science and Technology 2022

DOI

10.2352/EI.2022.34.5.MLSI-203

Volume 34

Issue 5

Abstract

The Magdalena Ridge Observatory Interferometer (MROI) utilizes Shack-Hartmann Wavefront Sensing (SH-WFS) for the back-end stability of its beam relay systems in a unique design. The SH-WFS, however, is sensitive to atmospheric turbulence scintillation which can drastically affect its precision in calculating the position of the beam profile it sees. A large number of images are needed to counteract the turbulence effect. Here we use deep learning as an alternative to long averaging cycles. A CNN was trained to map from a number of initial images of a series of star frames to the average image of the entire series at different positions of the beam profile. Under typical seeing conditions expected at MROI, the results showed that the network can map 10 input frames to the average of 100 within the permissible error margin of 0.1 pixels and furnish proper generalization to beam position movements not seen during training. The network can also outperform the averaging technique when both techniques operate on small numbers of input frames such as 10 or 20.

Digital Library: EI

Published Online: January 2022

Evaluation of semi-frozen semi-fixed neural network for efficient computer vision inference

68 7

Computer vision
Convolutional neural network
Pedestrian detection
Semantic segmentation
Facial landmark detection
Neural network hardware
Efficient deep learning

Chyuan-Tyng Wu, Peter van Beek, Phillip Schmidt, Joao Peralta Moreira, Thomas R. Gardos

Pages 213-1 - 213-7, January 2021, © Society for Imaging Science and Technology 2021

DOI

10.2352/ISSN.2470-1173.2021.17.AVM-213

Volume 33

Issue 17

Deep neural networks have been utilized in an increasing number of computer vision tasks, demonstrating superior performance. Much research has been focused on making deep networks more suitable for efficient hardware implementation, for low-power and low-latency real-time applications. In [1], Isikdogan et al. introduced a deep neural network design that provides an effective trade-off between flexibility and hardware efficiency. The proposed solution consists of fixed-topology hardware blocks, with partially frozen/partially trainable weights, that can be configured into a full network. Initial results in a few computer vision tasks were presented in [1]. In this paper, we further evaluate this network design by applying it to several additional computer vision use cases and comparing it to other hardware-friendly networks. The experimental results presented here show that the proposed semi-fixed semi-frozen design achieves competitive performanc on a variety of benchmarks, while maintaining very high hardware efficiency.

Digital Library: EI

Published Online: January 2021

Saliency-based deep blind image quality assessment

89 8

Image quality assessment
Deep learning
Convolutional neural network
Saliency map

Kamal Lamichhane, Marco Carli, Federica Battisti

Pages 225-1 - 225-6, January 2021, © Society for Imaging Science and Technology 2021

DOI

10.2352/ISSN.2470-1173.2021.9.IQSP-225

Volume 33

Issue 9

Assessing the quality of images is a challenging task. To achieve this goal, the images must be evaluated by a pool of subjects following a well-defined assessment protocol or an objective quality metric must be defined. In this contribution, an objective metric based on neural networks is proposed. The model takes into account the human vision system by computing a saliency map of the image under test. The system is based on two modules: the first one is trained using normalized distorted images. It learns the features from the original and the distorted images and the estimated saliency map. Furthermore, an estimate of the prediction error is performed. The second module (non-linear regression module) is trained with the available subjective scores. The performances of the proposed metric have been evaluated by using state of the art quality assessment datasets. The achieved results show the effectiveness of the proposed system in matching the subjective quality score.

Digital Library: EI

Published Online: January 2021

Remote estimation of respiration rate by optical flow using convolutional neural networks

183 47

Respiration rate
Convolutional neural network
Optical flow
Patient monitoring

Tianqi Guo, Qian Lin, Jan Allebach

Pages 267-1 - 267-11, January 2021, © Society for Imaging Science and Technology 2021

DOI

10.2352/ISSN.2470-1173.2021.8.IMAWM-267

Volume 33

Issue 8

In this paper, we propose a novel system for remotely estimating the respiration rate of people. Periodic inhalation and exhalation during respiration cycles induce subtle upper body movements, which are reflected by the local image deformation over time when recorded by a digital camera. This local image deformation can be recovered by estimating the optical flow between consecutive frames. We propose the usage of convolutional neural networks designed for general image registration to estimate the induced optical flow, the periodicity of which is then leveraged to obtain the respiration rate by frequency analysis. The proposed system is robust to lighting condition, camera type (RGB, infrared), clothing, and posture (sitting in chair/lying in bed); and it could be used by individuals with a webcam, or by healthcare centers to monitor the patients at night.

Digital Library: EI

Published Online: January 2021

Rare-Class Extraction Using Cascaded Pretrained Networks Applied to Crane Classification

162 2

Convolutional neural network
Classification
Crane classification
Data acquisition
Surveillance
Automatic data labeling

Sander R. Klomp, Guido M.Y.E. Brouwers, Rob G.J. Wijnhoven, Peter H.N. de With

DOI

10.2352/ISSN.2470-1173.2020.6.IRIACV-049

Volume 32

Issue 6

Overweight vehicles are a common source of pavement and bridge damage. Especially mobile crane vehicles are often beyond legal per-axle weight limits, carrying their lifting blocks and ballast on the vehicle instead of on a separate trailer. To prevent road deterioration, the detection of overweight cranes is desirable for law enforcement. As the source of crane weight is visible, we propose a camera-based detection system based on convolutional neural networks. We iteratively label our dataset to vastly reduce labeling and extensively investigate the impact of image resolution, network depth and dataset size to choose optimal parameters during iterative labeling. We show that iterative labeling with intelligently chosen image resolutions and network depths can vastly improve (up to 70×) the speed at which data can be labeled, to train classification systems for practical surveillance applications. The experiments provide an estimate of the optimal amount of data required to train an effective classification system, which is valuable for classification problems in general. The proposed system achieves an AUC score of 0.985 for distinguishing cranes from other vehicles and an AUC of 0.92 and 0.77 on lifting block and ballast classification, respectively. The proposed classification system enables effective road monitoring for semi-automatic law enforcement and is attractive for rare-class extraction in general surveillance classification problems.

Digital Library: EI

Published Online: January 2020

BTF Image Recovery based on U-Net and Texture Interpolation

45 1

BTF
Convolutional neural network
Computer graphics
U-Net
Texture mapping

Naoki Tada, Keita Hirai

DOI

10.2352/ISSN.2470-1173.2020.5.MAAP-032

Volume 32

Issue 5

Bidirectional Texture Function (BTF) is one of the methods to reproduce realistic images in Computer Graphics (CG). This is a technique that can be applied to texture mapping with changing lighting and viewing directions and can reproduce realistic appearance by a simple and high-speed processing. However, in the BTF method, a large amount of texture data is generally measured and stored in advance. In this paper, in order to address the problems related to the measurement time and the texture data size in the BTF reproduction, we a method to generate a BTF image dataset using deep learning. We recovery texture images under various azimuth lighting conditions from a single texture image. For achieving this goal, we applied the U-Net to our BTF recovery. The restored and original texture images are compared using SSIM. It will be confirmed that the reproducibility of fabric and wood textures is high.

Digital Library: EI

Published Online: January 2020

Comparison of texture retrieval techniques using deep convolutional features

40 1

CBIR
Deep learning
Texture features
Texture retrieval
Visual similarity
Image features
Image retrieval
Convolutional neural network

Augusto C Valente, Fábio V. M Perez, Guilherme A. S Megeto, Marcos H Cascone, Otavio Gomes, Thomas S Paula, Qian Lin

DOI

10.2352/ISSN.2470-1173.2019.8.IMAWM-406

Volume 31

Issue 8

Considering the complexity of a multimedia society and the subjective task of describing images with words, a visual search application is a valuable tool. This work implements a Content-Based Image Retrieval (CBIR) application for texture images with the goal of comparing three deep convolutional neural networks (VGG-16, ResNet-50, and DenseNet-161), used as image descriptors by extracting global features from images. For measuring similarity among images and ranking them, we employed cosine similarity, Manhattan distance, Bray-Curtis dissimilarity, and Canberra distance. We confirm that global average pooling applied to convolutional layers provides good texture descriptors, and propose to use it when extracting features from VGGbased models. Our best result uses the average pooling layer from DenseNet-161 as a 2208-dim feature vector along with Bray-Curtis dissimilarity. We achieved 73:09% mAP@1 and 76:98% mAP@5 on the Describable Textures Dataset (DTD) benchmark, adapted for image retrieval. Our mAP@1 result is comparable to the state-of-the-art classification accuracy (73:8%). We also investigate the impact on retrieval performance when reducing the number of feature components with PCA. We are able to compress a 2208-dim descriptor down to 128 components with a moderate 3.3 percentage points drop in mAP@1.

Digital Library: EI

Published Online: January 2019

A blind mesh visual quality assessment method based on convolutional neural network

63 2

3D mesh
Image quality
Convolutional neural network

Ilyass Abouelaziz, Aladine Chetouani, Mohammed El Hassouni, Hocine Cherifi

DOI

10.2352/ISSN.2470-1173.2018.18.3DIPM-423

Volume 30

Issue 18