IS&T | Library

A lightweight exposure bracketing strategy for HDR imaging without access to camera raw

Abstract

Recently, many deep learning applications have been used on the mobile platform. To deploy them in the mobile platform, the networks should be quantized. The quantization of computer vision networks has been studied well but there have been few studies for the quantization of image restoration networks. In previous study, we studied the effect of the quantization of activations and weight for deep learning network on image quality following previous study for weight quantization for deep learning network. In this paper, we made adaptive bit-depth control of input patch while maintaining the image quality similar to the floating point network to achieve more quantization bit reduction than previous work. Bit depth is controlled adaptive to the maximum pixel value of the input data block. It can preserve the linearity of the value in the block data so that the deep neural network doesn't need to be trained by the data distribution change. With proposed method we could achieve 5 percent reduction in hardware area and power consumption for our custom deep network hardware while maintaining the image quality in subejctive and objective measurment. It is very important achievement for mobile platform hardware.

Digital Library: EI

Published Online: January 2024

Article

155 54

High Dynamic Range
Exposure Bracketing Strategy
Convolutional Neural Network

Jieyu Li, Ruiwen Zhen, Robert L. Stevenson

DOI

10.2352/EI.2023.35.14.COIMG-156

Volume 35

Issue 14

Abstract

A lightweight learning-based exposure bracketing strategy is proposed in this paper for high dynamic range (HDR) imaging without access to camera RAW. Some low-cost, power-efficient cameras, such as webcams, video surveillance cameras, sport cameras, mid-tier cellphone cameras, and navigation cameras on robots, can only provide access to 8-bit low dynamic range (LDR) images. Exposure fusion is a classical approach to capture HDR scenes by fusing images taken with different exposures into a 8-bit tone-mapped HDR image. A key question is what the optimal set of exposure settings are to cover the scene dynamic range and achieve a desirable tone. The proposed lightweight neural network predicts these exposure settings for a 3-shot exposure bracketing, given the input irradiance information from 1) the histograms of an auto-exposure LDR preview image, and 2) the maximum and minimum levels of the scene irradiance. Without the processing of the preview image streams, and the circuitous route of first estimating the scene HDR irradiance and then tone-mapping to 8-bit images, the proposed method gives a more practical HDR enhancement for real-time and on-device applications. Experiments on a number of challenging images reveal the advantages of our method in comparison with other state-of-the-art methods qualitatively and quantitatively.

Digital Library: EI

Published Online: January 2023

Design of Professional Laboratory Exercises for Effective State-of-the-Art OSINT Investigation Tools - Part 3: Maltego

240 74

Stereoscopic image quality assessment
Convolutional Neural Network
3D Saliency
Human Visual System
OSINT
open-source intelligence
cybersecurity training
big data
RiskIQ PassiveTotal
Censys
Shodan
Maltego

Klaus Schwarz, Reiner Creutzburg

Pages 45-1 - 45-23, June 2021, © Society for Imaging Science and Technology 2021

DOI

10.2352/ISSN.2470-1173.2021.3.MOBMU-045

Volume 33

Issue 3

Open-source technologies (OSINT) are becoming increasingly popular with investigative and government agencies, intelligence services, media companies, and corporations [22]. These OSINT technologies use sophisticated techniques and special tools to analyze the continually growing sources of information efficiently [17]. There is a great need for professional training and further education in this field worldwide. After having already presented the overall structure of a professional training concept in this field in a previous paper [25], this series of articles offers individual further training modules for the worldwide standard state-of-the-art OSINT tools. The modules presented here are suitable for a professional training program and an OSINT course in a bachelor’s or master’s computer science or cybersecurity study at a university. In part 1 of a series of 4 articles, the OSINT tool RiskIQ Passiv-Total [26] is introduced, and its application possibilities are explained using concrete examples. In part 2 the OSINT tool Censys is explained [27]. This part 3 deals with Maltego [28] and Part 4 compares the 3 different tools of Part 1-3 [29].

Digital Library: EI

Published Online: June 2021

Deep Quality evaluator guided by 3D Saliency for Stereoscopic Images

68 5

Stereoscopic image quality assessment
Convolutional Neural Network
3D Saliency
Human Visual System

Oussama Messai, Aladine Chetouani, Fella Hachouf, Zianou Ahmed Seghir

Pages 110-1 - 110-7, January 2021, © Society for Imaging Science and Technology 2021

DOI

10.2352/ISSN.2470-1173.2021.11.HVEI-110

Volume 33

Issue 11

Due to the use of 3D contents in various applications, Stereo Image Quality Assessment (SIQA) has attracted more attention to ensure good viewing experience for the users. Several methods have been thus proposed in the literature with a clear improvement for deep learning-based methods. This paper introduces a new deep learning-based no-reference SIQA using cyclopean view hypothesis and human visual attention. First, the cyclopean image is built considering the presence of binocular rivalry that covers the asymmetric distortion case. Second, the saliency map is computed taking into account the depth information. The latter aims to extract patches on the most perceptual relevant regions. Finally, a modified version of the pre-trained vgg-19 is fine-tuned and used to predict the quality score through the selected patches. The performance of the proposed metric has been evaluated on 3D LIVE phase I and phase II databases. Compared with the state-of-the-art metrics, our method gives better outcomes.

Digital Library: EI

Published Online: January 2021

Action recognition using pose estimation with an artificial 3D coordinates and CNN

37 2

Action Recognition
Pose Estimation
Convolutional Neural Network

Jisu Kim, Deokwoo Lee

Pages 4-1 - 4-7, January 2020, © Society for Imaging Science and Technology 2020

DOI

10.2352/ISSN.2470-1173.2020.17.3DMP-004

Volume 32

Issue 17

Activity recognition and pose estimation are ingeneral closely related in practical applications, even though they are considered to be independent tasks. In this paper, we propose an artificial 3D coordinates and CNN that is for combining activity recognition and pose estimation with 2D and 3D static/dynamic images(dynamic images are composed of a set of video frames). In other words, We show that the proposed algorithm can be used to solve two problems, activity recognition and pose estimation. End-to-end optimization process has shown that the proposed approach is superior to the one which exploits the activity recognition and pose estimation seperately. The performance is evaluated by calculating recognition rate. The proposed approach enable us to perform learning procedures using different datasets.

Digital Library: EI

Published Online: January 2020

ProPaCoL-Net: A Novel Recursive Stereo Image SR Network with Progressive Parallax Coherency Learning

60 7

Super-Resolution
Stereo Images
Convolutional Neural Network
Parallax Coherence
Recursive Learning

Jeonghun Kim, Munchurl Kim

Pages 342-1 - 342-8, January 2020, © Society for Imaging Science and Technology 2020

DOI

10.2352/ISSN.2470-1173.2020.14.COIMG-342

Volume 32

Issue 14

Recently, stereo cameras have been widely packed in smart phones and autonomous vehicles thanks to low cost and smallsized packages. Nevertheless, acquiring high resolution (HR) stereo images is still a challenging problem. While the traditional stereo image processing tasks have mainly focused on stereo matching, stereo super-resolution (SR) has drawn less attention which is necessitated for HR images. Some deep learning based stereo image SR works have recently shown promising results. However, they have not fully exploited binocular parallax in SR, which may lead to unrealistic visual perception. In this paper, we present a novel and computationally efficient convolutional neural network (CNN) based deep SR network for stereo images by learning parallax coherency between the left and right SR images, which is called ProPaCoL-Net. The proposed ProPaCoL-Net progressively learns parallax coherency via a novel recursive parallax coherency (RPC) module with shared parameters. The RPC module is effectively designed to extract parallax information in prior for the left image SR from its right view input images and vice versa. Furthermore, we propose a parallax coherency loss to reliably train the ProPaCoL-Net. From extensive experiments, the ProPaCoL-Net shows to outperform the very recent state-of-the-art method with average 1.15 dB higher in PSNR.

Digital Library: EI

Published Online: January 2020

No Reference Video Quality Assessment with authentic distortions using 3-D Deep Convolutional Neural Network

134 14

Video Quality Assessment
Convolutional Neural Network
No reference
Video quality assessment

Roger Gomez Nieto, Hernan Dario Benitez Restrepo, Roger Figueroa Quintero, Alan Bovik

DOI

10.2352/ISSN.2470-1173.2020.9.IQSP-168

Volume 32

Issue 9

Video Quality Assessment (VQA) is an essential topic in several industries ranging from video streaming to camera manufacturing. In this paper, we present a novel method for No-Reference VQA. This framework is fast and does not require the extraction of hand-crafted features. We extracted convolutional features of 3-D C3D Convolutional Neural Network and feed one trained Support Vector Regressor to obtain a VQA score. We did certain transformations to different color spaces to generate better discriminant deep features. We extracted features from several layers, with and without overlap, finding the best configuration to improve the VQA score. We tested the proposed approach in LIVE-Qualcomm dataset. We extensively evaluated the perceptual quality prediction model, obtaining one final Pearson correlation of 0:7749±0:0884 with Mean Opinion Scores, and showed that it can achieve good video quality prediction, outperforming other state-of-the-art VQA leading models.

Digital Library: EI

Published Online: January 2020

A Deep Learning Approach to MRI Scanner Manufacturer and Model Identification

185 12

MRI Model Identification
Multimedia Forensics
Deep Learning
Convolutional Neural Network

Shengbang Fang, Ronnie A. Sebro, Matthew C. Stamm

DOI

10.2352/ISSN.2470-1173.2020.4.MWSF-217

Volume 32

Issue 4

Forensics research has developed several techniques to identify the model and manufacturer of a digital image or videos source camera. However, to the best of our knowledge, no work has been performed to identify the manufacturer and model of the scanner that captured an MRI image. MRI source identification can have several important applications ranging from scientific fraud discovery, exposing issues around anonymity and privacy of medical records, protecting against malicious tampering of medical images, and validating AI-based diagnostic techniques whose performance varies on different MRI scanners. In this paper, we propose a new CNN-based approach to learn forensic traces left by an MRI scanner and use these traces to identify the manufacturer and model of the scanner that captured an MRI image. Additionally, we identify an issue called weight divergence that can occur when training CNNs using a constrained convolutional layer and propose three new correction functions to protect against this. Our experimental results show we can identify an MRI scanners manufacturer with 97.88% accuracy and its model with 91.07% accuracy. Additionally, we show that our proposed correction functions can noticeably improve our CNNs accuracy when performing scanner model identification.

Digital Library: EI

Published Online: January 2020

Real-time traffic sign recognition using deep network for embedded platforms

63 4

Traffic Sign Recognition
Convolutional Neural Network
Sparsity

Raghav Nagpal, Chaitanya Krishna Paturu, Vijaya Ragavan, Navinprashath R R, Radhesh Bhat, Dipanjan Ghosh

DOI

10.2352/ISSN.2470-1173.2019.15.AVM-033

Volume 31

Issue 15

Road traffic signs provide vital information about the traffic rules, road conditions, and route directions to assist drivers in safe driving. Recognition of traffic signs is one of the key features of Advanced Driver Assistance Systems (ADAS). In this paper, we present a Convolutional Neural Network (CNN) based approach for robust Traffic Sign Recognition (TSR) that can run real-time on low power embedded systems. To achieve this, we propose a twostage network: In the first stage, a generic traffic sign detection network localizes the position of traffic signs in the video footage, and in the second stage a country-specific classification network classifies the detected signs. The network sub-blocks were retrained to generate an optimal network that runs real-time on the Nvidia Tegra platform. The network?s computational complexity and the model size are further reduced to make it deployable on low power embedded platforms. Methods like network customization, weight pruning, and quantization schemes were used to achieve an 8X reduction in computation complexity. The pruned and optimized network is further ported and benchmarked on embedded platforms like Texas Instruments Jacinto TDA2x SoC and Qualcomm?s Snapdragon 820Automotive platform.

Digital Library: EI

Published Online: January 2019

ECDNet: Efficient Siamese Convolutional Network for Real-Time Small Object Change Detection from Ground Vehicles

165 3

Change Detection
Convolutional Neural Network
CNN
Siamese Network
Contrastive Loss
Encoder Decoder Network

Sander R Klomp, Dennis W.J.M van de Wouw, ViNotion B.V., Peter H.N de With

DOI

10.2352/ISSN.2470-1173.2019.7.IRIACV-458

Volume 31

Issue 7