IS&T | Library

Digital Library: EI

Published Online: January 2018

Depth and Superpixel Extraction for Augmenting Human Detection

39 1

HUMAN DETECTION
DEPTH EXTRACTION
SUPER-PIXEL
BLOCK-MATCHING
SEGMENTATION

Hussin K. Ragb, Theus H. Aspiras, Vijayan K. Asari

Pages 336-1 - 336-6, January 2018, © Society for Imaging Science and Technology 2018

DOI

10.2352/ISSN.2470-1173.2018.10.IMAWM-336

Volume 30

Issue 10

Various human detection algorithms are limited in capability due to the lack of using other supplemental algorithms for enhancing detection. We propose using two different algorithms to extract vital information to augment human detection algorithms for increased accuracy. The first algorithm is the computation of depth information. Information needed to obtain depth is based on the specific location of the camera based from frame to frame. Most calibrated stereo cameras can develop accurate depth information, but the motion that takes place from frame to frame can be utilized for developing rough depth perception of the objects in the scene. Block-matching and optical flow algorithms can be used to provide these disparities that happen in the image, which will provide depth information for the human detection algorithm. The second algorithm is superpixel segmentation. This algorithm determines a rough over-segmentation of the imagery, which well defines the boundaries as larger pixels that are within the imagery. This information can be used to distinguish background and foreground information to create a specific segmentation around the human detection, rather than a bounding box detection that may include various background information. The fusion of these algorithms with human detection has been shown to increase detection accuracy and providing better localization of the human in the imagery.

Digital Library: EI

Published Online: January 2018

Vision based vehicle re-identification by fusion of multiple features

160 3

VEHICLE RE-IDENTIFICATION (VRI)
FEATURE EXTRACTION
IMAGE MATCHING
DATA FUSION

Geng Yang, Jane You, Zhenhua Guo, Qin Li

Pages 467-1 - 467-7, January 2018, © Society for Imaging Science and Technology 2018

DOI

10.2352/ISSN.2470-1173.2018.10.IMAWM-467

Volume 30

Issue 10

This paper presents a new vision based approach to vehicle re-identification (VRI) for smart transportation systems by fusion of multiple features. Unlike the conventional VRI systems which adopted loop sensors to capture inductive features for classification, we developed a hierarchical method for VRI by coarse-to-fine image matching. More specifically, VRI is performed at fine level by image matching using distinctive and anonymous features which are extracted from the large number of interesting points detected from the vehicle and its license plate images at coarse level. To achieve robustness, the thresholding of matching criteria is based on the dynamic analysis of the time series of vehicle images rather than predefined. In addition, the fusion of multiple features is conducted via a weighted probability scheme. To demonstrate the feasibility of the proposed new approach, a series of field testing were conducted, where 301 vehicles were considered for data calibration and 1699 vehicles were used for validation tests. The accuracy of matching rate reaches 73.51%. 85.52% and greater than 90% respectively by using density features, fusion of selected distinctive features and fusion of multimodal features.

Digital Library: EI

Published Online: January 2018

Hierarchical Auto-associative Polynomial Convolutional Neural Network (HAP-CNN)

44 6

CONVOLUTIONAL NEURAL NETWORK
POLYNOMIAL
POOLING
CONVOLUTION
NEURAL NETWORK
MNIST
CIFAR
SVHN

Patrick Martell, Vijayan Asari

Pages 338-1 - 338-6, January 2018, © Society for Imaging Science and Technology 2018

DOI

10.2352/ISSN.2470-1173.2018.10.IMAWM-338

Volume 30

Issue 10

Convolutional neural networks (CNNs) tend to look at methods to improve performance by adding more input data, making modifications on existing data, or changing the design of the network to better suit the variables. The goal of this work is to supplement the small number of existing methods that do not use one of the previously mentioned techniques. This research aims to show that with a standard CNN, classification accuracy rates have the potential to be improved without changes to the data or major network design modifications, such as adding convolution or pooling layers. A new layer is proposed that will be inserted in a similar location as the non-linearity functions in standard CNNs. This new layer creates a localized connectivity for each perceptron to a polynomial of N^th degree. This can be performed in both the convolutional portion and the fully connected portion of the network. The proposed polynomial layer is added with the idea that the higher dimensionality enables a better description of the input space, which leads to a higher classification rate. Two different datasets, MNIST and CIFAR10 are utilized for classification, each containing 10 distinct classes and having similar training set sizes, 60,000 and 50,000 images, respectively. These datasets differ in that the images are 28 × 28 grayscale and 32 × 32 RGB, respectively. It is shown that the added polynomial layers enable the chosen CNN design to have a higher rate of accuracy on the MNIST dataset. This effect was only similar at a lower learning rate with the CIFAR10 dataset.

Digital Library: EI

Published Online: January 2018

Learn a Hybrid Collaborative Representation for Fine-Grained Image Classification

145 0

HYBRID DICTIONARY LEARNING
IMAGE CLASSIFICATION
SHARED COLLABORATIVE REPRESENTATION
CLASS SPECIFIC COLLABORATIVE REPRESENTATION

Wen-Yang Xie, Bao-Di Liu, Xue Li, Yan-Jiang Wang

Pages 339-1 - 339-6, January 2018, © Society for Imaging Science and Technology 2018

DOI

10.2352/ISSN.2470-1173.2018.10.IMAWM-339

Volume 30

Issue 10

Image classification has attracted more and more interest over the recent years. Consequently, a number of excellent non-parametric classification algorithms, such as collaborative representation based classification (CRC), have emerged and achieved superior performance to parametric classification algorithms. However, for fine-grained image classification task, both the class specific attributes and the shared attributes play significant roles in describing the image. CRC scheme does not consider the characteristics and merely utilizes all attributes without separation to represent an image. In this paper, we propose a hybrid collaborative representation based classification method to describe an image from perspective of the shared features, as well as the class specific features. Moreover, to reduce the representation error and obtain precise description, we learn a dictionary for hybrid collaborative representation with the training samples. We conduct extensive experiments on fine-grained image datasets to verify the superior performance of our proposed algorithm compared with the conventional approaches.

Digital Library: EI

Published Online: January 2018

Face Liveness Detection Based on Joint Analysis of RGB and Near-Infrared Image of Faces

126 39

LIVENESS DETECTION
SPOOFING ATTACK
FACE RECOGNITION
BRIGHT PUPIL
REFLECTANCE

Lingxue Song, Changsong Liu

Pages 373-1 - 373-6, January 2018, © Society for Imaging Science and Technology 2018

DOI

10.2352/ISSN.2470-1173.2018.10.IMAWM-373

Volume 30

Issue 10

We have witnessed the huge evolution of face recognition technology from the first pioneering works to the current state-of-the-art highly accurate systems in the past few decades. The ability to resist spoofing attacks has not been addressed until recently. While a number of researchers has thrown themselves into the challenging mission of developing effective liveness detection methods against this kind of threat, the existing algorithms are usually affected by limitations such as light conditions, response speed and interactivity. In this paper, a novel and appealing approach is introduced based on the joint analysis of visible image and near-infrared image of faces, three different features (bright pupil, HOG in nose area, reflectance ratio) are extracted to form the final BPNGR feature vector. A SVM classifier with RBF kernel is trained to distinguish between genuine (live) and spoof faces. Experiment results on the self-collected database with 605 samples clearly demonstrate the superiority of our method over previous systems in terms of speed and accuracy.

Digital Library: EI

Published Online: January 2018

Robust Convolutional Neural Network Cascade for Facial Landmark Localization Exploiting Training Data Augmentation

38 1

FACIAL LANDMARK DETECTION
CASCADED CONVOLUTIONAL NEURAL NETWORK
FACE ALIGNMENT

Ruiyi Mao, Qian Lin, Jan P. Allebach

Pages 374-1 - 374-5, January 2018, © Society for Imaging Science and Technology 2018

DOI

10.2352/ISSN.2470-1173.2018.10.IMAWM-374

Volume 30

Issue 10

Facial landmark localization plays a critical role in many face analysis tasks. In this paper, we present a coarse-to-fine cascaded convolutional neural network system for robust facial landmark localization of faces in the wild. The system consists of two cascaded convolutional neural network levels. The first level network generates an initial prediction of all facial landmarks. The second level networks are cascaded to implement facial component-wise local refinement of the landmark points. We also present a novel data augmentation method for facial landmark localization networks training. The experiment result shows our method outperforms state-of-the-art methods on 300W [18] common dataset.

Digital Library: EI

Published Online: January 2018

Empirical Study of Image Compression for Palm Vein Recognition

162 1

PALM VEIN
SUBSPACE LEARNING
TEXTURE-BASED CODING
LINE-BASED DETECTION

Zhenhua Guo, Qin Li, Yujiu Yang, Jane You

DOI

10.2352/ISSN.2470-1173.2018.10.IMAWM-421

Volume 30

Issue 10

Nowadays, cloud architecture is getting more and more popular, so biometrics with cloud computing is becoming a trend for many applications. As a relative new biometrics, palm vein recognition has many merits, such as user friendly, high accuracy and robust. It is very convenient to deploy palm vein recognition in cloud computing, for example, using a cell phone to capture a palm vein image and fulfilling comparison in cloud environment. Usually, to reduce computation burden in a cell phone and data transmission, a palm vein image is compressed before transmission. However, how image compression affect recognition accuracy is not well studied. This paper empirically studies JPG compression for three kinds of palm vein feature extraction methods. It is found that subspace method is robust, texture-based method is sensitive, while line-based method is moderate, to image compression.

Digital Library: EI

Published Online: January 2018

3D Shape Retrieval using Volumetric and Image CNNs: A Meta-Algorithmic Approach

83 8

VOXEL
MAJORITY VOTE
MULTI-VIEW
DEEP LEARNING

Ruiting Shao, Yang Lei, Jian Fan, Jerry Liu

DOI

10.2352/ISSN.2470-1173.2018.10.IMAWM-419

Volume 30

Issue 10

We propose a deep learning method to retrieve the most similar 3D well-designed model that our system has seen before, given a rough 3D model or scanned 3D data. We can either use this retrieved model directly or use it as a reference to redesign it for various purposes. Our neural network consists of 3 different neural networks (sub-nets). The first neural network deals with object images (2D projection) and the other two deals with voxel representations of the 3D object. At the last stage, we combine the results of all 3 sub-nets to get the object classification. Furthermore, we use the second to last layer as a feature map to do the feature matching, and return a list of top N most similar well-designed 3D models.

Digital Library: EI

Published Online: January 2018

A feature fusion strategy for human detection in omnidirectional camera imagery

42 0

OMNIDIRECTIONAL CAMERA
SURVEILLANCE REGION
PHASE CONGRUENCY
IMAGE GRADIENTS

Hussin K. Ragb, Vijayan K. Asari

DOI

10.2352/ISSN.2470-1173.2018.10.IMAWM-375

Volume 30

Issue 10