Regular
AUGMENTED REALITY
BLOCK-MATCHINGBRIGHT PUPIL
CONVOLUTIONAL NEURAL NETWORKCLASS SPECIFIC COLLABORATIVE REPRESENTATIONCASCADED CONVOLUTIONAL NEURAL NETWORKCOLOR EXTRACTIONCOUNTERFEITING GOODSCOMPUTER VISIONCONVOLUTIONAL FEATURESCOLLISION AVOIDANCE SYSTEMCONVOLUTIONAL NEURAL NETWORKSCIFARCONVOLUTION
DATA AUGMENTATIONDILATED CONVOLUTIONSDEEP LEARNINGDATA MINING AND ANALYTICSDATA FUSIONDEPTH EXTRACTION
FACE ALIGNMENTFEATURE EXTRACTIONFASHION MARKETFACE RECOGNITIONFACIAL LANDMARK DETECTIONFASHION IMAGING
HUMAN POSE ESTIMATIONHUMAN DETECTIONHYBRID DICTIONARY LEARNING
IMAGE MATCHINGIMAGE GRADIENTSIMAGE PROCESSINGIMAGE CLASSIFICATIONIMAGINGIMAGE SEGMENTATION
LEARNING ENHANCEMENTLIVENESS DETECTIONLOGO RECOGNITIONLOGO DETECTIONLINE-BASED DETECTION
MOBILE SYSTEMMOBILEMULTI-VIEWMOVING OBJECT DETECTIONMULTIMEDIA ANALYSISMAJORITY VOTEMACHINE LEARNINGMNIST
NATURAL LANGUAGE PROCESSINGNEURAL NETWORK
OMNIDIRECTIONAL CAMERA
POOLINGPERSON SEGMENTATIONPHASE CONGRUENCYPOLYNOMIALPALM VEIN
REFLECTANCE
SEMANTIC SEGMENTATIONSURVEILLANCE REGIONSVMSHARED COLLABORATIVE REPRESENTATIONSUPER-PIXELSPOOFING ATTACKSVHNSUBSPACE LEARNINGSEGMENTATION
TEXTURE-BASED CODING
UAV
VOXELVEHICLE RE-IDENTIFICATION (VRI)
WEBWEB SCRAPING
 Filters
Month and year
 
  16  0
Image
Pages 560-1 - 560-4,  © Society for Imaging Science and Technology 2018
Digital Library: EI
Published Online: January  2018
  11  0
Image
Pages 336-1 - 336-6,  © Society for Imaging Science and Technology 2018
Volume 30
Issue 10

Various human detection algorithms are limited in capability due to the lack of using other supplemental algorithms for enhancing detection. We propose using two different algorithms to extract vital information to augment human detection algorithms for increased accuracy. The first algorithm is the computation of depth information. Information needed to obtain depth is based on the specific location of the camera based from frame to frame. Most calibrated stereo cameras can develop accurate depth information, but the motion that takes place from frame to frame can be utilized for developing rough depth perception of the objects in the scene. Block-matching and optical flow algorithms can be used to provide these disparities that happen in the image, which will provide depth information for the human detection algorithm. The second algorithm is superpixel segmentation. This algorithm determines a rough over-segmentation of the imagery, which well defines the boundaries as larger pixels that are within the imagery. This information can be used to distinguish background and foreground information to create a specific segmentation around the human detection, rather than a bounding box detection that may include various background information. The fusion of these algorithms with human detection has been shown to increase detection accuracy and providing better localization of the human in the imagery.

Digital Library: EI
Published Online: January  2018
  48  1
Image
Pages 467-1 - 467-7,  © Society for Imaging Science and Technology 2018
Volume 30
Issue 10

This paper presents a new vision based approach to vehicle re-identification (VRI) for smart transportation systems by fusion of multiple features. Unlike the conventional VRI systems which adopted loop sensors to capture inductive features for classification, we developed a hierarchical method for VRI by coarse-to-fine image matching. More specifically, VRI is performed at fine level by image matching using distinctive and anonymous features which are extracted from the large number of interesting points detected from the vehicle and its license plate images at coarse level. To achieve robustness, the thresholding of matching criteria is based on the dynamic analysis of the time series of vehicle images rather than predefined. In addition, the fusion of multiple features is conducted via a weighted probability scheme. To demonstrate the feasibility of the proposed new approach, a series of field testing were conducted, where 301 vehicles were considered for data calibration and 1699 vehicles were used for validation tests. The accuracy of matching rate reaches 73.51%. 85.52% and greater than 90% respectively by using density features, fusion of selected distinctive features and fusion of multimodal features.

Digital Library: EI
Published Online: January  2018
  10  0
Image
Pages 338-1 - 338-6,  © Society for Imaging Science and Technology 2018
Volume 30
Issue 10

Convolutional neural networks (CNNs) tend to look at methods to improve performance by adding more input data, making modifications on existing data, or changing the design of the network to better suit the variables. The goal of this work is to supplement the small number of existing methods that do not use one of the previously mentioned techniques. This research aims to show that with a standard CNN, classification accuracy rates have the potential to be improved without changes to the data or major network design modifications, such as adding convolution or pooling layers. A new layer is proposed that will be inserted in a similar location as the non-linearity functions in standard CNNs. This new layer creates a localized connectivity for each perceptron to a polynomial of Nth degree. This can be performed in both the convolutional portion and the fully connected portion of the network. The proposed polynomial layer is added with the idea that the higher dimensionality enables a better description of the input space, which leads to a higher classification rate. Two different datasets, MNIST and CIFAR10 are utilized for classification, each containing 10 distinct classes and having similar training set sizes, 60,000 and 50,000 images, respectively. These datasets differ in that the images are 28 × 28 grayscale and 32 × 32 RGB, respectively. It is shown that the added polynomial layers enable the chosen CNN design to have a higher rate of accuracy on the MNIST dataset. This effect was only similar at a lower learning rate with the CIFAR10 dataset.

Digital Library: EI
Published Online: January  2018
  53  0
Image
Pages 339-1 - 339-6,  © Society for Imaging Science and Technology 2018
Volume 30
Issue 10

Image classification has attracted more and more interest over the recent years. Consequently, a number of excellent non-parametric classification algorithms, such as collaborative representation based classification (CRC), have emerged and achieved superior performance to parametric classification algorithms. However, for fine-grained image classification task, both the class specific attributes and the shared attributes play significant roles in describing the image. CRC scheme does not consider the characteristics and merely utilizes all attributes without separation to represent an image. In this paper, we propose a hybrid collaborative representation based classification method to describe an image from perspective of the shared features, as well as the class specific features. Moreover, to reduce the representation error and obtain precise description, we learn a dictionary for hybrid collaborative representation with the training samples. We conduct extensive experiments on fine-grained image datasets to verify the superior performance of our proposed algorithm compared with the conventional approaches.

Digital Library: EI
Published Online: January  2018
  39  14
Image
Pages 373-1 - 373-6,  © Society for Imaging Science and Technology 2018
Volume 30
Issue 10

We have witnessed the huge evolution of face recognition technology from the first pioneering works to the current state-of-the-art highly accurate systems in the past few decades. The ability to resist spoofing attacks has not been addressed until recently. While a number of researchers has thrown themselves into the challenging mission of developing effective liveness detection methods against this kind of threat, the existing algorithms are usually affected by limitations such as light conditions, response speed and interactivity. In this paper, a novel and appealing approach is introduced based on the joint analysis of visible image and near-infrared image of faces, three different features (bright pupil, HOG in nose area, reflectance ratio) are extracted to form the final BPNGR feature vector. A SVM classifier with RBF kernel is trained to distinguish between genuine (live) and spoof faces. Experiment results on the self-collected database with 605 samples clearly demonstrate the superiority of our method over previous systems in terms of speed and accuracy.

Digital Library: EI
Published Online: January  2018
  12  0
Image
Pages 374-1 - 374-5,  © Society for Imaging Science and Technology 2018
Volume 30
Issue 10

Facial landmark localization plays a critical role in many face analysis tasks. In this paper, we present a coarse-to-fine cascaded convolutional neural network system for robust facial landmark localization of faces in the wild. The system consists of two cascaded convolutional neural network levels. The first level network generates an initial prediction of all facial landmarks. The second level networks are cascaded to implement facial component-wise local refinement of the landmark points. We also present a novel data augmentation method for facial landmark localization networks training. The experiment result shows our method outperforms state-of-the-art methods on 300W [18] common dataset.

Digital Library: EI
Published Online: January  2018
  45  1
Image
Pages 421-1 - 421-5,  © Society for Imaging Science and Technology 2018
Volume 30
Issue 10

Nowadays, cloud architecture is getting more and more popular, so biometrics with cloud computing is becoming a trend for many applications. As a relative new biometrics, palm vein recognition has many merits, such as user friendly, high accuracy and robust. It is very convenient to deploy palm vein recognition in cloud computing, for example, using a cell phone to capture a palm vein image and fulfilling comparison in cloud environment. Usually, to reduce computation burden in a cell phone and data transmission, a palm vein image is compressed before transmission. However, how image compression affect recognition accuracy is not well studied. This paper empirically studies JPG compression for three kinds of palm vein feature extraction methods. It is found that subspace method is robust, texture-based method is sensitive, while line-based method is moderate, to image compression.

Digital Library: EI
Published Online: January  2018
  14  4
Image
Pages 419-1 - 419-6,  © Society for Imaging Science and Technology 2018
Volume 30
Issue 10

We propose a deep learning method to retrieve the most similar 3D well-designed model that our system has seen before, given a rough 3D model or scanned 3D data. We can either use this retrieved model directly or use it as a reference to redesign it for various purposes. Our neural network consists of 3 different neural networks (sub-nets). The first neural network deals with object images (2D projection) and the other two deals with voxel representations of the 3D object. At the last stage, we combine the results of all 3 sub-nets to get the object classification. Furthermore, we use the second to last layer as a feature map to do the feature matching, and return a list of top N most similar well-designed 3D models.

Digital Library: EI
Published Online: January  2018
  16  0
Image
Pages 375-1 - 375-5,  © Society for Imaging Science and Technology 2018
Volume 30
Issue 10

Field of view of the traditional camera is limited such that usually more than three cameras is needed to cover the entire surveillance area. The use of multiple cameras usually requires more efforts regarding camera control and set up as well as they need additional algorithms to find the relationships among the images of different cameras. In this paper, we present a multi-feature algorithm that employs only one omnidirectional camera instead of using multiple cameras to cover the entire surveillance region. Here we use the image gradients, the local phase information based on phase congruency, the phase congruency magnitude, and the color features, and they are fused together to build one descriptor named as "Fused Phase, Gradients and Color features (FPGC). The image gradients, and local phase information based on phase congruency concept are used to extract the human body shape features. Either LUV or grayscale channel features are used according to the kind of camera used. The phase congruency magnitude and orientation of each pixel in the input image is computed with respect to its neighborhood. The resultant images are divided into local regions and the histogram of oriented phase, and the histogram of oriented gradient are determined for each local region and combined. A maximum pooling of the candidate features is generated for one channel of the phase congruency magnitude and the three LUV color channels. All these features are fed to a decision tree Adaboost classifier for training and classification between the classes. The proposed approach is evaluated on a challenging omnidirectional dataset and observed promising performance.

Digital Library: EI
Published Online: January  2018

Keywords

[object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object]