Regular
face recognitionface simulationface pose normalization
multi-view face alignment
 Filters
Month and year
 
  27  0
Image
Pages 1 - 3,  © Society for Imaging Science and Technology 2016
Digital Library: EI
Published Online: February  2016
  19  0
Image
Pages 1 - 4,  © Society for Imaging Science and Technology 2016
Volume 28
Issue 11

In this paper, we propose a face pose normalization and simulation methods based on multi-view face alignment that can enhance the performance of the face recognition algorithm towards large pose variation. The proposed method includes two steps: 1) multi-view face alignment, 2) face pose normalization and simulation methods. Multi-view face alignment algorithm is inspired by the design idea of the Supervised Descent Method (SDM) which is considered the state-of-the-art in face alignment. The proposed method modified the algorithm to adapt multi-view problems by changing the histogram of gradient feature to projection of gradient feature in order to adapt large pose variance. In addition, the feature scale also can be adaptive adjusted towards different part of face, for example, eyes, mouth, eyebrows, etc. Based on the multi-view face alignment results, 2D face normalization and simulation methods are proposed. Experimental results over many images with obvious pose changes have shown our method can significantly normalize the multi-view pose face and improve the accuracy of the existing common face recognition method when faces of probe sets have large pose variation.

Digital Library: EI
Published Online: February  2016
  39  1
Image
Pages 1 - 5,  © Society for Imaging Science and Technology 2016
Volume 28
Issue 11

Consistent monitoring of a right-of-way (ROW) is an important task for protecting the integrity of pipeline infrastructure. Pipeline monitoring is typically conducted visually by ground based and airborne inspection crews. In this paper, we present a real-time full-fledged automated airborne monitoring system that can detect, recognize, and localize machinery threats such as construction equipment, occurring on a pipeline ROW. In our approach, a modular key frame (MKF) selection technique is developed to improve data processing speed, a pyramid Fourier histogram feature is used for feature extraction, and a cascaded classifier is introduced for object categorization. Experimental results using two real-world datasets indicate that the proposed system is able to detect and recognize objects in challenging environments such as low illumination, varying resolution and partial occlusion. The results also show that our system can reach real-time processing speeds with good accuracy which offers a new and useful tool for wide area pipeline surveillance.

Digital Library: EI
Published Online: February  2016
  20  1
Image
Pages 1 - 5,  © Society for Imaging Science and Technology 2016
Volume 28
Issue 11

This paper presents a new system to monitor retinal microaneurysm which are regarded as the first sign of diabetic retinopathy(DR). The proposed approach to automatic microaneurysm detection aims to enhance screening large populations. Most of the existing computer-aided systems for microaneurysm detection are based on the sophisticated medical device in a clinical environment. However, the popular medical devices such as table fundus camera and portable fundus camera are subject to certain limitations for its usage beyond the scope of clinical practice. The challenges include the complexity of operation, cost issue and requirement of professional maintenance, etc. Unlike the conventional approaches, we developed an automatic mobile retinal microaneurysm detection system by using a handheld fundus camera to facilitate retinal healthcare and monitoring with flexibility and convenience. Our system includes: (1) retinal image capturing by handheld fundus camera;(2) retinal image analysis via cloud computing;(3) microaneurysm detection by Multi-orientation Sum of Matched Filter and SVM. The experimental results demonstrate the feasibility of our system by performance improvement on the aspects of speed, accuracy, and convenience.

Digital Library: EI
Published Online: February  2016
  49  3
Image
Pages 1 - 5,  © Society for Imaging Science and Technology 2016
Volume 28
Issue 11

In real-world face recognition (FR) scenario, illumination variation has been known to be a challenging problem because face appearance dramatically changes depending on the illumination conditions. In order to deal with this illumination variation effectively, an illumination-reduced feature learning method using deep convolutional neural network (DCNN) is proposed in this paper. It is motivated by the capability of deep learning that represents highly complicated nonlinear structures. Our learning method is mainly comprised of following two-steps: 1) learning illumination patterns for eliminating illumination effect and 2) learning for maximizing discriminative power of feature representation. Experimental results on CMU Multi-PIE database have demonstrated that the proposed method outperforms the previous works in terms of FR accuracy.

Digital Library: EI
Published Online: February  2016
  37  0
Image
Pages 1 - 5,  © Society for Imaging Science and Technology 2016
Volume 28
Issue 11

In this paper we describe a method for hazardous material (hazmat) sign location detection based on Fourier shape descriptors. The proposed method uses matching from both the magnitude and phase of the Fourier descriptor. The contribution of this paper includes a contour extraction method based on color channel clipping followed by image binarization. The experimental results show that our method is robust to geometric distortion, low resolution, blur, lighting conditions, and perspective.

Digital Library: EI
Published Online: February  2016
  34  2
Image
Pages 1 - 5,  © Society for Imaging Science and Technology 2016
Volume 28
Issue 11

Big data applications are growing rapidly as more sensors are connected to the internet and gathering business critical information for processing. Imaging sensors are an important type of sensors for collecting images and video data, and are widely deployed on smartphones, video surveillance networks, and robots. Traditional databases are designed to ingest and search structured information. The analysis of unstructured information such as images and videos is often done separately. In this paper, we describe a big data system with deep integration of face analysis and recognition in images and videos. We show how we can utilize the built-in parallelization in the Vertica database system to accelerate feature computation and search. We also show an example application of the system for face re-identification and face search in images and videos.

Digital Library: EI
Published Online: February  2016
  57  8
Image
Pages 1 - 6,  © Society for Imaging Science and Technology 2016
Volume 28
Issue 11

This paper investigates the interaction between two people, namely, a caregiver and an infant. A particular type of action in human interaction known as “touch” is described. We propose a method to detect “touch event” that uses color and motion features to track the hand positions of the caregiver. Our approach addresses the problem of hand occlusions during tracking. We propose an event recognition method to determine the time when the caregiver touches the infant and label it as a “touch event” by analyzing the merging contours of the caregiver’s hands and the infant’s contour. The proposed method shows promising results compared to human annotated data.

Digital Library: EI
Published Online: February  2016
  35  3
Image
Pages 1 - 6,  © Society for Imaging Science and Technology 2016
Volume 28
Issue 11

Matrix factorization has been a key technique in learning latent factor models for many applications in computer vision and pattern recognition such as image annotation and collaborative prediction. Specifically, in collaborative filtering problems, the goal of matrix factorization is to predict the missing values based on the low-rank factorization gained based on observed entries. Among various algorithms, maximum margin matrix factorization has been a successful approach to discriminative collaborative filtering problems, where the input matrix is binary. In this paper, we consider the problem of one-class discriminative collaborative filtering, where the data matrix is binary and only positive values can be observed, i.e. the entries of data matrix can be either observed as positive or missing. Many real applications fall in this category. For example, given an image with incomplete tag list: cat, tree, garden, we are only sure the image has cat while not sure whether it has grass or not since the tag list is incomplete. To cope with this problem, one-class Maximum Margin Matrix Factorization (one-class MMMF), which inherits the merits of both the applicability of one-class SVM and the discriminative power of maximum margin matrix factorization, is proposed. Extensive experiments conducted on both simulated toy data and real benchmark image datasets demonstrate that the proposed approach is considerably superior to the traditional approaches, which simply assume unobserved entries as negative.

Digital Library: EI
Published Online: February  2016
  24  0
Image
Pages 1 - 6,  © Society for Imaging Science and Technology 2016
Volume 28
Issue 11

Photo aesthetic quality prediction with machine learning techniques is an active yet challenging research topic. One of the most critical components of this task is to obtain the reliable ground truth for photo aesthetic quality through psychophysical experiments. A common approach is to use the average or the majority vote of all collected scores of a photo as its ground truth. However, these traditional approaches do not take into account different levels of expertise of the experiment subjects. Furthermore, this method tends to be unstable when the number of assessments is small. In this paper, we propose a strategy that focuses on improving the reliability of the ground truth estimated from human-given photo aesthetic scores. Instead of simply calculating the majority vote score or average score of each photo, we adopt a generative Bayesian approach to simultaneously infer each photo’s true aesthetic quality score, the difficulty of correctly assessing this photo, and each subject’s expertise. The statistic model fits into the expectation-maximization (EM) framework. This approach models the collected data with a discrete truncated Gaussian distribution whose parameters represent the hidden ground truth score, the difficulty to correctly assess each photo, and each subject’s expertise.

Digital Library: EI
Published Online: February  2016

Keywords

[object Object] [object Object]