IS&T | Library

Conference Overview and Papers Program

30 0

Multimedia Analysis
Machine Learning
Mobile
Imaging
Web

Pages A08-1 - A08-6, January 2019, © Society for Imaging Science and Technology 2019

DOI

10.2352/ISSN.2470-1173.2019.8.IMAWM-A08

Volume 31

Issue 8

Digital Library: EI

Published Online: January 2019

10.2352/ISSN.2470-1173.2019.8.IMAWM-145

181 2

Smart Fetal Care

computer aided healthcare
soft sensors
signal enhancement
feature extraction
data fusion
robust classification
analysis of time series data
pattern recognition

Jane YOU, Qin LI, Qiaozhu Chen, Zhenhua Guo, Hongbo Yang

Pages 145-1 - 145-5, January 2019, © Society for Imaging Science and Technology 2019

DOI

Volume 31

Issue 8

The exam of fetal well-being during routine prenatal care plays a crucial role in preventing pregnancy complications and reducing the risks of miscarriages, birth defects and other health problems. However, the conventional prenatal screening and diagnosis is conducted by medical professionals in a clinical environment, which is subject to certain limitations such as manpower, medical devices and location, time and cost of services, etc. This paper presents a new approach to detect and monitor fetal movement safely and reliably without any constrains of time, environment and cost. Unlike the conventional method, our contribution includes a novel soft sensor pad which can automatically detect fetal movement and uterine contraction nonintrusively and the robust data analysis software to monitor pregnancy health and screen abnormalities with quantitative assessment. The monitoring belt embedded with the soft sensor pad is wearable, non-intrusive, radiation free and washable. The new algorithms are robust for noise removal, feature extraction, time sequence data analysis and decision support to achieve personalized care. Both the design of soft sensor pad and functions of the belt are original and unique. The results of preliminary clinical trials demonstrate the feasibility and advantages of our prototype.

Digital Library: EI

Published Online: January 2019

10.2352/ISSN.2470-1173.2019.8.IMAWM-400

60 1

Face Set Recognition

computer vision
deep learning
image processing

Tongyang Liu, Xiaoyu Xiang, Qian Lin, Jan P Allebach

Pages 400-1 - 400-6, January 2019, © Society for Imaging Science and Technology 2019

DOI

Volume 31

Issue 8

In this paper we present a Cluster Aggregation Network (CAN) for face set recognition. This network takes a set of face images, which could be either face videos or clusters with a different number of face images as its input, and then it is able to produce a compact and fixed-dimensional feature representation for the face set for the purpose of recognition. The whole network is made up of two modules, among which the first one is a face feature embedding module and the second one is the face feature aggregation module. The first module is a deep Convolutional Neural Network (CNN) which maps each of the face images to a fixed-dimensional vector. The second module is also a CNN which is trained to be able to automatically assess the quality of input face images and thus assign various weights to the images’ corresponding feature vectors. Then the one aggregated feature vector representing the input set is formed inside the convex hull formed by the input single face image features. Due to the mechanism that quality assessment is invariant to the order of one image in a set and the number of images in the set, the aggregation is invariant to these factors. Our CAN is trained with standard classification loss without any other supervision information and we found that our network is automatically attracted to high quality face images, while repelling low quality images, such as blurred, blocked, and non-frontal face images. We trained our networks with CASIA and YouTube Face datasets and the experiments on IJB-C video face recognition benchmark show that our method outperforms the current state-of-the-art feature aggregation methods and our challenging baseline aggregation method.

Digital Library: EI

Published Online: January 2019

Dense prediction for micro-expression spotting based on deep sequence model

85 11

Micro-expression spotting
dense prediction
Long short-term memory

Thuong-Khanh Tran, Quang-Nhat Vo, Xiaopeng Hong, Guoying Zhao

Pages 401-1 - 401-6, January 2019, © Society for Imaging Science and Technology 2019

DOI

10.2352/ISSN.2470-1173.2019.8.IMAWM-401

Volume 31

Issue 8

Micro-expression (ME) analysis has been becoming an attractive topic recently. Nevertheless, the studies of ME mostly focus on the recognition task while spotting task is rarely touched. While micro-expression recognition methods have obtained the promising results by applying deep learning techniques, the performance of the ME spotting task still needs to be largely improved. Most of the approaches still rely upon traditional techniques such as distance measurement between handcrafted features of frames which are not robust enough in detecting ME locations correctly. In this paper, we propose a novel method for ME spotting based on a deep sequence model. Our framework consists of two main steps: 1) From each position of video, we extract a spatial-temporal feature that can discriminate MEs among extrinsic movements. 2) We propose to use a LSTM network that can utilize both local and global correlation of the extracted feature to predict the score of the ME apex frame. The experiments on two publicly databases of ME spotting demonstrate the effectiveness of our proposed method.

Digital Library: EI

Published Online: January 2019

Emotion Recognition Using Convolutional Neural Networks

70 10

Deep Learning
Emotion Recongnition
Computer Vision

Shaoyuan Xu, Yang Cheng, Qian Lin, Jan Allebach

Pages 402-1 - 402-9, January 2019, © Society for Imaging Science and Technology 2019

DOI

10.2352/ISSN.2470-1173.2019.8.IMAWM-402

Volume 31

Issue 8

Emotion has an important role in daily life, as it helps people better communicate with and understand each other more efficiently. Facial expressions can be classified into 7 categories: angry, disgust, fear, happy, neutral, sad and surprise. How to detect and recognize these seven emotions has become a popular topic in the past decade. In this paper, we develop an emotion recognition system that can apply emotion recognition on both still images and real-time videos by using deep learning. We build our own emotion recognition classification and regression system from scratch, which includes dataset collection, data preprocessing, model training and testing. Given a certain image or a real-time video, our system is able to show the classification and regression results for all of the 7 emotions. The proposed system is tested on 2 different datasets, and achieved an accuracy of over 80%. Moreover, the result obtained from realtime testing proves the feasibility of implementing convolutional neural networks in real time to detect emotions accurately and efficiently.

Digital Library: EI

Published Online: January 2019

Face Alignment via 3D-Assisted Features

41 2

3D-assisted features
Face alignment
Facial landmark detection
Cascaded regression
3D Morphable Model

Song Guo, Fei Li, Hajime Nada, Hidetsugu Uchida, Tomoaki Matsunami, Narishige Abe

Pages 403-1 - 403-8, January 2019, © Society for Imaging Science and Technology 2019

DOI

10.2352/ISSN.2470-1173.2019.8.IMAWM-403

Volume 31

Issue 8

We present a practical 3D-assited face alignment framework based on cascaded regression in this paper. The 3D information embedded in 2D face image is utilized to calculate two novel components to improve the performance of 2D methods in unconstrained face alignment. The two novel components for 2D image features are the projected local patch and the visibility of each landmark. First, we propose to extract the landmark related features in the projected local patches on 2D image from the corresponding 3D face model. Local patches of a fixed landmark in 3D face models for different 2D images cover the same region of face anatomically. The extracted features are more accurate for further locations regression of landmarks. Second, we propose to estimate the visibilities of 2D landmarks based on 3D face model, which are proven to be vital to address large pose face alignment problem. In this paper, we adopt Local Binary Features (LBF) to extract landmark related features in the proposed framework, and name the new method as 3D-Assisted LBF (3DALBF). An extensive evaluation on two face databases shows that 3DALBF can achieve better alignment results than the original 2D method and maintain the speed advantage of 2D method over 3D method.

Digital Library: EI

Published Online: January 2019

Face Recognition by the Construction of Matching Cliques of Points

36 1

Face recognition
pattern recognition
similarity
human vision
graph matching

Fred Stentiford

Pages 404-1 - 404-5, January 2019, © Society for Imaging Science and Technology 2019

DOI

10.2352/ISSN.2470-1173.2019.8.IMAWM-404

Volume 31

Issue 8

This paper addresses the problem of face recognition using a graphical representation to identify structure that is common to pairs of images. Matching graphs are constructed where nodes correspond to local brightness gradient directions and edges are dependent on the relative orientation of the nodes. Similarity is determined from the size of maximal matching cliques in pattern pairs. The method uses a single reference face image to obtain recognition without a training stage. Results on samples from MegaFace obtain a 100% correct recognition result.

Digital Library: EI

Published Online: January 2019

Comparison of texture retrieval techniques using deep convolutional features

42 1

CBIR
Deep learning
Texture features
Texture retrieval
Visual similarity
Image features
Image retrieval
Convolutional neural network

Augusto C Valente, Fábio V. M Perez, Guilherme A. S Megeto, Marcos H Cascone, Otavio Gomes, Thomas S Paula, Qian Lin

DOI

10.2352/ISSN.2470-1173.2019.8.IMAWM-406

Volume 31

Issue 8

Considering the complexity of a multimedia society and the subjective task of describing images with words, a visual search application is a valuable tool. This work implements a Content-Based Image Retrieval (CBIR) application for texture images with the goal of comparing three deep convolutional neural networks (VGG-16, ResNet-50, and DenseNet-161), used as image descriptors by extracting global features from images. For measuring similarity among images and ranking them, we employed cosine similarity, Manhattan distance, Bray-Curtis dissimilarity, and Canberra distance. We confirm that global average pooling applied to convolutional layers provides good texture descriptors, and propose to use it when extracting features from VGGbased models. Our best result uses the average pooling layer from DenseNet-161 as a 2208-dim feature vector along with Bray-Curtis dissimilarity. We achieved 73:09% mAP@1 and 76:98% mAP@5 on the Describable Textures Dataset (DTD) benchmark, adapted for image retrieval. Our mAP@1 result is comparable to the state-of-the-art classification accuracy (73:8%). We also investigate the impact on retrieval performance when reducing the number of feature components with PCA. We are able to compress a 2208-dim descriptor down to 128 components with a moderate 3.3 percentage points drop in mAP@1.

Digital Library: EI

Published Online: January 2019

Frame Detection for Photos of Online Fashion Items

52 5

On-line fashion images
Detection
Frame

Litao Hu, Jan Allebach, Gautam Glowala, Sathya Sundaram, Perry Lee

DOI

10.2352/ISSN.2470-1173.2019.8.IMAWM-412

Volume 31

Issue 8

In the competitive online fashion market place, it is common for sellers to add artificial elements to their product images, with the hope to improve the aesthetic quality of their products. Among the numerous types of artificial elements, we focus on detecting artificial frames in fashion images in this paper and we propose a novel algorithm based on traditional image processing techniques for this purpose. On the other hand, even though deep learning methods have been very powerful and effective in many image processing tasks in recent years, they do have their drawbacks in some cases, rendering them ineffective compared to our method for this particular task. Experimental results on 1000 testing images show that our algorithm has comparable performance with some of the state-of-the-art deep learning models that have been used for classification.

Digital Library: EI

Published Online: January 2019

Barcode Detection and Decoding in On-line Fashion Images

222 64

Computer Vision
Barcode Detection
Barcode Decoding
Skewed Image
Fashion Image
Convolution Neutral Network

Qingyu Yang, Gautam Golwala, Sathya Sundaram, Perry Lee, Jan Allebach

DOI

10.2352/ISSN.2470-1173.2019.8.IMAWM-413

Volume 31

Issue 8