Regular
analysis of time series data
Boundary DetectionBarcode DecodingBarcode Detection
Computer Visioncomputer aided healthcareCBIRConvolutional neural networkColor DetectionCascaded regressionConvolution Neutral Networkcomputer vision
database film-makingDetectiondense predictionDIY networkingdeep learningDeep Learningdata fusionDeep learningdigital arts
Emotion Recongnition
fashion imagesFace alignmentfeature extractionFacial landmark detectionFace recognitionFrameFashion Imagefloss, free, libre, open source software
graph matching
human vision
ImagingInfrared Imageryimage processingimage forgeryImage ProcessingImage featuresImage retrieval
Long short-term memory
machine learningMobile SystemMultimedia AnalysisMicro-expression spottingMobileMachine Learning
natural languagenon native contentnetworked media
open wireless networkOn-line fashion imagesonline fashion marketplace
participatory culturePassive-Aggressive classifierpattern recognition
robust classification
soft sensorsSkewed ImagesimilaritySemantic SegmentationSegNetsignal enhancementScene Labeling
Texture retrievalTexture features
Visual similarity
Web of ThingsWeb
3D Morphable Model3D-assisted features
 Filters
Month and year
 
  25  0
Image
Pages A08-1 - A08-6,  © Society for Imaging Science and Technology 2019
Digital Library: EI
Published Online: January  2019
  122  1
Image
Pages 145-1 - 145-5,  © Society for Imaging Science and Technology 2019
Volume 31
Issue 8

The exam of fetal well-being during routine prenatal care plays a crucial role in preventing pregnancy complications and reducing the risks of miscarriages, birth defects and other health problems. However, the conventional prenatal screening and diagnosis is conducted by medical professionals in a clinical environment, which is subject to certain limitations such as manpower, medical devices and location, time and cost of services, etc. This paper presents a new approach to detect and monitor fetal movement safely and reliably without any constrains of time, environment and cost. Unlike the conventional method, our contribution includes a novel soft sensor pad which can automatically detect fetal movement and uterine contraction nonintrusively and the robust data analysis software to monitor pregnancy health and screen abnormalities with quantitative assessment. The monitoring belt embedded with the soft sensor pad is wearable, non-intrusive, radiation free and washable. The new algorithms are robust for noise removal, feature extraction, time sequence data analysis and decision support to achieve personalized care. Both the design of soft sensor pad and functions of the belt are original and unique. The results of preliminary clinical trials demonstrate the feasibility and advantages of our prototype.

Digital Library: EI
Published Online: January  2019
  32  1
Image
Pages 400-1 - 400-6,  © Society for Imaging Science and Technology 2019
Volume 31
Issue 8

In this paper we present a Cluster Aggregation Network (CAN) for face set recognition. This network takes a set of face images, which could be either face videos or clusters with a different number of face images as its input, and then it is able to produce a compact and fixed-dimensional feature representation for the face set for the purpose of recognition. The whole network is made up of two modules, among which the first one is a face feature embedding module and the second one is the face feature aggregation module. The first module is a deep Convolutional Neural Network (CNN) which maps each of the face images to a fixed-dimensional vector. The second module is also a CNN which is trained to be able to automatically assess the quality of input face images and thus assign various weights to the images’ corresponding feature vectors. Then the one aggregated feature vector representing the input set is formed inside the convex hull formed by the input single face image features. Due to the mechanism that quality assessment is invariant to the order of one image in a set and the number of images in the set, the aggregation is invariant to these factors. Our CAN is trained with standard classification loss without any other supervision information and we found that our network is automatically attracted to high quality face images, while repelling low quality images, such as blurred, blocked, and non-frontal face images. We trained our networks with CASIA and YouTube Face datasets and the experiments on IJB-C video face recognition benchmark show that our method outperforms the current state-of-the-art feature aggregation methods and our challenging baseline aggregation method.

Digital Library: EI
Published Online: January  2019
  48  9
Image
Pages 401-1 - 401-6,  © Society for Imaging Science and Technology 2019
Volume 31
Issue 8

Micro-expression (ME) analysis has been becoming an attractive topic recently. Nevertheless, the studies of ME mostly focus on the recognition task while spotting task is rarely touched. While micro-expression recognition methods have obtained the promising results by applying deep learning techniques, the performance of the ME spotting task still needs to be largely improved. Most of the approaches still rely upon traditional techniques such as distance measurement between handcrafted features of frames which are not robust enough in detecting ME locations correctly. In this paper, we propose a novel method for ME spotting based on a deep sequence model. Our framework consists of two main steps: 1) From each position of video, we extract a spatial-temporal feature that can discriminate MEs among extrinsic movements. 2) We propose to use a LSTM network that can utilize both local and global correlation of the extracted feature to predict the score of the ME apex frame. The experiments on two publicly databases of ME spotting demonstrate the effectiveness of our proposed method.

Digital Library: EI
Published Online: January  2019
  30  5
Image
Pages 402-1 - 402-9,  © Society for Imaging Science and Technology 2019
Volume 31
Issue 8

Emotion has an important role in daily life, as it helps people better communicate with and understand each other more efficiently. Facial expressions can be classified into 7 categories: angry, disgust, fear, happy, neutral, sad and surprise. How to detect and recognize these seven emotions has become a popular topic in the past decade. In this paper, we develop an emotion recognition system that can apply emotion recognition on both still images and real-time videos by using deep learning. We build our own emotion recognition classification and regression system from scratch, which includes dataset collection, data preprocessing, model training and testing. Given a certain image or a real-time video, our system is able to show the classification and regression results for all of the 7 emotions. The proposed system is tested on 2 different datasets, and achieved an accuracy of over 80%. Moreover, the result obtained from realtime testing proves the feasibility of implementing convolutional neural networks in real time to detect emotions accurately and efficiently.

Digital Library: EI
Published Online: January  2019
  22  2
Image
Pages 403-1 - 403-8,  © Society for Imaging Science and Technology 2019
Volume 31
Issue 8

We present a practical 3D-assited face alignment framework based on cascaded regression in this paper. The 3D information embedded in 2D face image is utilized to calculate two novel components to improve the performance of 2D methods in unconstrained face alignment. The two novel components for 2D image features are the projected local patch and the visibility of each landmark. First, we propose to extract the landmark related features in the projected local patches on 2D image from the corresponding 3D face model. Local patches of a fixed landmark in 3D face models for different 2D images cover the same region of face anatomically. The extracted features are more accurate for further locations regression of landmarks. Second, we propose to estimate the visibilities of 2D landmarks based on 3D face model, which are proven to be vital to address large pose face alignment problem. In this paper, we adopt Local Binary Features (LBF) to extract landmark related features in the proposed framework, and name the new method as 3D-Assisted LBF (3DALBF). An extensive evaluation on two face databases shows that 3DALBF can achieve better alignment results than the original 2D method and maintain the speed advantage of 2D method over 3D method.

Digital Library: EI
Published Online: January  2019
  22  1
Image
Pages 404-1 - 404-5,  © Society for Imaging Science and Technology 2019
Volume 31
Issue 8

This paper addresses the problem of face recognition using a graphical representation to identify structure that is common to pairs of images. Matching graphs are constructed where nodes correspond to local brightness gradient directions and edges are dependent on the relative orientation of the nodes. Similarity is determined from the size of maximal matching cliques in pattern pairs. The method uses a single reference face image to obtain recognition without a training stage. Results on samples from MegaFace obtain a 100% correct recognition result.

Digital Library: EI
Published Online: January  2019
  29  1
Image
Pages 406-1 - 406-7,  © Society for Imaging Science and Technology 2019
Volume 31
Issue 8

Considering the complexity of a multimedia society and the subjective task of describing images with words, a visual search application is a valuable tool. This work implements a Content-Based Image Retrieval (CBIR) application for texture images with the goal of comparing three deep convolutional neural networks (VGG-16, ResNet-50, and DenseNet-161), used as image descriptors by extracting global features from images. For measuring similarity among images and ranking them, we employed cosine similarity, Manhattan distance, Bray-Curtis dissimilarity, and Canberra distance. We confirm that global average pooling applied to convolutional layers provides good texture descriptors, and propose to use it when extracting features from VGGbased models. Our best result uses the average pooling layer from DenseNet-161 as a 2208-dim feature vector along with Bray-Curtis dissimilarity. We achieved 73:09% mAP@1 and 76:98% mAP@5 on the Describable Textures Dataset (DTD) benchmark, adapted for image retrieval. Our mAP@1 result is comparable to the state-of-the-art classification accuracy (73:8%). We also investigate the impact on retrieval performance when reducing the number of feature components with PCA. We are able to compress a 2208-dim descriptor down to 128 components with a moderate 3.3 percentage points drop in mAP@1.

Digital Library: EI
Published Online: January  2019
  35  3
Image
Pages 412-1 - 412-5,  © Society for Imaging Science and Technology 2019
Volume 31
Issue 8

In the competitive online fashion market place, it is common for sellers to add artificial elements to their product images, with the hope to improve the aesthetic quality of their products. Among the numerous types of artificial elements, we focus on detecting artificial frames in fashion images in this paper and we propose a novel algorithm based on traditional image processing techniques for this purpose. On the other hand, even though deep learning methods have been very powerful and effective in many image processing tasks in recent years, they do have their drawbacks in some cases, rendering them ineffective compared to our method for this particular task. Experimental results on 1000 testing images show that our algorithm has comparable performance with some of the state-of-the-art deep learning models that have been used for classification.

Digital Library: EI
Published Online: January  2019
  141  62
Image
Pages 413-1 - 413-7,  © Society for Imaging Science and Technology 2019
Volume 31
Issue 8

A barcode is the representation of data including some information related to goods, offered for sale which frequently appears on manufactured items. Especially in the online fashion market such as Poshmark (a second-hand fashion market), barcodes on the tags of the sale items represent the identified information including producer, manufacturer, etc. The market needs a system to automatically detect and decode barcodes in real time. However, the existing methods have some limitations for detecting 1-D barcodes in various backgrounds including tassels, stripes, and clustered text in fashion images. In this research, our focus is on identifying the barcodes in fashion images and distinguishing the barcode from similar non-barcode image content. It is accomplished by applying a Convolutional Neural Network (CNN) to solve this typical objective detection problem. A comparison of the performance between our algorithm and a previous method will be given in our results. Also, a traditional method based on hand-crafted features will be proposed for comparison. For the decoding part, a package including current common types of decoding schemes is used in our work to decode the detected barcodes. But it fails to decode strongly skewed barcode images. Adding pre-processing to warp the skewed images is used to increase the success of decoding.

Digital Library: EI
Published Online: January  2019

Keywords

[object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object]