Regular
AUDIO UNDERSTANDINGAESTHETICS QUALITY PREDICTORATTRIBUTE-BASED ENCRYPTION
BOOSTED LOCAL FEATURES
COMPUTER VISIONCONVOLUTIONAL NEURAL NETWORKCSLBP FEATURESCHI-SQUARE SIMILARITY MEASURECONVOLUTIONAL NEURAL NETWORKS
DEEP LEARNINGDISTRACTED DRIVERS
FEATURE COMBINATIONFEATURE FUSIONFACE RECOGNITION
GRAPH CUT
HOG FEATURESHEAD DETECTIONHUMAN DETECTION
IMAGE SEGMENTATIONIMAGINGINDOOR SCENE SEGMENTATIONINSTANT MESSAGINGIMAGE MASH-UP AND RE-MIXINGINTERACTIVE SEGMENTATION
LOCAL DIRECTIONAL PATTERNSLOGO RECOGNITION
MACHINE LEARNINGMODULAR HISTOGRAMMOBILEMULTIMEDIA ANALYSIS
NETWORK CAMERA
ONLINE SOCIAL NETWORK
PROBE DETECTIONPROBE RECOGNITIONPSYCHOPHYSICAL EXPERIMENTSPHASE CONGRUENCYPRODUCT PHOTOGRAPHY
RANKING SVMREAL TIME DATA CAPTURE
SELECTIVE SEGMENTATIONSUPPORT VECTOR MACHINESSECURITYSUPPORT VECTOR MACHINE (SVM)SMILE ELEGENCE DETECTIONSINGLE VIEW GEOMETRYSMILE DETECTION
TIME- FREQUENCY AUDIO ANALYSISTEXTURE RE-RENDERING
UAV DETECTIONUSER INTERFACE
VISUAL ANYLITICS
WEB
 Filters
Month and year
 
  44  0
Image
Pages 1 - 4,  © Society for Imaging Science and Technology 2017
Digital Library: EI
Published Online: January  2017
  170  3
Image
Pages 5 - 12,  © Society for Imaging Science and Technology 2017
Volume 29
Issue 10

Millions of cameras are openly connected to the Internet for a variety of purposes. This paper takes advantage of this resource to gather visual data. This camera data could be used for a myriad of purposes by solving two problems. (i) The network camera image data needs context to solve real world problems. (ii) While contextual data is available, it is not centrally aggregated. The goal is to make it easy to leverage the vast amount of network cameras. The database allows users to aggregate camera data from over 119,000 network camera sources all across the globe in real time. This paper explains how to collect publicly available information from network cameras. The paper describes how to analyze websites to retrieve relevant information about the cameras and to calculate the refresh rates of the cameras.

Digital Library: EI
Published Online: January  2017
  41  5
Image
Pages 13 - 19,  © Society for Imaging Science and Technology 2017
Volume 29
Issue 10

Today, Online Social Network (OSN) has emerged as the pervasive form of media connecting people from all over the world. Among the core functionalities associated with OSN, Instant Messaging (IM) plays a critical role in real-time communication between those virtual online communities. As the growth in IM usage continues, it has become the primary means of communication within business, education, and everyday life. Meanwhile, privacy management and data protection are issues that remain paramount to the future development of IM technology. In this work, we focus on the data protection and privacy management of group chat where multiple users simultaneously connect to a central server for real-time communications. We describe a novel multimedia IM system supporting user defined security control over real-time communication in a multiuser environment. The attribute-based encryption (ABE) is employed by the system to provide access control over transmitted user messages. Extensive experiments demonstrate that the new ABE key management mechanism provides a flexible and effective solution to data protection and privacy management for real-time online communication in multiuser environments.

Digital Library: EI
Published Online: January  2017
  168  41
Image
Pages 20 - 26,  © Society for Imaging Science and Technology 2017
Volume 29
Issue 10

According to the National Highway Traffic Safety Administration, one in ten fatal crashes and two in ten injury crashes were reported as distracted driver accidents in the United State during 2014. In an attempt to mitigate these alarming statistics, this paper explores using a dashboard camera along with computer vision and machine learning to automatically detect distracted drivers. We consider a dataset that incorporates drivers engaging in seven different distracting behaviors using left and/or right hands. Traditional handcrafted features paired with a Support Vector Machine classifier are contrasted with deep Convolutional Neural Networks. The traditional features include a blend of Histogram of Oriented Gradients and Scale-Invariant Feature Transform descriptors used to create Bags of Words. The deep convolutional methods use transfer learning on AlexNet, VGG-16, and ResNet-152. The results yield 85% accuracy with ResNet and 82.5% accuracy with VGG-16, which outperformed AlexNet by almost 10%. Replacing the fully connected layers by a Support Vector Machine classifier did not improve the classification accuracy. The traditional features yielded much lower accuracy than the deep convolutional networks.

Digital Library: EI
Published Online: January  2017
  652  203
Image
Pages 27 - 36,  © Society for Imaging Science and Technology 2017
Volume 29
Issue 10

Recent progress in deep learning methods has shown that key steps in object detection and recognition, including feature extraction, region proposals, and classification, can be done using Convolutional Neural Networks (CNN) with high accuracy. However, the use of CNNs for object detection and recognition has significant technical challenges that still need to be addressed. One of the most daunting problems is the very large number of training images required for each class/label. One way to address this problem is through the use of data augmentation methods where linear and nonlinear transforms are done on the training data to create "new" training images. Typical transformations include spatial flipping, warping and other deformations. An important concept of data augmentation is that the deformations applied to the labeled training images do not change the semantic meaning of the classes/labels. In this paper we investigate several approaches to data augmentation. First, several data augmentation techniques are used to increase the size of the training dataset. Then, a Faster R-CNN is trained with the augmented dataset for detect and recognize objects. Our work is focused on two different scenarios: detecting objects in the wild (i.e. commercial logos) and detecting objects captured using a camera mounted on a computer system (i.e. toy animals).

Digital Library: EI
Published Online: January  2017
  27  0
Image
Pages 37 - 44,  © Society for Imaging Science and Technology 2017
Volume 29
Issue 10

We propose the use of a deep network to detect, segment and characterize a Coordinate Measuring Machine (CMM) probe used in measuring various machine parts. Our motivation is to accelerate the time taken for an operator to input various parameters of a CMM probe into the system, so that delay in quality assurance of machine parts can be negated. Using imagery from a high resolution EO sensor, we design a probe recognition and characterization framework which can segment probe regions, classify various probe-region proposals into generic or specific probe components, and estimate the various configuration parameters of the probe. In order to measure a specific machine part, an operator provides the CMM machine with an image of an assembled probe. This end-to-end deep network-based framework will then generate configuration parameters suitable for the measurement task. Since the number of machine parts are in the order of thousands, the probe can have multiple configurations. In this work, we do extensive analysis on a probe dataset captured in our lab and evaluate two main aspects of the framework: its ability to segment regions, and classify those regions as probe components.

Digital Library: EI
Published Online: January  2017
  153  0
Image
Pages 45 - 50,  © Society for Imaging Science and Technology 2017
Volume 29
Issue 10

Person detection and recognition has many applications in autonomous driving, smart home and smart office applications. Knowledge about the presence of a person in the environment can be used in safety solutions such as collision avoidance, in energy conservation solutions such as turning lights and air-conditioning off when there is no person around, and in meeting and collaboration solutions such as locating a vacant room. In this paper, we present a solution that can reliably detect and recognize persons under different lighting conditions and pose based on head detection and recognition using deep learning. The system is proved to achieve good results on a challenging dataset.

Digital Library: EI
Published Online: January  2017
  139  2
Image
Pages 51 - 59,  © Society for Imaging Science and Technology 2017
Volume 29
Issue 10

We present a click-based interactive segmentation for indoor scenes, which allows the user to select an object or region within the scene in a few clicks. The goal for the click-based approach is to provide the user with a simple method to reduce the amount of input required for segmentation. We first present an effective global segmentation strategy, which provides a rough separation of different textures. The user, then, places a few clicks to segment the target. A novel Trimap assignment strategy is proposed to utilize the click information. To study the performance of our method, psychophysical experiments were conducted to compare our click-based approach with other existing methods.

Digital Library: EI
Published Online: January  2017
  858  265
Image
Pages 60 - 64,  © Society for Imaging Science and Technology 2017
Volume 29
Issue 10

In the last years, the ductility and easiness of usage of unmanned aerial vehicles (UAV) and their affordable cost have increased the drones use by industry and private users. However, drones carry the potential of many illegal activities from smuggling illicit material, unauthorized reconnaissance and surveillance of targets and individuals, to electronic and kinetic attacks in the worse threatening scenarios. As a consequence, it has become important to develop effective and affordable coun- termeasures to report of a drone flying over critical areas. In this context, our research chooses different short term parametrization in time and frequency domain of environmental audio data to develop a machine learning based UAV warning system which employs the support vector machines to understand and recognize the drone audio fingerprint. Preliminary experimental results have shown the effectiveness of the proposed approach.

Digital Library: EI
Published Online: January  2017
  47  2
Image
Pages 65 - 69,  © Society for Imaging Science and Technology 2017
Volume 29
Issue 10

The thriving of online fashion markets has increasingly drawn people's attention. More and more small business owners and individual sellers have joined the traditional professional retail industry, which has led to the blooming of image-based online fashion communities and product photography. Accordingly, we have been dedicated to study how to improve the aesthetic quality of fashion images. In previous work, based on the psychophysical experiments we conducted and the aesthetics evaluation of a given collection of photos, we designed features for aesthetics inference, and introduced a SVM predictor to indicate the image quality. Using this predictor we investigate a large range of fashion photos; and our recent findings show that human aesthetic feedback on fashion images significantly depends on another two high-level factors: the nature of the background in the photo, and how the fashion items are displayed. We believe that fashion photos in which the fashion item is worn by a model, or placed on a mannequin are more aesthetically pleasing than others; and likewise people tend to prefer photos with white background. Furthermore, based on ground truth data that we collected, we perform a statistical analysis to validate these conclusions.

Digital Library: EI
Published Online: January  2017

Keywords

[object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object] [object Object]