Fluorescence microscopy has become a widely used tool for studying various biological structures of in vivo tissue or cells. However, quantitative analysis of these biological structures remains a challenge due to their complexity which is exacerbated by distortions caused by lens aberrations and light scattering. Moreover, manual quantification of such image volumes is an intractable and error-prone process, making the need for automated image analysis methods crucial. This paper describes a segmentation method for tubular structures in fluorescence microscopy images using convolutional neural networks with data augmentation and inhomogeneity correction. The segmentation results of the proposed method are visually and numerically compared with other microscopy segmentation methods. Experimental results indicate that the proposed method has better performance with correctly segmenting and identifying multiple tubular structures compared to other methods.
Convolutional neural networks (CNNs) tend to look at methods to improve performance by adding more input data, making modifications on existing data, or changing the design of the network to better suit the variables. The goal of this work is to supplement the small number of existing methods that do not use one of the previously mentioned techniques. This research aims to show that with a standard CNN, classification accuracy rates have the potential to be improved without changes to the data or major network design modifications, such as adding convolution or pooling layers. A new layer is proposed that will be inserted in a similar location as the non-linearity functions in standard CNNs. This new layer creates a localized connectivity for each perceptron to a polynomial of Nth degree. This can be performed in both the convolutional portion and the fully connected portion of the network. The proposed polynomial layer is added with the idea that the higher dimensionality enables a better description of the input space, which leads to a higher classification rate. Two different datasets, MNIST and CIFAR10 are utilized for classification, each containing 10 distinct classes and having similar training set sizes, 60,000 and 50,000 images, respectively. These datasets differ in that the images are 28 × 28 grayscale and 32 × 32 RGB, respectively. It is shown that the added polynomial layers enable the chosen CNN design to have a higher rate of accuracy on the MNIST dataset. This effect was only similar at a lower learning rate with the CIFAR10 dataset.
A flashover occurs when a fire spreads very rapidly through crevices due to intense heat. Flashovers present one of the most frightening and challenging fire phenomena to those who regularly encounter them: firefighters. Firefighters' safety and lives often depend on their ability to predict flashovers before they occur. Typical pre-flashover fire characteristics include dark smoke, high heat, and rollover ("angel fingers") and can be quantified by color, size, and shape. Using a color video stream from a firefighter's body camera, we applied generative adversarial neural networks for image enhancement. The neural networks were trained to enhance very dark fire and smoke patterns in videos and monitor dynamic changes in smoke and fire areas. Preliminary tests with limited flashover training videos showed that we predicted a flashover as early as 55 seconds before it occurred.
Understanding complex events from unstructured video, like scoring a goal in a football game, is an extremely challenging task due to the dynamics, complexity and variation of video sequences. In this work, we attack this problem exploiting the capabilities of the recently developed framework of deep learning. We consider independently encoding spatial and temporal information via convolutional neural networks and fusion of features via regularized Autoencoders. To demonstrate the capacities of the proposed scheme, a new dataset is compiled, compose of goal and no-goal sequences. Experimental results demonstrate that extremely high classification accuracy can be achieved, from a dramatically limited number of examples, by leveraging pre-trained models with fine-tuned fusion of spatio-temporal features.
Person detection and recognition has many applications in autonomous driving, smart home and smart office applications. Knowledge about the presence of a person in the environment can be used in safety solutions such as collision avoidance, in energy conservation solutions such as turning lights and air-conditioning off when there is no person around, and in meeting and collaboration solutions such as locating a vacant room. In this paper, we present a solution that can reliably detect and recognize persons under different lighting conditions and pose based on head detection and recognition using deep learning. The system is proved to achieve good results on a challenging dataset.
Recent studies have shown that the steganalytic approaches based on deep learning frameworks cannot surpass their rich?model features based companions in peiformance. According to our analysis, one of the main causes of the unsatisfactory performance of deep learning frameworks is that training procedure tends to get stuck at local plateaus or even diverge when starting from a non-ideal initial state. In this paper we will try to investi?gate how to fit deep neural network to a rich-model features set. We regard it as a pre-training procedure and study its 4fect on deep learning for steganalysis. The state-of-the-art JPEG steganalytic features set DCTR is selected as the target and its features extraction procedure is divided into multiple sub-models. A deep learning framework with similar sub-networks is proposed. In the pre-training procedure we train theframeworkfrom bottom to up, fitting the output of each sub-network to the actual output of the corresponding sub-module of DCTR. The motivation behind the scenario is that we reinforce the proposed framework learn to fit the nonlinear mapping implicit in DCTR and expect when it is trainedfrom an initial state which represents an approximate so?lution of DCTR, we can get better peiformance compared to what DCTR has achieved.
Feature-based steganalysis has been an integral tool for detecting the presence of steganography in communication channels for a long time. In this paper, we explore the possibility to utilize powerful optimization algorithms available in convolutional neural network packages to optimize the design of rich features. To this end, we implemented a new layer that simulates the formation of histograms from truncated and quantized noise residuals computed by convolution. Our goal is to show the potential to compactify and further optimize existing features, such as the projection spatial rich model (PSRM).
Nowadays, it is still difficult to adapt Convolutional Neural Network (CNN) based models for deployment on embedded devices. The heavy computation and large memory footprint of CNN models become the main burden in real application. In this paper, we propose a "Sparse Shrink" algorithm to prune an existing CNN model. By analyzing the importance of each channel via sparse reconstruction, the algorithm is able to prune redundant feature maps accordingly. The resulting pruned model thus directly saves computational resource. We have evaluated our algorithm on CIFAR-100. As shown in our experiments, we can reduce 56.77% parameters and 73.84% multiplication in total with only minor decrease in accuracy. These results have demonstrated the effectiveness of our "Sparse Shrink" algorithm.