Improving a deep convolutional neural network architecture for character recognition

Bogdan-Ionuţ Cirstea; Laurence Likforman-Sulem

doi:10.2352/ISSN.2470-1173.2016.17.DRR-060

Deep architectures based on convolutional neural networks have obtained state-of-the-art results for several recognition tasks. These architectures rely on a cascade of convolutional layers and activation functions. Beyond the set-up of the number of layers and the number of neurons in each layer, the choice of activation functions, training optimization algorithm and regularization procedure are of great importance. In this work we start from a deep convolutional architecture and we describe the effect of recent activation functions, optimization algorithms and regularization procedures when applied to the recognition of handwritten digits from the MNIST dataset. The network achieves a 0.38 % error rate, matching and slightly improving the best known performance of a single model trained without data augmentation at the time the experiments were performed.