Deep networks have revolutionized the image, speech, and pattern recognition communities. Despite recent evidence showing deep networks can rival the human brain for visual object recognition, the expansion of such architectures to generalpurpose intelligent reasoning is intractable
due to the number of training parameters. Hierarchical representations have been introduced, but either have been applied to small problems, or have been ad hoc in nature. This paper introduces a framework that automatically analyzes and configures a family of smaller deep networks as a replacement
to a singular, larger network. By analyzing the linkage coefficients from confusion matrices and class boundaries from spectral clustering, class clusters and subclusters are automatically detected, enabling the framework to divide and conquer large classification problems. The resulting smaller
networks are not only highly scalable, parallel and more practical to train, but also achieve higher classification accuracy. Numerous experiments on network classes, layers, and architecture configurations validate our results.