In content-based image retrieval, the most challenging problem is the “semantic gap” between low-level visual features captured by machines and high-level semantic concepts perceived by human. This paper focuses on the high-level image features learning by the convolutional
neural networks (CNN) in image retrieval. As a deep learning framework, CNN can extract meaningful image features in different layers, and transfer the image content into (abstract) semantic concepts. These high-level features descriptors can be better image representations than the hand-crafted
feature descriptors, and further improve the image retrieval performance. The experimental results showed that layerwise learning invariant feature hierarchies in CNN is good at feature representations. Using CNN for feature extractions on CIFAR-10 and CIFAR-100 dataset, it achieved 0.707
and 0.244 of mean average precision (MAP), respectively.