Historical Chinese character recognition has been suffering from the problem of lacking sufficient labeled training samples. An Semi-supervised learning method based on Convolutional Neural Network (CNN) for historical Chinese character recognition is proposed in this paper. We use traditional feature extraction method to extract features from the unlabeled sample sets at first; then according to the distance between the extracted features, samples pairs are constructed; With the constructed pairs, a Siamese network S is trained; The network structure and weights of model S are used to initialize another CNN model T. The model T is then fine-tuned by a few labeled historical Chinese character samples, and used for final evaluation. Experimental results show that the proposed method is effective.
Xiaoyi Yu, Wei Fan, Jun Sun, Satoshi Naoi, "Semi-supervised Learning Feature Representation for Historical Chinese Character Recognition" in Proc. IS&T Int’l. Symp. on Electronic Imaging: Visual Information Processing and Communication VIII, 2017, pp 73 - 77, https://doi.org/10.2352/ISSN.2470-1173.2017.2.VIPC-411