A rule-based optical character recognition system for the recognition of serial number on Renminbi (RMB) banknote is presented. It is based on the observation that the characters, including English letters and numbers, can be classified using two hand-crafted features, which are the opening and the loop. Each character has certain characteristics in terms of those two features and classification is achieved following the proposed scheme. The proposed system has been tested on 2245 RMB bills, which contain 22313 characters, and accomplished 99.35% for horizontal characters and 99.84% for vertical characters under 30ms processing time per banknote.
Historical Chinese character recognition has been suffering from the problem of samples labeling, not only the problem of lacking sufficient labeled training samples, but also of sample classes. So the scenario for Historical Chinese character recognition is "open set" recognition, where incomplete labeling of sample classes is present at training time, and unknown classes can be submitted to the system during testing. This paper proposes a method for open set Historical Chinese Character Recognition. For open set recognition, the features available in the training data cannot effectively characterize different kinds of unknown classes. We assume that the features which characterize unknown classes can be derived or learned from other similar data sets. We utilize an auxiliary data set combined with the open set training data set to learn good features to represent historical Chinese characters. The auxiliary data set is translated using Generative Adversarial Networks (GAN) to make sure that the translated data set is as close to the historical Chinese character dataset as possible. Then we construct a neural network for features extraction. The neural network is trained using an alternative training method with the translated auxiliary dataset and incomplete labeled historical Chinese character data set. Last, features are extracted from certain layer of the trained neural network. Unknown samples are detected using statistical modelling of the Euclidean metric between samples. Experimental results show that the proposed method is effective.
Extremely low-quality images, on the order of 20 pixels in width, appear with frustrating frequency in many forensic investigations. Even advanced de-noising and super-resolution technologies are unable to extract useful information from such lowquality images. We show, however, that useful information is present in such highly degraded images. We also show that convolutional neural networks can be trained to decipher the contents of highly degraded images of license plates, and that these networks significantly outperform human observers.