In this paper, we propose a document image classification framework based on layout information. Our method does not use OCR; hence, it is completely language independent. Still we are able to exploit text data by extracting text regions with a novel MSER-based approach. Our MSER formulation provides great robustness against text distortions in comparison to the existing one. We introduce two types of novel image descriptors supplemented with Fisher vectors, based on Bernoulli mixture model. Classifiers, based on aforementioned descriptors, are assembled into meta-classification system that is able to classify document in complex cases when individual classifier accuracy is poor. Our meta-classification system demonstrates low processing time comparable to a single classifier. We show that our method outperforms the existing ones by the means of classification accuracy for a wide range of documents of both well-known and machine-generated document datasets.
Sergey Zavalishin, Andrey Bout, Ilya Kurilin, Michail Rychagov, "Document Image Classification on the Basis of Layout Information" in Proc. IS&T Int’l. Symp. on Electronic Imaging: Visual Information Processing and Communication VIII, 2017, pp 78 - 86, https://doi.org/10.2352/ISSN.2470-1173.2017.2.VIPC-412