Back to articles
Articles
Volume: 19 | Article ID: art00098_2
Image
A Neural Network Based Color Document Segmentation
  DOI :  10.2352/ISSN.2169-4451.2003.19.1.art00098_2  Published OnlineJanuary 2003
Abstract

Document segmentation is defined as distinguishing different parts of the document image based on contents. In this paper, the document image is segmented into texts, pictures, and background. The algorithm we proposed includes background removal, block segmentation, feature extraction, and recognition. In background removal, we use local thresholds to extract foreground of the image. In block segmentation, run-length smoothing algorithm and connected component analysis are applied to divide the document image into a set of regions. And then, the features including image features and geometry features from the regions are extracted. Finally, these features are fed into the classifier which is a three-layer back-propagation neural network. The output of the neural network is the result of the recognition: texts or pictures. Through the experiments, we know that most document images with simple backgrounds can be segmented well by the method we proposed. Therefore, there are several advantages in our document segmentation system. 1. Localized thresholds to distinguish foreground from background based on color concepts. 2. Able to discriminate texts from pictures by extraction of good features. 3. Use a trainable neural network as the classifier where the structure can be adjusted flexibly. 4. Precise segmentation since the classifier is trained by mass of document images.

Subject Areas :
Views 22
Downloads 0
 articleview.views 22
 articleview.downloads 0
  Cite this article 

Hsiao-Yu Han, "A Neural Network Based Color Document Segmentationin Proc. IS&T Int'l Conf. on Digital Printing Technologies (NIP19),  2003,  pp 859 - 864,  https://doi.org/10.2352/ISSN.2169-4451.2003.19.1.art00098_2

 Copy citation
  Copyright statement 
Copyright © Society for Imaging Science and Technology 2003
72010410
NIP & Digital Fabrication Conference
nip digi fabric conf
2169-4451
Society of Imaging Science and Technology
7003 Kilworth Lane, Springfield, VA 22151, USA