Back to articles
Volume: 16 | Article ID: art00070_2
Content-based Document Enhancement and Resizing
  DOI :  10.2352/ISSN.2169-4451.2000.16.1.art00070_2  Published OnlineJanuary 2000

Recent advances in information and communications technologies have increased the need for automated reading and processing of documents. Most of today's documents contain not only text and background, but also graphics, tables, and images. Common image enhancement and interpolation methods apply an interpolation or enhancement function indiscriminately to the whole image. The resulting document image suffers from objectionable moiré patterns, edge blurring and aliasing. Therefore, scanned documents must often be segmented before other document processing techniques, such as filtering, resizing, and compression can be applied. In this paper, we present a new system to segment and label document images into text, halftone images, and background using feature extraction and unsupervised clustering. Once the segmentation is performed, a specific enhancement or interpolation kernel can be applied to each document component. Each pixel is assigned a feature pattern consisting of a scaled family of differential geometrical invariant features and texture features extracted from the co-occurrence matrix. The invariant feature pattern is then assigned to a specific region using a two-stage neural network system. The first stage is a self-organizing principal components analysis (SOPCA) network that is used to project the feature vector onto its leading principal axes found by using principal components analysis (PCA). The next step is to cluster the output of the SOPCA network into different regions. This is accomplished using a self-organizing feature-map (SOFM) network. In this paper, we demonstrate the power of the SOPCA-SOFM approach to segment document images into text, halftone, and background. The proposed filtering and interpolation method results in a noticeable improvement in the enhanced image.

Subject Areas :
Views 7
Downloads 0
 articleview.views 7
 articleview.downloads 0
  Cite this article 

Mohamed N. Ahmed, Brian E. Cooper, Shaun T. Love, "Content-based Document Enhancement and Resizingin Proc. IS&T Int'l Conf. on Digital Printing Technologies (NIP16),  2000,  pp 695 - 702,

 Copy citation
  Copyright statement 
Copyright © Society for Imaging Science and Technology 2000
NIP & Digital Fabrication Conference
nip digi fabric conf
Society of Imaging Science and Technology
7003 Kilworth Lane, Springfield, VA 22151, USA