Back to articles
Articles
Volume: 9 | Article ID: art00055
Image
A heuristic measure for detecting influence of lossy JP2 compression on Optical Character Recognition in the absence of ground truth
  DOI :  10.2352/issn.2168-3204.2012.9.1.art00055  Published OnlineJanuary 2012
Abstract

Cultural heritage institutions such as libraries, museums and archives have been carrying out large scale digitisation projects during the last decade, and the question how to store digital master images in a cost effective way made the JPEG 2000 standard (ISO/IEC 15444-1), especially the JP2 image file format (JPEG 2000 Part 1), popular in the library, museums, and archives community. Especially the lossy JP2 encoding of page image masters provides a good balance between file size reduction and preservation of the visible properties of a master image. Lossy JP2 encoding of digital images means that it is not possible to restore the original file at the bit level, even if there are no distinguishable differences to the human eye. But the absence of visual changes does not always imply that there is no influence on the computational processing of the images. In this context we present a heuristic measure that helps to detect undesired influence of lossy JP2 compression on the OCR result, and in the absence of ground truth.

Subject Areas :
Views 3
Downloads 0
 articleview.views 3
 articleview.downloads 0
  Cite this article 

Sven Schlarb, Clemens Neudecker, "A heuristic measure for detecting influence of lossy JP2 compression on Optical Character Recognition in the absence of ground truthin Proc. IS&T Archiving 2012,  2012,  pp 250 - 254,  https://doi.org/10.2352/issn.2168-3204.2012.9.1.art00055

 Copy citation
  Copyright statement 
Copyright © Society for Imaging Science and Technology 2012
72010361
Archiving Conference
archiving
2161-8798
Society of Imaging Science and Technology
7003 Kilworth Lane, Springfield, VA 22151, USA