This paper describes a computationally efficient storage and retrieval method for the (R,G,B) color images of the printed documents. The proposed method is developed based on the principal component analysis of image color distribution. A new similarity measure is introduced for image retrieval based on the Tanimoto measure of recognizing similar patterns. This similarity measure is computationally effective since the vector inner product is the only operation needed for its computation.Several feature sets are experimented in the computer simulation of the algorithm to demonstrate the efficacy of the image retrieval. It is determined experimentally that the proposed method is not affected by substantial changes in the databases. This is due to the fact that the features used for document retrieval are not predefined sets. Rather, they are extracted directly from the document images submitted for recording or searching. This makes the algorithm very robust and attractive for many applications of the image storage and retrieval systems.
Mehmet Celenk, Yuan Shao, "Printed Color Document Storage and Retrieval for Image Databases" in Proc. IS&T Int'l Conf. on Digital Printing Technologies (NIP14), 1998, pp 298 - 301, https://doi.org/10.2352/ISSN.2169-4451.1998.14.1.art00073_1