To this day, most important documents are still issued on paper. The security is based on the fact that the cost of creating a counterfeit must be unattractive for counterfeiters in relation to the expected profit. This results typically in using expensive printing equipment and substrate. This work introduces an approach which evaluates paper documents using any internet enabled device with a camera and a web browser like smartphones and tablets. Optical character recognition (OCR) is used to make text machine readable after the document is recognized and rectified. Digital signatures are then used to verify the authenticity and integrity of the data. Beyond that, the requirements of privacy, robustness and usability are satisfied. By using JAB Code, a high-capacity matrix code, the data to be verified can be stored directly on the document without having to use a database. This brings key advantages compared to database-bound systems in terms of security and privacy. The use of OCR achieves high usability.
Line segmentation performs a significant stage in the OCR systems; it has a direct effect on the character segmentation stage which affects the recognition rate. In this paper a robust algorithm is proposed for line segmentation for Arabic printed text system with and without diacritics based on finding the global maximum peak and the baseline detection. The algorithm is tested for different font sizes and types and results have been obtained from testing 5 types of fonts with total of 43,055 lines with 99.9 % accuracy for text without diacritics and 99.5% accuracy for text with diacritics.