Back to articles
Creating and Preserving Dynamic Media
Volume: 12 | Article ID: art00010
Image
PIVAJ: an article-centered platform for digitized newspapers
  DOI :  10.2352/issn.2168-3204.2015.12.1.art00010  Published OnlineMay 2015
Abstract

PIVAJ is a platform for archived digitized newspaper emphasizing articles: extracting them from digitized documents by automated page layout analysis, OCRing them, indexing their text transcription to allow users to search for content. Crowdsourcing is used to improve the quality of the indexing, by correcting the transcription and by tagging articles with keywords. The platform has been used to give Web access to 550 000 articles generated from a digitized local newspaper. Current developments include further improvements to its OCR as well as graphical interfaces for the management of the platform.

Subject Areas :
Views 4
Downloads 1
 articleview.views 4
 articleview.downloads 1
  Cite this article 

Pierrick Tranouez, Stéphane Nicolas, Julien Lerouge, Thierry Paquet, "PIVAJ: an article-centered platform for digitized newspapersin Proc. IS&T Archiving 2015,  2015,  pp 40 - 43,  https://doi.org/10.2352/issn.2168-3204.2015.12.1.art00010

 Copy citation
  Copyright statement 
Copyright © Society for Imaging Science and Technology 2015
72010361
Archiving Conference
archiving
2161-8798
Society for Imaging Science and Technology