Back to articles
Volume: 4 | Article ID: art00037
Archiving a Historic Medico-legal Collection: Automation and Workflow Customization
  DOI :  10.2352/issn.2168-3204.2007.4.1.art00037  Published OnlineJanuary 2007

The U.S. National Library of Medicine (NLM) has acquired a historical collection of documents, released by the Food and Drug Administration, specifying the Notices of Judgment (NJs) against manufacturers of adulterated or misbranded food, drugs and cosmetics. These documents, consisting of 70,000+ pages containing more than 65,000 NJs, are to be preserved and made accessible over the long term due to their legal and historical value.We developed a preservation system, named SPER (System for Preservation of Electronic Resources), based on DSpace infrastructure, for archiving and disseminating NJs contained in these documents. For efficiency and cost-effectiveness, we developed algorithms to automatically identify the NJs and extract metadata from their contents, and then have an archivist review and edit the metadata, and ingest the NJs into the archive. Contents of the documents are also captured as text streams to provide full-text search capability for the NJs.These functionalities required a number of changes to the open source DSpace software, including changing the ingest interface and workflow, handling metadata schema that does not map to Dublin Core, and enhancing the database schema.This paper describes the overall SPER system, customized workflow for automated metadata extraction, the automated metadata extraction process, and an estimate of labor savings through automation.

Subject Areas :
Views 8
Downloads 2
 articleview.views 8
 articleview.downloads 2
  Cite this article 

Dharitri Misra, Song Mao, John Rees, George R. Thoma, "Archiving a Historic Medico-legal Collection: Automation and Workflow Customizationin Proc. IS&T Archiving 2007,  2007,  pp 157 - 161,

 Copy citation
  Copyright statement 
Copyright © Society for Imaging Science and Technology 2007
Archiving Conference
Society of Imaging Science and Technology
7003 Kilworth Lane, Springfield, VA 22151, USA