Back to articles
Volume: 11 | Article ID: art00030
Building Scalable Web Archives
  DOI :  10.2352/issn.2168-3204.2014.11.1.art00030  Published OnlineJune 2014

This paper aims at introducing the Internet Memory Foundation platform based on its distributed infrastructure and the associated tools and workflows that facilitate data management and preservation actions at large scale. IMF's main concern over the past years has been related to scalability issues in terms of crawling, indexing, preserving and accessing content. To answer these issues, the Foundation developed its own crawler and built a new infrastructure.This paper aims at presenting our infrastructure and crawler and at sharing challenges met while building them as well as the approach taken to solve preservation issues inherent to scalable archives. It will also highlight new horizons arising for web archives in relation to analytics use cases.

Subject Areas :
Views 5
Downloads 0
 articleview.views 5
 articleview.downloads 0
  Cite this article 

Leïla Medjkoune, Stanislav Barton, Florent Carpentier, Julien Masanès, Radu Pop, "Building Scalable Web Archivesin Proc. IS&T Archiving 2014,  2014,  pp 138 - 143,

 Copy citation
  Copyright statement 
Copyright © Society for Imaging Science and Technology 2014
Archiving Conference
Society for Imaging Science and Technology