Back to articles
Articles
Volume: 11 | Article ID: art00030
Image
Building Scalable Web Archives
  DOI :  10.2352/issn.2168-3204.2014.11.1.art00030  Published OnlineJune 2014
Abstract

This paper aims at introducing the Internet Memory Foundation platform based on its distributed infrastructure and the associated tools and workflows that facilitate data management and preservation actions at large scale. IMF's main concern over the past years has been related to scalability issues in terms of crawling, indexing, preserving and accessing content. To answer these issues, the Foundation developed its own crawler and built a new infrastructure.This paper aims at presenting our infrastructure and crawler and at sharing challenges met while building them as well as the approach taken to solve preservation issues inherent to scalable archives. It will also highlight new horizons arising for web archives in relation to analytics use cases.

Subject Areas :
Views 2
Downloads 0
 articleview.views 2
 articleview.downloads 0
  Cite this article 

Leïla Medjkoune, Stanislav Barton, Florent Carpentier, Julien Masanès, Radu Pop, "Building Scalable Web Archivesin Proc. IS&T Archiving 2014,  2014,  pp 138 - 143,  https://doi.org/10.2352/issn.2168-3204.2014.11.1.art00030

 Copy citation
  Copyright statement 
Copyright © Society for Imaging Science and Technology 2014
72010361
Archiving Conference
archiving
2161-8798
Society for Imaging Science and Technology