Publications in Scientific Journals:
R. King, R. Schmidt, C. Becker, S. Schlarb:
"SCAPE: Big Data meets Digital Preservation";
Special theme: Big Data
The digital collections of scientific and memory institutions - many of which are already in the petabyte range - are growing larger every day. The fact that the volume of archived digital content worldwide is
increasing geometrically, demands that their associated preservation activities become more scalable. The economics of long-term storage and access demand that they become more automated. The present state
of the art fails to address the need for scalable automated solutions for tasks like the characterization or migration of very large volumes of digital content. Standard tools break down when faced with very large or complex digital objects; standard workflows break down when faced with a very large number of objects or heterogeneous collections. In short, digital preservation is becoming an application area of big data, and big data is itself revealing a number of significant preservation challenges.
big data, digital preservation, mapreduce, scalability, preservation planning
Electronic version of the publication:
Created from the Publication Database of the Vienna University of Technology.