Title: Software Preservation After the Internet
Author(s): Dragan Espenschied and Klaus Rechert
Abstract: Software preservation must consider knowledge management as a key challenge. We suggest a conceptualization of software preservation approaches that are available at different stages of the software lifecycle and can support memory institutions to understand the current state of software items in their collection, the capabilities of their infrastructure, and completeness and applicability of knowledge that is required to successfully steward the collection.
Type: Long Paper
————————————-
Title: Content-Based Characterization of the End of Term Web Archive
Author(s): Mark Phillips, Kristy Phillips and Sawood Alam
Abstract: Since 2008, the End of Term Web Archive has been gathering snapshots of the federal web, consisting of the publicly accessible .gov and .mil websites. In 2022, the End of Term team began to package these crawls into a public dataset which they released as part of the Amazon Open Data Partnership program. In total, over 460TB of WARC data was moved from local repositories at the Internet Archive and the University of North Texas Libraries. From the original WARC content, derivative datasets were created that address common use cases for web archives. These derivatives include WAT, WET, CDX and a format called a WARC Metadata Sidecar. This WARC Metadata Sidecar includes content-based characterizations of files held in the archive, including character set, language, file format identifier, and soft 404 detection. This paper describes the decisions made in the creation of these derivatives, the technologies used, and introduces the WARC Metadata Sidecar, which presents a useful approach for creating and storing auxiliary metadata for web archives.
Type: Long Paper
————————————-
Title: Revision-safe archiving and license-controlled access using distributed ledger technology
Author(s): Sven Schlarb, Roman Karl, Victor-Jan Vos, Carlijn Keijzer and Begoña Sanchez Royo
Abstract: This paper describes an approach to make use of Distributed Ledger Technology or, more specifically, Blockchain to build a trustworthy digital repository with transparent and traceable process logs for events related to the preservation or the action of requesting or granting access to digital information objects. The approach focuses on a notary use case where the information stored in the blockchain serves as a proof of evidence regarding the existence, integrity, and authenticity of information assets. The principle is demonstrated on a prototype implementation using an Ethereum blockchain which can be equally applied using public Blockchain services. The aim is increase trust in archival processes in an easy to implement and cost-efficient way.
Type: Short Paper
Event Timeslots (1)
Friday, September 22
-
TP-1