"An Evaluation of Caching Policies for Memento TimeMaps"

Justin F. Brunelle and Michael L. Nelson have self-archived "An Evaluation of Caching Policies for Memento TimeMaps" in arXiv.org.

Here's an excerpt from :

As defined by the Memento Framework, TimeMaps are machine-readable lists of time-specific copies—called "mementos"—of an archived original resource. In theory, as an archive acquires additional mementos over time, a TimeMap should be monotonically increasing. However, there are reasons why the number of mementos in a TimeMap would decrease, for example: archival redaction of some or all of the mementos, archival restructuring, and transient errors on the part of one or more archives. We study TimeMaps for 4,000 original resources over a three month period, note their change patterns, and develop a caching algorithm for TimeMaps suitable for a reverse proxy in front of a Memento aggregator. We show that TimeMap cardinality is constant or monotonically increasing for 80.2% of all TimeMap downloads observed in the observation period. The goal of the caching algorithm is to exploit the ideally monotonically increasing nature of TimeMaps and not cache responses with fewer mementos than the already cached TimeMap. This new caching algorithm uses conditional cache replacement and a Time To Live (TTL) value to ensure the user has access to the most complete TimeMap available. Based on our empirical data, a TTL of 15 days will minimize the number of mementos missed by users, and minimize the load on archives contributing to TimeMaps.

Digital Scholarship | Digital Scholarship Publications Overview | Sitemap

Avatar photo

Author: Charles W. Bailey, Jr.

Charles W. Bailey, Jr.