Princeton COS 592 - The LOCKSS Peer-to-Peer Digital

Unformatted text preview:

The LOCKSS Peer-to-Peer DigitalPreservation SystemPETROS MANIATISIntel ResearchMEMA ROUSSOPOULOSHarvard UniversityTJ GIULIStanford UniversityDAVID S. H. ROSENTHALStanford University LibrariesandMARY BAKERHP LabsThe LOCKSS project has developed and deployed in a world-wide test a peer-to-peer system forpreserving access to journals and other archival information published on the Web. It consists of alarge number of independent, low-cost, persistent Web caches that cooperate to detect and repairdamage to their content by voting in “opinion polls.” Based on this experience, we present a designfor and simulations of a novel protocol for voting in systems of this kind. It incorporates rate limi-tation and intrusion detection to ensure that even some very powerful adversaries attacking overmany years have only a small probability of causing irrecoverable damage before being detected.Categories and Subject Descriptors: H.3.7 [Information Storage and Retrieval]: Digital Li-braries; D.4.5 [Operating Systems]: ReliabilityThis work is supported by the National Science Foundation (Grant No. 0205667), by the AndrewW. Mellon Foundation, by Sun Microsystems Laboratories, by the Stanford Networking ResearchCenter, by DARPA (contract No. N66001-00-C-8015), by MURI (award No. F49620-00-1-0330), andby Sonera Corporation. Any opinions, findings, and conclusions or recommendations expressed hereare those of the authors and do not necessarily reflect the views of these funding agencies.This article is the extended version of an earlier conference article [Maniatis et al. 2003].Authors’ addresses: P. Maniatis, Intel Research, 2150 Shattuck Avenue Ste. 1300, Berkeley, CA94704; email: [email protected]; M. Roussopoulos, Harvard University, 33 Oxford Street,Cambridge, MA 02138; email: [email protected]; TJ Giuli, Computer Science Department,Stanford University, Gates Building 4A-416, 353 Serra Mall, Stanford, CA 94305-9040; email:[email protected]; D. S. H. Rosenthal, LOCKSS, 1454 Page Mill Rd., Palo Alto, CA 94304;email: [email protected]; M. Baker, HP Labs, 1501 Page Mill Road, Mail Stop 1183, Palo Alto, CA94304; email: [email protected] to make digital or hard copies of part or all of this work for personal or classroom use isgranted without fee provided that copies are not made or distributed for profit or direct commercialadvantage and that copies show this notice on the first page or initial screen of a display alongwith the full citation. Copyrights for components of this work owned by others than ACM must behonored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers,to redistribute to lists, or to use any component of this work in other works requires prior specificpermission and/or a fee. Permissions may be requested from Publications Dept., ACM, Inc., 1515Broadway, New York, NY 10036 USA, fax: +1 (212) 869-0481, or [email protected]2005 ACM 0734-2071/05/0200-0002 $5.00ACM Transactions on Computer Systems, Vol. 23, No. 1, February 2005, Pages 2–50.The LOCKSS Peer-to-Peer Digital Preservation System•3General Terms: Design, Economics, ReliabilityAdditional Key Words and Phrases: Rate limiting, replicated storage, digital preservation1. INTRODUCTIONAcademic publishing is migrating to the Web [Mogge 1999; Tenopir 2004], forc-ing the libraries that pay for journals to transition from purchasing copies of thematerial to renting access to the publisher’s copy [Keller et al. 2003]. Unfortu-nately, rental provides no guarantee of long-term access. Librarians consider itone of their responsibilities to provide future readers with access to importantmaterials. After millennia of experience with physical documents, they havetechniques for doing so: acquire lots of copies of the document, distribute themaround the world, and lend or copy them when necessary to provide access.In the LOCKSS1program (Lots Of Copies Keep Stuff Safe), we model thephysical document system and apply it to Web-published academic journals,providing tools for libraries to take custody of the material to which they sub-scribe, and to cooperate with other libraries to preserve it and provide access.The LOCKSS approach deploys a large number of independent, low-cost, persis-tent Web caches that cooperate to detect and repair damage by voting in “opinionpolls” on their cached documents. The initial version of the system [Rosenthaland Reich 2000] has been under test since 1999 at about 50 libraries worldwide,and entered production use at many more libraries in 2004. Unfortunately, theprotocol now in use does not scale adequately, and analysis of the first designfor a revised protocol [Michalakis et al. 2003] showed it to be insufficientlyresistant to attack.In this work, we present a design for and simulations of a new peer-to-peer opinion poll protocol that addresses these scaling and attack resistanceissues. We plan to migrate it to the deployed system shortly. The new protocolis based on our experience with the deployed LOCKSS system and the specialcharacteristics of such a long-term large-scale application. Distributed digitalpreservation, with its time horizon of many decades and lack of central con-trol, presents both unusual requirements, such as the need to avoid long-termsecrets like encryption keys, and unusual opportunities, such as the option tomake some system operations inherently very time-consuming without sacri-ficing usability.Digital preservation systems must resist both random failures and deliberateattack for a long time. Their ultimate success can be judged only in the distantfuture. Techniques for evaluating their design must necessarily be approximateand probabilistic; they share this problem with encryption systems. We attemptto evaluate our design in the same way that encryption systems are evaluated,by estimating the computational effort an adversary would need to achieve agiven probability of the desired result. In an encryption system, one such desiredresult is to recover the plaintext. In our case, it is to have the system deliver acorrupt copy of a document. These estimates can be converted to monetary costs1LOCKSS is a trademark of Stanford University.ACM Transactions on Computer Systems, Vol. 23, No. 1, February 2005.4•P. Maniatis et al.using technology cost curves, and thus compared to the value of the plaintextor document at risk.We introduce our design principles (Section 2) and the deployed test system(Section


View Full Document

Princeton COS 592 - The LOCKSS Peer-to-Peer Digital

Download The LOCKSS Peer-to-Peer Digital
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view The LOCKSS Peer-to-Peer Digital and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view The LOCKSS Peer-to-Peer Digital 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?