Communities
|
Social Applications
Networks
Knowledge Base
Support
|
|
C-Level Executives
Other Roles
|
|
Support
Education
Partner
Other Tasks
|

Oracle Labs
|
|
| How the Data Flows |
The library caches communicate with each other continually but very slowly, using the Library Cache Auditing Protocol (LCAP), which Reich and Rosenthal created. LCAP uses unicast and multicast IP datagrams to enable the LOCKSS daemons to challenge each other to vote in polls proving that their respective copies of journal volumes, issues, and articles are the same. If a daemon loses a poll, it fetches a new copy of the damaged content from the publisher or from one of the winning daemons. This mechanism is analogous to inter-library loans.
The system's reliability depends not on the LCAP protocol, which is itself unreliable, but on the presence of large numbers of replicas and the voting mechanism. LCAP provides "public" communication--a daemon cannot be certain which other daemons heard a message it sent. This enables daemons to make their own estimates of the credibility of other daemons using a reputation system.
Perhaps the most intriguing part of the integrity system is that it is designed to leverage the unique characteristics of the LOCKSS system. Because it is not centrally administered but rather distributed, there is no single point of failure. Because LOCKSS runs very slowly it means that an attacker "must persist in taking bad actions over a long period of time," according to Reich and Rosenthal. "By operating slowly even on human time scales, the system makes it easier to detect an attacker and limits the damage he can do before being stopped."
The LOCKSS integrity system is forgiving, too, which is remarkable for an autonomous caching system. For this it relies on maintaining a record of public behavior. Since each cache maintains a registry of every other cache's polling behavior, mistrusted caches are eventually excluded from polling, copying, and lending operations. If the mistrusted cache changes its ways, demonstrating its credibility in a sufficient number of polls over time, it is readmitted to the peer group and then granted voting and lending privileges in the LOCKSS system.
Reich and Rosenthal are quick to point out that theirs is not "a general-purpose Web content preservation system." LOCKSS is designed only for Web journals such as those published by Stanford's High Wire press. To be sure, LOCKSS' slow, methodical polling and copying system is "clearly not suitable for volatile content" such as that of a CNN news site. But Reich and Rosenthal do allow that "it may be possible to apply the system to other types of content."
The current LOCKSS version runs on generic PCs. At current prices, a suitable machine with a 60GB disk in a 1U rack-mount case should cost about $750. The system is distributed as a bootable floppy disk. The system boots and runs Linux from this floppy; there is no operating system installed on the hard disk. The first time the system boots it asks a few questions, then writes the resulting configuration to the floppy, which is then write-locked. At any time, the system can be returned to a known-good state by rebooting it from this write-locked disk.
Each time the system is booted, it downloads, verifies, and installs the necessary application software, including the daemon that manages the LOCKSS cache and the Java virtual machine needed to run it. The system then runs the daemon and starts the HTTP servers that provide the user interface Web pages. The cache's administrator can use these pages to specify the journal volumes to cache and monitor the system's behavior.
For scientists, librarians, and publishers who are concerned that the
digital material that has become the record of science will prove as
evanescent as the rest of the web, the LOCKSS system is a very promising solution.
It has the capability to deliver on a wide spectrum of needs:
The LOCKSS system is in the midst of a major test involving libraries and publishers around the world. As of September 2001, 45 libraries on five continents have signed onto the project (including Harvard University, Library of Congress, New York Public Library, Los Alamos National Laboratory, and the British Library), and 53 publishers are endorsing the LOCKSS beta test. Up-to-date project status is available at:
http://lockss.stanford.edu/projectstatus.htm
The Stanford University Libraries LOCKSS team members are:
The National Science Foundation, Sun Microsystems Laboratories, and Stanford University Libraries funded development and alpha testing of LOCKSS. The worldwide "beta" test in 2001 is made possible through a grant from the Andrew W. Mellon Foundation, equipment donated by and support from Sun Microsystems Laboratories, and support from Stanford University Libraries.
We are grateful to the contributors at our alpha sites:
Special thanks are due to:
