|
|
Farsite
| SIGMETRICS
2000 |
Feasibility of a Serverless Distributed File System Deployed on an Existing Set of Desktop PCs |
| 4th USENIX WSS |
Single Instance Storage in Windows® 2000 |
| 4th IEEE MASCOTS |
Large-Scale Simulation of Replica Placement Algorithms for a Serverless Distributed File System |
| 9th ESA |
Modeling Replica Placement in a Distributed File System: Narrowing the Gap between Analysis and Simulation |
| 15th DISC |
Competitive Hill-Climbing Strategies for Replica Placement in a Distributed File System |
| 4th IEEE SRDS |
Optimizing File Availability in a Secure Serverless Distributed File System |
| 1st IPTPS |
The Sybil Attack |
| USENIX 2002 |
Cooperative Task
Management without Manual Stack Management |
| 22nd ICDCS |
Reclaiming Space from Duplicate Files in a Serverless Distributed File System |
| 5th OSDI |
FARSITE: Federated, Available, and Reliable Storage for an Incompletely Trusted Environment |
|
18th ACSAC |
A Secure Directory Service based on Exclusive Encryption |
Details
|
Conference |
SIGMETRICS
2000 |
|
Title |
Feasibility of a Serverless Distributed File System Deployed on an Existing
Set of Desktop PCs |
|
Authors |
William J. Bolosky, John R. Douceur, David
Ely, and Marvin Theimer |
|
Abstract |
We consider an architecture for a serverless distributed file system that
does not assume mutual trust among the client computers. The system provides
security, availability, and reliability by distributing multiple encrypted
replicas of each file among the client machines. To assess the feasibility of
deploying this system on an existing desktop infrastructure, we measure and
analyze a large set of client machines in a commercial environment. In
particular, we measure and report results on disk usage and content; file
activity; and machine uptimes, lifetimes, and loads. We conclude that the
measured desktop infrastructure would passably support our proposed system,
providing availability on the order of one unfilled file request per user per
thousand days. |
|
Keywords |
Serverless distributed file system architecture, personal
computer usage data, availability, reliability, security, trust, workload
characterization, analytical modeling, feasibility analysis. |
Citation |
W. J. Bolosky, J. R. Douceur, D.
Ely, M. Theimer; Proceedings of the international conference on
Measurement and modeling of computer systems, 2000, pp. 34-43 |
| Definitive Version |
 |
|
Download |
  |
| Notice |
This copy is posted by permission of the ACM and may not be redistributed. |
|
Conference |
4th
USENIX Windows Systems Symposium |
|
Title |
Single Instance Storage in Windows®
2000 |
|
Authors |
William J. Bolosky, Scott Corbin, David
Goebel, and John R. Douceur |
|
Abstract |
Certain applications, such as Windows 2000's
Remote Install service, can result in a set of files in which many
different files have the same content. Using a traditional file system
to store these files separately results in excessive use of disk and
main memory file cache space. Using hard or symbolic links would
eliminate the excess resource requirements, but changes the semantics of
having separate files, in that updates to one "copy" of a file
would be visible to users of another "copy." We describe the
Single Instance Store (SIS), a component within Windows®
2000 that implements links with the semantics of copies for files stored
on a Windows 2000 NTFS volume. SIS uses copy-on-close to implement the
copy semantics of its links. SIS is structured as a file system filter
driver that implements links and a user level service that detects
duplicate files and reports them to the filter for conversion into
links. Because SIS links are semantically identical to separate files,
SIS creates them automatically when it detects files with duplicate
contents. This paper describes the design and implementation of SIS in
detail, briefly presents measurements of a remote install server showing
a 58% disk space savings by using SIS, and discusses other possible uses
of SIS. |
| Citation |
W. J. Bolosky, S. Corbin, D.
Goebel, J. R. Douceur; Proceedings of the 4th USENIX Windows Systems
Symposium, 2000, pp. 13-24 |
|
Download |
  |
| Notice |
The original publication of this
paper was granted to USENIX. Copyright to this work is retained by
the authors. Permission is granted for the noncommercial
reproduction of the complete work for educational or research purposes. |
|
Conference |
9th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS) |
|
Title |
Large-Scale Simulation of Replica Placement Algorithms for a Serverless Distributed File System |
|
Authors |
John R. Douceur and Roger P. Wattenhofer |
|
Abstract |
Farsite is a scalable, distributed file system
that logically functions as a centralized file server but that is
physically implemented on a set of client desktop computers. Farsite
provides high degrees of reliability and availability by storing
replicas of files on multiple machines. Replicas are placed to maximize
the effective system availability, using a distributed, iterative,
randomized placement algorithm. We perform a large-scale simulation of
three candidate algorithms using machine availability data collected
from over 50,000 desktop computers. We find that algorithmic efficiency
and placement efficacy run counter to each other. We fit analytic
functions to the improvement rates and provide explanations for the
fitted curves. We explore the algorithms’ properties through study of
their dynamic behavior. We visualize algorithmic placements and compare
them to theoretical worst cases. We quantify the degree of machine
failure correlation and develop a formula to approximate its effect. |
| Citation |
J. R. Douceur, R. P.
Wattenhofer; Proceedings of 9th IEEE MASCOTS, 2001, pp. 311-319 |
|
Download |
  |
| Notice |
©
2001 IEEE. Personal use of this material is permitted. However, permission
to reprint/republish this material for advertising or promotional purposes
or for creating new collective works for resale or redistribution to
servers or lists, or to reuse any copyrighted component of this work in
other works must be obtained from the IEEE. |
|
Conference |
9th Annual European Symposium on
Algorithms |
|
Title |
Modeling Replica Placement in a Distributed
File System: Narrowing the Gap between Analysis and Simulation |
|
Authors |
John R. Douceur and Roger P. Wattenhofer |
|
Abstract |
We examine the replica placement aspect of a
distributed peer-to-peer file system that replicates and stores files on
ordinary desktop computers. It has been shown that some desktop machines
are available for a greater fraction of time than others, and it is
crucial not to place all replicas of any file on machines with low
availability. In this paper we study the efficacy of three hill-climbing
algorithms for file replica placement. Based on large-scale
measurements, we assume that the distribution of machine availabilities
be uniform. Among other results we show that the MinMax algorithm is
competitive, and that for growing replication factor the MinMax and
MinRand algorithms have the same asymptotic worst-case efficacy. |
| Citation |
J. R. Douceur, R. P. Wattenhofer; Proceedings of
9th ESA, 2001, pp. 356-367 |
|
Download |
  |
Tech Report
(Extended Version) |
  |
| Notice |
©
Springer-Verlag. |
|
Conference |
15th International Symposium on
Distributed Computing |
|
Title |
Competitive Hill-Climbing Strategies for
Replica Placement in a Distributed File System |
|
Authors |
John R. Douceur and Roger P. Wattenhofer |
|
Abstract |
The Farsite distributed file system stores
multiple replicas of files on multiple machines, to provide file access
even when some machines are unavailable. Farsite assigns file replicas
to machines so as to maximally exploit the different degrees of
availability of different machines, given an allowable replication
factor R. We use competitive analysis and simulation to study the
performance of three candidate hill-climbing replica placement
strategies, MinMax, MinRand, and RandRand, each of which successively
exchanges the locations of two file replicas. We show that the MinRand
and RandRand strategies are perfectly competitive for R = 2 and
2/3-competitive for R = 3. For general R, MinRand is at
least 1/2-competitive and RandRand is at least 10/17-competitive. The
MinMax strategy is not competitive. Simulation results show better
performance than the theoretic worst-case bounds. |
| Citation |
J. R. Douceur, R. P. Wattenhofer; Proceedings of
15th DISC, 2001, pp. 48-62 |
|
Download |
  |
Tech Report
(Reformatted) |
  |
| Notice |
©
Springer-Verlag. |
|
Conference |
20th Symposium on Reliable Distributed
Systems |
|
Title |
Optimizing File Availability in a Secure Serverless Distributed File System |
|
Authors |
John R. Douceur and Roger P. Wattenhofer |
|
Abstract |
Farsite is a secure, scalable, distributed file
system that logically functions as a centralized file server but that is
physically realized on a set of client desktop computers. Farsite
provides security, reliability, and availability by storing replicas of
each file on multiple machines. It continuously monitors machine
availability and relocates replicas as necessary to maximize the
effective availability of the system. We evaluate several
replica-placement methods using large-scale simulation with machine
availability data from over 50,000 desktop computers. We find that
initially placing replicas in an availability-sensitive fashion yields
pathological results, whereas very good results are obtained by random
initial placement followed by incremental improvement using a scalable,
distributed, fault-tolerant, and attack-resistant hill-climbing
algorithm. The algorithm is resilient to severe restrictions on
communication and replica placement, and it does not excessively
co-locate replicas of different files on the same set of machines. |
| Citation |
J. R. Douceur, R. P. Wattenhofer; Proceedings of
20th IEEE SRDS, 2001, pp. 4-13 |
|
Download |
  |
| Notice |
©
2001 IEEE. Personal use of this material is permitted. However, permission
to reprint/republish this material for advertising or promotional purposes
or for creating new collective works for resale or redistribution to
servers or lists, or to reuse any copyrighted component of this work in
other works must be obtained from the IEEE. |
|
Conference |
1st International Workshop on Peer-to-Peer
Systems |
|
Title |
The Sybil Attack |
|
Author |
John R. Douceur |
|
Abstract |
Large-scale peer-to-peer
systems face security threats from faulty or hostile remote computing
elements. To resist these threats, many such systems employ redundancy.
However, if a single faulty entity can present multiple identities,
it can control a substantial fraction of the system, thereby undermining
this redundancy. One approach to preventing these “Sybil attacks” is
to have a trusted agency certify identities. This paper shows that,
without a logically centralized authority, Sybil attacks are always
possible except under extreme and unrealistic assumptions of resource
parity and coordination among entities. |
| Citation |
J. R. Douceur; Proceedings of
1st IPTPS, 2002 |
|
Download |
  |
|
Conference |
USENIX 2002 Annual Technical Conference |
|
Title |
Cooperative Task Management without Manual
Stack Management |
|
Authors |
Atul Adya, Jon Howell, Marvin Theimer, William
J. Bolosky, and John R. Douceur |
|
Abstract |
Cooperative task management can provide program
architects with ease of reasoning about concurrency issues. This
property is often espoused by those who recommend “event-driven”
programming over “multi-threaded” programming. Those terms conflate
several issues. In this paper, we clarify the issues, and show how one
can get the best of both worlds: reason more simply about concurrency in
the way “event-driven” advocates recommend, while preserving the
readability and maintainability of code associated with “multithreaded”
programming.
We identify the source of confusion about the two programming styles as
a conflation of two concepts: task management and stack management.
Those two concerns define a two-axis space in which “multithreaded” and
“event-driven” programming are diagonally opposite; there is a third
“sweet spot” in the space that combines the advantages of both
programming styles. We point out pitfalls in both alternative forms of
stack management, manual and automatic, and we supply techniques that
mitigate the danger in the automatic case. Finally, we exhibit adaptors
that enable automatic stack management code and manual stack management
code to interoperate in the same code base. |
| Citation |
A. Adya, J. Howell, M. Theimer, W. J. Bolosky, J. R. Douceur; Proceedings of USENIX 2002 Annual Technical Conference |
|
Download |
  |
| Notice |
The original publication of this
paper was granted to USENIX. Copyright to this work is retained by
the authors. Permission is granted for the noncommercial
reproduction of the complete work for educational or research purposes. |
|
Conference |
22nd International
Conference on Distributed Computing Systems |
| Title |
Reclaiming Space from Duplicate Files in a Serverless
Distributed File System |
| Authors |
John R. Douceur, Atul
Adya, William J. Bolosky, Dan Simon, and Marvin Theimer |
| Abstract |
The Farsite
distributed file system provides availability by replicating
each file onto multiple desktop computers. Since this
replication consumes significant storage space, it is important
to reclaim used space where possible. Measurement of over 500
desktop file systems shows that nearly half of all consumed
space is occupied by duplicate files. We present a mechanism to
reclaim space from this incidental duplication to make it
available for controlled file replication. Our mechanism
includes 1) convergent encryption, which enables duplicate files
to coalesced into the space of a single file, even if the files
are encrypted with different users’ keys, and 2) SALAD, a
Self-Arranging, Lossy, Associative Database for aggregating file
content and location information in a decentralized, scalable,
fault-tolerant manner. Large-scale simulation experiments show
that the duplicate-file coalescing system is scalable, highly
effective, and fault-tolerant. |
| Citation |
J. R. Douceur, A.
Adya, W. J. Bolosky, D. Simon, M. Theimer; Proceedings of 22nd
IEEE ICDCS, 2002. |
| Download |
  |
Tech Report
(Extended Version) |
  |
| Notice |
©
2002 IEEE. Personal use of this material is permitted. However,
permission to reprint/republish this material for advertising or
promotional purposes or for creating new collective works for
resale or redistribution to servers or lists, or to reuse any
copyrighted component of this work in other works must be
obtained from the IEEE. |
|
Conference |
5th Symposium on Operating
Systems Design and Implementation |
|
Title |
FARSITE: Federated, Available, and Reliable
Storage for an Incompletely Trusted Environment |
|
Authors |
Atul Adya, William J. Bolosky, Miguel Castro,
Gerald Cermak, Ronnie Chaiken, John R. Douceur, Jon Howell, Jacob R.
Lorch, Marvin Theimer, Roger P. Wattenhofer |
|
Abstract |
Farsite is a secure, scalable file system that
logically functions as a centralized file server but is physically
distributed among a set of untrusted computers. Farsite provides file
availability and reliability through randomized replicated storage; it
ensures the secrecy of file contents with cryptographic techniques; it
maintains the integrity of file and directory data with a
Byzantine-fault-tolerant protocol; it is designed to be scalable by
using a distributed hint mechanism and delegation certificates for
pathname translations; and it achieves good performance by locally
caching file data, lazily propagating file updates, and varying the
duration and granularity of content leases. We report on the design of
Farsite and the lessons we have learned by implementing much of that
design. |
| Citation |
A. Adya, W. J. Bolosky, M.
Castro, G. Cermak, R. Chaiken, J. R. Douceur, J. Howell, J. R. Lorch, M.
Theimer, R. P. Wattenhofer; Proceedings of the 5th OSDI, December 2002. |
|
Download |
  |
| Notice |
The original publication of this
paper was granted to USENIX. Copyright to this work is retained by
the authors. Permission is granted for the noncommercial
reproduction of the complete work for educational or research purposes. |
|
Conference |
18th Annual Computer Security Applications
Conference |
|
Title |
A Secure Directory Service based on Exclusive
Encryption |
|
Authors |
John R. Douceur, Atul Adya, Josh Benaloh,
William J. Bolosky, Gideon Yuval |
|
Abstract |
We describe the design of a Windows file-system
directory service that ensures the persistence, integrity, privacy,
syntactic legality, and case-insensitive uniqueness of the names it
indexes. Byzantine state replication provides persistence and integrity,
and encryption imparts privacy. To enforce Windows’ baroque name syntax
– including restrictions on allowable characters, on the terminal
character, and on several specific names – we develop a cryptographic
process, called “exclusive encryption,” that inherently excludes
syntactically illegal names and that enables the exclusion of
case-insensitively duplicate names without access to their plaintext.
This process excludes entire names by mapping the set of allowed strings
to the set of all strings, excludes certain characters through an
amended prefix encoding, excludes terminal characters through varying
the prefix coding by character index, and supports case-insensitive
comparison of names by extracting and encrypting case information
separately. We also address the issues of hiding name-length information
and access-authorization information, and we report a newly discovered
problem with enforcing case-insensitive uniqueness for Unicode names. |
| Citation |
J. R. Douceur, A. Adya, J.
Benaloh, W. J. Bolosky, G. Yuval; Proceedings of the 18th ACSAC, December 2002. |
|
Download |
  |
| Notice |
©
2002 IEEE. Personal use of this material is permitted. However, permission
to reprint/republish this material for advertising or promotional purposes
or for creating new collective works for resale or redistribution to
servers or lists, or to reuse any copyrighted component of this work in
other works must be obtained from the IEEE. |
|