|
Parallel-Downloads (a.k.a. swarming)
Parallel-Download is a technique used to fetch files
from multiple sources. Such sources can be Web servers, Web
caches, P2P nodes, etc.
Parallel-downloads were first proposed back in early 1999 in this
paper as a way to solve the server-selection problem.
Since then, parallel-downloads
have been integrated in many Internet applications and have
become the core of many P2P file-swarming systems.
Parallel-Downloads
speed up download time and eliminate the server selection
problem.
Rather than picking a specific server to download a file, a
number of servers are selected and content is downloaded in
parallel from all of them, getting different bits of the
content from different servers. As clients download content
they immediately become sources of new content for other
clients using parallel-downloads.
Clients
experience a transfer rate equal to the sum of the individual transfer rates of
the servers contacted, which helps saturate their
links, and ensure strong robustness against server failures or
network fluctuations.

Fig1.
Parallel-Download Architecture
In
addition to applications in Content Distribution networks
and P2P systems, parallel-downloads can also be used to
aggregate
bandwidth from multiple
network interfaces, e.g. wireless interfaces. For instance,
using parallel-downloads, multi-homed wireless devices can
stripe data connections across all their network interfaces. The benefits of
using multiple interfaces in parallel include faster throughput, smaller
bandwidth fluctuations, extended coverage, smooth handoffs, and minimized
disruption when new links come and go.
Assume that one client ones to
a large file. Normally, the client
would download the whole file from a single
source, limiting its throughput
to the speed of the selected source. With parallel-downloads, the
client picks a set of the available
sources and downloads different parts of the file from each of the servers
in parallel.
To this
extend, the
file is divided into many small data blocks at each server.
Clients use a greedy request policy to dynamically
request missing blocks from servers that are idle. In this way, the amount of load assigned to each server can be
dynamically adjusted as the download progresses and
downloads are significantly accelerated.
At the
end of the download, the last few pieces of
the file are requested from several servers simultaneously.
This end-game solution greatly reduces the download time
and prevents downloads from being stalled by the slowest
servers. One way to implement parallel-downloads using
standard protocols is
through the use of HTTP1.1 Byte-Range requests.
Related
Projects:
Initial
work on Parallel-Downloads at Institut Eurecom.
Performance of Parallel-Downloads under extreme loads (Georgia Tech)
MAR: A Mobile Access Commuter Router
Members:
Pablo Rodriguez (pablo@microsoft.com)
Publications:
-
P.
Rodriguez, W. Ernst Biersack., "Dynamic Parallel-Access to Replicated Content in the Internet".
In IEEE/Transactions on Networking, August 2002 (Also in IEEE/Infocom 2000)
[pdf].
|