*
Microsoft.com Home|Site Map
Microsoft *

Search:

Parallel-Downloads (a.k.a. swarming)

Parallel-Download is a technique used to fetch files from multiple sources. Such sources can be Web servers, Web caches, P2P nodes, etc. Parallel-downloads were first proposed back in early 1999 in this paper as a way to solve the server-selection problem. Since then, parallel-downloads have been integrated in many Internet applications and have become the core of many P2P file-swarming systems.

Parallel-Downloads speed up download time and eliminate the server selection problem. Rather than picking a specific server to download a file, a number of servers are selected and content is downloaded in parallel from all of them, getting different bits of the content from different servers. As clients download content they immediately become sources of new content for other clients using parallel-downloads.

Clients experience a transfer rate equal to the sum of the individual transfer rates of the servers contacted, which helps saturate their links, and ensure strong robustness against server failures or network fluctuations.  

Fig1. Parallel-Download Architecture

 

In addition to applications in Content Distribution networks and P2P systems, parallel-downloads can also be used to aggregate bandwidth from multiple network interfaces, e.g. wireless interfaces. For instance, using parallel-downloads, multi-homed wireless devices can stripe data connections across all their network interfaces. The benefits of using multiple interfaces in parallel include faster throughput, smaller bandwidth fluctuations, extended coverage, smooth handoffs, and minimized disruption when new links come and go.

How does it work?

Assume that one client ones to a large file. Normally, the client would download the whole file from a single source, limiting its throughput to the speed of the selected source. With parallel-downloads, the client picks a set of the available sources and downloads different parts of the file from each of the servers in parallel. To this extend, the file is divided into many small data blocks at each server. Clients use a greedy request  policy to dynamically request missing blocks from servers that are idle. In this way, the amount of load assigned to each server can be dynamically adjusted as the download progresses and downloads are significantly accelerated.

At the end of the download, the last few pieces of the file are requested from several servers simultaneously. This end-game solution greatly reduces the download time and prevents downloads from being stalled by the slowest servers. One way to implement parallel-downloads using standard protocols is through the use of HTTP1.1 Byte-Range requests.

 

Related Projects:

Initial work on Parallel-Downloads at Institut Eurecom.

Performance of Parallel-Downloads under extreme loads (Georgia Tech)

MAR: A Mobile Access Commuter Router

Members:

    Pablo Rodriguez (pablo@microsoft.com)

Publications:

  • P. Rodriguez, W. Ernst Biersack., "Dynamic Parallel-Access to Replicated Content in the Internet". In IEEE/Transactions on Networking, August 2002 (Also in IEEE/Infocom 2000) [pdf]


©2004 Microsoft Corporation. All rights reserved. Terms of Use |Privacy Statement