Experience from an operational map-reduce cluster reveals that outliers significantly prolong job completion. Mantri culls outliers using cause- and resource-aware techniques. Its strategies include smart restart of outliers, network-aware placement of tasks and protecting outputs of valuable tasks. Deployment in Bing’s production cluster and extensive trace-driven simulation indicate that Mantri is 3.1x more effective than the existing state-of-the-art in improving job completion times.
- Ganesh Ananthanarayanan, Srikanth Kandula, Albert Greenberg, Ion Stoica, Yi Lu, Bikas Saha, and Edward Harris, Reining in the Outliers in Map-Reduce Clusters using Mantri, in 9th Usenix Symposium on Operating Systems, USENIX, 6 October 2010.
- Ganesh Ananthanarayanan, Srikanth Kandula, Albert Greenberg, Ion Stoica, Yi Lu, Bikas Saha, and Edward Harris, Reining in the Outliers in Map-Reduce Clusters using Mantri, no. MSR-TR-2010-69, 15 May 2010.
|Ganesh Ananthanarayana||UC Berkeley (MSR Intern)|
|Ion Stoica||UC Berkeley|
|Yi Lu||MSR, now at UIUC|
Combating Stragglers in MapReduce Networks @ Brown IPP Symposium on Cloud Computing May, 2010
Combating Stragglers in MapReduce Networks @ MSR-Technology Advisory Board Meeting, July 2010