Programming models such as HIVE and DryadLINQ provide programmers with simple declarative abstractions for writing data intensive computations that can run on a large cluster of machines. However, this level of abstraction comes at a cost – the inability to understand, predict and debug performance. This project aims at building performance models for predicting the performance of the query while identifying bottleneck resources and computations.
- Ankush Desai, Kaushik Rajan, and Kapil Vaswani, Critical Path based Performance Models for Distributed Queries, no. MSR-TR-2012-121, 7 December 2012.