Dynamic Orchestration of Massively Data Parallel Execution

While GPUs provide low-cost and efficient platforms for accelerating massively parallel applications, tedious tuning is required to maximize performance. In addition to a complex programming model, there is a lack of performance portability across various systems with different runtime properties. Programmers usually make assumptions about runtime properties in order to optimize their code. However, if any of these properties change during execution, the optimized code performs poorly. To alleviate such limitations, several implementations of the application are necessary to maximize performance for different runtime properties. However, it is not practical for the programmer to write several different versions of the same code which are optimized for each individual runtime condition. In this talk, I will show how several runtime properties, such as device configuration, input size, dependency, and data values, impact the performance of fixed implementation code. Next, I will present a static and dynamic compiler framework to relieve the programmer of the burden of fine tuning different implementations of the same code. This framework allows the programmer to write a program once and use a static compiler to generate different versions of a data parallel application with several tuning parameters. A runtime system selects the best version and fine tunes its parameters based on runtime properties. Finally, I will discuss some open challenges and my future plans for providing performance portability across different sets of accelerators.

Speaker Details

I am currently a PhD student at University of Michigan working with Prof. Scott Mahlke in the Compilers Creating Custom Processors (CCCP) research group. I started my PhD in Fall 2009 and will be graduating in April 2014. I received my bachelor’s degree in electrical engineering from Sharif University of Technology in 2005 and in 2008 I received my master’s degree in electrical engineering from University of Tehran. My research interests lie in the area of computer architecture and compilers. I am currently working on designing a new dynamic/static compilation system for future heterogeneous architectures. The goal of this research project is to enable widespread use of different commodity processors and computing components, such as GPUs, in heterogeneous systems.

Date:: February 6, 2014
Speakers:: Mehrzad Samadi
Affiliation:: University of Michigan

- Jeff Running

Dynamic Orchestration of Massively Data Parallel Execution

Speaker Details

Speakers

Jeff Running