Share on Facebook Tweet on Twitter Share on LinkedIn Share by email
Correctness Checking Concepts and Tools for HPC: Call for Action

Speaker  Ganesh Gopalakrishnan

Affiliation  University of Utah

Host  Shaz Qadeer

Duration  01:05:50

Date recorded  14 May 2014

Today's high performance computing story is one where problems of ever-increasing scale in science and engineering are required to be solved under strict power budgets. This necessitates the use of heterogeneous computing elements (e.g., CPUs and GPUs) and also causes significant shifts in the use of established programming APIs (e.g., MPI mixed with Open MP and CUDA). In addition to detecting defects such as data races and deadlocks in this context, a designer increasingly worries about emerging issues such as resilience, floating-point precision, and even the ability to replay executions. My talk will first give a broad overview of our efforts directed at these problems. It will then focus on our tool GKLEE that helps locate data races in non-trivial CUDA kernels. I will close with two topics: (1) how the same kinds of concurrency errors pertaining to memory orderings are being repeated, and (2) the hope that by emphasizing correctness checking (in addition to the usual fixation on performance tuning) in basic concurrency courses, we might minimize these frequently committed mistakes.

©2014 Microsoft Corporation. All rights reserved.
By the same speaker
> Correctness Checking Concepts and Tools for HPC: Call for Action