Talks and Poster Presentations (without Proceedings-Entry):
"The Art of MPI Benchmarking";
Talk: Lunchtime Seminar, Department of Computer Science, University of Innsbruck,
Innsbruck, Austria (invited);
The Message Passing Interface (MPI) is the prevalent programming model used on current supercomputers. Therefore, MPI library developers are looking for the best possible performance (shortest run-time) of individual MPI functions across many different supercomputer architectures. Several MPI benchmark suites have been developed to assess the performance of MPI implementations. Unfortunately, the outcome of these benchmarks is often neither reproducible nor statistically sound. To overcome these issues, we show which experimental factors have an impact on the run-time of blocking collective MPI operations and how to measure their effect. We present a new experimental method that allows us to obtain reproducible and statistically sound measurements of MPI functions. However, to obtain reproducible measurements, it is a common approach to synchronize all processes before executing an MPI collective operation. Thus, we take a closer look at two commonly used process synchronization schemes: (1) relying on MPI_Barrier or (2) applying a window-based scheme using a common global time. We analyze both schemes experimentally and show the strengths and weaknesses of each approach. Last, we propose an automatic way to check whether MPI libraries respect self-consistent performance guidelines. In this talk, we take a closer look at the PGMPI framework, which can benchmark MPI functions and detect violations of performance guidelines.
Created from the Publication Database of the Vienna University of Technology.