Talks and Poster Presentations (with Proceedings-Entry):
"Energy Characterization and Optimization of Parallel Prefix-Sums Kernels";
Talk: 3rd Workshop on Runtime and Operating Systems for the Many-core Era (ROME 2015) in conjunction with Euro-Par 2015,
- 2015-08-28; in: "Euro-Par 2015: Parallel Processing Workshops, Euro-Par 2015 International Workshops, Vienna, Austria, August 24-25, 2015, Revised Selected Papers",
S. Hunold, A. Costan, D. Gimenez, A. Iosup, L. Ricci, G. Gomez Requena, V. Scarano, A. Varbanescu, S. Scott, S. Lankes, J. Weidendorfer, M. Alexander (ed.);
Springer International Publishing,
Prefix-sums appear frequently in numerous computational tasks, and many performance efficient parallel prefix-sums algorithms have been introduced for shared and distributed memory architectures. However, as far as we know, the energy consumption behavior of these algorithms is unknown, as well as the energy-performance trade-offs.
This paper is a first attempt to address the energy aspects of CPPS (cache-aware parallel prefix-sums), a high performance parallel prefix-sums kernel specific for x86 shared memory architectures. We provide implementation details for CPPS and various sequential prefix-sums algorithms that are used as building blocks. We measure performance and energy consumption of CPPS with different configurations (sequential prefix-sums kernel, CPU frequency, number of threads and thread placement policy). The results show significant energy savings, from 24 % to 55 %, when configuring CPPS with an optimized rather than a non-optimized sequential prefix-sums kernel for various different CPU frequency levels and number of threads.
"Official" electronic version of the publication (accessed through its Digital Object Identifier - DOI)
Created from the Publication Database of the Vienna University of Technology.