Publication Entry

[Back]

Talks and Poster Presentations (without Proceedings-Entry):

S. Yang, S. Jeong, B. Min, Y. Kim, B. Burgstaller, J. Blieberger:
"Blocking versus Non-Blocking Shared-Memory Multicore Synchronization: Programmability, Scalability and Performance";
Talk: Reliable Software Technologies - Ada-Europe, Warsaw; 2019-06-11 - 2019-06-14.

English abstract:

The mutual-exclusion property of locks stands in the way to scalability of parallel programs on many-core architectures. Locks do not allow progress guarantees, because a task may fail inside a critical section and thereby prevent other tasks from accessing shared data. Because of the disadvantages of mutual exclusion locks, it is desirable to give up on method-level locking and allow method calls to overlap in time. Synchronization is then performed on a finer granularity within a method's code, via atomic {\em read-modify-write}
operations. It thus becomes possible to provide progress guarantees, which are unattainable with locks. In particular, a method is non-blocking, if a task's pending invocation is never required to wait for another task's pending invocation to complete.

Non-blocking synchronization is nevertheless conceptually difficult when exposed to the programmer. We investigate three possible abstraction-levels to provide programmers with non-blocking synchronization in the context of the Ada programming language: (1)~our lock elision of protected objects with the help of hardware transactional memory shields the programmer from the non-blocking synchronization problem.
(2)~Concurrent objects are our novel programming primitive to encapsulate the complexity of non-blocking synchronization in a language-level construct. (3)~by exposing atomic read-modify-write operations on language-level, programmers gain fine-grained control over the synchronization-problem, including the memory consistency model.

We investigate the trade-offs between programmability, scalability and
performance of non-blocking synchronization on these three bstraction-levels. We contrast the scalability of non-blocking synchronization with state-of-the-art lock-based queues and mutual-exclusion locks.
This paper thereby makes the following contributions:

(1) We introduce lock-elision for monitors in the context of Ada protected objects. Our lock-elision technique is based on the
Intel transactional synchronization extensions (TSX).
(2) We introduce concurrent objects as a high-level programming primitive for non-blocking monitor constructs in Ada.
(3) We relax several recent state-of-the-art lock and queue synchronization primitives from sequential consistency to acquire-release consistency, to explore the performance advantage of low-level atomic operations and relaxed memory consistency in Ada and C++.
(4) We provide extensive experiments on scalability and performance of the proposed techniques on the x86\_64 and ARM~v8 hardware platforms.
(5) For reproducibility, we have open-sourced all benchmark code in the form of a shared-memory synchronization benchmark suite.