[Back]


Diploma and Master Theses (authored and supervised):

T. Polzer:
"Fault-Tolerant Hardware Implementation of a Consensus Algorithm";
Supervisor: A. Steininger, T. Handl; Institut für Technische Informatik, 2009.



English abstract:
This thesis develops a new communication model for digital electronic systems. The proposed scheme is comparable to a GALS (globally asynchronous locally synchronous) system with the difference that the clock sources have a bounded, a-priory known precision. This loose synchrony is exploited to establish a communication that (i) is free of metastability by design and (ii) has a fully predictable temporal behavior. As a consequence the communication scheme presents a synchronous behavior, thus allowing to employ techniques that are restricted to synchronous systems, while avoiding the central clock being a single point of failure. To compensate for the imperfect synchronization of the local clocks (within the defined precision), a FIFO buffer memory is used on each communication link. Using the theory of distributed systems the correctness of the approach is formally proved. For this purpose the communication activity is modeled as a distributed algorithm. More specifically it is shown that metastability-free and correct communication is possible, given that the buffer is larger than a certain, formally proved minimum. Furthermore an efficient hardware implementation is given and used to experimentally show that the theoretical derived FIFO buffer size requirement represents a tight lower bound. A performance comparison with a traditional GALS system shows that the performance of our solution is superior. Based on the new communication model, a fault tolerant electronic system, able to tolerate Byzantine faults even in case of non replica deterministic modules, is developed. First the usability of a TMR system in such a setting is analyzed and, as found inadequate, replaced by a hardware implementation of the commonly known Byzantine EIG consensus algorithm. As the EIG algorithm is lockstep synchronous, the lockstep synchronous model is simulated on top of our communication model. The EIG algorithm is adapted such that it can be efficiently implemented in hardware based on the timings established by the lockstep rounds. The equivalence of the adapted algorithm and the original EIG algorithm is shown. Additionally the hardware implementation for a system tolerating a single Byzantine fault is sketched. Performance and complexity of the implementation are analyzed.

Keywords:
Fault-tolerance, Consensus, VLSI, Metastability, Communication


Related Projects:
Project Head Andreas Steininger:
Verteilte Algorithmen für robuste Takt-Synchronisation