Publication Entry

[Back]

Doctor's Theses (authored and supervised):

E. Piatkowska:
"Asynchronous Stereo Vision Event-driven Stereo Matching and Tracking for Dynamic Vision Sensors";
Supervisor, Reviewer: M. Gelautz, J. Scharinger, P. Zemčík; Fakultät für Informatik der Technischen Universität Wien, 2018; oral examination: 2018-12-17.

English abstract:

This thesis addresses the problem of stereo reconstruction from a stream of events provided by two dynamic vision sensors (DVS) in a stereo configuration. Dynamic vision sensors consist of self-spiking pixels that independently and in continuous time react to relative light intensity changes by generating `spikes´ encoded in Address Event Representation(AER). In result, the output of the sensor is not a sequence of frames as in conventional cameras, but an asynchronous stream of events indicating captured intensity changes. The main advantages of these types of sensors are high temporal resolution (better than 10μs) and wide dynamic range (> 120dB). Several approaches for stereo matching have been introduced for dynamic vision sensors, including the application of conventional stereo algorithms by operating on `pseudo frames´ built from the address event stream (imagebased methods). Although the image-based algorithms are acceptable in performance, they do not exploit the sensor´s specific capabilities. Only few efforts have been invested so far in stereo processing techniques that can be applied directly to the stream of events (event-based methods). These methods preserve the asynchronous aspect of events, thus are better suited for keeping the advantages of dynamic vision sensors. However, there are still various challenges to tackle in event-based stereo matching.
In this thesis, we investigate the feasibility of fully asynchronous stereo vision tailored for dynamic vision sensors. We start out with a thorough analysis of event data from dynamic vision sensors in the context of stereo analysis with a focus on the events´ coincidence in time. We find that single event-to-event matching with the use of timing information as a matching score lacks reliability while dealing with complex scenes and challenging conditions. As the main contribution of this thesis we propose an adaptive dynamic cooperative network, which is constantly updated while events are generated,
making it feasible to preserve the data-driven aspect of the sensor. We develop two cooperative stereo matching algorithms with the first employing simple time-based event matching as an input to the cooperative network. In the second algorithm, we suggest using the spatio-temporal neighbourhood of the event as matching primitive and a
novel similarity measure, which is a combination of time-based correlation and polarity.
Extensive evaluation of the proposed cooperative stereo algorithms demonstrates that the results are comparable or better than competing algorithms in the field. Furthermore, we propose an asynchronous tracking method that is realised by clustering events in three-dimensional space with Gaussian mixture models and demonstrate its performance in conjunction with the cooperative stereo matching results.

German abstract:

Diese Arbeit befasst sich mit dem Problem der Stereo-Rekonstruktion aus Event-Daten, die von zwei Dynamic Vision Sensoren (DVS) in Stereo-Konfiguration geliefert werden. Dynamic Vision Sensoren bestehen aus Pixeln, die eigenständig und zeitkontinuierlich auf relative Änderungen in der Lichtintensität reagieren, indem sie sogenannte Spikes erzeugen, welche in Address Event Representation (AER) dargestellt werden. Dadurch ist die Ausgabe des Sensors keine Bildsequenz wie bei herkömmlichen Kameras, sondern ein asynchroner Strom von Events, welche die erfassten Intensitätsänderungen anzeigen.
Die Hauptvorteile dieser Art von Sensoren sind die hohe zeitliche Auflösung (besser als 10μs) und ein großer Dynamikbereich (> 120dB). Mehrere Ansätze für Stereo-Matching wurden für Dynamic Vision Sensoren vorgestellt, einschließlich der Anwendung von konventionellen Stereo-Algorithmen, welche auf sogenannten Pseudoframes arbeiten, die aus dem Strom der Address Events aufgebaut werden (bildbasierte Methoden). Obwohl die Leistung der bildbasierten Algorithmen akzeptabel ist, nützen sie die speziellen Fähigkeiten des Sensors nicht aus. Bisher wurden nur wenige Versuche unternommen Stereo-Verarbeitungstechniken zu entwickeln, welche direkt auf die Event-Ströme angewendet werden können (Event-basierte Methoden). Diese Methoden bewahren den asynchronen Aspekt der Events und sind daher besser geeignet, die Vorteile von Dynamic Vision Sensoren zu erhalten. Allerdings gilt es noch verschiedene Herausforderungen beim Event-basierten Stereo-Matching zu bewältigen.
In dieser Dissertation untersuchen wir die Machbarkeit von vollständig asynchroner Stereo Vision, die auf Dynamic Vision Sensoren zugeschnitten ist. Wir beginnen mit einer ausführlichen Analyse der Event Daten von Dynamic Vision Sensoren im Kontext der Stereoanalyse mit einem Fokus auf die zeitliche Koinzidenz der Ereignisse. Es zeigt
sich, dass einzelnes Event-zu-Event Matching unter Verwendung der Zeitinformation als Matching-Maß nicht zuverlässig ist, wenn man mit komplexen Szenen und herausfordernden Bedingungen konfrontiert ist. Als wichtigsten Beitrag dieser Arbeit schlagen wir ein adaptives dynamisches kooperatives Netzwerk vor, welches während der Generierung von Events ständig aktualisiert wird, wodurch der datengetriebene Aspekt des Sensors erhalten bleibt. Wir entwickeln zwei kooperative Stereo-Matching Algorithmen, wobei der erste Algorithmus einfaches zeitbasiertes Event-Matching als Input für das kooperative Netzwerk verwendet. Im zweiten Algorithmus schlagen wir die Verwendung
einer räumlich-zeitlichen Nachbarschaft des Events als Matching-Basis und ein neuartiges Ähnlichkeitsmerkmal, welches zeitabhängige Korrelation und Polarität kombiniert, vor.

Electronic version of the publication:

https://publik.tuwien.ac.at/files/publik_274528.pdf

Created from the Publication Database of the Vienna University of Technology.