Diploma and Master Theses (authored and supervised):
"Similarity searching in complex business events and sequences thereof";
Supervisor: G. Raidl;
Institut für Computergraphik und Algorithmen,
final examination: 2009-03.
This thesis contributes to the field of complex event-data analysis novel and formally well-founded methods for
similarity searching, both on the level of single events and on the level of sequences of events. As event-based
systems may produce highly diverse data sets, the main focus of our considerations is on highest possible
flexibility. Also, the approaches shall be intelligible to business analysts and, of course, generate meaningful
and intuitive results. Finally, the approaches shall be conceptually independent from concrete Complex Event
Processing solutions and instead build upon abstract and generally accepted definitions of events, event types,
Our approach on single-event similarity builds upon geometric ideas of similarity, with event attribute values
defining the relative positioning of two events in an n-dimensional space. Thereby, the similarity between two
events is calculated from weighted attribute-level similarities.
The proposed approach on event-sequence similarity outperforms existing approaches by allowing analysts to
consider event-level similarities, order, and relative and absolute temporal structures in a highly flexible
manner. It builds upon an assignment-based understanding of sequence similarity, where each unit of the
pattern sequence is considered either represented by a certain event of the target sequence or missing therein.
Our algorithm finds the best-possible assignment of the target sequence using a Branch & Bound strategy. This
assignment is then used for calculating the similarity between the given sequences.
We conclude this work with a practical evaluation, where we apply the approach on event-sequence similarity
in real-world scenarios from three application domains. We figured out that the algorithm performs excellent
for short and sharp-edged sequences where a majority of events constitute clear and significant characteristics
of the event sequence.
Electronic version of the publication:
Created from the Publication Database of the Vienna University of Technology.