Diploma and Master Theses (authored and supervised):

E. Jablonskis:
"Adaptive Gesture Recognition System, Transforming Dance Performance into Music";
Supervisor: H. Kaufmann; Institut für Visual Computing & Human-Centered Technology, 2018.

English abstract:
The objective of this thesis was to develop a gesture recognition system that would
transform dance to music using a machine learning algorithm. This thesis is divided into the six
stages of the processing chain: Input, Feature Extraction, Segmentation, Classification, Postprocessing,
Video cameras with and without markers, wearable sensors and depth cameras were
considered to provide input data; Microsoft Kinect v2 device was chosen as the best option.
Body contour and body skeleton approaches were presented for feature extraction; Kinect SDK
2.0 was chosen to extract relevant features from the depth image. Segmentation based on music
metrics was chosen over body tracking, while bar measure was chosen as the most suitable
approach to split data stream to distinct gestures. For classification, machine learning
algorithms Dynamic Time Warping (DTW), Hidden Markov Models, Support Vector Machines
and Artificial Neural Network were explored; DTW was chosen as the most suitable algorithm.
EyesWeb environment was chosen for post-processing and to build an overall "gesture engine".
Ableton Live was selected to function as the output.
The designed system coupled virtual instruments with body parts: the system had to
learn gestures of each group of body parts and know how gestures were paired with music clips
in a composition. A working prototype of such a system was implemented and tested. Results
supported the hypothesis of this thesis that a machine learning algorithm could be used for
flexible gesture recognition.
Performance of the system under various conditions was evaluated in order to reveal its
strengths and weaknesses. Measurements based on Signal Detection Theory were calculated in
both fitting and cross-validation analysis. Results disclosed a very high prediction accuracy of
the system: in most of the cases it was over 90%. Analysis showed that the system performed
best when all predicted gestures were included in the training dataset and when each gesture
had at least 16 training samples.
The implementation process provided some ideas about how the dance recognition
system could be expanded to provide more features in music creation. The experience of music
creation using gestures also implied that further advancements in machine learning and human computer
interfaces will not only enhance two-way interaction of dance and music but also
build a closer relationship of body and mind.

Gesture Recognition, Dance Performance, Motion Capture

Electronic version of the publication:

Created from the Publication Database of the Vienna University of Technology.