[Back]


Diploma and Master Theses (authored and supervised):

B. Wedenik:
"A Big Data Analytics Framework for Evaluating Automated Elastic Scalability of the SMACK-Stack";
Supervisor: S. Dustdar, S. Nastic; Institute of Information Systems Engineering, Distributed Systems Group, 2018; final examination: 2018-10-25.



English abstract:
In the last years the demand of information availability and shorter response times is increasing. Today´s business requirements are changing: Waiting hours or even days for the result of a query is not acceptable anymore in many sectors. The response needs to be immediate, or the query is discarded - This is where "Fast Data" begins. With the SMACK Stack, consisting of Spark, Mesos, Akka, Cassandra and Kafka, a robust and versatile platform and toolset to successfully run Fast Data applications is provided. In this thesis a framework to correctly scale services and distribute resources within
the stack is introduced. The main contributions of this thesis are: 1) Development and evaluation of the mentioned framework, including monitoring metrics extraction and aggregation, as well as the scaling service itself. 2) Implementation of two real-world reference applications. 3) Providing infrastructure management tools to easily deploy the stack in the cloud. 4) Deployment blueprints in form of recommendations on how to initially set up and configure available resources are provided. To evaluate the framework, the real world applications are used for benchmarking. One application is based on
IoT data and is mainly I/O demanding, while the other one is computationally bound and provides predictions based on IoT data. The results indicate, that the framework performs well in terms of identifying which component is under heavy stress and scaling it automatically. This leads to an increase of throughput in the IoT application of up to 73%, while the prediction application is able to handle up to 169% more messages when using the supervising framework. While the results look promising, there is still potential for future work, like using machine learning to better handle thresholds or an
extended REST API.

Created from the Publication Database of the Vienna University of Technology.