Publication Entry

[Back]

Diploma and Master Theses (authored and supervised):

R. Plettenberg:
"Frameworks for Distributed Big Data Processing: A Comparison in the Domain of Predictive Maintenance";
Supervisor: M. Wimmer, A. Mazak; Institut für Information Systems Engineering, 2018; final examination: 2018-09-25.

English abstract:

Predictive maintenance is a novel approach for making maintenance decisions, lowering maintenance costs, increasing a plants capacity and production volume, and positively affecting environmental and employee safety. In predictive maintenance, condition data of machines is constantly collected and analysed to predict future machine failures. Due to the high volume, velocity, and variety of gathered data, Big Data analytic frameworks are necessary to provide the desired results. The performance of these frameworks highly influences the overall performance of a predictive maintenance system, raising the need for tools to measure it.
Benchmarks present such tools by defining general workloads for a system to measure its performance. Due to the wide popularity of Big Data analytics across industries, benchmarks for Big Data analytic frameworks are defined specifically for each domain. While there are currently many benchmarks available for other domains such as retail, social network, or search engines, there are none available for Big Data analytic frameworks in the application area of predictive maintenance.
This thesis introduces the predictive maintenance benchmark (PMB). The PMB is a benchmark aimed at measuring the performance of Big Data analytic frameworks in the field of predictive maintenance. The data model and workload of the PMB represent typical tasks encountered by a predictive maintenance system. The PMB is implemented in the two most popular Big Data analytic ecosystems Hadoop and Spark and show Spark outperforming Hadoop in almost every task. For evaluation, findings gathered during implementation and execution of the PMB are analysed. Furthermore, the PMB results are validated against other studies comparing Hadoop and Spark.

German abstract:

Predictive Maintenance (vorausschauende Wartung) ist ein neuer Ansatz für das Fällen von Wartungsentscheidungen und ermöglicht eine Senkung von Wartungskosten, eine Steigerung der Produktion, sowie eine Erhöhung der Sicherheit und des Umweltbewusstseins. Bei Predictive Maintenance werden permanent daten über den Zustand einer Maschine gesammelt und verwendet, um Vorhersagen über künftige Ausfälle zu treffen. Auf Grund des Volumens, der Geschwindigkeit, und der Vielfalt der gesammelten Daten werden für deren Analyse spezielle Software Frameworks aus dem Bereich der Big Data Analyse benötigt. Die Leistung dieser Frameworks ist maßgeblich für die Leistung des gesamten Predictive Maintenance Systems.
Benchmarks erlauben es die Leistung von Frameworks zu messen und dienen daher gleichzeitig als Basis dafür, diese zu vergleichen. Durch den branchenweiten Einsatz von Big Data Analyse, und die daraus resultierenden unterschiedlichen Einsatzgebiete, ist es wichtig die Frameworks innerhalb eines bestimmten Aufgabengebietes zu vergleichen. Zurzeit existieren solche Big Data Benchmarks für die Bereiche des Handels, der Sozialen Netzwerke, der Web Suche, sowie der Bioinformatik. Es gibt allerdings derzeit keinen Benchmark, der den Tätigkeitsumfeld von Predictive Maintenance abdeckt.
Die vorliegende Diplomarbeit stellt daher den Predictive Maintenance Benchmark (PMB) vor. Der PMB setzt sich zum Ziel, die Leistung von Big Data Analyse Frameworks an Hand von Aufgaben aus dem Bereich Predictive Maintenance zu testen. Das Datenmodell und das Arbeitsvolumen von PMB repräsentieren hierbei typische Aufgaben eines Predictive Maintenance Systems. Nach der Entwicklung des PMBs, wird er auf den zwei populären Big Data Frameworks Hadoop und Spark implementiert. Die Resultate der jeweiligen Implementationen dienen als Basis für den Leistungsvergleich zwischen Hadoop und Spark. Schlussendlich wird der PMB durch Erkenntnisse, die während der Planung, Implementierung und Analyse der Resultate gewonnen wurden evaluiert. Zusätzlich werden die Resultate des PMBs noch mit anderen Studien, die die Leistung von Hadoop und Spark vergleichen, validiert.

Keywords:

Big Data/Predictive Maintenance/Benchmark/Hadoop/Spark/Big Data Analytic Frameworks/Raspberry Pi/Machine Learning/Distributed Processing

Created from the Publication Database of the Vienna University of Technology.