[Zurück]


Vorträge und Posterpräsentationen (mit Tagungsband-Eintrag):

A. Wurl, A. Falkner, A. Haselböck, A. Mazak:
"Advanced Data Integration with Signifiers: Case Studies for Rail Automation";
Vortrag: Revised Selected Papers of 6th International Conference, DATA 2017, Madrid, Spain; 24.07.2017 - 26.07.2017; in: "Data Management Technologies and Applications", J. Filipe, J. Bernardino, C. Quix (Hrg.); Springer International Publishing, (2018), ISBN: 978-3-319-94809-6; S. 87 - 110.



Kurzfassung deutsch:
In Rail Automation, planning future projects requires the integration of business-critical data from heterogeneous, often noisy data sources. Current integration approaches often neglect uncertainties and inconsistencies in the integration process and thus cannot guarantee the necessary data quality. To tackle these issues, we propose a semi-automated process for data import, where the user resolves ambiguous data classifications. The task of finding the correct data warehouse entry for a source value in a proprietary, often semi-structured format is supported by the notion of a signifier which is a natural extension of composite primary keys. In three different case studies we show that this approach (i) facilitates high-quality data integration while minimizing user interaction, (ii) leverages approximate name matching of railway station and entity names, (iii) contributes to extract features from contextual data for data cross-checks and thus supports the planning phases of railway projects.

Kurzfassung englisch:
In Rail Automation, planning future projects requires the integration of business-critical data from heterogeneous, often noisy data sources. Current integration approaches often neglect uncertainties and inconsistencies in the integration process and thus cannot guarantee the necessary data quality. To tackle these issues, we propose a semi-automated process for data import, where the user resolves ambiguous data classifications. The task of finding the correct data warehouse entry for a source value in a proprietary, often semi-structured format is supported by the notion of a signifier which is a natural extension of composite primary keys. In three different case studies we show that this approach (i) facilitates high-quality data integration while minimizing user interaction, (ii) leverages approximate name matching of railway station and entity names, (iii) contributes to extract features from contextual data for data cross-checks and thus supports the planning phases of railway projects.

Schlagworte:
Data integration, Signifier, Data quality


"Offizielle" elektronische Version der Publikation (entsprechend ihrem Digital Object Identifier - DOI)
http://dx.doi.org/10.1007/978-3-319-94809-6_5

Elektronische Version der Publikation:
doi.org/10.1007/978-3-319-94809-6_5