[Back]


Talks and Poster Presentations (with Proceedings-Entry):

A. Wurl, A. Falkner, A. Haselböck, A. Mazak:
"Advanced Data Integration with Signifiers: Case Studies for Rail Automation";
Talk: Revised Selected Papers of 6th International Conference, DATA 2017, Madrid, Spain; 2017-07-24 - 2017-07-26; in: "Data Management Technologies and Applications", J. Filipe, J. Bernardino, C. Quix (ed.); Springer International Publishing, (2018), ISBN: 978-3-319-94809-6; 87 - 110.



English abstract:
In Rail Automation, planning future projects requires the integration of business-critical data from heterogeneous, often noisy data sources. Current integration approaches often neglect uncertainties and inconsistencies in the integration process and thus cannot guarantee the necessary data quality. To tackle these issues, we propose a semi-automated process for data import, where the user resolves ambiguous data classifications. The task of finding the correct data warehouse entry for a source value in a proprietary, often semi-structured format is supported by the notion of a signifier which is a natural extension of composite primary keys. In three different case studies we show that this approach (i) facilitates high-quality data integration while minimizing user interaction, (ii) leverages approximate name matching of railway station and entity names, (iii) contributes to extract features from contextual data for data cross-checks and thus supports the planning phases of railway projects.

German abstract:
In Rail Automation, planning future projects requires the integration of business-critical data from heterogeneous, often noisy data sources. Current integration approaches often neglect uncertainties and inconsistencies in the integration process and thus cannot guarantee the necessary data quality. To tackle these issues, we propose a semi-automated process for data import, where the user resolves ambiguous data classifications. The task of finding the correct data warehouse entry for a source value in a proprietary, often semi-structured format is supported by the notion of a signifier which is a natural extension of composite primary keys. In three different case studies we show that this approach (i) facilitates high-quality data integration while minimizing user interaction, (ii) leverages approximate name matching of railway station and entity names, (iii) contributes to extract features from contextual data for data cross-checks and thus supports the planning phases of railway projects.

Keywords:
Data integration, Signifier, Data quality


"Official" electronic version of the publication (accessed through its Digital Object Identifier - DOI)
http://dx.doi.org/10.1007/978-3-319-94809-6_5

Electronic version of the publication:
doi.org/10.1007/978-3-319-94809-6_5


Created from the Publication Database of the Vienna University of Technology.