Talks and Poster Presentations (with Proceedings-Entry):
F. Endel, B. Glock:
"Bringing it all together: How to join and analyze sensitive data from multiple sources";
Talk: The FARR Institute International Conference 2015: Data Intensive Health Research and Care,
St. Andrews, UK;
- 2015-08-28; in: "The FARR Institute International Conference 2015: Data Intensive Health Research and Care",
Routinely collected claims data and other statistical databases storing pseudonymized and anonymized personal information from the social security and healthcare system are increasingly available for analysis in many countries. In Austria, many authorities are responsible for different parts of these systems, depending on region, service type and organizational background. As a result, data collections are heterogeneous and each one can only provide a restricted and biased view. Additionally, joint analyses are challenging, not only due to availability, quality and comparability of data, but also because of technical, organizational, legal and privacy reasons.
We are reporting about the various approaches we have developed in Austria for sharing and potentially merging healthcare data from multiple sources.
First, primary care, specialized outpatient care and prescription claims data from 19 different social security institutions had to be merged into an integrated database. While common pseudonyms exist, challenges including quality issues, the harmonization of accounting and coding systems as well as the definition of a reliable data model were key aspects.
Second, data from these databases had to be linked with inpatient databases which are under still different responsibilities and do not share any personal identifiers. Therefore special techniques of record linkage, including probabilistic matching had to be developed. The specific procedures have been presented at the SHIP conferences in 2011 and 2013. A common database covering all sectors of the healthcare system for 2006 and 2007 is now available, covering over 95% of the Austrian population. While further years are added, procedures are steadily improved.
The next challenge is to link this database with data collections outside the health sector (e.g. unemployment data), which usually do not share common identifiers. Conventional record linkage techniques are often not viable here as sharing personal information is legally difficult. For this purpose, we have developed a stepwise procedure minimizing the risk of data leakage.
Finally, international data exchange and common analysis across countries is an emerging challenge. In the EU FP7 project CEPHOS-LINK, fully anonymized data from six participating European countries is planned to be shared cross-border. In addition to requirements of country specific legislations, several technical issues from harmonized variable definition and trustworthy anonymisation up to secure data transfer have to be mastered for joint analyses. As alternatives to actually transferring anonymised and therefore comparably restricted information, new techniques of distributed analysis and statistical modelling are planned to be deployed.
Electronic version of the publication:
Created from the Publication Database of the Vienna University of Technology.