Talks and Poster Presentations (with Proceedings-Entry):
F. Endel, B. Glock, G. Endel, N. Popper:
"Challenges and Ressults with the Record Linkage of Austrian Health Insureance Data on Different Sources";
Talk: Informatics for Health 2017 Conference,
- 2017-04-26; in: "Informatics for Health 2017",
Informatics for Health,
Due to data privacy issues, routinely collected data of different sources is pseudonymized 8e.g. MBDs minimum basic data set from the Federy ministry of Health, which up to 2015 even don't have a personal identifier). This makes statistical analysis for decision support and health care planning very difficult. Data from insurance carriers (FoKo) is event based: whenever a hospital reports, a new data entry is generated. To enable efficient, significant and quality assured data analysis for patient centred assertions record linkage of these episodes is required with the aim of finding a personal ID for each MBDS episode.
For historical data (GAP-DRG1) a linkage has been done before. Some challengens remain for new data (GAP-DRG2): in MBDS data for the whole of Austria is available, but in FoKo there is only data for persons insured by the Lower Austria sickness fund; a hospital stay may be split off im more data entries, du to intermediate reporting from the hospital. In GAP-DRG2 an iterative deterministic record linkage is applied: (1) determine the quality assured matcing variables. (2) Determine the minimum set of matching variables (MVs). (3) Basematch: check for unique matches in three basic MVs. (4) Start Levelmatch1: Data entries need to be identical in all MVs except 1 (MVs are varied; Step1: missing: Step2: contradicting). (5) - (9) Same procedure is done for up to 6 missing/contradicting MVs. (10) Start the iterative process with remaining episodes at (4).
In GAP-DRG2 1.410.165 episodes from FoKo and 1.272.813 episodes from MBDS are designated to be matched. In the basematch 611.591 (48,05 %) episodes could be matched. In the first iteration, a total of 1.271.395 (99,88 %) episodes are matched, where most of them are found in level 3 (3 MVs are allowed to be NULL or contradicting, most of them involving the episod's identifier). Finally, after iteration 3 a total of 1.272.104 (99,94 %) matched data entries are found.
The main innovations of this procedure include significant improvement of previously developed methods, mainly concerning reproducibility, stability and adaptability to new data and documentation on every single step of the linkage procedure, allowing researchers to comprehend the origin of a link an adapt their data analysis strategies. Checks if the registered age differs with +/- one year is included and quality ches on the base match showed that age and district differs a lot. The procedure achieves the best possible outcom for the new data sets and is highly suitable to be used within new data. Further evaluations will be included by the end of the year.
The applied determinist record linkage provided a rate of 99.94 % matches. It is a further developed, structured and improved implementation of the historic mathing. As soon as new data is incorporated the same procedure can be applied to new data, for more recent years or the whole of Austria. This project is part of the K-Project dexhelpp in COMET - Competence Centers for Excellent Technologies that is funded by BMVIT GMWGJ and transacted by FFG.
Created from the Publication Database of the Vienna University of Technology.