Re-identifying register data by survey data: an empirical study

[journal article]

Bender, Stefan; Brand, Ruth; Bacher, Johann

Abstract "More and more empirical researchers from universities or research centres would like to use register data collected by statistical agencies or the social security system, because these data can be used for several empirical studies, e.g. the analysis of special groups or quantitative effects of economic policies. Most of the register data required have to be (factually) anonymised before they are disseminated to preserve confidentiality. Therefore re-identification risks for register data are examined by matching a sample of register data with survey data, collected especially for scientific purposes. Three methods were applied: the uniqueness approach, a simple distance estimation and a cluster analysis. The data sets used were two birth cohorts (1964 and 1971) of the German employment statistics (register data) and the German Life History Study. The analysis show that a re-identification of real persons may be possible by a standard-cluster analysis or a simple distance criterion if an intruder has access to additional information. The number of re-identifiable persons is remarkably high although the proportion of re-identifiable persons is less than expected on the basis of the uniqueness-approach." (author's abstract)
Keywords data; empirical social research; official statistics; survey; interview; comparison of methods; data preparation; analysis; anonymity; personal data; data protection
Classification Methods and Techniques of Data Collection and Data Analysis, Statistical Methods, Computer Methods
Method basic research; development of methods
Document language English
Publication Year 2001
Page/Pages p. 373-381
Journal Statistical journal of the United Nations Economic Commission for Europe, 18 (2001) 4
Licence Deposit Licence - No Redistribution, No Modifications
