SSOAR Logo
    • Deutsch
    • English
  • English 
    • Deutsch
    • English
  • Login
SSOAR ▼
  • Home
  • About SSOAR
  • Guidelines
  • Publishing in SSOAR
  • Cooperating with SSOAR
    • Cooperation models
    • Delivery routes and formats
    • Projects
  • Cooperation partners
    • Information about cooperation partners
  • Information
    • Possibilities of taking the Green Road
    • Grant of Licences
    • Download additional information
  • Operational concept
Browse and search Add new document OAI-PMH interface
JavaScript is disabled for your browser. Some features of this site may not work without it.

Download PDF
Download full text

(external source)

Citation Suggestion

Please use the following Persistent Identifier (PID) to cite this document:
https://doi.org/10.1214/22-SS137

Exports for your reference manager

Bibtex export
Endnote export

Display Statistics
Share
  • Share via E-Mail E-Mail
  • Share via Facebook Facebook
  • Share via Bluesky Bluesky
  • Share via Reddit reddit
  • Share via Linkedin LinkedIn
  • Share via XING XING

General-purpose imputation of planned missing data in social surveys: Different strategies and their effect on correlations

[journal article]

Axenfeld, Julian B.
Bruch, Christian
Wolf, Christof

Abstract

Planned missing survey data, for example stemming from split questionnaire designs are becoming increasingly common in survey research, making imputation indispensable to obtain reasonably analyzable data. However, these data can be difficult to impute due to low correlations, many predictors, and l... view more

Planned missing survey data, for example stemming from split questionnaire designs are becoming increasingly common in survey research, making imputation indispensable to obtain reasonably analyzable data. However, these data can be difficult to impute due to low correlations, many predictors, and limited sample sizes to support imputation models. This paper presents findings from a Monte Carlo simulation, in which we investigate the accuracy of correlations after multiple imputation using different imputation methods and predictor set specifications based on data from the German Internet Panel (GIP). The results show that strategies that simplify the imputation exercise (such as predictive mean matching with dimensionality reduction or restricted predictor sets, linear regression models, or the multivariate normal model without transformation) perform well, while especially generalized linear models for categorical data, classification trees, and imputation models with many predictor variables lead to strong biases.... view less


Geplant fehlende Werte in sozialwissenschaftlichen Befragungen, beispielsweise infolge eines Split Questionnaire Designs, treten in der Umfrageforschung immer häufiger auf. Um hinlänglich analysierbare Daten zu erhalten, ist hierbei oftmals eine Imputation erforderlich. Die statistische Modellierung... view more

Geplant fehlende Werte in sozialwissenschaftlichen Befragungen, beispielsweise infolge eines Split Questionnaire Designs, treten in der Umfrageforschung immer häufiger auf. Um hinlänglich analysierbare Daten zu erhalten, ist hierbei oftmals eine Imputation erforderlich. Die statistische Modellierung bei der Imputation solcher Daten kann jedoch aufgrund niedriger Korrelationen, einer Großzahl möglicher Prädiktoren und begrenzter Stichprobengrößen mit enormen Herausforderungen verbunden sein. Der vorliegende Beitrag stellt Ergebnisse aus einer Monte-Carlo-Simulation vor, in der basierend auf Daten des German Internet Panels (GIP) die Validität von Korrelationsschätzungen in einem Split Questionnaire Design unter Verwendung verschiedener Imputationsstrategien untersucht wird. Dabei zeigt sich, dass Ansätze, die die Imputation vereinfachen, zu guten Ergebnissen führen können (z.B. Predictive Mean Matching mit Dimensionsreduktion oder wenigen Prädiktorvariablen). Demgegenüber können insbesondere Generalisierte Lineare Modelle für kategoriale Daten, Klassifikationsbäume (CART) und Imputationsmodelle mit vielen Prädiktorvariablen starke Verzerrungen zur Folge haben.... view less

Keywords
survey; questionnaire; correlation; data quality; validity; survey research; data capture; estimation

Classification
Methods and Techniques of Data Collection and Data Analysis, Statistical Methods, Computer Methods

Free Keywords
bias; imputation methods; Monte Carlo simulation; multiple imputation; split questionnaire design; German Internet Panel (GIP)

Document language
English

Publication Year
2022

Page/Pages
p. 182-209

Journal
Statistics Surveys, 16 (2022)

ISSN
1935-7516

Status
Published Version; peer reviewed

Licence
Creative Commons - Attribution 4.0

FundingGefördert durch die Deutsche Forschungsgemeinschaft (DFG) - Projektnummern BL 1148/1-1, BR 5869/1-1, WO 739/20-1 / Funded by the German Research Foundation (DFG) - Project numbers BL 1148/1-1, BR 5869/1-1, WO 739/20-1


GESIS LogoDFG LogoOpen Access Logo
Home  |  Legal notices  |  Operational concept  |  Privacy policy
© 2007 - 2025 Social Science Open Access Repository (SSOAR).
Based on DSpace, Copyright (c) 2002-2022, DuraSpace. All rights reserved.
 

 


GESIS LogoDFG LogoOpen Access Logo
Home  |  Legal notices  |  Operational concept  |  Privacy policy
© 2007 - 2025 Social Science Open Access Repository (SSOAR).
Based on DSpace, Copyright (c) 2002-2022, DuraSpace. All rights reserved.