Download full text
(external source)
Citation Suggestion
Please use the following Persistent Identifier (PID) to cite this document:
https://doi.org/10.17620/02671.92
Exports for your reference manager
Generation and use of unstructured data in the social, behavioural, and economic sciences: challenges and recommendations
Erhebung und Nutzung unstrukturierter Daten in den Sozial-, Verhaltens- und Wirtschaftswissenschaften: Herausforderungen und Empfehlungen
[working paper]
Corporate Editor
Rat für Sozial- und Wirtschaftsdaten (RatSWD)
Abstract The increasing digital transformation of society in recent decades has resulted in a number of new data sources for the social, behavioural, and economic sciences. Among many others, they include unstructured data, which are characterised by not being available in a fixed data format and are therefo... view more
The increasing digital transformation of society in recent decades has resulted in a number of new data sources for the social, behavioural, and economic sciences. Among many others, they include unstructured data, which are characterised by not being available in a fixed data format and are therefore not easy to process for data analysis (e.g., Facebook posts, Instagram images, YouTube videos, Twitter messages). The use of unstructured data is linked to specific challenges, which arise precisely because the data are not typically collected as part of a controlled, scientific study but are often created in people's natural environments. Building on the results of an expert workshop, we describe the specific challenges of generating and using unstructured data and formulate recommendations for their use. Our recommendations are based on the total error framework and take into account data generation (definition of the units of analysis, coverage and sampling error, non-response, and missing data error), post-collection processing (specification error, validity, measurement error, and error in terms of content), and, lastly, data analysis (record linkage and processing errors, modelling errors, analytical errors). Finally, we discuss open questions and challenges to research using unstructured data. This output paper is aimed at students and researchers in the social, behavioural, and economic sciences on the one hand, and everyone working with unstructured data and drawing inferences from them for practical applications on the other.... view less
Keywords
data capture; data access; data; analysis; data preparation; transparency; data quality; data protection
Classification
Methods and Techniques of Data Collection and Data Analysis, Statistical Methods, Computer Methods
Document language
English
Publication Year
2024
City
Berlin
Page/Pages
30 p.
Series
RatSWD Output Series, 2 (7)
Status
Published Version; reviewed