Export für Ihre Literaturverwaltung

Übernahme per Copy & Paste



Bookmark and Share

Creating an Annotated Corpus for Sentiment Analysis of German Product Reviews


Boland, Katarina; Wira-Alam, Andias; Messerschmidt, Reinhard


Bitte beziehen Sie sich beim Zitieren dieses Dokumentes immer auf folgenden Persistent Identifier (PID):http://nbn-resolving.de/urn:nbn:de:0168-ssoar-339398

Weitere Angaben:
Körperschaftlicher Herausgeber GESIS - Leibniz-Institut für Sozialwissenschaften
Abstract The availability of annotated data is an important prerequisite for the development of machine learning algorithms for sentiment analysis. However, as manually labeling large datasets is time-consuming and expensive, few datasets are available and most of them represent a small sample of a very narrow domain, e.g. movie reviews or reviews of a certain product type. Additionally, many annotated datasets are available for English texts only. However, the influence of different characteristics of the input dataset on the performance of algorithms for sentiment analysis remains unclear if only training data from one specific domain is available or if specific domains are mixed in the test corpus. We therefore introduce a new dataset for German product reviews of various product types and investigate whether even small variances in this specific domain (different product types) already exhibit different characteristics, e.g. with regard to the difficulty of sentiment annotation. The annotation of this corpus lays the basis for future enhanced annotations of similar corpora and for the extension of our annotations to corpora of inherently different domains. These will then serve to investigate the influence of different corpus characteristics on different algorithms for sentiment analysis and as a basis to apply machine learning methods for sentence-wise sentiment analysis for German texts.
Klassifikation Naturwissenschaften, Technik(wissenschaften), angewandte Wissenschaften
Sprache Dokument Englisch
Publikationsjahr 2013
Erscheinungsort Mannheim
Seitenangabe 16 S.
Schriftenreihe GESIS-Technical Reports, 2013/05
ISSN 1868-9051
Status Veröffentlichungsversion
Lizenz Deposit Licence - Keine Weiterverbreitung, keine Bearbeitung