Show simple item record

[conference paper]

dc.contributor.authorZielinski, Andreade
dc.contributor.authorMutschke, Peterde
dc.date.accessioned2018-06-28T08:26:20Z
dc.date.available2018-06-28T08:26:20Z
dc.date.issued2018de
dc.identifier.isbn979-10-95546-00-9de
dc.identifier.urihttps://www.ssoar.info/ssoar/handle/document/57723
dc.description.abstractIn this paper, we describe our effort to create a new corpus for the evaluation of detecting and linking so-called survey variables in social science publications (e.g., "Do you believe in Heaven?"). The task is to recognize survey variable mentions in a given text, disambiguate them, and link them to the corresponding variable within a knowledge base. Since there are generally hundreds of candidates to link to and due to the wide variety of forms they can take, this is a challenging task within NLP. The contribution of our work is the first gold standard corpus for the variable detection and linking task. We describe the annotation guidelines and the annotation process. The produced corpus is multilingual - German and English - and includes manually curated word and phrase alignments. Moreover, it includes text samples that could not be assigned to any variables, denoted as negative examples. Based on the new dataset, we conduct an evaluation of several state-of-the-art text classification and textual similarity methods. The annotated corpus is made available along with an open-source baseline system for variable mention identification and linking.en
dc.languageende
dc.relationinfo:eu-repo/grantAgreement/EC/H2020/654021de
dc.rightsinfo:eu-repo/semantics/openAccessde
dc.subject.ddcPublizistische Medien, Journalismus,Verlagswesende
dc.subject.ddcNews media, journalism, publishingen
dc.subject.ddcLiteratur, Rhetorik, Literaturwissenschaftde
dc.subject.ddcLiterature, rhetoric and criticismen
dc.subject.othertext mining; semantic textual similarity; paraphrase detection; linkingde
dc.titleTowards a Gold Standard Corpus for Variable Detection and Linking in Social Science Publicationsde
dc.typeinfo:eu-repo/semantics/conferenceObjectde
dc.description.reviewbegutachtet (peer reviewed)de
dc.description.reviewpeer revieweden
dc.source.collectionProceedings of the 11th International Conference on Language Resources and Evaluation (LREC)de
dc.publisher.countryDEU
dc.subject.classozInformationswissenschaftde
dc.subject.classozInformation Scienceen
dc.subject.classozLiteraturwissenschaft, Sprachwissenschaft, Linguistikde
dc.subject.classozScience of Literature, Linguisticsen
dc.subject.thesozSozialwissenschaftde
dc.subject.thesozsocial scienceen
dc.subject.thesozPublikationde
dc.subject.thesozpublicationen
dc.subject.thesozDatende
dc.subject.thesozdataen
dc.subject.thesozAlgorithmusde
dc.subject.thesozalgorithmen
dc.subject.thesozComputerlinguistikde
dc.subject.thesozcomputational linguisticsen
dc.identifier.urnurn:nbn:de:0168-ssoar-57723-2
dc.rights.licenceCreative Commons - Namensnennung, Nicht kommerz., Keine Bearbeitung 4.0de
dc.rights.licenceCreative Commons - Attribution-Noncommercial-No Derivative Works 4.0en
ssoar.contributor.institutionGESISde
internal.statusformal und inhaltlich fertig erschlossende
internal.identifier.thesoz10058540
internal.identifier.thesoz10041401
internal.identifier.thesoz10034708
internal.identifier.thesoz10035039
internal.identifier.thesoz10040387
dc.type.stockincollectionde
dc.type.documentKonferenzbeitragde
dc.type.documentconference paperen
internal.identifier.classoz1080500
internal.identifier.classoz30200
internal.identifier.document16
dc.contributor.corporateeditorEuropean Language Resources Association (ELRA)
dc.source.conferenceInternational Conference on Language Resources and Evaluation (LREC)de
dc.event.cityMiyazaki (Japan)de
internal.identifier.corporateeditor96
internal.identifier.ddc070
internal.identifier.ddc800
dc.date.conference2018de
dc.source.conferencenumber11de
dc.description.pubstatusVeröffentlichungsversionde
dc.description.pubstatusPublished Versionen
internal.identifier.licence20
internal.identifier.pubstatus1
internal.identifier.review1
dc.subject.classhort30200de
dc.subject.classhort50200de
ssoar.wgl.collectiontruede
internal.pdf.version1.5
internal.pdf.validtrue
internal.pdf.wellformedtrue
internal.check.openairetruede
internal.check.abstractlanguageharmonizerCERTAIN
internal.check.languageharmonizerCERTAIN_RETAINED


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record