Show simple item record

[conference paper]

dc.contributor.authorEspín-Noboa, Lisettede
dc.contributor.authorKarimi, Faribade
dc.contributor.authorRibeiro, Brunode
dc.contributor.authorLerman, Kristinade
dc.contributor.authorWagner, Claudiade
dc.date.accessioned2023-08-16T10:39:47Z
dc.date.available2023-08-16T10:39:47Z
dc.date.issued2021de
dc.identifier.issn2364-8228de
dc.identifier.urihttps://www.ssoar.info/ssoar/handle/document/88562
dc.description.abstractSocial networks are very important carriers of information. For instance, the political leaning of our friends can serve as a proxy to identify our own political preferences. This explanatory power is leveraged in many scenarios ranging from business decision-making to scientific research to infer missing attributes using machine learning. However, factors affecting the performance and the direction of bias of these algorithms are not well understood. To this end, we systematically study how structural properties of the network and the training sample influence the results of collective classification. Our main findings show that (i) mean classification performance can empirically and analytically be predicted by structural properties such as homophily, class balance, edge density and sample size, (ii) small training samples are enough for heterophilic networks to achieve high and unbiased classification performance, even with imperfect model estimates, (iii) homophilic networks are more prone to bias issues and low performance when group size differences increase, (iv) when sampling budgets are small, partial crawls achieve the most accurate model estimates, and degree sampling achieves the highest overall performance. Our findings help practitioners to better understand and evaluate their results when sampling budgets are small or when no ground-truth is available.de
dc.languageende
dc.subject.ddcNaturwissenschaftende
dc.subject.ddcScienceen
dc.subject.ddcSozialwissenschaften, Soziologiede
dc.subject.ddcSocial sciences, sociology, anthropologyen
dc.subject.otherCollective inference; Input bias; Network structure; Output bias; Relational classification; Research; Sampling biasde
dc.titleExplaining classification performance and bias via network structure and sampling techniquede
dc.description.reviewbegutachtet (peer reviewed)de
dc.description.reviewpeer revieweden
dc.identifier.urllocalfile:/var/tmp/crawlerFiles/deepGreen/3236dc54396a4852b04d118c60c28340/3236dc54396a4852b04d118c60c28340.pdfde
dc.source.journalApplied Network Science
dc.source.volume6de
dc.publisher.countryCHEde
dc.subject.classozErhebungstechniken und Analysetechniken der Sozialwissenschaftende
dc.subject.classozMethods and Techniques of Data Collection and Data Analysis, Statistical Methods, Computer Methodsen
dc.subject.thesozsoziales Netzwerkde
dc.subject.thesozsocial networken
dc.subject.thesozDatengewinnungde
dc.subject.thesozdata captureen
dc.subject.thesozStichprobede
dc.subject.thesozsampleen
dc.subject.thesozDatenqualitätde
dc.subject.thesozdata qualityen
dc.identifier.urnurn:nbn:de:0168-ssoar-88562-5
dc.rights.licenceCreative Commons - Namensnennung 4.0de
dc.rights.licenceCreative Commons - Attribution 4.0en
ssoar.contributor.institutionGESISde
internal.statusformal und inhaltlich fertig erschlossende
internal.identifier.thesoz10053143
internal.identifier.thesoz10040547
internal.identifier.thesoz10037472
internal.identifier.thesoz10055811
dc.type.stockarticlede
dc.type.documentKonferenzbeitragde
dc.type.documentconference paperen
dc.type.documentZeitschriftenartikelde
dc.type.documentjournal articleen
internal.identifier.classoz10105
internal.identifier.journal2724
internal.identifier.document16
internal.identifier.document32
internal.identifier.ddc500
internal.identifier.ddc300
dc.identifier.doihttps://doi.org/10.1007/s41109-021-00394-3de
dc.description.pubstatusVeröffentlichungsversionde
dc.description.pubstatusPublished Versionen
internal.identifier.licence16
internal.identifier.pubstatus1
internal.identifier.review1
ssoar.wgl.collectiontruede
internal.dda.referencecrawler-deepgreen-217@@3236dc54396a4852b04d118c60c28340


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record