Exploring Wikipedia talk pages for conflict detection

Abstract : The present study concentrates on Wikipedia talk pages, which are online discussions where the authors discuss the composition and content of Wikipedia articles. These pages provide new data for describing and analysing collaborative writing processes, which often involve conflicts. Previously, many studies have explored Wikipedia conflicts, highlighting opposite editing patterns in relation to cooperation, conflicts or quality. Most of these studies belong to the field of social sciences, and linguistic analyses are not very common in this context. Therefore, the linguistic characteristics of Wikipedia conflicts in talk pages are still little described in the literature. In this context, our objective is to analyse relevant linguistic cues which may help identify and characterize conflicts on Wikipedia talk pages. To this end, we apply two automatic methods. The first consists of the supervised automatic classification of conflicting vs. harmonic discussion threads. Secondly, we apply multidimensional analysis to the data to help profile the Wikipedia talk genre, enabling us to highlight key features and oppositions at a global level. The analyses are carried out on the WikiTalk corpus, a resource based on the French Wikipedia talk pages (160M words, 3M posts, 1M threads). The corpus includes a wide range of metadata, providing extra-linguistic characterization of the Wikipedia discussions.
Document type :
Book sections
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-01678227
Contributor : Lydia-Mai Ho-Dac <>
Submitted on : Tuesday, January 9, 2018 - 9:48:29 AM
Last modification on : Wednesday, July 10, 2019 - 1:32:54 AM

Licence


Distributed under a Creative Commons Attribution - NonCommercial - NoDerivatives 4.0 International License

Identifiers

Collections

Citation

Lydia-Mai Ho-Dac, Veronika Laippala, Céline Poudat, Ludovic Tanguy. Exploring Wikipedia talk pages for conflict detection. Darja Fišer and Michael Beißwenger. Investigating Computer-Mediated Communication: Corpus-Based Approaches to Language in the Digital World, ⟨Ljubljana University Press, Faculty of Arts⟩, pp.146-168, 2017, Translation Studies and Applied Linguistics, 978-961-237-961-2. ⟨10.4312/9789612379612⟩. ⟨hal-01678227⟩

Share

Metrics

Record views

283