The XML Wikipedia Corpus

Ludovic Denoyer 1 Patrick Gallinari 1
1 MALIRE - Machine Learning and Information Retrieval
LIP6 - Laboratoire d'Informatique de Paris 6
Abstract : Wikipedia is a well known free content, multilingual encyclopedia written collaboratively by contributors around the world. Anybody can edit an article using a wiki markup language that offers a simplified alternative to HTML. This encyclopedia is composed of millions of articles in different languages.
Document type :
Journal articles
Complete list of metadatas
Contributor : Lip6 Publications <>
Submitted on : Tuesday, July 7, 2015 - 10:51:32 AM
Last modification on : Friday, May 24, 2019 - 5:22:25 PM



Ludovic Denoyer, Patrick Gallinari. The XML Wikipedia Corpus. Sigir Forum, Association for Computing Machinery (ACM), 2006, 40 (1), pp.64-69. ⟨10.1145/1147197.1147210⟩. ⟨hal-01172244⟩



Record views