The Wikipedia XML Corpus

Ludovic Denoyer 1 Patrick Gallinari 1
1 MALIRE - Machine Learning and Information Retrieval
LIP6 - Laboratoire d'Informatique de Paris 6
Abstract : This article presents the general Wikipedia XML Collection developped for Structured Information Retrieval and Structured Machine Learning. This collection has been built from the Wikipedia Enclyclopedia. We detail particularly here which parts of this collection have been used during INEX 2006 for the Ad-hoc track and for the XML Mining track. Note that other tracks of INEX - multimedia track for example - have also been based on this collection.
Document type :
Conference papers
Complete list of metadatas
Contributor : Lip6 Publications <>
Submitted on : Wednesday, June 22, 2016 - 2:52:54 PM
Last modification on : Thursday, March 21, 2019 - 2:43:03 PM

Links full text



Ludovic Denoyer, Patrick Gallinari. The Wikipedia XML Corpus. Advances in XML Information Retrieval and Evaluation: Fifth Workshop of the INitiative for the Evaluation of XML Retrieval (INEX'06), Dec 2006, Dagstuhl, Germany. pp.12-19, ⟨10.1007/978-3-540-73888-6_2⟩. ⟨hal-01335922⟩



Record views