ANCOR_Centre, a Large Free Spoken French Coreference Corpus: description of the Resource and Reliability Measures

Abstract : This article presents ANCOR_Centre, a French coreference corpus, available under the Creative Commons Licence. With a size of around 500,000 words, the corpus is large enough to serve the needs of data-driven approaches in NLP and represents one of the largest coreference resources currently available. The corpus focuses exclusively on spoken language, it aims at representing a certain variety of spoken genders. ANCOR_Centre includes anaphora as well as coreference relations which involve nominal and pronominal mentions. The paper describes into details the annotation scheme and the reliability measures computed on the resource.
Document type :
Conference papers
ELRA. LREC'2014, 9th Language Resources and Evaluation Conference., May 2014, Reyjavik, Iceland. pp.MUZERELLE14.150, 2014, <http://www.lrec-conf.org/proceedings/lrec2014/index.html>


https://hal.archives-ouvertes.fr/hal-01075679
Contributor : Jean-Yves Antoine <>
Submitted on : Sunday, October 19, 2014 - 3:57:57 PM
Last modification on : Thursday, April 28, 2016 - 3:43:27 PM
Document(s) archivé(s) le : Tuesday, January 20, 2015 - 10:44:20 AM

File

2014_LREC_ANCOR.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01075679, version 1

Citation

Judith Muzerelle, Anaïs Lefeuvre, Emmanuel Schang, Jean-Yves Antoine, Aurore Pelletier, et al.. ANCOR_Centre, a Large Free Spoken French Coreference Corpus: description of the Resource and Reliability Measures. ELRA. LREC'2014, 9th Language Resources and Evaluation Conference., May 2014, Reyjavik, Iceland. pp.MUZERELLE14.150, 2014, <http://www.lrec-conf.org/proceedings/lrec2014/index.html>. <hal-01075679>

Export

Share

Metrics

Record views

160

Document downloads

93