Skip to Main content Skip to Navigation
Conference papers

The ETAPE corpus for the evaluation of speech-based TV content processing in the French language

Guillaume Gravier 1 Gilles Adda 2 Niklas Paulson 3 Matthieu Carré 3 Aude Giraudel 4 Olivier Galibert 5 
1 TEXMEX - Multimedia content-based indexing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
2 Traitement du Langage parlé
LIMSI - Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur
Abstract : The paper presents a comprehensive overview of existing data for the evaluation of spoken content processing in a multimedia framework for the French language. We focus on the ETAPE corpus which will be made publicly available by ELDA at the end of 2012, after completion of the evaluation, and recall existing resources resulting from previous evaluation campaigns. The ETAPE corpus consists of 30 hours of TV and radio broadcasts, selected to cover a wide variety of topics and speaking styles, emphasizing spontaneous speech and multiple speaker areas.
Document type :
Conference papers
Complete list of metadata

Cited literature [6 references]  Display  Hide  Download
Contributor : Guillaume Gravier Connect in order to contact the contributor
Submitted on : Sunday, July 1, 2012 - 9:54:29 PM
Last modification on : Tuesday, March 15, 2022 - 3:22:40 AM
Long-term archiving on: : Tuesday, October 2, 2012 - 9:30:56 AM


Files produced by the author(s)


  • HAL Id : hal-00712591, version 1


Guillaume Gravier, Gilles Adda, Niklas Paulson, Matthieu Carré, Aude Giraudel, et al.. The ETAPE corpus for the evaluation of speech-based TV content processing in the French language. LREC - Eighth international conference on Language Resources and Evaluation, 2012, Turkey. ⟨hal-00712591⟩



Record views


Files downloads