Skip to Main content Skip to Navigation
Conference papers

Caractérisation et détection de parole spontanée dans de larges collections de documents audio

Abstract : Processing spontaneous speech is one of the many challenges that Automatic Speech Recognition (ASR) systems have to deal with. The main evidences characterizing spontaneous speech are disfluencies (filled pause, repetition, repair and false start) and many studies have focused on the detection and the correction of these disfluencies. In this study we define spontaneous speech as unprepared speech, in opposition to prepared speech where utterances contain well-formed sentences close to those that can be found in written documents. Disfluencies are of course very good indicators of unprepared speech, however they are not the only ones : ungrammaticality and language register are also important as well as prosodic patterns. This paper proposes a set of acoustic and linguistic features that can be used for characterizing and detecting spontaneous speech segments from large audio databases. To better define this notion of unprepared speech, a set of speech segments representing an 11 hour corpus (French Broadcast News) has been manually labelled according to a level of spontaneity. We present an evaluation of our features on this corpus and describe the correlation between the Word-Error-Rate obtained by a state-of-the-art ASR decoder on this BN corpus and the level of spontaneity.
Document type :
Conference papers
Complete list of metadata
Contributor : bibliothèque Universitaire Déposants HAL-Avignon Connect in order to contact the contributor
Submitted on : Wednesday, May 25, 2016 - 10:38:14 AM
Last modification on : Wednesday, October 20, 2021 - 4:31:09 PM


  • HAL Id : hal-01321187, version 1



Vincent Jousse, yannick Estève, Frédéric Béchet, Thierry Bazillon, Georges Linares. Caractérisation et détection de parole spontanée dans de larges collections de documents audio. JEP, Jun 2008, Avignon, France. ⟨hal-01321187⟩



Record views