Caractérisation et détection de parole spontanée dans de larges collections de documents audio - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2008

Caractérisation et détection de parole spontanée dans de larges collections de documents audio

Résumé

Processing spontaneous speech is one of the many challenges that Automatic Speech Recognition (ASR) systems have to deal with. The main evidences characterizing spontaneous speech are disfluencies (filled pause, repetition, repair and false start) and many studies have focused on the detection and the correction of these disfluencies. In this study we define spontaneous speech as unprepared speech, in opposition to prepared speech where utterances contain well-formed sentences close to those that can be found in written documents. Disfluencies are of course very good indicators of unprepared speech, however they are not the only ones : ungrammaticality and language register are also important as well as prosodic patterns. This paper proposes a set of acoustic and linguistic features that can be used for characterizing and detecting spontaneous speech segments from large audio databases. To better define this notion of unprepared speech, a set of speech segments representing an 11 hour corpus (French Broadcast News) has been manually labelled according to a level of spontaneity. We present an evaluation of our features on this corpus and describe the correlation between the Word-Error-Rate obtained by a state-of-the-art ASR decoder on this BN corpus and the level of spontaneity.
Fichier non déposé

Dates et versions

hal-01317613 , version 1 (18-05-2016)

Identifiants

  • HAL Id : hal-01317613 , version 1

Citer

Vincent Jousse, Yannick Estève, Frédéric Béchet, Thierry Bazillon, Georges Linares. Caractérisation et détection de parole spontanée dans de larges collections de documents audio. JEP, Jun 2008, Avignon, France. ⟨hal-01317613⟩
225 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More