Using Broad Phonetic Classes to Guide Search in Automatic Speech Recognition

Stefan Ziegler 1 Bogdan Ludusan 2 Guillaume Gravier 2
1 METISS - Speech and sound data modeling and processing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
2 TEXMEX - Multimedia content-based indexing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
Abstract : This work presents a novel framework to guide the Viterbi decoding process of a hidden Markov model based speech recognition system by means of broad phonetic classes. In a first step, decision trees are employed, along with frame and segment based attributes, in order to detect broad phonetic classes in the speech signal. Then, the detected phonetic classes are used to reinforce paths in the search process, either at every frame or at phonetically significant landmarks. Results obtained on French broadcast news data show a relative improvement in word error rate of about 2% with respect to the baseline.
Document type :
Conference papers
Complete list of metadatas

Cited literature [12 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-00758427
Contributor : Guillaume Gravier <>
Submitted on : Wednesday, November 28, 2012 - 5:00:31 PM
Last modification on : Friday, November 16, 2018 - 1:22:28 AM
Long-term archiving on : Saturday, December 17, 2016 - 5:43:13 PM

File

LDASR_interspeech12.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00758427, version 1

Citation

Stefan Ziegler, Bogdan Ludusan, Guillaume Gravier. Using Broad Phonetic Classes to Guide Search in Automatic Speech Recognition. INTERSPEECH - Annual Conference of the International Speech Communication Association, 2012, United States. ⟨hal-00758427⟩

Share

Metrics

Record views

1306

Files downloads

221