Skip to Main content Skip to Navigation

Modèles de langue exploitant la similarité structurelle entre séquences pour la reconnaissance de la parole

Christian Gillot 1, 2
1 SYNALP - Natural Language Processing : representations, inference and semantics
LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
2 PAROLE - Analysis, perception and recognition of speech
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : The role of a stochastic language model is to give the best estimation possible of the probability of the sequence of words in a given language. It is an essential component of any speech recognition software and has a great influence on performance. The state-of-the-art models most commonly used are the n-gram models smoothed using the Kneser-Ney technique. These models use occurrence statistics of word sequences typically up to a length of 5, statistics computed on a large training corpus. This thesis starts by an empirical study of the errors of a state-of-the-art speech recognition system in French, which shows that there are many regular language phenomena that are out of reach of the n-gram models. This thesis thus explores a dual approach of the prevailing statistical paradigm by using memory models which process efficiently specific phenomena, in synergy with the n-gram models which efficiently main trends. The notion of similarity between long n-gram is studied in order to identify the relevant contexts to take into account in a first similarity language model. The data extracted out of the corpus is combined via a Gaussian kernel to compute a new score. The integration of this non-probabilistic model improves the performance of a recognition system. A second model is then introduced, probabilistic and thus allowing a better integration of the similarity approach with the existing models and improves the performance in perplexity on text.
Document type :
Complete list of metadata

Cited literature [62 references]  Display  Hide  Download
Contributor : Christophe Cerisara <>
Submitted on : Monday, January 18, 2016 - 4:53:15 PM
Last modification on : Friday, February 26, 2021 - 3:28:06 PM
Long-term archiving on: : Friday, November 11, 2016 - 10:01:01 AM


  • HAL Id : tel-01258153, version 1


Christian Gillot. Modèles de langue exploitant la similarité structurelle entre séquences pour la reconnaissance de la parole. Intelligence artificielle [cs.AI]. Université de Lorraine, 2012. Français. ⟨tel-01258153⟩



Record views


Files downloads