Skip to Main content Skip to Navigation
Theses

Modèles de langue exploitant la similarité structurelle entre séquences pour la reconnaissance de la parole

Christian Gillot 1, 2
1 SYNALP - Natural Language Processing : representations, inference and semantics
LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
2 PAROLE - Analysis, perception and recognition of speech
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : The role of a stochastic language model is to give the best estimation possible of the probability of the sequence of words in a given language. It is an essential component of any speech recognition software and has a great influence on performance. The state-of-the-art models most commonly used are the n-gram models smoothed using the Kneser-Ney technique. These models use occurrence statistics of word sequences typically up to a length of 5, statistics computed on a large training corpus. This thesis starts by an empirical study of the errors of a state-of-the-art speech recognition system in French, which shows that there are many regular language phenomena that are out of reach of the n-gram models. This thesis thus explores a dual approach of the prevailing statistical paradigm by using memory models which process efficiently specific phenomena, in synergy with the n-gram models which efficiently main trends. The notion of similarity between long n-gram is studied in order to identify the relevant contexts to take into account in a first similarity language model. The data extracted out of the corpus is combined via a Gaussian kernel to compute a new score. The integration of this non-probabilistic model improves the performance of a recognition system. A second model is then introduced, probabilistic and thus allowing a better integration of the similarity approach with the existing models and improves the performance in perplexity on text.
Complete list of metadatas

Cited literature [62 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/tel-01258153
Contributor : Christophe Cerisara <>
Submitted on : Monday, January 18, 2016 - 4:53:15 PM
Last modification on : Tuesday, December 18, 2018 - 4:38:01 PM
Document(s) archivé(s) le : Friday, November 11, 2016 - 10:01:01 AM

Identifiers

  • HAL Id : tel-01258153, version 1

Citation

Christian Gillot. Modèles de langue exploitant la similarité structurelle entre séquences pour la reconnaissance de la parole. Intelligence artificielle [cs.AI]. Université de Lorraine, 2012. Français. ⟨tel-01258153⟩

Share

Metrics

Record views

396

Files downloads

394