Skip to Main content Skip to Navigation
Conference papers

An Hybrid Language Model for a Continuous Dictation Prototype

Kamel Smaïli 1 Imed Zitouni 2 François Charpillet 3 Jean-Paul Haton 2
1 SMarT - Statistical Machine Translation and Speech Modelization and Text
LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
2 PAROLE - Analysis, perception and recognition of speech
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
3 MAIA - Autonomous intelligent machine
Inria Nancy - Grand Est, LORIA - AIS - Department of Complex Systems, Artificial Intelligence & Robotics
Abstract : This paper describes the combination of a stochastic language model and a formal grammar modeled such as a unification grammar. The stochastic model is trained over 42 million words extracted from Le Monde newspaper. The stochastic model is based on smoothed 3-gram and 3-class. The 3-class model is represented by a Markov chain made up of four states. Several experiments have been done to state which values are the best for specific training and test corpus. Experiments indicate that the unification grammar reduces strongly the number of hypothesis (sentences) produced by the stochastic model.
Complete list of metadatas
Contributor : Kamel Smaïli <>
Submitted on : Tuesday, February 3, 2015 - 7:12:36 PM
Last modification on : Tuesday, December 18, 2018 - 4:40:21 PM


  • HAL Id : hal-01112905, version 1


Kamel Smaïli, Imed Zitouni, François Charpillet, Jean-Paul Haton. An Hybrid Language Model for a Continuous Dictation Prototype. 5th European Conference on Speech Communication and Technology, Sep 1997, Rhodes, Greece. ⟨hal-01112905⟩



Record views