An Hybrid Language Model for a Continuous Dictation Prototype

Kamel Smaïli 1 Imed Zitouni 2 François Charpillet 3 Jean-Paul Haton 2
1 SMarT - Statistical Machine Translation and Speech Modelization and Text
LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
2 PAROLE - Analysis, perception and recognition of speech
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
3 MAIA - Autonomous intelligent machine
Inria Nancy - Grand Est, LORIA - AIS - Department of Complex Systems, Artificial Intelligence & Robotics
Abstract : This paper describes the combination of a stochastic language model and a formal grammar modeled such as a unification grammar. The stochastic model is trained over 42 million words extracted from Le Monde newspaper. The stochastic model is based on smoothed 3-gram and 3-class. The 3-class model is represented by a Markov chain made up of four states. Several experiments have been done to state which values are the best for specific training and test corpus. Experiments indicate that the unification grammar reduces strongly the number of hypothesis (sentences) produced by the stochastic model.
Type de document :
Communication dans un congrès
5th European Conference on Speech Communication and Technology, Sep 1997, Rhodes, Greece
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-01112905
Contributeur : Kamel Smaïli <>
Soumis le : mardi 3 février 2015 - 19:12:36
Dernière modification le : mardi 18 décembre 2018 - 16:40:21

Identifiants

  • HAL Id : hal-01112905, version 1

Citation

Kamel Smaïli, Imed Zitouni, François Charpillet, Jean-Paul Haton. An Hybrid Language Model for a Continuous Dictation Prototype. 5th European Conference on Speech Communication and Technology, Sep 1997, Rhodes, Greece. 〈hal-01112905〉

Partager

Métriques

Consultations de la notice

442