Combinaison de modèles de langage pour l'identification de thèmes

Brigitte Bigi; Renato de Mori; Marc El Bèze; Thierry Spriet

Communication Dans Un Congrès Année : 1998

Combinaison de modèles de langage pour l'identification de thèmes

(1) , (1) , (1) , (1)

Brigitte Bigi

Fonction : Auteur
PersonId : 7990
IdHAL : brigittebigi
ORCID : 0000-0003-1834-6918
IdRef : 079410790

Laboratoire Informatique d'Avignon

Renato de Mori

Fonction : Auteur
PersonId : 981954

Laboratoire Informatique d'Avignon

Marc El Bèze

Fonction : Auteur
PersonId : 949557

Laboratoire Informatique d'Avignon

Thierry Spriet

Fonction : Auteur
PersonId : 7705
IdHAL : thierry-spriet
IdRef : 079038131

Laboratoire Informatique d'Avignon

Résumé

A new statistical method for Language Modeling and spoken document classification is proposed. It is based on a mixture of topic dependent probabilities. Each topic dependent probability is in turn a mixture of n-gram probabilities and the probability of Kullback-Lieber (KL) distances between key-word unigrams and distribution obtained from the content of a cache memory. Experimental result on topic classification using a corpus of 60 Mwords from the French newspaper Le Monde show the excellent performance of the cache memory and its complementary role in providing different statistics for the decision process.

Mots clés

Topic identification

Domaines

Informatique et langage [cs.CL] Sciences de l'information et de la communication

Fichier principal

bigi1998jep.pdf (139.89 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Brigitte Bigi : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01392234

Soumis le : jeudi 15 décembre 2016-15:04:05

Dernière modification le : mardi 14 janvier 2020-10:38:06

Archivage à long terme le : jeudi 16 mars 2017-17:43:46

Dates et versions

hal-01392234 , version 1 (15-12-2016)

Licence

Identifiants

HAL Id : hal-01392234 , version 1

Citer

Brigitte Bigi, Renato de Mori, Marc El Bèze, Thierry Spriet. Combinaison de modèles de langage pour l'identification de thèmes . XXIIèmes Journées d'Etudes sur la Parole, 1998, Martigny, Suisse. pp.347-350. ⟨hal-01392234⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-AVIGNON LIA

57 Consultations

22 Téléchargements

Combinaison de modèles de langage pour l'identification de thèmes

Résumé

Mots clés

Domaines

Dates et versions

Licence

Identifiants

Citer

Exporter

Collections

Partager