Modeling topic dependencies in semantically coherent text spans with copulas

Abstract : The exchangeability assumption in topic models like Latent Dirichlet Allocation (LDA) often results in inferring inconsistent topics for the words of text spans like noun-phrases, which are usually expected to be topically coherent. We propose copulaLDA, that extends LDA by integrating part of the text structure to the model and relaxes the conditional independence assumption between the word-specific latent topics given the per-document topic distributions. To this end, we assume that the words of text spans like noun-phrases are topically bound and we model this dependence with copulas. We demonstrate empirically the effectiveness of copulaLDA on both intrinsic and extrinsic evaluation tasks on several publicly available corpora.
Keywords : Copulas
Type de document :
Communication dans un congrès
International Conference on Computational Linguistics (COLING), Dec 2016, Osaka, Japan. 〈http://coling2016.anlp.jp/〉
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-01406613
Contributeur : Massih-Reza Amini <>
Soumis le : jeudi 1 décembre 2016 - 13:33:54
Dernière modification le : jeudi 11 octobre 2018 - 08:48:05

Identifiants

  • HAL Id : hal-01406613, version 1

Collections

Citation

Georgios Balikas, Hesam Amoualian, Marianne Clausel, Gaussier Eric, Massih-Reza Amini. Modeling topic dependencies in semantically coherent text spans with copulas. International Conference on Computational Linguistics (COLING), Dec 2016, Osaka, Japan. 〈http://coling2016.anlp.jp/〉. 〈hal-01406613〉

Partager

Métriques

Consultations de la notice

294