Skip to Main content Skip to Navigation
Conference papers

Streaming-LDA: A Copula-based Approach to Modeling Topic Dependencies in Document Streams

Abstract : We propose in this paper two new models for modeling topic and word-topic dependencies between consecutive documents in document streams. The first model is a direct extension of Latent Dirichlet Allocation model (LDA) and makes use of a Dirichlet distribution to balance the influence of the LDA prior parameters wrt to topic and word-topic distribution of the previous document. The second extension makes use of copulas, which constitute a generic tools to model dependencies between random variables. We rely here on Archimedean copulas, and more precisely on Franck copulas, as they are symmetric and associative and are thus appropriate for exchangeable random variables. Our experiments , conducted on three standard collections that have been used in several studies on topic modeling, show that our proposals outperform previous ones (as dynamic topic models and temporal LDA), both in terms of perplexity and for tracking similar topics in a document stream.
Complete list of metadata

Cited literature [25 references]  Display  Hide  Download
Contributor : Hesam Amoualian Connect in order to contact the contributor
Submitted on : Tuesday, July 12, 2016 - 3:19:19 PM
Last modification on : Wednesday, November 3, 2021 - 6:47:17 AM


Files produced by the author(s)



Hesam Amoualian, Marianne Clausel, Eric Gaussier, Massih-Reza Amini. Streaming-LDA: A Copula-based Approach to Modeling Topic Dependencies in Document Streams. 22nd ACM SIGKDD Conference Knowledge Discovery and Data Mining, Aug 2016, San Francisco, United States. pp.695-704 ⟨10.1145/2939672.2939781⟩. ⟨hal-01344779⟩



Record views


Files downloads