Skip to Main content Skip to Navigation
Conference papers

Text mining with constrained tensor decomposition

Abstract : Text mining, as a special case of data mining, refers to the estimation of knowledge or parameters necessary for certain purposes, such as unsupervised clustering by observing various documents. In this context, the topic of a document can be seen as a hidden variable, and words are multi-view variables related to each other by a topic. The main goal in this paper is to estimate the probability of topics, and conditional probability of words given topics. To this end, we use non negative Canonical Polyadic (CP) decomposition of a third order moment tensor of observed words. Our computer simulations show that the proposed algorithm has better performance compared to a previously proposed algorithm, which utilizes the Robust tensor power method after whitening by second order moment. Moreover, as our cost function includes the non negativity constraint on estimated probabilities, we never obtain negative values in our estimated probabilities , whereas it is often the case with the power method combined with deflation. In addition, our algorithm is capable of handling over-complete cases, where the number of hidden variables is larger than that of multi-view variables, contrary to deflation-based techniques. Further, the method proposed therein supports a larger over-completeness compared to modified versions of the tensor power method, which has been customized to handle over-complete case.
Complete list of metadata

Cited literature [27 references]  Display  Hide  Download
Contributor : Pierre Comon Connect in order to contact the contributor
Submitted on : Monday, July 1, 2019 - 5:25:44 PM
Last modification on : Thursday, May 27, 2021 - 2:16:01 PM


Files produced by the author(s)


  • HAL Id : hal-02084803, version 3


Elaheh Sobhani, Pierre Comon, Christian Jutten, Massoud Babaie-Zadeh. Text mining with constrained tensor decomposition. LOD 2019 - 5th International Conference on Machine Learning, Optimization, and Data Science, Sep 2019, Certosa di Pontignano, Siena, Italy. ⟨hal-02084803⟩



Record views


Files downloads