TOM: A library for topic modeling and browsing - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2016

TOM: A library for topic modeling and browsing

Résumé

In this paper, we present TOM (TOpic Modeling), a Python library for topic modeling and browsing. Its objective is to allow for an efficient analysis of a text corpus from start to finish, via the discovery of latent topics. To this end, TOM features advanced functions for preparing and vectorizing a text corpus. It also offers a unified interface for two topic models (namely LDA using either variational inference or Gibbs sampling, and NMF using alternating least-square with a projected gradient method), and implements three state-of-the-art methods for estimating the optimal number of topics to model a corpus. What is more, TOM constructs an interactive Web-based browser that makes exploring a topic model and the related corpus easy.
Fichier non déposé

Dates et versions

hal-01442868 , version 1 (21-01-2017)

Identifiants

  • HAL Id : hal-01442868 , version 1

Citer

Adrien Guille, Edmundo-Pavel Soriano-Morales. TOM: A library for topic modeling and browsing. Conférence sur l'Extraction et la Gestion des Connaissances, Jan 2016, Reims, France. ⟨hal-01442868⟩
217 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More