Hierarchical topic structuring: from dense segmentation to topically focused fragments via burst analysis - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2015

Hierarchical topic structuring: from dense segmentation to topically focused fragments via burst analysis

Résumé

Topic segmentation traditionally relies on lexical cohesion measured through word re-occurrences to output a dense segmen-tation, either linear or hierarchical. In this paper, a novel organization of the topical structure of textual content is proposed. Rather than searching for topic shifts to yield dense segmentation, we propose an algorithm to extract topically focused fragments organized in a hierarchical manner. This is achieved by leveraging the temporal distribution of word re-occurrences, searching for bursts, to skirt the limits imposed by a global counting of lexical re-occurrences within segments. Comparison to a reference dense segmentation on varied datasets indicates that we can achieve a better topic focus while retrieving all of the important aspects of a text.
Fichier principal
Vignette du fichier
114_Paper.pdf (486.89 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01186443 , version 1 (24-08-2015)

Identifiants

  • HAL Id : hal-01186443 , version 1

Citer

Anca Simon, Pascale Sébillot, Guillaume Gravier. Hierarchical topic structuring: from dense segmentation to topically focused fragments via burst analysis. Recent Advances on Natural Language Processing, 2015, Hissar, Bulgaria. ⟨hal-01186443⟩
586 Consultations
208 Téléchargements

Partager

Gmail Facebook X LinkedIn More