HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Conference papers

Hierarchical topic structuring: from dense segmentation to topically focused fragments via burst analysis

Anca Simon 1 Pascale Sébillot 1 Guillaume Gravier 1
1 LinkMedia - Creating and exploiting explicit links between multimedia fragments
Inria Rennes – Bretagne Atlantique , IRISA-D6 - MEDIA ET INTERACTIONS
Abstract : Topic segmentation traditionally relies on lexical cohesion measured through word re-occurrences to output a dense segmen-tation, either linear or hierarchical. In this paper, a novel organization of the topical structure of textual content is proposed. Rather than searching for topic shifts to yield dense segmentation, we propose an algorithm to extract topically focused fragments organized in a hierarchical manner. This is achieved by leveraging the temporal distribution of word re-occurrences, searching for bursts, to skirt the limits imposed by a global counting of lexical re-occurrences within segments. Comparison to a reference dense segmentation on varied datasets indicates that we can achieve a better topic focus while retrieving all of the important aspects of a text.
Complete list of metadata

Cited literature [15 references]  Display  Hide  Download

Contributor : Guillaume Gravier Connect in order to contact the contributor
Submitted on : Monday, August 24, 2015 - 10:30:40 PM
Last modification on : Thursday, January 20, 2022 - 5:33:11 PM
Long-term archiving on: : Wednesday, November 25, 2015 - 7:13:22 PM


Files produced by the author(s)


  • HAL Id : hal-01186443, version 1


Anca Simon, Pascale Sébillot, Guillaume Gravier. Hierarchical topic structuring: from dense segmentation to topically focused fragments via burst analysis. Recent Advances on Natural Language Processing, 2015, Hissar, Bulgaria. ⟨hal-01186443⟩



Record views


Files downloads