The stochastic topic block model for the clustering of vertices in networks with textual edges - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue Statistics and Computing Année : 2016

The stochastic topic block model for the clustering of vertices in networks with textual edges

Résumé

Due to the significant increase of communications between individuals via social media (Face-book, Twitter, Linkedin) or electronic formats (email, web, e-publication) in the past two decades, network analysis has become a unavoidable discipline. Many random graph models have been proposed to extract information from networks based on person-to-person links only, without taking into account information on the contents. This paper introduces the stochastic topic block model (STBM), a prob-abilistic model for networks with textual edges. We address here the problem of discovering meaningful clusters of vertices that are coherent from both the network interactions and the text contents. A classification variational expectation-maximization (C-VEM) algorithm is proposed to perform inference. Simulated data sets are considered in order to assess the proposed approach and to highlight its main features. Finally, we demonstrate the effectiveness of our methodology on two real-word data sets: a directed communication network and a undirected co-authorship network.
Fichier principal
Vignette du fichier
STBM.pdf (4.12 Mo) Télécharger le fichier
STBM-supp.pdf (253.26 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-01519743 , version 1 (09-05-2017)

Identifiants

Citer

Charles Bouveyron, Pierre Latouche, Rawya Zreik. The stochastic topic block model for the clustering of vertices in networks with textual edges. Statistics and Computing, 2016, ⟨10.1007/s11222-016-9713-7⟩. ⟨hal-01519743⟩
373 Consultations
205 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More