The stochastic topic block model for the clustering of vertices in networks with textual edges

Abstract : Due to the significant increase of communications between individuals via social medias (Face-book, Twitter) or electronic formats (email, web, co-authorship) in the past two decades, network analysis has become a unavoidable discipline. Many random graph models have been proposed to extract information from networks based on person-to-person links only, without taking into account information on the contents. In this paper, we have developed the stochastic topic block model (STBM) model, a probabilistic model for networks with textual edges. We address here the problem of discovering meaningful clusters of vertices that are coherent from both the network interactions and the text contents. A classification variational expectation-maximization (C-VEM) algorithm is proposed to perform inference. Simulated data sets are considered in order to assess the proposed approach and highlight its main features. Finally, we demonstrate the effectiveness of our model on two real-word data sets: a communication network and a co-authorship network.
Type de document :
Article dans une revue
Statistics and Computing, Springer Verlag (Germany), 2016, <10.1007/s11222-016-9713-7>
Liste complète des métadonnées


https://hal.archives-ouvertes.fr/hal-01299161
Contributeur : Pierre Latouche <>
Soumis le : jeudi 7 avril 2016 - 11:57:47
Dernière modification le : mercredi 8 février 2017 - 14:43:06
Document(s) archivé(s) le : lundi 14 novembre 2016 - 19:30:20

Fichier

article-STBM.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Collections

Citation

Charles Bouveyron, P Latouche, R Zreik. The stochastic topic block model for the clustering of vertices in networks with textual edges. Statistics and Computing, Springer Verlag (Germany), 2016, <10.1007/s11222-016-9713-7>. <hal-01299161>

Partager

Métriques

Consultations de
la notice

1174

Téléchargements du document

398