Multilayer Network Data Clustering - Archive ouverte HAL Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2019

Multilayer Network Data Clustering

Résumé

Network data appears in very diverse applications, like from biological, social, or sensor networks. Clustering of network nodes into categories or communities has thus become a very common task in machine learning and data mining. Network data comes with some information about the network edges. In some cases, this network information can even be given with multiple views or multiple layers, each one representing a different type of relationship between the network nodes. Increasingly often, network nodes also carry a signal or feature vector. We propose in this paper to extend the node clustering problem, that commonly considers only the network information, to a problem where both the network information and the node features are considered together for the node embedding. Specifically, we design a generic two-step algorithm for multilayer graph data clustering. The first step aggregates the different layers of network information into a graph representation that is the geometric mean of the layer-wise network Laplacian matrices. The second step uses a neural net to learn a non-linear embedding of the network nodes that is consistent with the structure given by the network representation. We propose a novel algorithm for efficiently training the neural net via stochastic gradient descent, where the pairwise distances between the outputs are minimized on the net, while keeping such outputs orthogonal. We demonstrate with an extensive set of experiments on synthetic and real datasets that our method leads to a significant improvement w.r.t. state-of-the-art multilayer graph clustering algorithms, as it judiciously combines nodes features and network information in the node embedding algorithms.

Dates et versions

hal-02150018 , version 1 (06-06-2019)

Identifiants

Citer

Mireille El Gheche, Giovanni Chierchia, Pascal Frossard. Multilayer Network Data Clustering. 2019. ⟨hal-02150018⟩
125 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More