Skip to Main content Skip to Navigation
Conference papers

Scalable sparse tensor decompositions in distributed memory systems

Oguz Kaya 1, 2 Bora Uçar 1, 2
Abstract : We investigate an efficient parallelization of the most common iterative sparse tensor decomposition algorithms on distributed memory systems. A key operation in each iteration of these algorithms is the matricized tensor times Khatri-Rao product (MTTKRP). This operation amounts to element-wise vector multiplication and reduction depending on the sparsity of tensor. We investigate a fine and a coarse-grain task definition for this operation, and propose hypergraph partitioning-based methods for these task definitions to achieve load balance as well as reduce communication requirements. We also design a distributed memory sparse tensor library, HyperTensor, which implements a well-known algorithm for the CANDECOMP-PARAFAC (CP) tensor decomposition using the task definitions and the associated partitioning methods. We use this library to test the proposed implementation of MTTKRP in CP decomposition context, and report scalability results up to 1024 MPI ranks. We demonstrate up to 194 fold speedups using 512 MPI processes on a well-known real world data, and significantly better performance results with respect to a state of the art implementation.
Complete list of metadata

Cited literature [30 references]  Display  Hide  Download
Contributor : Equipe Roma <>
Submitted on : Monday, December 14, 2015 - 3:51:21 PM
Last modification on : Monday, June 14, 2021 - 10:25:00 AM
Long-term archiving on: : Saturday, April 29, 2017 - 11:48:45 AM


Files produced by the author(s)




Oguz Kaya, Bora Uçar. Scalable sparse tensor decompositions in distributed memory systems. International Conference for High Performance Computing, Networking, Storage and Analysis (SC15), Nov 2015, Austin, TX, United States. ⟨10.1145/2807591.2807624⟩. ⟨hal-01148202v2⟩



Record views


Files downloads