Distributed sparse BSS for large-scale datasets - Archive ouverte HAL Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2019

Distributed sparse BSS for large-scale datasets

Résumé

Blind Source Separation (BSS) [1] is widely used to analyze multichannel data stemming from origins as wide as astrophysics to medicine. However, existent methods do not efficiently handle very large datasets. In this work, we propose a new method coined DGMCA (Distributed Generalized Morphological Component Analysis) in which the original BSS problem is decomposed into subproblems that can be tackled in parallel, alleviating the large-scale issue. We propose to use the RCM (Riemannian Center of Mass-[6][7]) to aggregate during the iterative process the estimations yielded by the different subproblems. The approach is made robust both by a clever choice of the weights of the RCM and the adaptation of the heuristic parameter choice proposed in [4] to the parallel framework. The results obtained show that the proposed approach is able to handle large-scale problems with a linear acceleration performing at the same level as GMCA and maintaining an automatic choice of parameters. I. LARGE-SCALE BLIND SOURCE SEPARATION Given m row observations of size t stacked in a matrix Y assumed to follow a linear model Y = AS + N, the objective of BSS [1] is to estimate the matrices A (size m × n) and S (size n × t) up to a mere permutation and scaling indeterminacy. In this model, A mixes the n row sources in S, the observations being entached by some unknwown noise N (size m × t). We will assume that n ≤ m. While ill-posed, this problem can be regularized assuming the sparsity of S [2]. The estimation will then turn into the minization of: ˆ A, ˆ S = arg min A,S 1 2 Y − AS 2 F +Λ S 1 +i X:X k 2 =1, ∀k (A) , (1) with ·· F the Frobenius norm, Λ the regularization parameters and iC(·) the indicator function of the set C. The first term is a data fidelity one, the second enforces the sparsity and the last avoids degenerated solutions with A 2 F → 0 by enforcing unit columns. To tackle Eq. (1), the GMCA [4] algorithm has known a tremendous success due to an automatic decreasing parameter strategy making it robust. However, in this work we will assume that the data Y are large-scale in the sense that t can have huge values (e.g. up to 10 9 samples), which make the treatement of Y as a whole intractable. In this context, using GMCA or most other algorithms is prohibitive. II. PROPOSED METHOD This difficulty motivates the construction of J subproblems (j) of the type Yj = ASj + Nj where j denotes a subset of tj columns of the corresponding matrices. We use disjoints sets with j |tj| = t.
Fichier principal
Vignette du fichier
SPARS - DGMCA.pdf (338.37 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02088466 , version 1 (18-04-2019)

Identifiants

  • HAL Id : hal-02088466 , version 1

Citer

Tobias I Liaudat, Jerome Bobin, Christophe Kervazo. Distributed sparse BSS for large-scale datasets. 2019. ⟨hal-02088466⟩

Collections

CEA CEA-DRF
61 Consultations
129 Téléchargements

Partager

Gmail Facebook X LinkedIn More