Séparation de sources audio en milieu réverbérant : Factorisation en matrices non-négatives et représentation temporelle du mélange convolutif

Simon Leglaive; Roland Badeau; Gael Richard

Communication Dans Un Congrès Année : 2017

Séparation de sources audio en milieu réverbérant : Factorisation en matrices non-négatives et représentation temporelle du mélange convolutif

(1) , (1) , (1)

Simon Leglaive

Fonction : Auteur
PersonId : 20853
IdHAL : simon-leglaive
ORCID : 0000-0002-8219-1298
IdRef : 25312171X

Télécom ParisTech

Roland Badeau

Fonction : Auteur
PersonId : 1121
IdHAL : rbadeau
ORCID : 0000-0002-9630-6877
IdRef : 106938134

Télécom ParisTech

Gael Richard

Fonction : Auteur
PersonId : 14146
IdHAL : gael-richard
IdRef : 094977208

Télécom ParisTech

Résumé

This paper addresses the problem of multichannel audio source separation in under-determined reverberant mixtures. We target a semi-blind scenario assuming that the mixing filters are known. The proposed method consists in working directly with the time-domain mixture signals. This approach makes it possible to accurately represent the convolutive mixing process, it is therefore suitable for the separation of highly reverberant mixtures. The source signals are represented in the modified discrete cosine transform domain with a Gaussian model based on non-negative matrix factorization (NMF). Source inference is based on a variational expectation-maximization algorithm. We experimentally show the advantage of using a time-domain representation of the convolutive mixture and a source model based on NMF.

Cet article traite du problème de séparation de sources audio sous-déterminé pour les mélanges réverbérants multi- canaux. Nous visons une application semi-aveugle où les filtres de mélange sont connus. La méthode proposée consiste à travailler directement avec les signaux temporels du mélange. Cette approche permet de représenter de façon exacte le processus de mélange convolutif, elle est donc adaptée pour la séparation de mélanges fortement réverbérants. Les signaux sources sont quant à eux représentés dans le domaine de la transformée en cosinus discrète modifiée, en utilisant un modèle gaussien basé sur la factorisation en matrices non-négatives. L'inférence des sources repose sur un algorithme espérance-maximisation variationnel. Nous montrons expérimentalement l'intérêt d'utiliser conjointement une représentation temporelle du mélange convolutif et un modèle de source basé sur la factorisation en matrices non-négatives.

Domaines

Traitement du signal et de l'image [eess.SP]

Fichier principal

LeglaiveBadeauRichard_final.pdf (293.66 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Roland Badeau : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01540481

Soumis le : jeudi 29 juin 2017-15:58:30

Dernière modification le : lundi 9 octobre 2023-12:49:42

Archivage à long terme le : jeudi 18 janvier 2018-02:28:06

Dates et versions

hal-01540481 , version 1 (29-06-2017)

Identifiants

HAL Id : hal-01540481 , version 1

Citer

Simon Leglaive, Roland Badeau, Gael Richard. Séparation de sources audio en milieu réverbérant : Factorisation en matrices non-négatives et représentation temporelle du mélange convolutif. Colloque GRETSI, Sep 2017, Juan-Les-Pins, France. ⟨hal-01540481⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSTITUT-TELECOM PARISTECH UNIV-PARIS-SACLAY LTCI IDS S2A ANR

76 Consultations

156 Téléchargements

Séparation de sources audio en milieu réverbérant : Factorisation en matrices non-négatives et représentation temporelle du mélange convolutif

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager