Multichannel audio source separation with deep neural networks

Aditya Arie Nugraha; Antoine Liutkus; Emmanuel Vincent

Rapport (Rapport De Recherche) Année : 2015

Multichannel audio source separation with deep neural networks

Séparation de sources audio multicanale par réseaux de neurones profonds

(1) , (1) , (1)

Aditya Arie Nugraha

Fonction : Auteur
PersonId : 967049

Speech Modeling for Facilitating Oral-Based Communication

Antoine Liutkus

Fonction : Auteur
PersonId : 2740
IdHAL : antoine-liutkus
ORCID : 0000-0002-3458-6498
IdRef : 167600419

Speech Modeling for Facilitating Oral-Based Communication

Emmanuel Vincent

Fonction : Auteur
PersonId : 1256
IdHAL : emmanuelv
ORCID : 0000-0002-0183-7289
IdRef : 089360176

Speech Modeling for Facilitating Oral-Based Communication

Résumé

This technical report considers the problem of multichannel audio source separation. A few studies have addressed the problem of single-channel audio source separation with deep neural networks (DNNs). We introduce a new framework for multichannel source separation where (1) spectral and spatial parameters are updated iteratively similarly to the expectation-maximization (EM) algorithm and (2) DNNs are used in the spectral updates. We evaluated several systems based on the proposed framework by participating in the "professionally-produced music recording" task of SiSEC 2015. Experimental results show that the framework performed well in separating singing voice and other instruments from a mixture containing multiple musical instruments.

Ce rapport de recherche traite du problème de la séparation de sources audio multicanale. Quelques travaux ont traité le problème de la séparation de sources monocanale par réseaux de neurones profonds (DNNs). Nous présentons une nouvelle approche pour la séparation de sources multicanale où (1) les paramètres spectraux et spatiaux sont mis à jour itérativement de façon similaire à l'algorithme Espérance-Maximisation (EM) et (2) des DNNs sont utilisés pour la mise à jour des paramètres spectraux. Nous évaluons plusieurs systèmes basés sur cette approche en participant à la tâche ``enregistrements musicaux professionnels'' de SiSEC 2015. Les résultats montrent que cette approche fonctionne bien pour la séparation de la voix chantée et des autres instruments dans un mélange contenant plusieurs instruments.

Mots clés

expectation-maximization (EM) algorithm SiSEC source separation deep neural networks

réseaux de neurones profonds algorithme Espérance-Maximisation (EM) séparation de sources

Domaines

Traitement du signal et de l'image [eess.SP]

Fichier principal

RR-8740.pdf (798.26 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Aditya Arie Nugraha : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01163369

Soumis le : jeudi 16 juillet 2015-17:56:19

Dernière modification le : lundi 11 septembre 2023-17:41:19

Archivage à long terme le : mercredi 26 avril 2017-06:50:17

Dates et versions

hal-01163369 , version 1 (12-06-2015)

hal-01163369 , version 2 (16-07-2015)

hal-01163369 , version 3 (05-02-2016)

hal-01163369 , version 4 (12-05-2016)

hal-01163369 , version 5 (21-06-2016)

Identifiants

HAL Id : hal-01163369 , version 2

Citer

Aditya Arie Nugraha, Antoine Liutkus, Emmanuel Vincent. Multichannel audio source separation with deep neural networks. [Research Report] RR-8740, INRIA. 2015. ⟨hal-01163369v2⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

2145 Consultations

10565 Téléchargements

Multichannel audio source separation with deep neural networks

Séparation de sources audio multicanale par réseaux de neurones profonds

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Partager