An interactive audio source separation framework based on non-negative matrix factorization

Ngoc Q. K. Duong; Alexey Ozerov; Louis Chevallier; Joel Sirot

Communication Dans Un Congrès Année : 2014

An interactive audio source separation framework based on non-negative matrix factorization

(1) , (1) , (1) , (1)

Ngoc Q. K. Duong

Fonction : Auteur
PersonId : 946470

Technicolor R & I [Cesson Sévigné]

Alexey Ozerov

Fonction : Auteur
PersonId : 930358

Technicolor R & I [Cesson Sévigné]

Louis Chevallier

Fonction : Auteur

Technicolor R & I [Cesson Sévigné]

Joel Sirot

Fonction : Auteur

Technicolor R & I [Cesson Sévigné]

Résumé

Though audio source separation offers a wide range of applications in audio enhancement and post-production, its performance has yet to reach the satisfactory especially for single-channel mixtures with limited training data. In this paper we present a novel interactive source separation framework that allows end-users to provide feedback at each separation step so as to gradually improve the result. For this purpose, a prototype graphical user interface (GUI) is developed to help users annotating time-frequency regions where a source can be labeled as either active, inactive, or well-separated within the displayed spectrogram. This user feedback information, which is partially new with respect to the state-of-the-art annotations, is then taken into account in a proposed uncertainty-based learning algorithm to constraint the source estimates in next separation step. The considered framework is based on non-negative matrix factorization and is shown to be effective even without using any isolated training data.

Mots clés

Interactive audio source separation nonnegative matrix factorization uncertainty-based learning timefrequency annotation user feedback

Domaines

Traitement du signal et de l'image [eess.SP] Traitement du signal et de l'image [eess.SP]

Fichier principal

icassp2014_revised.pdf (326.71 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Alexey Ozerov : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-00960717

Soumis le : mardi 18 mars 2014-15:57:24

Dernière modification le : mardi 18 mars 2014-16:01:48

Archivage à long terme le : mercredi 18 juin 2014-13:25:58

Dates et versions

hal-00960717 , version 1 (18-03-2014)

Identifiants

HAL Id : hal-00960717 , version 1

Citer

Ngoc Q. K. Duong, Alexey Ozerov, Louis Chevallier, Joel Sirot. An interactive audio source separation framework based on non-negative matrix factorization. IEEE International Conference on Acoustics Speech and Signal Processing, May 2014, Florence, Italy. ⟨hal-00960717⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

246 Consultations

337 Téléchargements

An interactive audio source separation framework based on non-negative matrix factorization

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Partager