Filterbank coefficients selection for segmentation in singer turns

Audio segmentation is often the first step of audio indexing systems. It provides segments supposed to be acoustically homogeneous. In this paper, we report our recent experiments on segmenting music recordings into singer turns, by analogy with speaker turns in speech processing. We compare several acoustic features for this task: FilterBANK coefficients (FBANK), and Mel frequency cepstral coefficients (MFCC). FBANK features were shown to outperform MFCC on a “clean” singing corpus. We describe a coefficient selection method that allowed further improvement on this corpus. A 75.8% F-measure was obtained with FBANK features selected with this method, corresponding to a 30.6% absolute gain compared to MFCC. On another corpus comprised of ethno-musicological recordings, both feature types showed a similar performance of about 60%. This corpus presents an increased difficulty due to the presence of instruments overlapped with singing and to a lower recording audio quality.

Mots clés

Niobium Frequency modulation Hafnium

Domaines

Synthèse d'image et réalité virtuelle [cs.GR] Traitement du signal et de l'image [eess.SP] Traitement des images [eess.IV] Vision par ordinateur et reconnaissance de formes [cs.CV] Intelligence artificielle [cs.AI]

Fichier principal

thlithi_17205.pdf (359.11 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Open Archive Toulouse Archive Ouverte (OATAO) : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01447347

Soumis le : jeudi 26 janvier 2017-17:50:36

Dernière modification le : lundi 20 novembre 2023-11:44:21

Archivage à long terme le : jeudi 27 avril 2017-14:58:29

Dates et versions

hal-01447347 , version 1 (26-01-2017)

Identifiants

HAL Id : hal-01447347 , version 1
OATAO : 17205

Citer

Marwa Thlithi, Julien Pinquier, Thomas Pellegrini, Régine André-Obrecht. Filterbank coefficients selection for segmentation in singer turns. 14th International Workshop on Content-Based Multimedia Indexing (CBMI 2016), Jun 2016, Bucharest, Romania. pp. 1-6. ⟨hal-01447347⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-TLSE2 CNRS UNIV-LEMANS SMS UT1-CAPITOLE LIUM LIUM-LST IRIT IRIT-SAMOVA IRIT-SI IRIT-UT3 TOULOUSE-INP UNIV-UT3 UT3-TOULOUSEINP

177 Consultations

130 Téléchargements