A two level strategy for audio segmentation

Sébastien Lefèvre; Nicole Vincent

doi:10.1016/j.dsp.2010.07.003

Article Dans Une Revue Digital Signal Processing Année : 2011

A two level strategy for audio segmentation

(1) , (2)

1
2

Sébastien Lefèvre

Fonction : Auteur
PersonId : 3219
IdHAL : sebastien-lefevre
ORCID : 0000-0002-2384-8202
IdRef : 113437773

Laboratoire des Sciences de l'Image, de l'Informatique et de la Télédétection

Nicole Vincent

Fonction : Auteur
PersonId : 835856

Centre de Recherche en Informatique de Paris 5

Résumé

In this paper we are dealing with audio segmentation. The audio tracks are sampled in short sequences which are classified into several classes. Every sequence can then be further analyzed depending on the class it belongs to. We first describe simple techniques for segmentation in two or three classes. These methods rely on amplitude, spectral or cepstral analysis, and classical Hidden Markov Models. From the limitations of these approaches, we propose a two level segmentation process. The segmentation is performed by computing several features for each audio sequence. These features are computed either on a complete audio segment or on a frame (set of samples) which is a subset of the audio segment. The proposed approach for microsegmentation of audio data consists of a combination of a K-mean classifier at the segment level and of a Multidimensional Hidden Markov Model system using the frame decomposition of the signal. A first classification is obtained using the K-mean classifier and segment-based features. Then final result comes from the use of Multidimensional Hidden Markov Models and frame-based features involving temporary results. Multidimensional Hidden Markov Models are an extension of classical Hidden Markov Models dedicated to multicomponent data. They are particularly adapted to our case where each audio segment can be characterized by several features of different natures. We illustrate our methods in the context of analysis of football audio tracks.

Domaines

Traitement des images [eess.IV]

Fichier principal

dsp.pdf (219.61 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Sébastien Lefèvre : Connectez-vous pour contacter le contributeur

https://hal.science/hal-00512744

Soumis le : mardi 31 août 2010-14:49:06

Dernière modification le : vendredi 24 mars 2023-14:52:53

Archivage à long terme le : mardi 23 octobre 2012-15:20:39

Dates et versions

hal-00512744 , version 1 (31-08-2010)

Identifiants

HAL Id : hal-00512744 , version 1
DOI : 10.1016/j.dsp.2010.07.003

Citer

Sébastien Lefèvre, Nicole Vincent. A two level strategy for audio segmentation. Digital Signal Processing, 2011, 21 (2), pp.270-277. ⟨10.1016/j.dsp.2010.07.003⟩. ⟨hal-00512744⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS

105 Consultations

1030 Téléchargements

A two level strategy for audio segmentation

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager