Detection of nonlinguistic vocalizations using ALISP sequencing

In this paper, we present a generic methodology to detect nonlinguistic vocalizations using ALISP (Automatic Language Independent Speech Processing), which is a data-driven audio segmentation approach. Using Maximum Likelihood Linear Regression (MLLR) and Maximum A Posterior (MAP) techniques, the proposed method adapts ALISP models, which then facilitate detection of local regions of nonlinguistic vocalizations with the standard Viterbi decoding algorithm. We also illustrate how a simple majority voting scheme, using a sliding window on ALISP sequences, can be helpful in eliminating outliers from the Viterbi-predicted sequence automatically. We evaluate the performance of our method on detection of laughter, a nonlinguistic vocalization, in comparison with global acoustic models such as GMMs, left-to-right HMMs and ergodic HMMs. The results indicate that adapted ALISP acoustic models perform better than global acoustic models in terms of F-measure. Moreover, our majority voting scheme on ALISP-sequences further improves the performance yielding, in total, an increase of 19.6%, 8.1% and 5.6% on the F-measure against global acoustic models GMMs, left-to-right HMMs, and ergodic HMMs respectively

Mots clés

ALISP sequencing Acoustic models Audio segmentation Model adaptation

Domaines

Traitement du signal et de l'image [eess.SP] Traitement du signal et de l'image [eess.SP]

Médiathèque Télécom SudParis & Institut Mines-Télécom Business School : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01275101

Soumis le : mardi 16 février 2016-17:10:51

Dernière modification le : lundi 9 octobre 2023-12:49:39

Dates et versions

hal-01275101 , version 1 (16-02-2016)

Identifiants

HAL Id : hal-01275101 , version 1
DOI : 10.1109/ICASSP.2013.6639132

Citer

Sathish Pammi, Houssemeddine Khemiri, Dijana Petrovska-Delacrétaz, Gérard Chollet. Detection of nonlinguistic vocalizations using ALISP sequencing. ICASSP 2013 : 38th IEEE International Conference on Acoustics, Speech and Signal Processing, May 2013, Vancouver, Canada. pp.7557 - 7561, ⟨10.1109/ICASSP.2013.6639132⟩. ⟨hal-01275101⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSTITUT-TELECOM CNRS TELECOM-SUDPARIS PARISTECH LTCI IDS

84 Consultations

0 Téléchargements