Singing voice detection with deep recurrent neural networks

Simon Leglaive 1, 2 Romain Hennequin 1 Roland Badeau 2
2 Télécom ParisTech - TSI
LTCI - Laboratoire Traitement et Communication de l'Information
Abstract : In this paper, we propose a new method for singing voice detection based on a Bidirectional Long Short-Term Memory (BLSTM) Recurrent Neural Network (RNN). This classifier is able to take into account a past and future temporal context to decide on the presence/absence of singing voice, thus using the inherent sequential aspect of a short-term feature extraction in a piece of music. The BLSTM-RNN contains several hidden layers, so it is able to extract from low-level features a simple representation fitted to our task. The results we obtain significantly outperform state-of-the-art methods on a common database.
Document type :
Conference papers
Complete list of metadatas

Cited literature [22 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01110035
Contributor : Roland Badeau <>
Submitted on : Monday, April 27, 2015 - 2:45:09 PM
Last modification on : Thursday, October 17, 2019 - 12:36:51 PM
Long-term archiving on : Wednesday, April 19, 2017 - 7:32:13 AM

File

Leglaive-ICASSP-2015.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01110035, version 1

Citation

Simon Leglaive, Romain Hennequin, Roland Badeau. Singing voice detection with deep recurrent neural networks. 40th International Conference on Acoustics, Speech and Signal Processing (ICASSP), Apr 2015, Brisbane, Australia. pp.121-125. ⟨hal-01110035⟩

Share

Metrics

Record views

972

Files downloads

2935