Singing voice detection with deep recurrent neural networks

Simon Leglaive 1, 2 Romain Hennequin 1 Roland Badeau 2
2 Télécom ParisTech - TSI
LTCI - Laboratoire Traitement et Communication de l'Information
Abstract : In this paper, we propose a new method for singing voice detection based on a Bidirectional Long Short-Term Memory (BLSTM) Recurrent Neural Network (RNN). This classifier is able to take into account a past and future temporal context to decide on the presence/absence of singing voice, thus using the inherent sequential aspect of a short-term feature extraction in a piece of music. The BLSTM-RNN contains several hidden layers, so it is able to extract from low-level features a simple representation fitted to our task. The results we obtain significantly outperform state-of-the-art methods on a common database.
Document type :
Conference papers
Complete list of metadatas

Cited literature [22 references]  Display  Hide  Download
Contributor : Roland Badeau <>
Submitted on : Monday, April 27, 2015 - 2:45:09 PM
Last modification on : Thursday, October 17, 2019 - 12:36:51 PM
Long-term archiving on : Wednesday, April 19, 2017 - 7:32:13 AM


Files produced by the author(s)


  • HAL Id : hal-01110035, version 1


Simon Leglaive, Romain Hennequin, Roland Badeau. Singing voice detection with deep recurrent neural networks. 40th International Conference on Acoustics, Speech and Signal Processing (ICASSP), Apr 2015, Brisbane, Australia. pp.121-125. ⟨hal-01110035⟩



Record views


Files downloads