FOLDED CQT RCNN FOR REAL-TIME RECOGNITION OF INSTRUMENT PLAYING TECHNIQUES - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2019

FOLDED CQT RCNN FOR REAL-TIME RECOGNITION OF INSTRUMENT PLAYING TECHNIQUES

Résumé

In the past years, deep learning has produced state-of-the-art performance in timbre and instrument classification. However, only a few models currently deal with the recognition of advanced Instrument Playing Techniques (IPT). None of them have a real-time approach of this problem. Furthermore, most studies rely on a single sound bank for training and testing. Their methodology provides no assurance as to the generalization of their results to other sounds. In this article, we extend state-of-the-art convolutional neural networks to the classification of IPTs. We build the first IPT corpus from independent sound banks, annotate it with the JAMS standard and make it freely available. Our models yield consistently high accuracies on a homogeneous subset of this corpus. However, only a proper taxonomy of IPTs and specifically defined input transforms offer proper resilience when addressing the "minus-1db" methodology, which assesses the ability of the models to generalize. In particular, we introduce a novel Folded Constant Q-Transform adjusted to the requirements of IPT classification. Finally we discuss the use of our classifier in real-time.
Fichier principal
Vignette du fichier
000086.pdf (786.49 Ko) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte
Loading...

Dates et versions

hal-02472560 , version 1 (10-02-2020)

Identifiants

  • HAL Id : hal-02472560 , version 1

Citer

Jean-Francois Ducher, Philippe Esling. FOLDED CQT RCNN FOR REAL-TIME RECOGNITION OF INSTRUMENT PLAYING TECHNIQUES. International Society for Music Information Retrieval, Nov 2019, Delft, Netherlands. ⟨hal-02472560⟩
135 Consultations
148 Téléchargements

Partager

Gmail Facebook X LinkedIn More