A weakly-supervised discriminative model for audio-to-score alignment

Rémi Lajugie 1, 2 Piotr Bojanowski 1, 3 Philippe Cuvillier 4, 5 Sylvain Arlot 6 Francis Bach 1, 2
2 SIERRA - Statistical Machine Learning and Parsimony
DI-ENS - Département d'informatique de l'École normale supérieure, ENS Paris - École normale supérieure - Paris, CNRS - Centre National de la Recherche Scientifique, Inria de Paris
3 WILLOW - Models of visual object recognition and scene understanding
DI-ENS - Département d'informatique de l'École normale supérieure, Inria de Paris
4 Repmus - Représentations musicales
STMS - Sciences et Technologies de la Musique et du Son
5 MuTant - Synchronous Realtime Processing and Programming of Music Signals
UPMC - Université Pierre et Marie Curie - Paris 6, IRCAM, CNRS - Centre National de la Recherche Scientifique, Inria de Paris
Abstract : In this paper, we consider a new discriminative approach to the problem of audio-to-score alignment. We consider the two distinct informations provided by the music scores: (i) an exact ordered list of musical events and (ii) an approximate prior information about relative duration of events. We extend the basic dynamic time warping algorithm to a convex problem that learns optimal classifiers for all events while jointly aligning files, using this weak supervision only. We show that the relative duration between events can be easily used as a penalization of our cost function and allows us to drastically improve performances of our approach. We demonstrate the validity of our approach on a large and realistic dataset.
Type de document :
Communication dans un congrès
41st International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Mar 2016, Shanghai, China. Proceedings of the 41st International Conference on Acoustics, Speech, and Signal Processing (ICASSP). 〈http://www.icassp2016.org/〉
Liste complète des métadonnées

Littérature citée [26 références]  Voir  Masquer  Télécharger

https://hal.archives-ouvertes.fr/hal-01251018
Contributeur : Philippe Cuvillier <>
Soumis le : mardi 5 janvier 2016 - 15:12:06
Dernière modification le : lundi 29 mai 2017 - 14:23:41
Document(s) archivé(s) le : jeudi 7 avril 2016 - 15:26:31

Fichier

icassp2016.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01251018, version 1

Collections

IRCAM | UPMC | STMS | INRIA | PSL

Citation

Rémi Lajugie, Piotr Bojanowski, Philippe Cuvillier, Sylvain Arlot, Francis Bach. A weakly-supervised discriminative model for audio-to-score alignment. 41st International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Mar 2016, Shanghai, China. Proceedings of the 41st International Conference on Acoustics, Speech, and Signal Processing (ICASSP). 〈http://www.icassp2016.org/〉. 〈hal-01251018〉

Partager

Métriques

Consultations de
la notice

532

Téléchargements du document

476