A weakly-supervised discriminative model for audio-to-score alignment

Rémi Lajugie 1, 2 Piotr Bojanowski 1, 3 Philippe Cuvillier 4, 5 Sylvain Arlot 6 Francis Bach 1, 2
2 SIERRA - Statistical Machine Learning and Parsimony
DI-ENS - Département d'informatique de l'École normale supérieure, CNRS - Centre National de la Recherche Scientifique, Inria de Paris
3 WILLOW - Models of visual object recognition and scene understanding
Inria de Paris, DI-ENS - Département d'informatique de l'École normale supérieure
4 Repmus - Représentations musicales
STMS - Sciences et Technologies de la Musique et du Son
5 MuTant - Synchronous Realtime Processing and Programming of Music Signals
Inria de Paris, UPMC - Université Pierre et Marie Curie - Paris 6, IRCAM, CNRS - Centre National de la Recherche Scientifique
Abstract : In this paper, we consider a new discriminative approach to the problem of audio-to-score alignment. We consider the two distinct informations provided by the music scores: (i) an exact ordered list of musical events and (ii) an approximate prior information about relative duration of events. We extend the basic dynamic time warping algorithm to a convex problem that learns optimal classifiers for all events while jointly aligning files, using this weak supervision only. We show that the relative duration between events can be easily used as a penalization of our cost function and allows us to drastically improve performances of our approach. We demonstrate the validity of our approach on a large and realistic dataset.
Document type :
Conference papers
Liste complète des métadonnées

Cited literature [26 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01251018
Contributor : Philippe Cuvillier <>
Submitted on : Tuesday, January 5, 2016 - 3:12:06 PM
Last modification on : Thursday, March 21, 2019 - 2:29:49 PM
Document(s) archivé(s) le : Thursday, April 7, 2016 - 3:26:31 PM

File

icassp2016.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01251018, version 1

Citation

Rémi Lajugie, Piotr Bojanowski, Philippe Cuvillier, Sylvain Arlot, Francis Bach. A weakly-supervised discriminative model for audio-to-score alignment. 41st International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Mar 2016, Shanghai, China. ⟨hal-01251018⟩

Share

Metrics

Record views

825

Files downloads

966