Benchmarking benchmarks: introducing new automatic indicators for benchmarking Spoken Language Understanding corpora

Abstract : Empirical evaluation is nowadays the main evaluation paradigm in Natural Language Processing for assessing the relevance of a new machine-learning based model. If large corpora are available for tasks such as Automatic Speech Recognition , this is not the case for other tasks such as Spoken Language Understanding (SLU), consisting in translating spoken transcriptions into a formal representation often based on semantic frames. Corpora such as ATIS or SNIPS are widely used to compare systems, however differences in performance among systems are often very small, not statistically significant , and can be produced by biases in the data collection or the annotation scheme, as we presented on the ATIS corpus ("Is ATIS too shallow?, IS2018"). We propose in this study a new methodology for assessing the relevance of an SLU corpus. We claim that only taking into account systems performance does not provide enough insight about what is covered by current state-of-the-art models and what is left to be done. We apply our methodology on a set of 4 SLU systems and 5 benchmark corpora (ATIS, SNIPS, M2M, MEDIA) and automatically produce several indicators assessing the relevance (or not) of each corpus for benchmarking SLU models.
Complete list of metadatas

Cited literature [11 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-02270633
Contributor : Christian Raymond <>
Submitted on : Monday, August 26, 2019 - 10:17:16 AM
Last modification on : Wednesday, August 28, 2019 - 1:20:31 AM

File

Interspeech2019.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-02270633, version 1

Collections

Citation

Frédéric Béchet, Christian Raymond. Benchmarking benchmarks: introducing new automatic indicators for benchmarking Spoken Language Understanding corpora. InterSpeech, Sep 2019, Graz, Austria. ⟨hal-02270633⟩

Share

Metrics

Record views

53

Files downloads

48