KRAUTS: A German Temporally Annotated News Corpus

Abstract : In recent years, temporal tagging, i.e., the extraction and normalization of temporal expressions, has become a vibrant research area. Several tools have been made available, and new strategies have been developed. Due to domain-specific challenges, evaluations of new methods should be performed on diverse text types. Despite significant efforts towards multilinguality in the context of temporal tagging, for all languages except English, annotated corpora exist only for a single domain. In the case of German, for example, only a narrativestyle corpus has been manually annotated so far, thus no evaluations of German temporal tagging performance on news articles can be made. In this paper, we present KRAUTS, a new German temporally annotated corpus containing two subsets of news documents: articles from the daily newspaper DOLOMITEN and from the weekly newspaper DIE ZEIT. Overall, the corpus contains 192 documents with 1,140 annotated temporal expressions, and has been made publicly available to further boost research in temporal tagging
Type de document :
Communication dans un congrès
LREC 2018 - 11th International Conference on Language Resources and Evaluation, May 2018, Miyazaki, Japan
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-01844834
Contributeur : Anne-Lyse Minard <>
Soumis le : jeudi 19 juillet 2018 - 17:31:16
Dernière modification le : jeudi 15 novembre 2018 - 11:59:01

Identifiants

  • HAL Id : hal-01844834, version 1

Citation

Strötgen Jannik, Anne-Lyse Minard, Lukas Lange, Manuela Speranza, Bernardo Magnini. KRAUTS: A German Temporally Annotated News Corpus. LREC 2018 - 11th International Conference on Language Resources and Evaluation, May 2018, Miyazaki, Japan. 〈hal-01844834〉

Partager

Métriques

Consultations de la notice

123