WikiWars-UA : Ukrainian corpus annotated with temporal expressions - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2019

WikiWars-UA : Ukrainian corpus annotated with temporal expressions

Résumé

Reliability of tools and reproducibility of study results are important features of modern Natural Language Processing (NLP) tools and methods. The scientific research is indeed increasingly coming under criticism for the lack of reproducibility of results. First step towards the reproducibility is related to the availability of freely usable tools and corpora. In our work, we are interested in automatic processing of unstructured documents for the extraction of temporal information. Our main objective is to create reference annotated corpus with temporal information related to dates (absolute and relative), periods, time, etc. in Ukrainian, and to their normalization. The approach relies on the adaptation of existing application, automatic pre-annotation of WikiWars corpus in Ukrainian and its manual correction. The reference corpus permits to reliably evaluate the current version of the automatic temporal annotator and to prepare future work on this topics. The corpus is freely available for the research on https://github.com/thhamon/WikiWarsUA
Fichier non déposé

Dates et versions

hal-02371237 , version 1 (19-11-2019)

Identifiants

  • HAL Id : hal-02371237 , version 1

Citer

Natalia Grabar, Thierry Hamon. WikiWars-UA : Ukrainian corpus annotated with temporal expressions. Computational Linguistics and Intelligent Systems, Apr 2019, Kharkiv, Ukraine. ⟨hal-02371237⟩
69 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More