Unsupervised acquisition of morphological resources for Ukrainian - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2017

Unsupervised acquisition of morphological resources for Ukrainian

Résumé

Availability of morphological resources is an important and recurrent need because they allow the development of NLP tools and applications for a given language. Indeed, such resources provide basic information which is necessary for such tools for performing more sophisticated treatments (information retrieval, morpho-syntactic tagging, etc). We propose to acquire morphological resources for Ukrainian language. The method proposed exploits corpora in order to extract words that are related morphologically between them. The method has two versions: without and with processing of prefixes. The association strength between these words indicates their probability to have a morphological and semantic relation between them. We use three corpora (literary, medical and general-language) and evaluate the results obtained. According to the corpora, precision varies between 67% and 86%. The results from different corpora are also compared, which shows that there is little redundancy between the copora. The currently available resource contains 3,315 fully validated pairs of words.
Fichier non déposé

Dates et versions

hal-01736400 , version 1 (16-03-2018)

Identifiants

  • HAL Id : hal-01736400 , version 1

Citer

Thierry Hamon, Natalia Grabar. Unsupervised acquisition of morphological resources for Ukrainian. Computational Linguistics and Intelligent Systems, Apr 2017, Kharkiv, Ukraine. ⟨hal-01736400⟩
89 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More