The Thesaurus Occitan: a multimedia database dedicated to occitan dialects.: Presentation of its morphosyntax module.

Pierre-Aurélien Georges 1
1 BCL, équipe Diachronie, Dialectologie, et Phonologie (DDP) [2008..2015]
BCL - Bases, Corpus, Langage (UMR 7320 - UNS / CNRS)
Abstract : The Module MorphoSyntaxique (abbreviated MMS) is a computer tool especially designed for syntactic and morpho-syntactic analysis of Occitan dialects. It is part of the Thesaurus Occitan multimedia database (of which a general presentation can be found in these proceedings in another article by Guylaine Brun-Trigaud). Following the THESOC's general guidelines (i.e. localised and oral data only), this module contains both oral texts (including ethnotexts) and single sentences, such as answers to morphosyntactic questionnaires. The "oral data" criteria can be somewhat flexed: even if this module was originally conceived for oral data processing, its part-of-speech tagger and syntactic parser are still able to process written texts so far as they are written in a familiar or popular style, close to oral register. The locations where all these texts and sentences have been harvested are stored in the database, thus enabling on the long term a comparison between different dialects on a morphosyntactical or syntactical basis, thus opening new perspectives for dialectology.
Type de document :
Communication dans un congrès
Jose Luis ORMAETXEA; Gotzon AURREKOETXEA OLABARRI. Tools for Linguistic Variation (EUDIA-2), Oct 2009, Vitoria Gasteiz, Spain. ASJU-ren gehigarriak, LIII, UPV-EHU, Bilbao, Tools for Linguistic Variation (LIII), pp.107-118, 2010, Anejos del Anuario del Seminario de Filología Vascoa "Julio de Urquijo". <https://sites.google.com/site/edakeudia/Home/aurkibidea/jardunaldi-biltzar/hizkuntza-bariazioa-aztertzeko-teknologia-jardunaldi-internazionala>
Liste complète des métadonnées


https://hal.archives-ouvertes.fr/hal-01277767
Contributeur : Pierre-Aurélien Georges <>
Soumis le : mardi 23 février 2016 - 18:14:58
Dernière modification le : vendredi 26 février 2016 - 01:09:58
Document(s) archivé(s) le : mardi 24 mai 2016 - 11:42:02

Fichier

PAG Vitoria Gasteiz 2009.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01277767, version 1

Collections

Citation

Pierre-Aurélien Georges. The Thesaurus Occitan: a multimedia database dedicated to occitan dialects.: Presentation of its morphosyntax module.. Jose Luis ORMAETXEA; Gotzon AURREKOETXEA OLABARRI. Tools for Linguistic Variation (EUDIA-2), Oct 2009, Vitoria Gasteiz, Spain. ASJU-ren gehigarriak, LIII, UPV-EHU, Bilbao, Tools for Linguistic Variation (LIII), pp.107-118, 2010, Anejos del Anuario del Seminario de Filología Vascoa "Julio de Urquijo". <https://sites.google.com/site/edakeudia/Home/aurkibidea/jardunaldi-biltzar/hizkuntza-bariazioa-aztertzeko-teknologia-jardunaldi-internazionala>. <hal-01277767>

Partager

Métriques

Consultations de
la notice

121

Téléchargements du document

58