Natural Language Processing for Regional Languages of France: Lessons Learned from the RESTAURE Project - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2019

Natural Language Processing for Regional Languages of France: Lessons Learned from the RESTAURE Project

Delphine Bernhard

Résumé

The RESTAURE project (2015-2018) aimed at providing computational resources and natural language processing (NLP) tools for three regional languages of France: Alsatian, Occitan and Picard. It brought together researchers from four research units located in Strasbourg (LiLPa), Toulouse (CLLE-ERSS), Amiens (Habiter le monde) and Orsay (LIMSI). In this presentation, we will discuss the results of the RESTAURE project, focusing on the obstacles and challenges, the successful outcomes and the next steps. We will assess to what extent the project followed recent recommendations on improving digital language vitality for under-resourced and minority languages, to which Alsatian, Occitan and Picard belong [Soria et al., 2013, Ceberio Berger et al., 2018]. In particular, we will show how the cooperation between the research units involved made it possible to compensate, to some extent, for the lack of human resources and specialists for the regional languages under study. We will also explain how we re-used existing standards and proven methods in the process of resource building. Finally, this presentation will serve as an opportunity to detail the resources and tools developed during the project, which have been made available on the Zenodo platorm (https://zenodo.org/communities/restaure/) under a Creative Commons Attribution Share Alike 4.0 license. References Ceberio Berger, K., Gurrutxaga Heraiz, A., Baroni, P., Davyth, H., Kruse, E., Quochi, V., Russo, I., Salonen, T., Sarhimaa, A., and Soria, C. (2018). Digital Language Survival Kit. The DLDP Recommendations to Improve Digital Vitality. Technical report. http://www.dldp.eu/sites/default/files/documents/DLDP_Digital-Language-Survival-Kit.pdf. Soria, C., Mariani, J., and Zoli, C. (2013). Dwarfs sitting on the giants' shoulders—how LTs for regional and minority languages can benefit from piggybacking major languages. In Proceedings of XVII FEL Conference, pages 73-79.
Fichier non déposé

Dates et versions

hal-02378175 , version 1 (25-11-2019)

Identifiants

  • HAL Id : hal-02378175 , version 1

Citer

Delphine Bernhard. Natural Language Processing for Regional Languages of France: Lessons Learned from the RESTAURE Project. New Ways of Analyzing Dialectal Variation, Nov 2019, Paris, France. ⟨hal-02378175⟩

Collections

SITE-ALSACE ANR
76 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More