Multilingual AMR-to-Text Generation - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2020

Multilingual AMR-to-Text Generation

Résumé

Generating text from structured data is challenging because it requires bridging the gap between (i) structure and natural language (NL) and (ii) semantically underspecified input and fully specified NL output. Multilingual generation brings in an additional challenge: that of generating into languages with varied word order and morphological properties. In this work, we focus on Abstract Meaning Representations (AMRs) as structured input, where previous research has overwhelmingly focused on generating only into English. We leverage advances in cross-lingual embeddings, pretraining, and multilingual models to create multilingual AMR-to-text models that generate in twenty one different languages. For eighteen languages, based on automatic metrics, our multilingual models surpass baselines that generate into a single language. We analyse the ability of our multilingual models to accurately capture morphology and word order using human evaluation, and find that native speakers judge our generations to be fluent.
Fichier principal
Vignette du fichier
amr_to_text_generation__camera_ready_ (2).pdf (673.08 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02999676 , version 1 (11-11-2020)

Identifiants

  • HAL Id : hal-02999676 , version 1

Citer

Angela Fan, Claire Gardent. Multilingual AMR-to-Text Generation. 2020 Conference on Empirical Methods in Natural Language Processing, Nov 2020, Punta Cana, Dominican Republic. ⟨hal-02999676⟩
27 Consultations
45 Téléchargements

Partager

Gmail Facebook X LinkedIn More