Phylogeny-Inspired Soft Prompts For Data-to-Text Generation in Low-Resource Languages - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2023

Phylogeny-Inspired Soft Prompts For Data-to-Text Generation in Low-Resource Languages

Résumé

Most work on verbalising Knowledge-Graphs (KG) has focused on high-resource languages such as English, Russian, Czech or Arabic. In this paper, we focus on KG-to-Text generation where the output text is in Breton, Irish or Welsh. To overcome the small size of the parallel training data, we combine the strengths of a multilingual encoder-decoder model with denoising fine-tuning on monolingual data and Soft Prompt fine-tuning on a small quantity of KG/text data. We furthermore structure the soft prompt into multiple sub-prompts designed to capture the similarities and differences between English, Knowledge graphs and the three target languages. Our experiments show that our approach outperforms strong baselines and that all sub-prompts contribute to performance.
Fichier principal
Vignette du fichier
revision_aacl23_79.pdf (3.12 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-04199557 , version 1 (05-10-2023)
hal-04199557 , version 2 (22-01-2024)

Licence

Paternité

Identifiants

  • HAL Id : hal-04199557 , version 2

Citer

William Soto, Yannick Parmentier, Claire Gardent. Phylogeny-Inspired Soft Prompts For Data-to-Text Generation in Low-Resource Languages. IJCNLP-AACL 2023: The 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, ACL, Nov 2023, Bali, Indonesia. ⟨hal-04199557v2⟩
87 Consultations
35 Téléchargements

Partager

Gmail Facebook X LinkedIn More