Investigating alignment interpretability for low-resource NMT - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue Machine Translation Année : 2021

Investigating alignment interpretability for low-resource NMT

Résumé

The attention mechanism in Neural Machine Translation (NMT) models added flexibility to translation systems, and the possibility to visualize soft-alignments between source and target representations. While there is much debate about the relationship between attention and the yielded output for neural models [26, 35, 43, 38], in this paper we propose a different assessment, investigating soft-alignment interpretability in low-resource scenarios. We experimented with different architectures (RNN [5], 2D-CNN [15], and Transformer [39]), comparing them with regards to their ability to produce directly exploitable alignments. For evaluating exploitability, we replicated the Unsupervised Word Segmentation (UWS) task from Godard et al. [22]. There, source words are translated into unsegmented phone sequences. Posterior to training, the resulting soft-alignments are used for producing segmentation over the target side. Our results showed that a RNN-based NMT model produced the most exploitable alignments in this scenario. We then investigated methods for increasing its UWS scores by comparing the following methodologies: monolingual pre-training, input representation augmentation (hybrid model), and explicit word length optimization during training. We reached the best results by using the hybrid model, which uses an intermediate monolingual-rooted segmentation from a non-parametric Bayesian model [25] to enrich the input representation before training.

Dates et versions

hal-03139744 , version 1 (12-02-2021)

Identifiants

Citer

Marcely Zanon Boito, Aline Villavicencio, Laurent Besacier. Investigating alignment interpretability for low-resource NMT. Machine Translation, 2021, ⟨10.1007/s10590-020-09254-w⟩. ⟨hal-03139744⟩
53 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More