Building a Treebank for French, pp.165-187, 2003. ,
Arabert: Transformer-based model for arabic language understanding, 2020. ,
Massively multilingual sentence embeddings for zero-shot cross-lingual transfer and beyond, Transactions of the Association for Computational Linguistics, vol.7, pp.597-610, 2019. ,
A neural probabilistic language model, Journal of machine learning research, vol.3, pp.1137-1155, 2003. ,
Domain adaptation with structural correspondence learning, Proceedings of the 2006 conference on empirical methods in natural language processing, pp.120-128, 2006. ,
N-gram counts and language models from the common crawl, Proceedings of the Language Resources and Evaluation Conference, 2014. ,
Spanish pre-trained bert model and evaluation data, 2020. ,
Enhanced lstm for natural language inference, Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, vol.1, pp.1657-1668, 2017. ,
A unified architecture for natural language processing: Deep neural networks with multitask learning, Proceedings of the 25th international conference on Machine learning, pp.160-167, 2008. ,
Natural language processing (almost) from scratch, Journal of machine learning research, vol.12, pp.2493-2537, 2011. ,
Xnli: Evaluating cross-lingual sentence representations, Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp.2475-2485, 2018. ,
Unsupervised cross-lingual representation learning at scale, 2019. ,
The ligm-alpage architecture for the spmrl 2013 shared task: Multiword expression analysis and dependency parsing, Proceedings of the EMNLP Workshop on Statistical Parsing of Morphologically Rich Languages, 2013. ,
URL : https://hal.archives-ouvertes.fr/hal-00932372
Semi-supervised sequence learning, Advances in neural information processing systems, pp.3079-3087, 2015. ,
, Bertje: A dutch bert model, 2019.
, Robbert: a dutch roberta-based language model, 2020.
Bert: Pre-training of deep bidirectional transformers for language understanding, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol.1, pp.4171-4186, 2019. ,
Deep biaffine attention for neural dependency parsing, ICLR, 2016. ,
Multiun: A multilingual corpus from united nation documents, Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10), 2010. ,
Multifit: Efficient multilingual language model fine-tuning, Proceedings of the 2019 conference on empirical methods in natural language processing (EMNLP), pp.1532-1543, 2019. ,
Reducing transformer depth on demand with structured dropout, International Conference on Learning Representations, 2019. ,
Efficient training of bert by progressively stacking, International Conference on Machine Learning, pp.2337-2346, 2019. ,
Arabic word sense disambiguation for and by machine translation. Theses, Faculté des Scienceséconomiques et de gestion, 2018. ,
URL : https://hal.archives-ouvertes.fr/tel-02139438
Long shortterm memory, Neural computation, vol.9, issue.8, pp.1735-1780, 1997. ,
Universal language model fine-tuning for text classification, Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, vol.1, pp.328-339, 2018. ,
Deep networks with stochastic depth, European conference on computer vision, pp.646-661, 2016. ,
Adam: A method for stochastic optimization, 2014. ,
Constituency parsing with a self-attentive encoder, Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, vol.1, pp.2676-2686, 2018. ,
Multilingual constituency parsing with self-attention and pre-training, Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp.3499-3505, 2019. ,
Moses: Open source toolkit for statistical machine translation, Proceedings of the 45th annual meeting of the association for computational linguistics companion volume proceedings of the demo and poster sessions, pp.177-180, 2007. ,
Europarl: A parallel corpus for statistical machine translation, Machine Translation Summit, pp.79-86, 2005. ,
Adaptation of deep bidirectional multilingual transformers for russian language, 2019. ,
Cross-lingual language model pretraining, Advances in neural information processing systems, 2019. ,
Albert: A lite bert for selfsupervised learning of language representations, 2019. ,
Findings of the first shared task on machine translation robustness, p.91, 2019. ,
Opensubtitles2015: Extracting large parallel corpora from movie and tv subtitles, International Conference on Language Resources and Evaluation, 2016. ,
, Roberta: A robustly optimized bert pretraining approach, 2019.
, CamemBERT: a Tasty French Language Model. arXiv e-prints, 2019.
URL : https://hal.archives-ouvertes.fr/hal-02445946
Learned in translation: Contextualized word vectors, Advances in Neural Information Processing Systems, pp.6294-6305, 2017. ,
, Data dumps -meta, discussion about wikimedia projects, Meta, 2019.
Distributed representations of words and phrases and their compositionality, Proceedings of the 26th International Conference on Neural Information Processing Systems, vol.2, pp.3111-3119, 2013. ,
A semantic concordance, Proceedings of the workshop on Human Language Technology, HLT '93, pp.303-308, 1993. ,
Wordnet: a lexical database for english, Communications of the ACM, vol.38, issue.11, pp.39-41, 1995. ,
Babelnet: Building a very large multilingual semantic network, Proceedings of the 48th annual meeting of the association for computational linguistics, pp.216-225, 2010. ,
SemEval-2013 Task 12: Multilingual Word Sense Disambiguation, Proceedings of the Seventh International Workshop on Semantic Evaluation, vol.2, pp.222-231, 2013. ,
Word sense disambiguation: A survey, ACM Computing Surveys, vol.41, issue.2, 2009. ,
Phobert: Pretrained language models for vietnamese, 2020. ,
Transformers without tears: Improving the normalization of self-attention, 2019. ,
,
fairseq: A fast, extensible toolkit for sequence modeling, Proceedings of NAACL-HLT 2019: Demonstrations, 2019. ,
Glove: Global vectors for word representation, 2014. ,
Deep contextualized word representations, Proceedings of NAACL-HLT, pp.2227-2237, 2018. ,
Very deep self-attention networks for end-to-end speech recognition, 2019. ,
AlBERTo: Italian BERT Language Understanding Model for NLP Challenging Tasks Based on Tweets, Proceedings of the Sixth Italian Conference on Computational Linguistics, p.2481, 2019. ,
Cross-language text classification using structural correspondence learning, Proceedings of the 48th annual meeting of the association for computational linguistics, pp.1118-1127, 2010. ,
Improving language understanding by generative pre-training, 2018. ,
Exploring the limits of transfer learning with a unified textto-text transformer, 2019. ,
, Know what you don't know: Unanswerable questions for squad, 2018.
Unsupervised pretraining for sequence to sequence learning, Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp.383-391, 2017. ,
Overview of the SPMRL 2013 shared task: A cross-framework evaluation of parsing morphologically rich languages, Proceedings of the Fourth Workshop on Statistical Parsing of Morphologically-Rich Languages, pp.146-182, 2013. ,
URL : https://hal.archives-ouvertes.fr/hal-00877096
Using wiktionary as a resource for wsd: the case of french verbs, Proceedings of the 13th International Conference on Computational Semantics-Long Papers, pp.259-270, 2019. ,
URL : https://hal.archives-ouvertes.fr/hal-02436417
Neural machine translation of rare words with subword units, Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, vol.1, pp.1715-1725, 2016. ,
Dbnary: Wiktionary as a lmf based multilingual rdf network, Language Resources and Evaluation Conference, 2012. ,
Billions of parallel words for free: Building and using the EU bookshop corpus, Proceedings of the Ninth International Conference on Language Resources and Evaluation, LREC, pp.1850-1855, 2014. ,
, Portuguese named entity recognition using bert-crf, 2019.
The enronsent corpus, 2011. ,
Sequence to sequence learning with neural networks, Advances in neural information processing systems, pp.3104-3112, 2014. ,
Parallel data, tools and interfaces in opus, Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12), 2012. ,
Attention is all you need, Advances in neural information processing systems, pp.5998-6008, 2017. ,
Tensor2tensor for neural machine translation, Proceedings of the 13th Conference of the Association for Machine Translation in the Americas, vol.1, pp.193-199, 2018. ,
UF-SAC: Unification of Sense Annotated Corpora and Tools, Language Resources and Evaluation Conference (LREC), 2018. ,
Sense Vocabulary Compression through the Semantic Knowledge of WordNet for Neural Word Sense Disambiguation, Proceedings of the 10th Global Wordnet Conference, 2019. ,
URL : https://hal.archives-ouvertes.fr/hal-02131872
, Multilingual is not enough: Bert for finnish, 2019.
GLUE: A multi-task benchmark and analysis platform for natural language understanding, Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, pp.353-355, 2018. ,
Superglue: A stickier benchmark for generalpurpose language understanding systems, 2019. ,
Learning deep transformer models for machine translation, Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp.1810-1822, 2019. ,
A broad-coverage challenge corpus for sentence understanding through inference, Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol.1, pp.1112-1122, 2018. ,
Huggingface's transformers: State-of-the-art natural language processing, ArXiv, 2019. ,
Why deep transformers are difficult to converge? from computation order to lipschitz restricted parameter initialization, 2019. ,
, Paws-x: A cross-lingual adversarial dataset for paraphrase identification, 2019.
Xlnet: Generalized autoregressive pretraining for language understanding, Advances in neural information processing systems, 2019. ,
, Fixup initialization: Residual learning without normalization, 2019.
Paws: Paraphrase adversaries from word scrambling, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol.1, pp.1298-1308, 2019. ,