T. Mihaylov and A. Frank, Discourse-aware semantic selfattention for narrative reading comprehension, Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP-IJCNLP), pp.2541-2552, 2019.

C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang et al., Exploring the limits of transfer learning with a unified text-to-text transformer, 2019.

M. Sap, R. Lebras, E. Allaway, C. Bhagavatula, N. Lourie et al., ATOMIC: an atlas of machine commonsense for if-then reasoning, 2018.

R. Speer and C. Havasi, Representing general relational knowledge in ConceptNet 5, Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pp.3679-3686, 2012.

Y. Sun, S. Wang, Y. Li, S. Feng, H. Hao-tian et al., ERNIE 2.0: A continual pre-training framework for language understanding, 2019.

I. Tenney, D. Das, and E. Pavlick, BERT rediscovers the classical NLP pipeline, Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp.4593-4601, 2019.

A. Wang, Y. Pruksachatkun, N. Nangia, A. Singh, J. Michael et al., Superglue: A stickier benchmark for general-purpose language understanding systems, 2019.

Z. Zhang, X. Han, Z. Liu, X. Jiang, M. Sun et al., ERNIE: Enhanced language representation with informative entities, Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp.1441-1451, 2019.

Z. Zhang, Y. Wu, H. Zhao, Z. Li, S. Zhang et al., Semantics-aware BERT for language understanding, the Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-2020), 2020.