Y. Bengio, R. Ducharme, P. Vincent, and C. Jauvin, Neural Probabilistic Language Models, Journal of machine learning research, vol.3, issue.Feb, pp.1137-1155, 2003.
DOI : 10.1007/3-540-33486-6_6

URL : https://hal.archives-ouvertes.fr/hal-01434258

Y. Chen and A. Eisele, Multiun v2: Un documents with multilingual alignments, LREC, pp.2500-2504, 2012.

V. Chvatal and D. Sankoff, Summary, Journal of Applied Probability, vol.12, issue.02, pp.306-315, 1975.
DOI : 10.1007/BF01504345

R. Collobert and J. Weston, A unified architecture for natural language processing, Proceedings of the 25th international conference on Machine learning, ICML '08, pp.160-167, 2008.
DOI : 10.1145/1390156.1390177

Z. Dong and Q. Dong, Hownet-a hybrid language and knowledge resource, Natural Language Processing and Knowledge Engineering Proceedings. 2003 International Conference on, pp.820-824, 2003.

S. Gahbiche-braham, H. Bonneau-maynard, T. Lavergne, and F. Yvon, Joint segmentation and pos tagging for arabic using a crfbased classifier, LREC, pp.2107-2113, 2012.

. Khaleej, Khaleej and watan corpus https://sites.google.com/site/mouradabbas9/corpora, ksucorpus. 2012. King saud university corpus, 2004.

I. Vladimir and . Levenshtein, Binary codes capable of correcting deletions, insertions, and reversals, Soviet physics doklady, pp.707-710, 1966.

C. Lioma and R. Blanco, Part of Speech Based Term Weighting for Information Retrieval, European Conference on Information Retrieval, pp.412-423, 2009.
DOI : 10.1108/eb026526

D. Christopher, P. Manning, H. Raghavan, and . Schütze, Introduction to information retrieval, 2008.

. Meedan, Meedan's open source arabic english , https://github.com/anastaw/meedan-memory, 2012.

T. Mikolov, M. Karafiát, and L. B. Cernock, Cernock`y, and Sanjeev Khudanpur Recurrent neural network based language model, Interspeech, p.3, 2010.

T. Mikolov, K. Chen, G. Corrado, and J. Dean, Efficient estimation of word representations in vector space, In: ICLR: Proceeding of the International Conference on Learning Representations Workshop Track, pp.1301-3781, 2013.

T. Mikolov, I. Sutskever, K. Chen, S. Greg, J. Corrado et al., Distributed representations of words and phrases and their compositionality, Advances in neural information processing systems, pp.3111-3119, 2013.

T. Mikolov, W. Yih, and G. Zweig, Linguistic regularities in continuous space word representations, Hlt-naacl, pp.746-751, 2013.

A. George and . Miller, Wordnet: a lexical database for english, Communications of the ACM, vol.38, issue.11, pp.39-41, 1995.

A. Mnih, E. Geoffrey, and . Hinton, A scalable hierarchical distributed language model, Advances in Neural Information Processing Systems 21, pp.1081-1088, 2009.

. Msr-video, Microsoft research video corpus , https://www.microsoft.com/en-us/download/ details.aspx?id=52422, 2016.

R. Navigli and S. P. Ponzetto, BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network, Artificial Intelligence, vol.193, pp.217-250, 2012.
DOI : 10.1016/j.artint.2012.07.001

B. Saul, . Needleman, D. Christian, and . Wunsch, A general method applicable to the search for similarities in the amino acid sequence of two proteins, Journal of molecular biology, vol.48, issue.3, pp.443-453, 1970.

J. Pennington, R. Socher, D. Christopher, and . Manning, Glove: Global Vectors for Word Representation, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp.1532-1543, 2014.
DOI : 10.3115/v1/D14-1162

URL : http://nlp.stanford.edu/projects/glove/glove.pdf

. Quran, Raw quran text, 2007.

M. Hazem, . Raafat, A. Mohamed, M. Zahran, and . Rashwan, Arabase-a database combining different arabic resources with lexical and semantic information, In KDIR/KMIS, pp.233-240, 2013.

K. Motaz, W. Saad, and . Ashour, Osac: Open source arabic corpora, 6th ArchEng Int. Symposiums , EEECS, 2010.

G. Salton and C. Buckley, Termweighting approaches in automatic text retrieval. Information processing & management, pp.513-523, 1988.
DOI : 10.1016/0306-4573(88)90021-0

URL : http://ecommons.cornell.edu/bitstream/1813/6721/1/87-881.pdf

D. Schwab, Approche hybride-lexicale et thématique-pour la modélisation, la détection et lexploitation des fonctions lexicales en vue de lanalyse sémantique de texte, 2005.

G. Sérasset, DBnary: Wiktionary as a Lemon-based multilingual lexical resource in RDF, Semantic Web, vol.46, issue.4, pp.355-361, 2015.
DOI : 10.1007/s10579-012-9182-3

J. Tiedemann, News from OPUS ??? A collection of multilingual parallel corpora with tools and interfaces, Recent advances in natural language processing, pp.237-248, 2009.
DOI : 10.1075/cilt.309.19tie

J. Tiedemann, Parallel data, tools and interfaces in opus, LREC, pp.2214-2218, 2012.

J. Turian, L. Ratinov, and Y. Bengio, Word representations: a simple and general method for semi-supervised learning, Proceedings of the 48th annual meeting of the association for computational linguistics, pp.384-394, 2010.

D. Peter, P. Turney, and . Pantel, From frequency to meaning: Vector space models of semantics, Journal of artificial intelligence research, vol.37, pp.141-188, 2010.

. Wikiar, Arabic wikipedia corpus, 2006.

E. William and . Winkler, The state of record linkage and current research problems, Statistical Research Division, 1999.

A. Mohamed, A. Zahran, . Magooda, Y. Ashraf, H. Mahgoub et al., Word representations in vector space and their applications for arabic, International Conference on Intelligent Text Processing and Computational Linguistics, pp.430-443, 2015.