H. Adel, N. T. Vu, and T. Schultz, Combination of recurrent neural networks and factored language models for code-switching language modeling, Proceedings of ACL, pp.206-211, 2013.
DOI : 10.1109/icassp.2013.6639306

?. Agi´cagi´c, Cross-lingual parser selection for low-resource languages, Proceedings of the NoDaLiDa 2017 Workshop on Universal Dependencies, pp.1-10, 2017.

?. Agi´cagi´c, D. Hovy, and A. Søgaard, If all you have is a bit of the Bible: Learning POS taggers for truly low-resource languages, The 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference of the Asian Federation of Natural Language Processing (ACL -IJCNLP 2015), pp.268-272, 2015.

?. Agi´cagi´c, A. Johannsen, B. Plank, N. Héctor-alonso-martínez, A. Schluter et al., Multilingual projection for parsing truly low-resource languages, Transactions of the Association for Computational Linguistics, vol.4, p.301, 2016.

?. Agi´cagi´c, J. Tiedemann, K. Dobrovoljc, S. Krek, D. Merkler et al., Cross-lingual dependency parsing of related languages with rich morphosyntactic tagsets, Proceedings of the EMNLP 2014 Workshop on Language Technology for Closely Related Languages and Language Variants, pp.13-24, 2014.

M. Almeida, C. Sc, H. Pinto, P. Figueira, . Mendes et al., Aligning Opinions: Cross-Lingual Opinion Mining with Dependencies, Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp.408-418, 2015.
DOI : 10.3115/v1/P15-1040

URL : https://doi.org/10.3115/v1/p15-1040

W. Ammar, G. Mulcaire, M. Ballesteros, C. Dyer, A. Noah et al.,

M. Artetxe, G. Labaka, and E. Agirre, Learning bilingual word embeddings with (almost) no bilingual data, Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp.451-462, 2017.
DOI : 10.18653/v1/P17-1042

M. Artetxe, G. Labaka, E. Agirre, and K. Cho, Unsupervised neural machine translation, 2017.

E. Asgari and H. Schütze, Past, Present, Future: A Computational Investigation of the Typology of Tense in 1000 Languages, Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp.113-124, 2017.
DOI : 10.18653/v1/D17-1011

D. Bahdanau, K. Cho, and Y. Bengio, Neural machine translation by jointly learning to align and translate, Proceedings of the International Conference on Learning Representations (ICLR), 2015.

D. Bakker, Language sampling The Oxford handbook of linguistic typology, pp.100-127, 2010.

C. Banea, R. Mihalcea, J. Wiebe, and S. Hassan, Multilingual subjectivity analysis using machine translation, Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP '08, pp.127-135, 2008.
DOI : 10.3115/1613715.1613734

E. M. Bender, Linguistically na??ve != language independent, Proceedings of the EACL 2009 Workshop on the Interaction between Linguistics and Computational Linguistics Virtuous, Vicious or Vacuous?, ILCL '09, pp.26-32, 2009.
DOI : 10.3115/1642038.1642044

E. M. Bender, On achieving and evaluating language-independence in NLP. Linguistic Issues in Language Technology, pp.1-26, 2011.

E. M. Bender, Language collage: Grammatical description with the lingo grammar matrix, LREC, pp.2447-2451, 2014.

E. M. Bender, Abstract, Linguistic Typology, vol.16, issue.3, pp.645-660, 2016.
DOI : 11311325

E. M. Bender, J. Michael-wayne-goodman, F. Crowgey, and . Xia, Towards creating precision grammars from interlinear glossed text: Inferring large-scale typological properties, 2013.

B. Berlin and P. Kay, Basic color terms: Their universality and evolution, 1969.

Y. Berzak, R. Reichart, and B. Katz, Reconstructing Native Language Typology from Foreign Language Usage, Proceedings of the Eighteenth Conference on Computational Natural Language Learning, pp.21-29, 2014.
DOI : 10.3115/v1/W14-1603

Y. Berzak, R. Reichart, and B. Katz, Contrastive Analysis with Predictive Power: Typology Driven Estimation of Grammatical Error Distributions in ESL, Proceedings of the Nineteenth Conference on Computational Natural Language Learning, pp.94-102, 2015.
DOI : 10.18653/v1/K15-1010

B. Bickel, Typology in the 21st century: Major current developments, Linguistic Typology, vol.13, issue.1, pp.239-251, 2007.
DOI : 10.1515/LINGTY.2007.018

B. Bickel, Typology in the 21st century: Major current developments, Linguistic Typology, vol.13, issue.1, pp.239-251, 2007.
DOI : 10.1515/LINGTY.2007.018

B. Bickel, Distributional typology: statistical inquiries into the dynamics of linguistic diversity. Oxford handbook of linguistic analysis, pp.901-923, 2015.

. Bickel, J. Balthasar, T. Nichols, A. Zakharko, K. Witzlack-makarevich et al.,

J. Bjerva and I. Augenstein, From Phonology to Syntax: Unsupervised Linguistic Typology at Different Levels with Language Embeddings, Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 2018.
DOI : 10.18653/v1/N18-1083

P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov, Enriching word vectors with subword information, Transactions of the ACL, vol.5, pp.135-146, 2017.

J. A. Botha and P. Blunsom, Compositional morphology for word representations and language modelling, ICML, pp.1899-1907, 2014.

M. Bowerman and S. Choi, Shaping meanings for language: universal and language-specific in the acquisition of semantic categories. In Language acquisition and conceptual development, pp.475-511, 2001.

C. Braud, O. Lacroix, and A. Søgaard, Cross-lingual and cross-domain discourse segmentation of entire documents, Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp.237-243, 2017.
DOI : 10.18653/v1/P17-2037

J. Bybee, L. James, and . Mcclelland, Alternatives to the combinatorial paradigm of linguistic theory based on domain general principles of human cognition. The linguistic review, pp.381-410, 2005.

J. L. Bybee, The diachronic dimension in explanation, Explaining language universals. Basil Blackwell, pp.350-379, 1988.

S. Chandar, S. Lauly, H. Larochelle, M. Khapra, . Balaraman-ravindran et al., An autoencoder approach to learning bilingual word representations, Advances in Neural Information Processing Systems, pp.1853-1861, 2014.

M. Chang, L. Ratinov, and D. Roth, Guiding semi-supervision with constraint-driven learning, ACL, pp.280-287, 2007.

X. Chen, B. Athiwaratkun, Y. Sun, K. Weinberger, and C. Cardie, Adversarial deep averaging networks for cross-lingual sentiment classification. arXiv preprint, 2017.
DOI : 10.1162/tacl_a_00039

URL : https://doi.org/10.1162/tacl_a_00039

S. Cohen and N. A. Smith, Shared logistic normal distributions for soft parameter tying in unsupervised grammar induction, Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics on, NAACL '09, pp.74-82, 2009.
DOI : 10.3115/1620754.1620766

URL : http://dl.acm.org/ft_gateway.cfm?id=1620766&type=pdf

S. B. Cohen, Bayesian Analysis in Natural Language Processing. Synthesis Lectures on Human Language Technologies, 2016.

R. Coke, B. King, and D. R. Radev, Classifying syntactic regularities for hundreds of languages, 2016.

C. Collins and R. Kayne, Syntactic structures of the world's languages, 2009.

M. Collins, Discriminative training methods for hidden Markov models, Proceedings of the ACL-02 conference on Empirical methods in natural language processing , EMNLP '02, pp.1-8, 2002.
DOI : 10.3115/1118693.1118694

URL : http://dl.acm.org/ft_gateway.cfm?id=1118694&type=pdf

B. Comrie, Language universals and linguistic typology: Syntax and morphology, 1989.

A. Conneau, G. Lample, L. Marc-'aurelio-ranzato, H. Denoyer, and . Jégou, Word translation without parallel data. arXiv preprint, 2017.

A. Copestake, D. Flickinger, C. Pollard, A. Ivan, and . Sag, Minimal recursion semantics: An introduction. Research on language and computation, pp.281-332, 2005.
DOI : 10.1007/s11168-006-6327-9

G. G. Corbett, Implicational hierarchies The Oxford handbook of linguistic typology, pp.190-205, 2010.

R. Cotterell and J. Eisner, Probabilistic Typology: Deep Generative Models of Vowel Inventories, Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp.1182-1192, 2017.
DOI : 10.18653/v1/P17-1109

URL : https://doi.org/10.18653/v1/p17-1109

R. Cotterell and J. Eisner, A Deep Generative Model of Vowel Formant Typology, Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp.37-46, 2018.
DOI : 10.18653/v1/N18-1004

R. Cotterell and H. Schütze, Morphological Word-Embeddings, Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp.1287-1292, 2015.
DOI : 10.3115/v1/N15-1140

URL : https://doi.org/10.3115/v1/n15-1140

K. Crammer and Y. Singer, Ultraconservative Online Algorithms for Multiclass Problems, Journal of Machine Learning Research, vol.3, pp.951-991, 2003.
DOI : 10.1007/3-540-44581-1_7

S. Cristofaro and P. Ramat, Introduzione alla tipologia linguistica. Carocci, 1999.

W. Croft, Typology and universals, 2002.

W. Croft, D. Nordquist, K. Looney, and M. Regan, Linguistic typology meets universal dependencies, Proceedings of the 15th International Workshop on Treebanks and Linguistic Theories (TLT15), pp.63-75, 2017.

J. Daiber, M. Stanojevi´cstanojevi´c, and K. Sima, Universal reordering via linguistic typology, Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pp.3167-3176, 2016.

R. G. Andrade, The development of cognitive anthropology, 1995.

D. Das and S. Petrov, Unsupervised part-of-speech tagging with bilingual graph-based projections, ACL, pp.600-609, 2011.

I. Daumé and H. , Non-parametric Bayesian areal linguistics, Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics on, NAACL '09, pp.593-601, 2009.
DOI : 10.3115/1620754.1620841

I. Daumé, H. , and L. Campbell, A Bayesian model for discovering typological implications, ACL, pp.65-72, 2007.

D. Dediu and M. Cysouw, Some Structural Aspects of Language Are More Stable than Others: A Comparison of Seven Methods, PLoS ONE, vol.104, issue.1, p.55009, 2013.
DOI : 10.1371/journal.pone.0055009.s003

D. Dediu, C. Stephen, and . Levinson, Abstract Profiles of Structural Stability Point to Universal Tendencies, Family-Specific Factors, and Ancient Connections between Languages, PLoS ONE, vol.7, issue.9, p.45198, 2012.
DOI : 10.1371/journal.pone.0045198.s001

A. P. Dempster, M. Nan, . Laird, B. Donald, and . Rubin, Maximum likelihood from incomplete data via the em algorithm, Journal of the royal statistical society. Series B (methodological), pp.1-38, 1977.
DOI : 10.1111/j.2517-6161.1977.tb01600.x

A. Deri and K. Knight, Grapheme-to-Phoneme Models for (Almost) Any Language, Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp.399-408, 2016.
DOI : 10.18653/v1/P16-1038

URL : https://doi.org/10.18653/v1/p16-1038

M. Diab and . Talat, Word sense disambiguation within a multilingual framework, 2003.

R. Dixon and . Mw, Ergativity, 1994.

M. S. Dryer, Large linguistic areas and language sampling. Studies in Language. International Journal sponsored by the Foundation " Foundations of Language, pp.257-292, 1989.

M. S. Dryer and M. Haspelmath, 2013. WALS Online. Max Planck Institute for Evolutionary Anthropology

M. Dunn, J. Simon, . Greenhill, C. Stephen, . Levinson et al., Evolved structure of language shows lineage-specific trends in word-order universals, Nature, vol.32, issue.7345, p.47379, 2011.
DOI : 10.1017/S0140525X0999094X

L. Duong, T. Cohn, S. Bird, and P. Cook, Low Resource Dependency Parsing: Cross-lingual Parameter Sharing in a Neural Network Parser, Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pp.845-850, 2015.
DOI : 10.3115/v1/P15-2139

L. Duong, T. Cohn, S. Bird, and P. Cook, A Neural Network Model for Low-Resource Universal Dependency Parsing, Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp.339-348, 2015.
DOI : 10.18653/v1/D15-1040

L. Duong, H. Kanayama, T. Ma, S. Bird, and T. Cohn, Learning crosslingual word embeddings without bilingual corpora. arXiv preprint, 2016.

W. H. Durham, Coevolution: Genes, culture, and human diversity, 1991.

G. Durrett, A. Pauls, and D. Klein, Syntactic transfer using a bilingual lexicon, Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp.1-11, 2012.

N. Evans, Semantic typology In The Oxford Handbook of Linguistic Typology, pp.504-533, 2011.

N. Evans, C. Stephen, and . Levinson, The myth of language universals: Language diversity and its importance for cognitive science, Behavioral and Brain Sciences, vol.14, issue.05, pp.429-448, 2009.
DOI : 10.1515/lity.2005.9.1.115

M. Fang and T. Cohn, Model Transfer for Tagging Low-resource Languages using a Bilingual Dictionary, Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp.587-593, 2017.
DOI : 10.18653/v1/P17-2093

M. Faruqui, J. Dodge, S. Kumar-jauhar, C. Dyer, E. Hovy et al., Retrofitting Word Vectors to Semantic Lexicons, Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp.1606-1615, 2015.
DOI : 10.3115/v1/N15-1184

A. Fernández, A. Moreo, F. Esuli, and . Sebastiani, Distributional Correspondence Indexing for Cross-Lingual and Cross-Domain Sentiment Classification., Journal of Artificial Intelligence Research, vol.55, pp.131-163, 2015.
DOI : 10.1613/jair.4762

K. Ganchev and D. Das, Cross-lingual discriminative learning of sequence models with posterior regularization, pp.1996-2006, 2013.

K. Ganchev, J. Gillenwater, and B. Taskar, Dependency grammar induction via bitext projection constraints, Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1, ACL-IJCNLP '09, pp.369-377, 2009.
DOI : 10.3115/1687878.1687931

K. Ganchev, J. Gillenwater, and B. Taskar, Posterior regularization for structured latent variable models, Journal of Machine Learning Research, vol.11, pp.2001-2049, 2010.

N. Garg and J. Henderson, A bayesian model of multilingual unsupervised semantic role induction, 2016.

R. Georgi, F. Xia, and W. Lewis, Comparing language similarity across genetic and typologically-based groupings, In COLING, pp.385-393, 2010.

D. Gerz, E. M. Ponti, J. Naradowsky, R. Reichart, A. Korhonen et al., Language modeling for morphologically rich languages: Character-aware modeling for word-level prediction, Transactions of the Association of Computational Linguistics, 2018.

D. Gillick and C. Brunk, Oriol Vinyals, and Amarnag Subramanya. 2016. Multilingual language processing from bytes, Proceedings of NAACL-HLT, pp.1296-1306

A. Globerson and T. S. Jaakkola, Fixing max-product: Convergent message passing algorithms for MAP LP-relaxations, NIPS, pp.553-560, 2007.

R. Goedemans, J. Heinz, and H. Van-der-hulst, , 2014.

S. Gouws, Y. Bengio, and G. Corrado, Bilbowa: Fast bilingual distributed representations without word alignments, International Conference on Machine Learning, pp.748-756, 2015.

S. Gouws and A. Søgaard, Simple task-specific bilingual word embeddings, Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp.1386-1390, 2015.
DOI : 10.3115/v1/N15-1157

E. Grave and N. Elhadad, A convex and feature-rich discriminative approach to dependency grammar induction, Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp.1375-1384, 2015.
DOI : 10.3115/v1/P15-1133

J. H. Greenberg, Some universals of grammar with particular reference to the order of meaningful elements, pp.73-113, 1963.

J. H. Greenberg, Universals of language, 1966.
DOI : 10.1515/9783110899771

J. Guo, W. Che, H. Wang, and T. Liu, Exploiting multi-typed treebanks for parsing with deep multi-task learning. arXiv preprint, 2016.

J. Guo, W. Che, D. Yarowsky, H. Wang, and T. Liu, Cross-lingual Dependency Parsing Based on Distributed Representations, Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp.1234-1244, 2015.
DOI : 10.3115/v1/P15-1119

J. Guo, W. Che, D. Yarowsky, H. Wang, and T. Liu, A representation learning framework for multi-source transfer parsing, Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, pp.2734-2740, 2016.

T. Ha, J. Niehues, and A. Waibel, Toward multilingual neural machine translation with universal encoder and decoder. arXiv preprint, 2016.

H. Hammarström, R. Forkel, M. Haspelmath, and S. Bank, 2016. Glottolog 2.7. Max Planck Institute for the Science of Human History

J. Hana, A. Feldman, and C. Brew, A resource-light approach to Russian morphology: Tagging Russian using Czech resources, EMNLP, pp.222-229, 2004.

I. Hartmann, M. Haspelmath, and B. Taylor, 2013. Valency Patterns Leipzig. Max Planck Institute for Evolutionary Anthropology

M. Haspelmath, Optimality and diachronic adaptation, Zeitschrift f??r Sprachwissenschaft, vol.18, issue.2, pp.180-205, 1999.
DOI : 10.1515/zfsw.1999.18.2.180

M. Haspelmath, Pre-established categories don't exist: Consequences for language description and typology, Linguistic typology, pp.119-132, 2007.
DOI : 10.1162/002438906775321175

M. Haspelmath and U. Tadmor, WOLD. Max Planck Institute for Evolutionary Anthropology, 2009.

K. Hermann, P. Moritz, and . Blunsom, Multilingual distributed representations without word alignment. arXiv preprint arXiv, pp.1312-6173, 2013.

R. Hwa, P. Resnik, A. Weinberg, C. I. Cabezas, and O. Kolak, Bootstrapping parsers via syntactic projection across parallel texts, Natural Language Engineering, vol.11, issue.03, pp.311-325, 2005.
DOI : 10.1017/S1351324905003840

N. Ide, T. Erjavec, and D. Tufis, Sense discrimination with parallel corpora, Proceedings of the ACL-02 workshop on Word sense disambiguation recent successes and future directions -, pp.61-66, 2002.
DOI : 10.3115/1118675.1118683

URL : http://dl.acm.org/ft_gateway.cfm?id=1118683&type=pdf

. Jensen and V. Finn, An introduction to Bayesian networks, 1996.

M. Johnson, M. Schuster, V. Quoc, M. Le, Y. Krikun et al., Google's multilingual neural machine translation system: Enabling zero-shot translation, 2016.
DOI : 10.1162/tacl_a_00065

URL : https://doi.org/10.1162/tacl_a_00065

, 2015. IDS. Max Planck Institute for Evolutionary Anthropology

M. M. Khapra, A. Joshi, P. Chatterjee, and . Bhattacharyya, Together we can: Bilingual bootstrapping for WSD, ACL, pp.561-569, 2011.

S. Kim, K. Toutanova, and H. Yu, Multilingual named entity recognition using parallel data and metadata from Wikipedia, Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, pp.694-702, 2012.

A. Klementiev, I. Titov, and B. Bhattarai, Inducing crosslingual distributed representations of words, In COLING, pp.1459-1474, 2012.

N. Komodakis, N. Paragios, and G. Tziritas, MRF Energy Minimization and Beyond via Dual Decomposition, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.33, issue.3, pp.531-552, 2011.
DOI : 10.1109/TPAMI.2010.108

URL : https://hal.archives-ouvertes.fr/hal-00856311

M. Kozhevnikov and I. Titov, Cross-lingual transfer of semantic role labeling models, Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, pp.1190-1200, 2013.

S. Lauly, A. Boulanger, and H. Larochelle, Learning multilingual word representations using a bag-of-words autoencoder. arXiv preprint, 2014.

E. Lefever, V. Hoste, and M. D. Cock, Parasense or how to use parallel corpora for word sense disambiguation, Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, pp.317-322, 2011.

O. Levy and Y. Goldberg, Dependency-Based Word Embeddings, Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp.302-308, 2014.
DOI : 10.3115/v1/P14-2050

URL : https://doi.org/10.3115/v1/p14-2050

M. Lewis, . Paul, F. Gary, . Simons, D. Charles et al., Ethnologue: Languages of the world, 2016.

W. D. Lewis and F. Xia, Automatically identifying computationally relevant typological features, IJCNLP, pp.685-690, 2008.

S. Li, J. V. Graça, and B. Taskar, Wiki-ly supervised part-of-speech tagging, Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp.1389-1398, 2012.

P. Littel, D. R. Mortensen, and L. Levin,

H. Liu, Dependency direction as a means of word-order typology: A method based on dependency treebanks, Lingua, vol.120, issue.6, pp.1567-1578, 2010.
DOI : 10.1016/j.lingua.2009.10.001

X. Lu, Exploring word order universals: a probabilistic graphical model approach, Proceedings of ACL (Student Research Workshop), pp.150-157, 2013.

T. Luong, H. Pham, D. Christopher, and . Manning, Bilingual Word Representations with Monolingual Quality in Mind, Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing, pp.151-159, 2015.
DOI : 10.3115/v1/W15-1521

URL : https://doi.org/10.3115/v1/w15-1521

X. Ma and F. Xia, Unsupervised Dependency Parsing with Transferring Distribution via Parallel Guidance and Entropy Regularization, Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp.1337-1348, 2014.
DOI : 10.3115/v1/P14-1126

I. Maddieson, S. Flavier, E. Marsico, C. Coupé, and F. Pellegrino, LAPSyd: Lyon-Albuquerque phonological systems database, INTERSPEECH, pp.3022-3026, 2013.
URL : https://hal.archives-ouvertes.fr/halshs-01179104

A. Majid, M. Bowerman, M. Van-staden, S. James, and . Boster, The semantic categories of cutting and breaking events: A crosslinguistic perspective, Cognitive Linguistics, vol.6, issue.2, pp.133-152, 2007.
DOI : 10.1515/cogl.1995.6.2-3.209

C. Malaviya, G. Neubig, and P. Littell, Learning Language Representations for Typology Prediction, Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp.2529-2535, 2017.
DOI : 10.18653/v1/D17-1268

G. S. Mann and A. Mccallum, Generalized expectation criteria for semi-supervised learning of conditional random fields, ACL, pp.870-878, 2008.

T. Mayer and M. Cysouw, Language comparison through sparse multilingual word alignment, Proceedings of the EACL 2012 Joint Workshop of LINGVIS & UNCLH, pp.54-62, 2012.

R. Mcdonald, K. Crammer, and F. Pereira, Online large-margin training of dependency parsers, Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics , ACL '05, pp.91-98, 2005.
DOI : 10.3115/1219840.1219852

R. Mcdonald, S. Petrov, and K. Hall, Multi-source transfer of delexicalized dependency parsers, EMNLP, pp.62-72, 2011.

S. Michaelis, P. Maria, and . Maurer, 2013. Atlas of Pidgin and Creole Language Structures Online. Max Planck Institute for Evolutionary Anthropology

T. Mikolov, V. Quoc, I. Le, and . Sutskever, Exploiting similarities among languages for machine translation. arXiv preprint, 2013.

T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, Distributed representations of words and phrases and their compositionality, NIPS, pp.3111-3119, 2013.

S. Moran, D. Mccloy, and R. Wright, 2014. PHOIBLE Online. Max Planck Institute for Evolutionary Anthropology

N. Mrk?i´cmrk?i´c, I. Vuli´cvuli´c, Ó. Diarmuid, I. Séaghdha, R. Leviant et al., Semantic specialization of distributional word vector spaces using monolingual and cross-lingual constraints, Transactions of the Association of Computational Linguistics, vol.5, issue.1, pp.309-324, 2017.

N. Mrk?i´cmrk?i´c, Ó. Diarmuid, B. Séaghdha, M. Thomson, L. Ga?i´cga?i´c et al., Counter-fitting word vectors to linguistic constraints, NAACL-HLT, pp.142-148, 2016.

Y. Murawaki, Diachrony-aware induction of binary latent representations from typological features, Proceedings of the Eighth International Joint Conference on Natural Language Processing, pp.451-461, 2017.

T. Naseem, R. Barzilay, and A. Globerson, Selective sharing for multilingual dependency parsing, ACL, pp.629-637, 2012.

T. Naseem, J. Benjamin-book, R. Eisenstein, and . Barzilay, Multilingual Part-of-Speech Tagging: Two Unsupervised Approaches, Journal of Artificial Intelligence Research, vol.36, pp.341-385, 2009.
DOI : 10.1613/jair.2843

T. Naseem, H. Chen, R. Barzilay, and M. Johnson, Using universal linguistic knowledge to guide grammar induction, Proc. of EMNLP, 2010.

R. Navigli, Word sense disambiguation, ACM Computing Surveys, vol.41, issue.2, p.10, 2009.
DOI : 10.1145/1459352.1459355

J. Nichols, Language diversity in space and time, 1992.

J. Niehues, T. Herrmann, S. Vogel, and A. Waibel, Wider context by using bilingual language models in machine translation, Proceedings of the Sixth Workshop on Statistical Machine Translation, pp.198-206, 2011.

J. Nivre, M. De-marneffe, F. Ginter, Y. Goldberg-hajic, C. D. Manning et al., Universal dependencies v1: A multilingual treebank collection, LREC, pp.1659-1666, 2016.

O. Horan, Y. Helen, I. Berzak, R. Vuli´cvuli´c, A. Reichart et al., Survey on the use of typological information in natural language processing, Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pp.1297-1308, 2016.

D. Osborne, S. Narayan, and S. B. Cohen, Encoding prior knowledge with eigenword embeddings, Transactions of the ACL, 2016.
DOI : 10.1162/tacl_a_00108

URL : https://doi.org/10.1162/tacl_a_00108

R. Östling, Word order typology through multilingual word alignment, ACL, pp.205-211, 2015.

R. Östling and J. Tiedemann, Continuous multilinguality with language vectors. arXiv preprint, 2016.

S. Padó and M. Lapata, Cross-linguistic projection of role-semantic information, Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing , HLT '05, pp.859-866, 2005.
DOI : 10.3115/1220575.1220683

S. Padó and M. Lapata, Cross-lingual Annotation Projection for Semantic Roles, Journal of Artificial Intelligence Research, vol.36, issue.1, pp.307-340, 2009.
DOI : 10.1613/jair.2863

N. Pappas and A. Popescu-belis, Multilingual hierarchical attention networks for document classification, 8th International Joint Conference on Natural Language Processing (IJCNLP), p.231134, 2017.

A. Pawley, A language which defies description by ordinary means The role of theory in language description, pp.87-130, 1993.

M. E. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark et al., Deep Contextualized Word Representations, Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp.2227-2237, 2018.
DOI : 10.18653/v1/N18-1202

URL : https://doi.org/10.18653/v1/n18-1202

F. Plank and E. Filiminova, Universals archive, 1996.
DOI : 10.1515/lingty.2006.015

. Van-der-plas, P. Lonneke, J. Merlo, and . Henderson, Scaling up automatic cross-lingual semantic role annotation, Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, pp.299-304, 2011.

E. Ponti, R. Maria, A. Reichart, I. Korhonen, and . Vuli´cvuli´c, Isomorphic transfer of syntactic structures for cross-lingual nlp, 2018.

E. Ponti, I. Maria, A. Vuli´cvuli´c, and . Korhonen, Decoding Sentiment from Distributed Representations of Sentences, Proceedings of the 6th Joint Conference on Lexical and Computational Semantics (*SEM 2017), pp.22-32, 2017.
DOI : 10.18653/v1/S17-1003

P. Prettenhofer and B. Stein, Cross-language text classification using structural correspondence learning, Proceedings of the 48th annual meeting of the association for computational linguistics, pp.1118-1127, 2010.
DOI : 10.1145/2036264.2036277

URL : http://arxiv.org/pdf/1008.0716.pdf

M. Rasooli, M. Sadegh, and . Collins, Density-Driven Cross-Lingual Transfer of Dependency Parsers, Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp.328-338, 2015.
DOI : 10.18653/v1/D15-1039

URL : https://doi.org/10.18653/v1/d15-1039

R. Rosa and Z. Zabokrtsky, KLcpos3 - a Language Similarity Measure for Delexicalized Parser Transfer, Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pp.243-249, 2015.
DOI : 10.3115/v1/P15-2040

URL : https://doi.org/10.3115/v1/p15-2040

M. Ross, Social networks and kinds of speech community event, Archaeology and Language, I. Routledge, pp.209-261, 1997.

S. Rothe and H. Schütze, AutoExtend: Extending Word Embeddings to Embeddings for Synsets and Lexemes, Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp.1793-1803, 2015.
DOI : 10.3115/v1/P15-1173

G. Rotman, I. Vuli´cvuli´c, and R. Reichart, Bridging languages through images with deep partial canonical correlation analysis, Proceedings of ACL 2018, 2018.

R. Roy, R. Saha, N. Katare, M. Ganguly, and . Choudhury, Automatic discovery of adposition typology, Proceedings of COLING, pp.1037-1046, 2014.

S. Ruder, A survey of cross-lingual embedding models, Journal of Artificial Intelligence Research, 2018.

E. Sapir, Friedrich. 1808. Über die Sprache und Weisheit der Indier, 1921.

P. Schone and D. Jurafsky, Language-independent induction of part of speech class labels using only language universals, IJCAI-2001 Workshop "Text Learning: Beyond Supervision, 2001.

R. Sennrich and B. Haddow, Linguistic Input Features Improve Neural Machine Translation, Proceedings of the First Conference on Machine Translation: Volume 1, Research Papers, pp.83-91, 2016.
DOI : 10.18653/v1/W16-2209

URL : https://doi.org/10.18653/v1/w16-2209

C. Silberer and S. P. Ponzetto, UHD: Cross-lingual word sense disambiguation using multilingual co-occurrence graphs, Proceedings of the 5th International Workshop on Semantic Evaluation, pp.134-137, 2010.

D. A. Smith and J. Eisner, Parser adaptation and projection with quasi-synchronous grammar features, Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing Volume 2, EMNLP '09, pp.822-831, 2009.
DOI : 10.3115/1699571.1699620

URL : http://dl.acm.org/ft_gateway.cfm?id=1699620&type=pdf

B. Snyder, Unsupervised Multilingual Learning, 2010.
DOI : 10.3115/1613715.1613851

URL : http://dl.acm.org/ft_gateway.cfm?id=1613851&type=pdf

B. Snyder and R. Barzilay, Unsupervised multilingual learning for morphological segmentation, Proceedings of ACL-08: HLT, pp.737-745, 2008.

B. Snyder, T. Naseem, J. Eisenstein, and R. Barzilay, Adding more languages improves unsupervised multilingual part-of-speech tagging, Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics on, NAACL '09, pp.83-91, 2009.
DOI : 10.3115/1620754.1620767

URL : http://dl.acm.org/ft_gateway.cfm?id=1620767&type=pdf

A. Søgaard, Data point selection for cross-language adaptation of dependency parsers, ACL, pp.682-686, 2011.

A. Søgaard, ?. Agi´cagi´c, B. Héctor-martínez-alonso, B. Plank, A. Bohnet et al., Inverted indexing for cross-lingual NLP, Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp.1713-1722, 2015.
DOI : 10.3115/v1/P15-1165

A. Søgaard and J. Wulff, An empirical study of non-lexical extensions to delexicalized transfer, Proceedings of COLING 2012: Posters, pp.1181-1190, 2012.

V. I. Spitkovsky, D. Hiyan-alshawi, and . Jurafsky, Lateen EM: Unsupervised training with multiple objectives, applied to dependency grammar induction, EMNLP, pp.1269-1280, 2011.

K. Spreyer and J. Kuhn, Data-driven dependency parsing of new languages using incomplete and noisy training data, Proceedings of the Thirteenth Conference on Computational Natural Language Learning, CoNLL '09, pp.12-20, 2009.
DOI : 10.3115/1596374.1596380

R. Sproat, Language typology in speech and language technology, Linguistic Typology, vol.20, issue.3, pp.635-644, 2016.
DOI : 10.1515/lingty-2016-0034

O. Täckström, D. Das, S. Petrov, R. Mcdonald, and J. Nivre, Token and type constraints for cross-lingual part-of-speech tagging, Transactions of the Association for Computational Linguistics, vol.1, pp.1-12, 2013.

O. Täckström, R. Mcdonald, and J. Nivre, Target language adaptation of discriminative transfer parsers, NAACL-HLT, pp.1061-1071, 2013.

O. Täckström, R. Mcdonald, and J. Uszkoreit, Cross-lingual word clusters for direct transfer of linguistic structure, NAACL-HLT, pp.477-487, 2012.

K. Tai, R. Sheng, . Socher, D. Christopher, and . Manning, Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks, Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp.1556-1566, 2015.
DOI : 10.3115/v1/P15-1150

URL : https://doi.org/10.3115/v1/p15-1150

H. Takamura, R. Nagata, and Y. Kawasaki, Discriminative analysis of linguistic features for typological study, LREC, pp.69-76, 2016.

L. Talmy, Path to Realization: A Typology of Event Conflation, Proceedings of the Seventeenth Annual Meeting of the Berkeley Linguistics Society: General Session and Parasession on The Grammar of Event Structure, pp.480-519, 1991.
DOI : 10.3765/bls.v17i0.1620

B. Taskar, C. Guestrin, and D. Koller, Max-margin Markov networks, NIPS, pp.25-32, 2004.

Y. Teh, H. Whye, I. Daumé, M. Daniel, and . Roy, Bayesian agglomerative clustering with coalescents, Proceedings of NIPS, pp.1473-1480, 2007.

J. Tiedemann, Cross-lingual dependency parsing with universal dependencies and predicted pos labels, Proceedings of the Third International Conference on Dependency Linguistics, pp.340-349, 2015.

I. Titov and A. Klementiev, Crosslingual induction of semantic roles, Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers, pp.647-656, 2012.

Y. Tsvetkov, S. Sitaram, M. Faruqui, G. Lample, P. Littell et al., Polyglot Neural Language Models: A Case Study in Cross-Lingual Phonetic Representation Learning, Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp.1357-1366, 2016.
DOI : 10.18653/v1/N16-1161

URL : https://doi.org/10.18653/v1/n16-1161

S. Upadhyay, M. Faruqui, C. Dyer, and D. Roth, Cross-lingual Models of Word Embeddings: An Empirical Comparison, Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp.1661-1670, 2016.
DOI : 10.18653/v1/P16-1157

URL : https://doi.org/10.18653/v1/p16-1157

I. Vuli´cvuli´c, W. D. Smet, and M. Moens, Identifying word translations from comparable corpora using latent topic models, Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers, pp.479-484, 2011.

I. Vuli´cvuli´c, G. Glava?, N. Mrk?i´cmrk?i´c, and A. Korhonen, Post-specialisation: Retrofitting vectors of words unseen in lexical resources, Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp.516-527, 2018.

I. Vuli´cvuli´c and A. Korhonen, Is " universal syntax " universally useful for learning distributed word representations?, The 54th Annual Meeting of the Association for Computational Linguistics, pp.518-524, 2016.

I. Vuli´cvuli´c and M. Moens, Bilingual word embeddings from non-parallel document-aligned data applied to bilingual lexicon induction, Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, pp.719-725, 2015.

I. Vuli´cvuli´c, N. Mrk?i´cmrk?i´c, R. Reichart, Ó. Diarmuid, S. Séaghdha et al., Morph-fitting: Fine-tuning word vector spaces with simple language-specific rules, Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, pp.56-68, 2017.

I. Vuli´cvuli´c, N. Mrk?i´cmrk?i´c, R. Reichart, Ó. Diarmuid, S. Séaghdha et al., Morph-fitting: Fine-tuning word vector spaces with simple language-specific rules, Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL), pp.56-68, 2017.

B. Wälchli and M. Cysouw, Lexical typology through similarity semantics: Toward a semantic map of motion verbs, Linguistics, vol.50, issue.3, pp.671-710, 2012.

X. Wan, Co-training for cross-lingual sentiment classification, Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1, ACL-IJCNLP '09, pp.235-243, 2009.
DOI : 10.3115/1687878.1687913

URL : http://dl.acm.org/ft_gateway.cfm?id=1687913&type=pdf

D. Wang and J. Eisner, The galactic dependencies treebanks: Getting more data by synthesizing new languages, Transactions of the Association for Computational Linguistics, vol.4, pp.491-505, 2016.
DOI : 10.1162/tacl_a_00113

URL : https://doi.org/10.1162/tacl_a_00113

D. Wang and J. Eisner, Fine-grained prediction of syntactic typology: Discovering latent structure with supervised learning, Transactions of the Association for Computational Linguistics, vol.5, 2017.

M. Wang, D. Christopher, and . Manning, Cross-lingual pseudo-projected expectation regularization for weakly supervised learning, Transactions of the Association for Computational Linguistics, vol.2, pp.55-66, 2014.
DOI : 10.1162/tacl_a_00165

URL : https://doi.org/10.1162/tacl_a_00165

S. Wichmann, E. W. Holman, and C. H. Brown, The ASJP Database (version 17) Max Planck Institute for Evolutionary Anthropology, 2016.

G. Wisniewski, N. Pécheux, S. Gahbiche-braham, and F. Yvon, Cross-Lingual Part-of-Speech Tagging through Ambiguous Learning, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp.1779-1785, 2014.
DOI : 10.3115/v1/D14-1187

URL : https://hal.archives-ouvertes.fr/hal-01908356

M. Xiao and Y. Guo, Distributed Word Representation Learning for Cross-Lingual Dependency Parsing, Proceedings of the Eighteenth Conference on Computational Natural Language Learning, pp.119-129, 2014.
DOI : 10.3115/v1/W14-1613

Z. Yang, R. Salakhutdinov, and W. Cohen, Multi-task cross-lingual sequence tagging from scratch. arXiv preprint, 2016.

D. Yarowsky, G. Ngai, and R. Wicentowski, Inducing multilingual text analysis tools via robust projection across aligned corpora, Proceedings of the first international conference on Human language technology research , HLT '01, pp.1-8, 2001.
DOI : 10.3115/1072133.1072187

URL : http://www.cs.jhu.edu/~yarowsky/pdfpubs/hlt2001.pdf

D. Zeman and P. Resnik, Cross-language parser adaptation between related languages, Proceedings of IJCNLP, pp.35-42, 2008.

O. Zennaki, N. Semmar, and L. Besacier, Inducing multilingual text analysis tools using bidirectional recurrent neural networks, pp.450-460, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01374205

D. Zhang, Q. Mei, and C. Zhai, Cross-lingual latent topic extraction, Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp.1128-1137, 2010.

Y. Zhang and R. Barzilay, Hierarchical Low-Rank Tensors for Multilingual Transfer Parsing, Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp.1857-1867, 2015.
DOI : 10.18653/v1/D15-1213

URL : https://doi.org/10.18653/v1/d15-1213

Y. Zhang, D. Gaddy, R. Barzilay, and T. Jaakkola, Ten Pairs to Tag ??? Multilingual POS Tagging via Coarse Mapping between Embeddings, Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp.1307-1317, 2016.
DOI : 10.18653/v1/N16-1156

URL : https://doi.org/10.18653/v1/n16-1156

Y. Zhang, R. Reichart, R. Barzilay, and A. Globerson, Learning to map into a universal POS tagset, EMNLP, pp.1368-1378, 2012.

G. Zhou, T. H. Zhao, and W. Wu, A subspace learning framework for cross-lingual sentiment classification with partial parallel data, Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI 2015), pp.1426-1432, 2015.

X. Zhou, X. Wan, and J. Xiao, Cross-Lingual Sentiment Classification with Bilingual Document Representation Learning, Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp.1403-1412, 2016.
DOI : 10.18653/v1/P16-1133

URL : https://doi.org/10.18653/v1/p16-1133

W. Y. Zou, R. Socher, D. Cer, D. Christopher, and . Manning, Bilingual word embeddings for phrase-based machine translation, Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp.1393-1398, 2013.