D. Le-tableau, 3 présente les résultats de cette évaluation ainsi qu'une analyse d'erreurs. La proportion de liens correctement extraits va de 78% (espagnol) à 97.3% (néerlandais) Ces chiffres concernent l'évaluation avant la phase de filtre, décrite à la section 9

D. Le-tableau, 3 indique un score moyen de 87,6% pour l'extraction non filtrée. Toutefois, ce score est biaisé parce que les éditions de langues les plus modestes sont sur-représentées. En pondérant les scores par la taille de l'extraction (tableau D.2), on obtiens une estimation de score plus fiable

J. Aberdeen, S. Bayer, R. Yeniterzi, B. Wellner, C. Clark et al., The MITRE Identification Scrubber Toolkit: Design, training, and assessment, International Journal of Medical Informatics, vol.79, issue.12, pp.79-849, 2010.
DOI : 10.1016/j.ijmedinf.2010.09.007

A. Ahmad, S. M. Halawani, and I. A. Albidewi, Novel ensemble methods for regression via classification problems, Expert Systems with Applications, vol.39, issue.7, pp.6396-6401, 2012.
DOI : 10.1016/j.eswa.2011.12.029

K. Ahmad, L. Gillam, and L. Tostevin, University of Surrey Participation in TREC8 : Weirdness indexing for logical document extrapolation and retrieval (wilder), 1999.

I. A. Al-sughaiyer and I. A. Al-kharashi, Arabic morphological analysis techniques: A comprehensive survey, Journal of the American Society for Information Science and Technology, vol.16, issue.3, pp.189-213, 2004.
DOI : 10.1002/asi.10368

L. Al-sulaiti and E. S. Atwell, The design of a corpus of Contemporary Arabic, International Journal of Corpus Linguistics, vol.11, issue.2, pp.135-171, 2006.
DOI : 10.1075/ijcl.11.2.02als

A. Alajmi, E. Saad, and R. Darwish, Toward an arabic stop-words list generation, International Journal of Computer Applications, p.46, 2012.

M. Alamgir and U. Von-luxburg, Multi-agent Random Walks for Local Clustering on Graphs, 2010 IEEE International Conference on Data Mining, pp.18-27, 2010.
DOI : 10.1109/ICDM.2010.87

L. Alexeeva, Proceedings of the theoretical foundations of terminology comparison between eastern europe and western countries in conjunction with the 14th European Symposium on Language for Special Purposes (LSP), chapter Interaction Between Terminology and Philosophy, 2006.

J. Allwood, A. Hendrikse, and E. Ahlse?nahlse?n, Words and alternative basic units for linguistic analysis. Linguistic Theory and Raw Sound, pp.9-25, 2010.

S. Ananiadou, A methodology for automatic term recognition, Proceedings of the 15th conference on Computational linguistics -, pp.1034-1038, 1994.
DOI : 10.3115/991250.991317

S. Anderson, A-Morphoo Morpholoo. Cambridge Studies in Linguistics, 1992.

S. R. Anderson, The morpheme : Its nature and use. The Oxford Handbook of Inflection, 2013.

J. Atserias, L. Villarejo, G. Rigau, E. Agirre, J. Carroll et al., The meaning multilingual central repository, Proceedings of the Second International WordNet Conference, pp.80-210, 2004.

S. Aubin and T. Hamon, Improving Term Extraction with Terminological Resources, Advancc in Natural Language Processing, pp.380-387, 2006.
DOI : 10.1007/11816508_39

URL : https://hal.archives-ouvertes.fr/hal-00091444

T. Baccouche and S. Mejri, Norme grammaticale et description linguistique??: le cas de l'arabe, Langages, vol.167, issue.3, pp.27-37, 2007.
DOI : 10.3917/lang.167.0027

URL : https://hal.archives-ouvertes.fr/halshs-00410716

E. Badawi, M. Carter, M. Carter, and A. Gully, Modern Written Arabic : A Comprehensive Grammar Routledge Comprehensive Grammars, 2013.

D. Bakker, ?. Mu, A. Velupillai, V. Wichmann, S. Brown et al., Adding typology to lexicostatistics: A combined approach to language classification, Linguistic Typoloo, pp.169-181, 2009.
DOI : 10.1515/LITY.2009.009

T. Baldwin, J. Pool, and S. M. Colowick, Panlex and lextract : Translating all words of all languages of the world, COLING 2010, 23rd International Conference on Computational Linguistics, Demonstrations Volume, pp.23-27, 2010.

A. Baroni, Alphabetic vs. non-alphabetic writing : Linguistic fit and natural tendencies, Rivista di Linguistica, vol.23, issue.2, pp.127-159, 2011.

M. Baroni, S. Bernardini, A. Ferraresi, and E. Zanchetta, The wacky wide web : a collection of very large linguistically processed web-crawled corpora. Language resourcc and evaluation, pp.209-226, 2009.

M. Baroni and A. Kilgarriff, Large linguistically-processed web corpora for multiple languages, Proceedings of the Eleventh Conference of the European Chapter of the Association for Computational Linguistics: Posters & Demonstrations on, EACL '06, pp.87-90, 2006.
DOI : 10.3115/1608974.1608976

M. Bastian, S. Heymann, and M. Jacomy, Gephi : an open source software for exploring and manipulating networks, ICWSM, vol.8, pp.361-362, 2009.

J. Baudouin-de-courtenay, A Baudouin de Courtenay anthology, 1895.

S. D. Bay, Multivariate discretization of continuous variables for set mining, Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining , KDD '00, pp.315-319, 2000.
DOI : 10.1145/347090.347159

K. R. Beesley, Finite-state morphological analysis and generation of Arabic at Xerox Research : Status and plans in 2001, The Arabic Language Processing : Statt and Prospect?39th Annual Meeting of the Association for Computational Linguistics, pp.1-8, 2001.

R. Benczes, Creative Compounding in English : The Semantics of Metaphorical and Metonymical Noun-noun Combinations. Human cognitive processing, 2006.
DOI : 10.1075/hcp.19

E. Bender, On achieving and evaluating language-independence in NLP, Linguistic Issuu in Language Technoloo, vol.6, issue.0, 2011.

C. Bentz, D. Kiela, F. Hill, and P. Buttery, Zipf's law and the grammar of languages (abstract). A quantitative study of Old and Modern, 2014.

D. Bernhard, Apprentissage non supervisé de familles morphologiques : Comparaison de méthodes et aspects multilingues, Traitement Automatique dd Languu, vol.51, issue.2, pp.11-39, 2010.

B. Bhat, L. Poddar, and P. Bhattacharyya, IndoNet : A Multilingual Lexical Knowledge Network for Indian Languages, Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), 2013.

A. Bisetto and S. Scalise, Classification of compounds. university of bologna, 2009.

F. Bond, Z. Chang, and K. Uchimoto, Extracting bilingual terms from mainly monolingual data, 14th Annual Meeting of the Association for Natural Language Processing, 2008.

S. Boulaknadel, B. Daille, and D. Aboutajdine, A multi-word term extraction program for Arabic language, Proceedings of the Sixth International Conference on Language Resourcc and Evaluation (LREC'08) European Language Resources Association (ELRA), 2008.

D. Bourigault, Surface grammatical analysis for the extraction of terminological noun phrases, Proceedings of the 14th conference on Computational linguistics -, pp.977-981, 1992.
DOI : 10.3115/992383.992415

D. Bourigault, N. Aussenac-gilles, and J. Charlet, Construction de ressources terminologiques ou ontologiques ?? partir de textes Un cadre unificateur pour trois ??tudes de cas, Revue d'intelligence artificielle, vol.18, issue.1, pp.87-110, 2004.
DOI : 10.3166/ria.18.87-110

D. Bourigault and C. Fabre, Approche linguistique pour l'analyse syntaxique de corpus. Cahiers de grammaire, pp.131-151, 2000.

D. Bourigault and C. Jacquemin, Term extraction + term clustering, Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics -, pp.15-22, 1999.
DOI : 10.3115/977035.977039

T. Buckwalter, Buckwalter Arabic morphological analyzer version 1.0. Linguistic Data Consortium (LDC) catalog number, 2002.

T. Bynon, Approaches to morphological typology, Morphologie / Morpholoo. Ein internationall Handbuch zur Flexion und Wortbildung, 2004.

. T. Cabre?mcabre?m, La terminologie : théorie, méthode et applications. Les Presses de l'Université d'Ottawa, 1998.

. T. Cabre?mcabre?m, On diversity and terminology, Terminoloo, vol.2, issue.1, pp.1-16, 1995.

M. V. Campenhoudt, Que nous reste-t-il d'Eugen Wüster ?, 2006.

D. Candel, Wüster par lui-même, Terminologie : problèmm théoriquu, pp.15-32, 2004.

G. Cao, J. Gao, and J. Nie, A system to mine large-scale bilingual dictionaries from monolingual web pages. Proceedings of thé MT Summit XI -The Eleventh Machine Translation Summit, pp.57-64, 2007.

L. Cao, X. Zhao, H. Zheng, and B. Y. Zhao, Atll : Approximating shortest paths in social graphs, 2011.

S. A. Caraballo and E. Charniak, Determining the specificity of nouns from text, Proceedings of the 1999 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, pp.63-70, 1999.

J. Carletta, Assessing agreement on classification tasks : the kappa statistic, Computational linguistics, vol.22, issue.2, pp.249-254, 1996.

N. Catach, Les Histoires de l'Écriture ? Panorama critique ?, Histoire Épistémologie Langage, vol.19, issue.2, pp.177-185, 1997.

Y. Cen, Z. Han, and P. Ji, Chinese Term Recognition and Extraction Based on Hidden Markov Model, 2008 IEEE Pacific-Asia Workshop on Computational Intelligence and Industrial Application, pp.219-224, 2008.
DOI : 10.1109/PACIIA.2008.242

N. Chawla, K. W. , L. O. Kegelmeyer, and W. , SMOTE : synthetic minority over-sampling technique, Journal of Artificial Intelligence Research, vol.16, 2002.

M. Chen, B. Chang, and W. Pei, A Joint Model for Unsupervised Chinese Word Segmentation, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp.854-863, 2014.
DOI : 10.3115/v1/D14-1092

F. Chung, Random walks and local cuts in graphs, Linear Algebra and its Applications, vol.423, issue.1, pp.22-32, 2007.
DOI : 10.1016/j.laa.2006.07.018

A. Clark, Combining distributional and morphological information for part of speech induction, Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics , EACL '03, pp.59-66, 2003.
DOI : 10.3115/1067807.1067817

V. Claveau and M. &-l-'homme, Structuring terminology using analogy-based machine learning, Proceedings of the 7th International Conference on Terminoloo and Knowledge Engineering, pp.17-18, 2005.

L. Cle?mentcle?ment and E. Villemonte-de-la-clergerie, MAF : a morphosyntactic annotation framework, Proc. of the 2nd Language & Technoloo Conference (LT'05), pp.90-94, 2005.

B. Comrie, Language universals and linguistic typoloo, 1989.

M. D. Conrado, T. A. Pardo, and S. O. Rezende, A machine learning approach to automatic term extraction using a rich feature set, Proceedings of the NAACL HLT 2013 Student Research Workshop, pp.16-23, 2013.

G. G. Corbett, The number of genders in Polish. Papers and Studii in Contrastive Linguistics, VXI, Note : journal is now called Pozna? Studies in Contemporary Linguistics, pp.83-89, 1983.

M. Creutz and K. Lagus, Inducing the morphological lexicon of a natural language from unannotated text, Proceedings of the International and Interdisciplinary Conference on Adaptive Knowledge Representation and Reasoning, 2005.

J. F. Da-silva and G. P. Lopes, A local maxima method and a fair dispersion normalization for extracting multi-word units from corpora, Sixth Meeting on Mathematics of Language, 1999.

I. Dagan, A. Itai, and U. Schwall, Two languages are more informative than one, Proceedings of the 29th annual meeting on Association for Computational Linguistics -, pp.130-137, 1991.
DOI : 10.3115/981344.981361

B. Daille, Approche mixte pour l'extraction automatique de terminologie : statistiquu lexicall et filtrr linguistiquu, 1994.

B. Daille, Conceptual structuring through term variations, Proceedings of the ACL 2003 workshop on Multiword expressions analysis, acquisition and treatment -, pp.9-16, 2003.
DOI : 10.3115/1119282.1119284

URL : https://hal.archives-ouvertes.fr/hal-00456518

B. Daille, Variations and application-oriented terminology engineering, Terminoloo, vol.11, issue.1, pp.181-197, 2005.
DOI : 10.1075/bct.2.09dai

URL : https://hal.archives-ouvertes.fr/hal-00442194

B. Daille, Building bilingual terminologies from comparable corpora : The ttc termsuite, The 5th Workshop on Building and Using Comparable Corpora, pp.2-9, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00819594

B. Daille and H. Blancafort, Knowledge-poor and knowledge-rich approaches for multilingual terminology extraction, Proceedings, 13th International Conference on Intelligent Text Processing and Computational Linguistics, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00820322

P. T. Daniels, The Handbook of Linguistics, chapter Chapter, 2003.

S. David and P. Plante, De la nécessité d'une approche morpho-syntaxique dans l'analyse de textes, Intelligence artificielle et sciencc cognitivv au Québec, vol.3, issue.3, pp.140-154, 1990.

D. Davidov and A. Rappoport, Enhancement of lexical concepts using cross-lingual web mining, Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing Volume 2, EMNLP '09, pp.852-861, 2009.
DOI : 10.3115/1699571.1699624

P. Dayan, Unsupervised learning. The MIT encyclopedia of the cognitive sciencc, 1999.

G. De-melo, Graph-based Methods for Large-Scale Multilingual Knowledge Integration, 2012.

G. De-melo and G. Weikum, Towards a universal wordnet by learning from combined evidence, Proceeding of the 18th ACM conference on Information and knowledge management, CIKM '09, 2009.
DOI : 10.1145/1645953.1646020

G. De-melo and G. Weikum, Towards a universal wordnet by learning from combined evidence, Proceeding of the 18th ACM conference on Information and knowledge management, CIKM '09, 2009.
DOI : 10.1145/1645953.1646020

V. De-paiva, A. Rademaker, and G. De-melo, Openwordnet-pt : An open brazilian wordnet for reasoning, Proceedings of the 24th International Conference on Computational Linguistics, 2012.

H. De?jeande?jean, E. Gaussier, and F. Sadat, Bilingual Terminology Extraction : An Approach based on a Multilingual thesaurus Applicable to Comparable Corpora, Proceedings of the 19th International Conference on Computational Linguistics COLING 2002, pp.218-224, 2002.

M. T. Diab, The feasibility of bootstrapping an Arabic wordnet leveraging parallel corpora and and an English wordnet, Proceedings of the Arabic Language Technologii and Resourcc, 2004.

Z. Dong, Q. Dong, and C. Hao, Word segmentation needs change-from a linguist's view, Proceedings of CIPS-SIGHAN Joint Conference on Chinese Language Processing, pp.1-7, 2010.

J. Dougherty, R. Kohavi, and M. Sahami, Supervised and Unsupervised Discretization of Continuous Features, Machine Learning : Proceedings of the Twelfth International Conference, pp.194-202, 1995.
DOI : 10.1016/B978-1-55860-377-6.50032-3

M. S. Dryer, Frequency and pragmatically unmarked word order, p.105, 1995.
DOI : 10.1075/tsl.30.06dry

M. S. Dryer, Word order. Language typoloo and syntactic description, pp.61-131, 2007.

M. S. Dryer, Prefixing vs Suffixing in Inflectional Morpholoo The World Atll of Language Structurr Online, 2013.

G. Drzazga, The Puzzle of Grammatical Gender : Insights from the Cognitive Theory of Translation and the Nature of Polish Hybrid Nouns, 2013.

O. Ducrot and J. Schaeffer, Nouveau dictionnaire encyclopédique dd sciencc du langage. Points (Paris), 1995.

H. Dyvik, Translations as semantic mirrors : from parallel corpus to wordnet, Proceedings of the Workshop Multilinguality in the lexicon II at the 13th biennial European Conference on Artificial Intelligence (ECAI'98), pp.24-44, 1998.

J. Edachery, A. Sen, and F. J. Brandenburg, Graph Clustering Using Distance-k Cliques, Graph drawing, pp.98-106, 1999.
DOI : 10.1007/3-540-46648-7_10

M. Ehrmann, LL Entitt Nomméé, de la Linguistique au TAL : Statut théorique et méthodd de désambiguïsation, 2008.

H. B. Eifring and R. Theil, Linguistics for students of Asian and African languagg, 2004.

E. Ayari and S. , Évaluation transparente du traitement dd éléments de réponse à une question factuelle, 2009.

E. Hadi, W. M. Timimi, I. Dabbadie, M. Choukri, K. Hamon et al., Terminological resources acquisition tools : Toward a user-oriented evaluation model, Proceedings of the 5th International Conference on Language Resourcc and Evaluation (LREC'06) European Language Resources Association (ELRA), pp.945-948, 2006.

V. Estivill-castro, Why so many clustering algorithms, ACM SIGKDD Explorations Newsletter, vol.4, issue.1, pp.65-75, 2002.
DOI : 10.1145/568574.568575

O. Etzioni, K. Reiter, S. Soderland, and M. Sammer, Lexical translation with application to image search on the web, 2007.

S. Evert, The Statistics of Word Cooccurrencc : Word Pairs and Collocations, 2005.

S. Evert, Corpora and collocations (extended manuscript), 2007.

X. Fan, N. Shimizu, and H. Nakagawa, Automatic extraction of bilingual terms from a Chinese-Japanese parallel corpus, Proceedings of the 3rd International Universal Communication Symposium on, IUCS '09, pp.41-45, 2009.
DOI : 10.1145/1667780.1667789

J. Fang, L. Sui, and H. Jian, Comparative Analysis of Continuous Entropy Estimation with Different Unsupervised Discretization Methods, Proceedings of the 2nd International Conference on Computer Science and Electronics Engineering (ICCSEE 2013), 2013.
DOI : 10.2991/iccsee.2013.94

X. Farreres, G. Rigau, &. Rodri?guezrodri?guez, and H. , Using wordnet for building wordnets, Computing Research Repository, p.980, 1998.

C. Fellbaum and E. , WordNet : An Electronic Lexical Database, 1998.

J. Feuillet, Introduction à la typologie linguistique. Bibliothèque de grammaire et de linguistique, 2006.

J. R. Firth, A synopsis of linguistic theory 1930-55, pp.1952-59, 1957.

J. Foo, Computational terminology : Exploring bilingual and monolingual term extraction, 2012.

J. Foo and M. Merkel, Using machine learning to perform automatic term recognition, Proceedings of the LREC 2010 Workshop on Methods for automatic acquisition of Language Resourcc and their evaluation methods, pp.49-54, 2010.

S. Fortunato, Community detection in graphs, Physics Reports, vol.486, issue.3-5, pp.75-174, 2010.
DOI : 10.1016/j.physrep.2009.11.002

K. Frantzi, S. Ananiadou, and J. Tsuji, The c-value/nc-value method of automatic recognition for multi-word terms, Proceedings of the ECDL, 1998.

P. Fung and P. Cheung, Multi-level bootstrapping for extracting parallel sentences from a quasi-comparable corpus, Proceedings of the 20th international conference on Computational Linguistics , COLING '04, 2004.
DOI : 10.3115/1220355.1220506

P. Fung and K. Mckeown, Finding terminology translations from non-parallel corpora, Proceedings of the 5th Annual Workshop on Very Large Corpora, pp.192-202, 1997.

B. Gaillard, B. Gaume, and E. Navarro, Invariants and variability of synonymy networks : Self mediated agreement by confluence, Proceedings of TextGraphs-6 : Graphbased Methods for Natural Language Processing, pp.15-23, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00992057

B. Gaume, Balades Aléatoires dans les petits mondes lexicaux, I3 Information Interaction Intelligence, 2004.

B. Gaume, Mapping the forms of meaning in small worlds, International Journal of Intelligent Systems, vol.6, issue.2, pp.848-862, 2008.
DOI : 10.1002/int.20275

URL : https://hal.archives-ouvertes.fr/hal-01322013

I. J. Gelb, A study of writing : The foundations of grammatoloo, 1952.

D. Gil, Linguistic Fieldwork, chapter Escaping Eurocentrism : fieldwork as a process of unlearning, 2001.

A. L. Gilbert, T. Regier, P. Kay, and R. B. Ivry, Whorf hypothesis is supported in the right visual field but not the left, Proceedings of the National Academy of Sciencc of the United Statt of America, pp.489-494, 2006.
DOI : 10.1073/pnas.0509868103

?. Go, A. Kerslake, and C. , Turkish : A Comprehensive Grammar, 2005.

J. Goldsmith, Linguistica : An automatic morphological analyzer, Proceedings of 36th meeting of the Chicago Linguistic Society, 2000.

J. H. Greenberg, A Quantitative Approach to the Morphological Typology of Language, International Journal of American Linguistics, vol.26, issue.3, pp.178-194, 1960.
DOI : 10.1086/464575

J. H. Greenberg, Some universals of grammar with particular reference to the order of meaningful elements, pp.73-113, 1963.

J. H. Greenberg, Language universals (With special reference to feature hierarchii), 1966.

?. G. Grigonyte, ?. E. Rimkute, A. Utka, and L. Boizou, Experiments on Lithuanian term extraction, Proceedings of NODALIDA 2011 Conference, pp.82-89, 2011.

C. Grouin, Anonymisation de documents cliniquu : performancc et limitt dd méthodd symboliquu et par apprentissage statistique, 2013.

T. R. Gruber, A translation approach to portable ontology specifications. Knowledge acquisition, pp.199-220, 1993.

N. Guarino, D. Oberle, and S. Staab, What is an ontology ? In Handbook on ontologii, pp.1-17, 2009.

?. Hammarstro, H. Borin, and L. , Unsupervised Learning of Morphology, Computational Linguistics, vol.23, issue.3, pp.309-350, 2011.
DOI : 10.1162/coli.2009.35.4.35409

B. Hamp and H. Feldweg, Germanet -a lexical-semantic net for German, Proceedings of ACL workshop Automatic Information Extraction and Building of Lexical Semantic Resourcc for NLP Applications, pp.9-15, 1997.

J. Han and M. Kamber, Data Mining, 2006.
DOI : 10.1007/978-1-4899-7993-3_104-2

V. Hanoka and B. Sagot, Wordnet extension made simple : A multilingual lexiconbased approach using wiki resources, Proceedings of the Eight International Conference on Language Resourcc and Evaluation (LREC'12) European Language Resources Association (ELRA), 2012.
URL : https://hal.archives-ouvertes.fr/hal-00701606

V. Hanoka and B. Sagot, YaMTG : An Open-Source Heavily Multilingual Translation Graph Extracted from Wiktionaries and Parallel Corpora, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01022306

J. A. Hartigan and M. A. Wong, Algorithm AS 136: A K-Means Clustering Algorithm, Applied Statistics, vol.28, issue.1, pp.100-108, 1979.
DOI : 10.2307/2346830

M. Haspelmath, An Empirical Test of the Agglutination Hypothesis, of Studii in Natural Language and Linguistic Theory, pp.13-29, 2009.
DOI : 10.1007/978-1-4020-8825-4_2

D. Healy, Complete Vietnamese : Teach Yourself. Complete Languages, 2012.

U. Heid, A linguistic bootstrapping approach to the extraction of term candidates from German text, Terminology International Journal of Theoretical and Applied Issues in Specialized Communication, vol.5, issue.2, 1999.
DOI : 10.1075/term.5.2.06hei

H. Hjelm, Identifying cross language term equivalents using statistical machine translation and distributional association measures, Proceedings of NODALIDA, pp.97-104, 2007.

Y. Hu, M. Li, P. Zhang, Y. Fan, and Z. Di, Community detection by signaling on complex networks, Physical Review E, vol.78, issue.1, p.16115, 2008.
DOI : 10.1103/PhysRevE.78.016115

M. D. Humphries and K. Gurney, Network ???Small-World-Ness???: A Quantitative Method for Determining Canonical Network Equivalence, PLoS ONE, vol.76, issue.4, p.2051, 2008.
DOI : 10.1371/journal.pone.0002051.s003

N. Ide, T. Erjavec, and ?. D. Tufis, Sense discrimination with parallel corpora, Proceedings of the ACL-02 workshop on Word sense disambiguation recent successes and future directions -, pp.61-66, 2002.
DOI : 10.3115/1118675.1118683

M. Ideue, K. Yamamoto, M. Utiyama, and E. Sumita, A comparison of unsupervised bilingual term extraction methods using phrase tables, Proc. MT Summit XIII, 2011.

H. Isahara, F. Bond, K. Uchimoto, M. Utiyama, and K. Kanzaki, Development of the Japanese WordNet, Proceedings of the Sixth International Conference on Language Resourcc and Evaluation (LREC'08) European Language Resources Association (ELRA), 2008.

M. Ismail, An empirical investigation of the impact of discretization on common data distributions, 2003.

A. Ittoo and G. Bouma, Term extraction from sparse, ungrammatical domain-specific documents, Expert Systems with Applications, vol.40, issue.7, pp.2530-2540, 2013.
DOI : 10.1016/j.eswa.2012.10.067

C. Jacquemin, Syntagmatic and paradigmatic representations of term variation, Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics -, pp.341-348, 1999.
DOI : 10.3115/1034678.1034733

A. Jain, K. Nandakumar, and A. Ross, Score normalization in multimodal biometric systems, Pattern Recognition, vol.38, issue.12, pp.2270-2285, 2005.
DOI : 10.1016/j.patcog.2005.01.012

K. S. Jones, A STATISTICAL INTERPRETATION OF TERM SPECIFICITY AND ITS APPLICATION IN RETRIEVAL, Journal of Documentation, vol.28, issue.1, pp.11-21, 1972.
DOI : 10.1108/eb026526

J. S. Justeson and S. M. Katz, Technical terminology: some linguistic properties and an algorithm for identification in text, Natural Language Engineering, vol.2, issue.01, pp.9-27, 1995.
DOI : 10.1109/72.363484

K. Kageura, Toward the theoretical study of terms: A sketch from the linguistic viewpoint, Terminology International Journal of Theoretical and Applied Issues in Specialized Communication, vol.2, issue.2, pp.239-257, 1995.
DOI : 10.1075/term.2.2.04kag

K. Kageura and B. Umino, Methods of automatic term recognition: A review, Terminology International Journal of Theoretical and Applied Issues in Specialized Communication, vol.3, issue.2, pp.259-289, 1996.
DOI : 10.1075/term.3.2.03kag

D. Kamholz, J. Pool, and S. M. Colowick, Panlex : Building a resource for panlingual lexical translation, Proceedings of the Ninth International Conference on Language Resourcc and Evaluation (LREC'14) European Language Resources Association (ELRA), 2014.

N. Kando, K. Kuriyama, T. Nozue, K. Eguchi, H. Kato et al., The NTCIR workshop : the first evaluation workshop on Japanese text retrieval and cross-lingual information retrieval, Proceedings of the 4th International Workshop on Information Retrieval with Asian Languagg (1RAL'99), 1999.

B. Khaliq and J. Carroll, Induction of root and pattern lexicon for unsupervised morphological analysis of Arabic, International Joint Conference on Natural Language Processing, pp.1012-1016, 2013.

S. Khoja and R. Garside, Stemming arabic text, 1999.

S. N. Kim, T. Baldwin, and K. Min-yen, An unsupervised approach to domainspecific term extraction, Proceedings of the Australasian Language Technoloo Association Workshop, pp.9-13, 2009.

S. Kirkpatrick and M. Vecchi, Optimization by Simulated Annealing, Science, vol.220, issue.4598, pp.671-680, 1983.
DOI : 10.1126/science.220.4598.671

G. Klir and M. Wierman, Uncertainty-Based Information : Elements of Generalized Information Theory, Studies in Fuzziness and Soft Computing, 1999.
DOI : 10.1007/978-3-7908-1869-7

M. Knowles and R. Moon, Introducing Metaphor, 2006.

P. Koehn and K. Knight, Knowledge sources for word-level translation models, Proceedings of the 2001 Conference on Empirical Methods in Natural Language Processing, pp.27-35, 2001.

R. Kohavi, A study of cross-validation and bootstrap for accuracy estimation and model selection, IJCAI, pp.1137-1145, 1995.

O. Kohonen, S. Virpioja, and K. Lagus, Semi-supervised learning of concatenative morphology, Proceedings of the 11th Meeting of the ACL Special Interest Group on BIBLIOGRAPHIE 295, 2010.

S. Kotsiantis and D. Kanellopoulos, Discretization techniques : A recent survey, GESTS International Transactions on Computer Science and Engineering, vol.32, issue.1, pp.47-58, 2006.

Z. Kozareva and E. Hovy, A semi-supervised method to learn and construct taxonomies using the web, Proceedings of the 2010 conference on empirical methods in natural language processing, pp.1110-1118, 2010.

T. Kudo, K. Yamamoto, and Y. Matsumoto, Applying Conditional Random Fields to Japanese Morphological Analysis, EMNLP, pp.230-237, 2004.

S. Kullback and R. A. Leibler, On information and sufficiency. The Annals of Mathematical Statistics, pp.79-86, 1951.

M. Kurimo, S. Virpioja, V. T. Turunen, G. W. Blackwood, and W. Byrne, Overview and results of morpho challenge, Multilingual Information Access Evaluation I. Text Retrieval Experiments, pp.578-597, 2009.

J. Lafferty, A. Mccallum, and F. Pereira, Conditional random fields : Probabilistic models for segmenting and labeling sequence data, pp.282-289, 2001.

C. Laughlin, Intuition : The inside story : Interdisciplinary perspectives. chapter The Nature of Intuition : A neuropsychological Approach, 1997.

T. Lavergne, . Cappe?ocappe?o, and F. Yvon, Practical very large scale CRFs, Proceedings the 48th Annual Meeting of the Association for Computational Linguistics (ACL), pp.504-513, 2010.

V. Leitchik and S. Shelov, Proceedings of the theoretical foundations of terminology comparison between eastern europe and western countries in conjunction with the 14th European Symposium on Language for Special Purposes (LSP), chapter Some Basics Concepts of Terminology : Tradition and Innovations, 2006.

L. S. Li, Y. Z. Dang, J. Zhang, and D. Li, Domain term extraction based on conditional random fields combined with active learning strategy, Journal of Information & Computational Science, vol.9, issue.7, pp.1931-1940, 2012.

R. Lieber, P. , and &. M. Baker, The Oxford handbook of compounding, 2009.
DOI : 10.1093/oxfordhb/9780199695720.001.0001

D. Lin, Automatic identification of non-compositional phrases, Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics -, pp.317-324, 1999.
DOI : 10.3115/1034678.1034730

F. Liu, D. Pennell, F. Liu, and Y. Liu, Unsupervised approaches for automatic keyword extraction using meeting transcripts, Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics on, NAACL '09, pp.620-628, 2009.
DOI : 10.3115/1620754.1620845

P. Liu, W. Li, N. Lin, and X. Li, Do Chinese Readers Follow the National Standard Rules for Word Segmentation during Reading?, PLoS ONE, vol.119, issue.2, 2013.
DOI : 10.1371/journal.pone.0055440.s001

W. Liu, A. Weichselbraun, A. Scharl, and E. Chang, Semi-automatic ontology extension using spreading activation, Journal of Universal Knowledge Management, vol.1, pp.50-58, 2005.

R. T. Lo, B. He, and I. Ounis, Automatically building a stopword list for an information retrieval system, Journal on Digital Information Management : Special Issue on the 5th Dutch-Belgian Information Retrieval Workshop (DIR), pp.17-24, 2005.

R. Longadge and S. Dongre, Class imbalance problem in data mining review, 2013.

L. M. Ló-pez, I. F. Ruiz, R. M. Bueno, and F. T. Ruiz, Dynamic Discretization of Continuous Values from Time Series, Lecture Nott in Computer Science, vol.1810, pp.280-291, 2000.
DOI : 10.1007/3-540-45164-1_30

N. V. Loukachevitch, Automatic term recognition needs multiple evidence, Proceedings of the Eight International Conference on Language Resourcc and Evaluation (LREC'12) European Language Resources Association (ELRA), pp.2401-2407, 2012.

M. Lud and G. Widmer, Relative Unsupervised Discretization for Association Rule Mining, Principll of data mining and knowledge discovery, pp.148-158, 2000.
DOI : 10.1007/3-540-45372-5_15

D. B. Lurie, Language, writing, and disciplinarity in the Critique of the ???Ideographic Myth???: Some proleptical remarks, Language & Communication, vol.26, issue.3-4, pp.250-269, 2006.
DOI : 10.1016/j.langcom.2006.02.015

L. 'homme and M. , Sur la notion de «terme». Meta : Journal dd traducteursMeta :/ Translators, Journal, vol.50, issue.4, pp.1112-1132, 2005.

P. Magistry, Unsupervised Word Segmentation and Wordhood Assessment. The case for Mandarin Chinese, 2013.

P. Magistry and B. Sagot, Unsupervized Word Segmentation : the case for Mandarin Chinese, ACL -Annual Meeting of the Association for Computational Linguistics - 2012, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00701200

O. Maimon and L. Rokach, Data Mining and Knowledge Discovery Handbook. The Kluwer International Series in Engineering and Computer Science, 2005.

A. Makkai, Idiom Structure in English, Number 48 in Janua Linguarum. Series Maior. De Gruyter, 1972.
DOI : 10.1515/9783110812671

B. B. Mandelbrot, An informational theory of the statistical structure of languages Communication theory : papers read at a Symposium on Applications of Communication Theory " held at the Institution of Electrical Engineers, pp.486-502, 1952.

C. Manning, ?. Schu, and H. , Foundations of Statistical Natural Language Processing, 1999.

C. D. Manning, P. Raghavan, ?. Schu, and H. , Scoring, term weighting, and the vector space model, In Introduction to Information Retrieval, chapter Ch, 2008.

S. Martin, W. M. Brown, R. Klavans, and K. W. Boyack, OpenOrd: an open-source toolbox for large graph layout, Visualization and Data Analysis 2011, pp.786806-786806, 2011.
DOI : 10.1117/12.871402

S. S. Mausam, O. Etzioni, D. S. Weld, M. Skinner, and J. Bilmes, Compiling a massive, multilingual dictionary via probabilistic inference, Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1, ACL-IJCNLP '09, pp.262-270, 2009.
DOI : 10.3115/1687878.1687917

A. Mccallum and W. Li, Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons, Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 -, pp.188-191, 2003.
DOI : 10.3115/1119176.1119206

A. K. Mccallum, Mallet : A machine learning for language toolkit, 2002.

J. J. Mccarthy, A prosodic theory of nonconcatenative morphology, Linguistic inquiry, vol.12, pp.373-418, 1981.

A. Mcenery and Z. Xiao, The Lancaster Corpus of Mandarin Chinese : A corpus for monolingual and contrastive language study. Religion, pp.3-4, 2004.

B. T. Mcinnes, Extending the log likelihood measure to improve collocation identification, 2004.

?. Mel?c and I. A. , Leçon inaugurale faite le vendredi 10 janvier internationale : vers une linguistique sens-texte, 1997.

?. Mel?c and I. A. , Collocations and lexical functions, pp.23-54, 1998.

C. M. Meyer and I. Gurevych, Wiktionary: A new rival for expert-built lexicons? Exploring the possibilities of collaborative lexicography, Electronic Lexicography, chapter 13, pp.259-291, 2012.
DOI : 10.1093/acprof:oso/9780199654864.003.0013

I. Meyer, Standardizing Terminoloo for Better Communication : Practice, Applied Theory, and Results, chapter Concept Management for Terminology : A Knowledge Engineering Approach, 1993.

S. Milgram, The small world problem, Psycholoo Today, vol.67, issue.1, pp.61-67, 1967.
DOI : 10.1037/e400002009-005

M. Moens, Information Extraction : Algorithms and Prospects in a Retrieval Context. The Information Retrieval Series, 2006.

M. A. Molinero, B. Sagot, and L. Nicolas, A morphological and syntactic wide-coverage lexicon for Spanish : The Leffe, RANLP 2009 -Recent Advancc in Natural Language Processing, 2009.
URL : https://hal.archives-ouvertes.fr/inria-00616693

V. G. Mollineda and R. A. Sotoca, The class imbalance problem in pattern classification and learning, Simposio de Inteligencia Computacional, SICO'2007 (IEEE Computational Intelligence Society, SC). Congreso Español de Informática, 2007.

T. Mondary, A. Nazarenko, H. Zargayouna, and S. Barreaux, The Quaero Evaluation Campaign on Term Extraction, The eighth international conference on Language Resourcc and Evaluation (LREC), pp.663-669, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00699356

E. A. Moravcsik, What is universal about typology ? Linguistic Typoloo, pp.27-41, 2007.

P. Muller and P. Langlais, Comparaison d'une approche miroir et d'une approche distributionnelle pour l'extraction de mots sémantiquement reliés, Traitement Automatique dd Languu Naturelll (TALN), issue.1, pp.235-246, 2011.

M. Nagata, T. Saito, and K. Suzuki, Using the web as a bilingual dictionary, Proceedings of the workshop on Data-driven methods in machine translation -, pp.1-8, 2001.
DOI : 10.3115/1118037.1118050

?. Nai, A. Fournier, and R. , Traitement du signal et de l'image pour la biométrie, 2012.

H. Nakagawa and T. Mori, A simple but powerful automatic term extraction method, COLING-02 on COMPUTERM 2002 second international workshop on computational terminology -, pp.1-7, 2002.
DOI : 10.3115/1118771.1118778

R. Navigli and S. P. Ponzetto, BabelNet : Building a Very Large Multilingual Semantic Network, Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp.216-225, 2010.

R. Navigli and S. P. Ponzetto, BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network, Artificial Intelligence, vol.193, pp.217-250, 2012.
DOI : 10.1016/j.artint.2012.07.001

R. Navigli, P. Velardi, and S. Faralli, A graph-based algorithm for inducing lexical taxonomies from scratch, IJCAI, pp.1872-1877, 2011.

R. Nazar, L. Wanner, and J. Vivaldi, Two step flow in bilingual lexicon extraction from unrelated corpora, Proceedings of the EAMT (European Association for Machine Translation) Conference, 2008.

A. Nazarenko and H. Zargayouna, Evaluating term extraction, Proceegings of International Conference Recent Advancc in Natural Language Processing, pp.299-304, 2009.
URL : https://hal.archives-ouvertes.fr/hal-00517090

A. Nazarenko, H. Zargayouna, O. Hamon, and J. Van-puymbrouck, Évaluation des outils terminologiques : enjeux, difficultés et propositions, Traitement Automatique dd Languu (TAL), vol.50, issue.1, pp.257-281, 2009.

J. Nichols, What, if anything, is typology ? Linguistic Typoloo, pp.231-238, 2007.

J. Nicholson, T. Cohn, and T. Baldwin, Evaluating a morphological analyser of inuktitut, Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics : Human Language Technologii, NAACL HLT '12, pp.372-376, 2012.

H. Niemann, Pattern analyss and understanding, volume 4 of Springer serii in information sciencc, 1990.

N. H. Noor, S. Sapuan, and F. Bond, Creating the Open Wordnet Bahasa, pp.255-264, 2011.

J. Oh, K. Lee, and K. Choi, Term recognition using technical dictionary hierarchy, Proceedings of the 38th Annual Meeting on Association for Computational Linguistics , ACL '00, pp.496-503, 2000.
DOI : 10.3115/1075218.1075281

N. Ordan, B. Ilan, N. Ordan, and S. Wintner, S. : Hebrew wordnet : a test case of aligning lexical databases across languages, International Journal of Translation, vol.19, pp.39-58, 2007.

J. L. Packard, The Morpholoo of Chinese A Linguistic and Cognitive Approach, 2000.

B. Pakendorf, Contact in the prehistory of the Sakha (Yakuts) : Linguistic and genetic perspectivv, Netherlands Graduate School of Linguistics, 2007.

Y. Park, R. J. Byrd, and B. K. Boguraev, Automatic glossary extraction, Proceedings of the 19th international conference on Computational linguistics -, pp.1-7, 2002.
DOI : 10.3115/1072228.1072370

P. Pecina and P. Schlesinger, Combining association measures for collocation extraction, Proceedings of the COLING/ACL on Main conference poster sessions -, pp.651-658, 2006.
DOI : 10.3115/1273073.1273157

J. Petrovic?spetrovic?s, ?. Bas, and . D. Ic?bic?b, Extending lexical association measures for collocation extraction, Computer Speech & Language, vol.24, issue.2, pp.383-394, 2009.
DOI : 10.1016/j.csl.2009.06.001

E. Pianta, L. Bentivogli, and C. Girardi, Multiwordnet : developing an aligned multilingual database, Proceedings of the First International Conference on Global WordNet, 2002.

M. Pinnis, ?. Ljubes, ?. , D. Skadin, ?. et al., Term extraction, tagging, and mapping tools for under-resourced languages, Proceedings of the 10th Conference on Terminoloo and Knowledge Engineering, pp.20-21, 2012.

?. Polgue and A. , Collocations et fonctions lexicales : pour un modèle d'apprentissage. LL Collocations. Analyse et traitement, pp.117-133, 2003.

M. Polinsky and R. Kluender, Linguistic typology and theory construction: Common challenges ahead, Linguistic Typology, vol.11, issue.1, pp.273-283, 2007.
DOI : 10.1515/LINGTY.2007.022

I. Popescu and G. Altmann, Zipf's mean and language typology, Glottometrics, issue.16, pp.31-37, 2008.

F. Provost, Machine learning from imbalanced data sets 101, Proceedings of the AAAI'2000 workshop on imbalanced data sets, pp.1-3, 2000.

A. Przepió, R. L. Gó-rski, B. Lewandowska-tomaszyk, and M. Lazinski, Towards the National Corpus of Polish, Proceedings of the Sixth International Conference on Language Resourcc and Evaluation (LREC'08), 2008.

R. B. Rao, G. Fung, and R. Rosales, On the Dangers of Cross-Validation. An Experimental Evaluation, SDM, pp.588-596, 2008.
DOI : 10.1137/1.9781611972788.54

R. Rapp, Identifying word translations in non-parallel texts, Proceedings of the 33rd annual meeting on Association for Computational Linguistics -, pp.320-322, 1995.
DOI : 10.3115/981658.981709

L. Ratinov and D. Roth, Design challenges and misconceptions in named entity recognition, Proceedings of the Thirteenth Conference on Computational Natural Language Learning, CoNLL '09, pp.147-155, 2009.
DOI : 10.3115/1596374.1596399

P. Resnik and D. Yarowsky, A perspective on word sense disambiguation methods and their evaluation, Proceedings of the SIGLEX Workshop, 1997.

A. Rey, La Terminologie : noms et notions. Que Sais-Je ?, 1979.

G. Rondeau, Introduction à la terminologie, Gaëtan Morin, 1984.

E. Rosch, Cognitive representations of semantic categories., Journal of Experimental Psychology: General, vol.104, issue.3, pp.192-233, 1975.
DOI : 10.1037/0096-3445.104.3.192

E. Rosch, Principles of Categorization, pp.27-48, 1978.
DOI : 10.1016/B978-1-4832-1446-7.50028-5

G. Rubio, Chasing the semitic root : The skeleton in the closet, Aula Orientall, vol.23, pp.45-63, 2005.

M. Sadeghi and J. Vegas, Automatic identification of light stop words for Persian information retrieval systems, Journal of Information Science, vol.27, issue.3, 2014.
DOI : 10.1177/0165551514530655

J. Sager, Practical Course in Terminoloo Processing, 1990.

B. Sagot, Building a Morphosyntactic Lexicon and a Pre-syntactic Processing Chain for Polish, Human Language Technoloo. Challengg of the Information Society, pp.85-95, 2009.
DOI : 10.1007/11551874_20

URL : https://hal.archives-ouvertes.fr/inria-00614709

B. Sagot, The Lefff, a freely available and large-coverage morphological and syntactic lexicon for French, 7th international conference on Language Resourcc and Evaluation, 2010.
URL : https://hal.archives-ouvertes.fr/inria-00521242

B. Sagot, Construction de ressources lexicales pour le traitement automatique des langues, of Lingvisticae Investigationn Supplementa, pp.217-254, 2013.
DOI : 10.1075/lis.30.07sag

URL : https://hal.archives-ouvertes.fr/hal-00927281

B. Sagot, Delex, a freely-available, large-scale and linguistically grounded morphological lexicon for German, Proceedings of BIBLIOGRAPHIE 303, 2014.

B. Sagot and P. Boullier, SxPipe 2 : architecture pour le traitement présyntaxique de corpus bruts, pp.155-188, 2008.

B. Sagot, ?. Fis, and D. , Building a free French wordnet from multilingual resources, 2008.
URL : https://hal.archives-ouvertes.fr/inria-00614708

B. Sagot, ?. Fis, and D. , Automatic extension of WOLF, Proceedings of the 6th Global WordNet Conference, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00655774

B. Sagot and R. Stern, Aleda, a free large-scale entity database for French, LREC 2012 : eighth international conference on Language Resourcc and Evaluation, p.4, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00699300

E. F. Sang and J. Veenstra, Representing text chunks, Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics -, pp.173-179, 1999.
DOI : 10.3115/977035.977059

E. Sapir, Language : An Introduction to the Study of Speech, 1921.
DOI : 10.1017/CBO9781139629430

C. Saranya and G. Manikandan, A study on normalization techniques for privacy preserving data mining, International Journal of Engineering & Technoloo, vol.5, issue.3, 2013.

A. Savary, Recensement et description dd mots composs -méthodd et applications. Theses, 2000.

B. Say, D. Zeyrek, K. Oflazer, and U. O-?-zge, Development of a corpus and a treebank for present-day written Turkish, Proceedings of the eleventh international conference of Turkish linguistics, pp.183-192, 2002.

P. Schachter, The subject in Tagalog : Still none of the above, 1996.

S. E. Schaeffer, Stochastic Local Clustering for Massive Graphs, Advancc in knowledge discovery and data mining, pp.354-360, 2005.
DOI : 10.1007/11430919_42

S. E. Schaeffer, Graph clustering, Computer Science Review, vol.1, issue.1, pp.27-64, 2007.
DOI : 10.1016/j.cosrev.2007.05.001

G. Se?rassetse?rasset, Dbnary : Wiktionary as a lemon-based multilingual lexical resource in RDF, Semantic Web Journal-Special issue on Multilingual Linked Open Data, 2012.

M. A. Serrano, A. Flammini, and F. Menczer, Beyond Zipf's law : Modeling the structure of human language, 2009.

F. Sha and F. Pereira, Shallow parsing with conditional random fields, Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology , NAACL '03, pp.134-141, 2003.
DOI : 10.3115/1073445.1073473

C. Shanon, A mathematical theory of communication. the bell systems technical journal, p.27, 1948.

S. , J. Schaeffer, and S. E. , On the np-completeness of some graph cluster measures, SOFSEM 2006 : Theory and Practice of Computer Science, pp.530-537, 2006.

P. Skorik, Grammatika ?ukotskogo jazyka : Fonetika i morfologija imennyx ?astej re?i, Nauka, 1977.

F. Smadja, Retrieving collocations from text : Xtract, Computational Linguistics, vol.19, pp.143-177, 1993.

A. Spencer, Morphological theory : An introduction to word structure in generative grammar, 1991.

D. A. Spielman and S. Teng, A Local Clustering Algorithm for Massive Graphs and Its Application to Nearly Linear Time Graph Partitioning, SIAM Journal on Computing, vol.42, issue.1, pp.1-26, 2013.
DOI : 10.1137/080744888

S. Stamou, K. Oflazer, K. Pala, D. Christoudoulakis, D. Cristea et al., Balkanet a multilingual semantic network for the Balkan languages, pp.21-25, 2002.

S. , P. Valera, S. Ko, ?. , and L. , Word-Formation in the World's Languagg : A Typological Survey, 2012.

C. Sutton and A. Mccallum, An Introduction to Conditional Random Fields, Foundations and Trends?? in Machine Learning, vol.4, issue.4, 2010.
DOI : 10.1561/2200000013

P. Syal and D. Jindal, An Introduction to Linguistics : Language, Grammar and Semantics. Eastern Economy Edition, 2007.

B. Szmrecsanyi and B. Kortmann, The morphosyntax of varieties of English worldwide: A quantitative perspective, Lingua, vol.119, issue.11, pp.1643-1663, 2009.
DOI : 10.1016/j.lingua.2007.09.016

K. Taghva, R. Elkhoury, and J. Coombs, Arabic stemming without a root dictionary, International Conference on Information Technology: Coding and Computing (ITCC'05), Volume II, pp.152-157, 2005.
DOI : 10.1109/ITCC.2005.90

I. Tellier, I. Eshkol, S. Taalab, and J. Prost, POS-tagging for oral texts with CRF and category decomposition, Research in Computing Science, vol.46, pp.79-90, 2010.
URL : https://hal.archives-ouvertes.fr/hal-00467951

R. Temmerman, Towards New Ways of Terminoloo Description : The Sociocognitive-Approach. Terminology and Lexicography Research and Practice Series, 2000.

J. Tiedemann, News from OPUS ??? A collection of multilingual parallel corpora with tools and interfaces, Recent Advancc in Natural Language Processing, pp.237-248, 2009.
DOI : 10.1075/cilt.309.19tie

L. Torgo and J. Gama, Search-based class discretization, Machine Learning : ECML-97, pp.266-273, 1997.
DOI : 10.1007/3-540-62858-4_91

M. Torii, K. Wagholikar, and H. Liu, Using machine learning for concept extraction on clinical documents from multiple data sources, Journal of the American Medical Informatics Association, vol.18, issue.5, pp.580-587, 2011.
DOI : 10.1136/amiajnl-2011-000155

R. Tsarfaty, D. Seddah, Y. Goldberg, ?. Ku, S. Candito et al., Statistical parsing of morphologically rich languages (SPMRL) : what, how and whither, Proceedings of the NAACL HLT 2010 First Workshop on Statistical Parsing of Morphologically-Rich Languagg, pp.1-12, 2010.
URL : https://hal.archives-ouvertes.fr/inria-00525751

Y. Tsuruoka, J. Tsujii, and S. Ananiadou, Fast full parsing by linear-chain conditional random fields, Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics on, EACL '09, pp.790-798, 2009.
DOI : 10.3115/1609067.1609155

?. D. Tufis, D. Cristea, and S. Stamou, BalkaNet : Aims, Methods, Results and Perspectives . A General Overview, In Romanian Journal on Information Science and Technoloo . Special Issue on BalkaNet, vol.7, pp.9-34, 2004.

M. Uschold and M. Gruninger, Ontologies and semantics for seamless connectivity, ACM SIGMOD Record, vol.33, issue.4, pp.58-64, 2004.
DOI : 10.1145/1041410.1041420

H. Uszkoreit, New chances for deep linguistic processing, Proceedings of COLING 2002, 2002.

A. S. Valderrá, A. Belskis, and L. I. Moreno, Multilingual terminology extraction and validation, Proceedings of the 3rd International Conference on Language Resourcc and Evaluation (LREC'02), 2002.

G. B. Van-huyssteen and B. Verhoeven, A Taxonomy for Afrikaans and Dutch Compounds, Proceedings of the First Workshop on Computational Approaches to Compound Analysis (ComAComA 2014), pp.3-4, 2014.
DOI : 10.3115/v1/W14-5704

V. N. Vapnik, The Nature of Statistical Learning Theory, 1995.

P. Velardi and F. Sclano, Termextractor : a web application to learn the common terminology of interest groups and research communities, 7ème Confèrence " Terminologie et intelligence artificielle, pp.85-94, 2007.

O. K. Virpioja and L. L. Lagus, Semi-supervised extensions to Morfessor Baseline, In Kurimo et al, pp.30-34, 2010.

S. Virpioja, P. Smit, ?. Gro, S. Kurimo, and M. , Morfessor 2.0 : Python implementation and extensions for Morfessor Baseline, 2013.

S. Visa and A. Ralescu, Issues in mining imbalanced data sets-a review paper, Proceedings of the sixteen midwest artificial intelligence and cognitive science conference, pp.67-73, 2005.

J. Vivaldi, ?. Ma, L. Rodri?guezrodri?guez, and H. , Improving Term Extraction by System Combination Using Boosting, Proceedings of European Conference on Machine Learning (ECML'01), 2001.
DOI : 10.1007/3-540-44795-4_44

J. Vivaldi and H. Rodri?guezrodri?guez, Evaluation of terms and term extraction systems: A practical approach, Terminology International Journal of Theoretical and Applied Issues in Specialized Communication, vol.13, issue.2, pp.225-248, 2007.
DOI : 10.1075/term.13.2.06viv

V. Humboldt and W. , Ueber dd Entstehen der grammatischen Formen, und ihren Einfluss auf die Ideenentwicklung, reprinted in : über die Sprache Jürgen Trabant, 1822.

V. Humboldt and W. , Über die Verschiedenheit dd menschlichen Sprachbauu und ihren Einflu? auf die geistige Entwickelung dd Menschengeschlechtt, 1836.

P. Vossen, EuroWordNet : a multilingual database with lexical semantic networks, 1998.
DOI : 10.1007/978-94-017-1491-4

X. Wan, J. Yang, and J. Xiao, Towards an iterative reinforcement approach for simultaneous document summarization and keyword extraction, Annual Meeting-Association for Computational Linguistics, p.552, 2007.

A. Wang, M. Kan, D. Andrade, T. Onishi, and K. Ishikawa, Chinese informal word normalization : an experimental study, Proceedings of the Sixth International Joint Conference on Natural Language Processing, pp.127-135, 2013.

W. Wang, S. Yaman, K. Precoda, C. Richey, and G. Raymond, Detection of agreement and disagreement in broadcast conversations, Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics : Human Language Technologii : short papers, pp.374-378, 2011.

D. Watts and S. Strogatz, Collective dynamics of « small-world » networks, Nature, issue.393, pp.440-442, 1998.

W. Weaver, Science and Complexity, American scientist, vol.36, issue.4, pp.536-544, 1948.
DOI : 10.1007/978-1-4899-0718-9_30

P. Weissenhofer, Conceptoloo in Terminoloo Theory Semantics and Wordformation : A Morpho-conceptually Based Approach to Classification Exemplified by the English Baseball Terminoloo. IITF-series / International Institute for Terminology Research : IITF-series, 1995.

B. L. Welch, The Generalization of `Student's' Problem when Several Different Population Variances are Involved, Biometrika, vol.34, issue.1/2, pp.28-35, 1947.
DOI : 10.2307/2332510

J. Wermter, Collocation and term extraction using linguistically enhanced statistical methods, 2009.

J. Wermter and U. Hahn, Collocation extraction based on modifiability statistics, Proceedings of the 20th international conference on Computational Linguistics , COLING '04, 2004.
DOI : 10.3115/1220355.1220496

J. Wermter and U. Hahn, Paradigmatic modifiability statistics for the extraction of complex multi-word terms, Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing , HLT '05, pp.843-850, 2005.
DOI : 10.3115/1220575.1220681

J. Wermter and U. Hahn, You can't beat frequency (unless you use linguistic knowledge), Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the ACL , ACL '06, pp.785-792, 2006.
DOI : 10.3115/1220175.1220274

W. J. Wilbur and K. Sirotkin, The automatic identification of stop words, Journal of Information Science, vol.39, issue.2, pp.45-55, 1992.
DOI : 10.1177/016555159201800106

G. Williams, Sur les caractéristiques de la collocation, pp.9-16, 2001.

I. H. Witten and E. Frank, Data mining, ACM SIGMOD Record, vol.31, issue.1, 2005.
DOI : 10.1145/507338.507355

S. Wright, in Basic Aspects of Terminology Management, chapter Term Selection : The Initial Phase of Terminology Management, Handbook of Terminoloo Management, 1997.

?. Wu and E. , Internationale Sprachnormung in der Technik, besonders in der Elektrotechnik [International language standardization, especially within electrotechnics], 1931.

?. Wu, E. Bauer, and L. , Einführung in die allgemeine Terminologielehre und terminologische Lexikographie, 1979.

A. Xanthos, Apprentissage automatique de la morphologie : le cc dd structurr racine-schème. Sciences pour la communication, 2008.

J. Xu, A. Fraser, and R. Weischedel, Empirical studies in strategies for Arabic retrieval, Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval , SIGIR '02, pp.269-274, 2002.
DOI : 10.1145/564376.564424

Y. Xu, C. Ringlstetter, and R. Goebel, A continuum-based approach for tightness analysis of chinese semantic units, pp.569-578, 2009.

H. Yang and J. Callan, A metric-based framework for automatic taxonomy induction, Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1, ACL-IJCNLP '09, 2009.
DOI : 10.3115/1687878.1687918

Y. Yang, Discretization for naive-bayy learning, 2003.

B. E. Zawada and P. Swanepoel, On the empirical adequacy of terminological concept theories: The case for prototype theory, Terminology International Journal of Theoretical and Applied Issues in Specialized Communication, vol.1, issue.2, pp.253-275, 1994.
DOI : 10.1075/term.1.2.03zaw

X. Zhang, Y. Song, and A. Fang, Term recognition using Conditional Random fields, Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010), pp.1-6, 2010.
DOI : 10.1109/NLPKE.2010.5587809

Z. Zhang, J. Iria, C. Brewster, and F. Ciravegna, A Comparative Evaluation of Term Recognition Algorithms, Proceedings of the Sixth International Conference on Language Resourcc and Evaluation (LREC08), 2008.

V. Zhikov, H. Takamura, and M. Okumura, An Efficient Algorithm for Unsupervised Word Segmentation with Branching Entropy and MDL, Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pp.832-842, 2010.
DOI : 10.1527/tjsai.28.347

F. Zou, F. L. Wang, X. Deng, S. Han, and L. S. Wang, Automatic construction of Chinese stop word list, Proceedings of the 5th WSEAS international conference on Applied computer science, pp.1010-1015, 2006.