D. T. Abreu, A semântica de construções com verbos-suporte e o paradigma Framenet, 2011.

O. Acosta, A. Villavicencio, and V. Moreira, Identification and treatment of multiword expressions applied to information retrieval, In Kordoni et al, pp.101-109, 2011.

I. Alegria, O. Ansa, X. Artola, N. Ezeiza, K. Gojenola et al., Representation and treatment of multiword expressions in Basque, Proceedings of the Workshop on Multiword Expressions Integrating Processing, MWE '04, pp.48-55, 2004.
DOI : 10.3115/1613186.1613193

A. Alsina, J. Bresnan, and P. Sells, Complex Predicates, 1997.

A. Anastasiadi-symeonidi, Neology in Modern Greek (in Greek), 1986.

J. Apresian, I. Boguslavsky, L. Iomdin, and L. Tsinman, Lexical functions as a tool of ETAP-3, Proc. of the First MTT Conference, 2003.

V. D. Araujo, C. Ramisch, and A. Villavicencio, Fast and flexible MWE candidate generation with the mwetoolkit, In Kordoni et al, pp.134-136, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00959163

M. F. Athayde, Construções com verbo-suporte (Funktionsverbgefüge) do português e do alemão. Number 1 in Cadernos do CIEG Centro Interuniversitário de Estudos Germanísticos, 2001.

S. Atkins, The DANTE database: Its contribution to English lexical research, and in particular to complementing the FrameNet data, editor, A Way with Words: Recent Advances in Lexical Theory and Analysis. A Festschrift for Patrick Hanks, 2010.

S. Atkins, C. Fillmore, J. , and C. R. , Lexicographic Relevance: Selecting Information From Corpus Evidence, International Journal of Lexicography, vol.16, issue.3, pp.251-280, 2003.
DOI : 10.1093/ijl/16.3.251

M. Attia, A. Toral, L. Tounsi, P. Pecina, and J. Van-genabith, Automatic extraction of Arabic multiword expressions, Proc. of the COLING Workshop on MWEs: from Theory to Applications, pp.18-26, 2010.

R. H. Baayen, Word Frequency Distributions, volume 18 of Text, Speech and Language Technology, 2001.

M. Bai, J. You, K. Chen, C. , and J. S. , Acquiring translation equivalences of multiword expressions by normalized correlation frequencies, Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing Volume 2, EMNLP '09, pp.478-486, 2009.
DOI : 10.3115/1699571.1699574

T. Baldwin, Bootstrapping deep lexical resources, Proceedings of the ACL-SIGLEX Workshop on Deep Lexical Acquisition, DeepLA '05, pp.67-76, 2005.
DOI : 10.3115/1631850.1631858

T. Baldwin, Deep lexical acquisition of verb???particle constructions, Computer Speech & Language, vol.19, issue.4, pp.398-414, 2005.
DOI : 10.1016/j.csl.2005.02.004

T. Baldwin, A resource for evaluating the deep lexical acquisition of English verb-particle constructions, Proc. of the LREC Workshop Towards a Shared Task for MWEs, pp.1-2, 2008.

T. Baldwin, MWEs and topic modelling: Enhancing machine learning with linguistics, p.1, 2011.

T. Baldwin, C. Bannard, T. Tanaka, and D. Widdows, An empirical model of multiword expression decomposability, Proceedings of the ACL 2003 workshop on Multiword expressions analysis, acquisition and treatment -, pp.89-96, 2003.
DOI : 10.3115/1119282.1119294

T. Baldwin and T. Tanaka, Translation by machine of complex nominals, Proceedings of the Workshop on Multiword Expressions Integrating Processing, MWE '04, pp.24-31, 2004.
DOI : 10.3115/1613186.1613190

T. Baldwin and A. Villavicencio, Extracting the unextractable, proceeding of the 6th conference on Natural language learning , COLING-02, pp.98-104, 2002.
DOI : 10.3115/1118853.1118854

S. Banerjee and T. Pedersen, The Design, Implementation, and Use of the Ngram Statistics Package, Proceedings of the Fourth International Conference on Intelligent Text Processing and Computational Linguistics, pp.370-381, 2003.
DOI : 10.1007/3-540-36456-0_38

C. Bannard, Learning about the meaning of verb???particle constructions from corpora, Computer Speech & Language, vol.19, issue.4, pp.467-478, 2005.
DOI : 10.1016/j.csl.2005.02.003

C. Bannard, A measure of syntactic flexibility for automatically identifying multiword expressions in corpora, Proceedings of the Workshop on a Broader Perspective on Multiword Expressions, MWE '07, pp.1-8, 2007.
DOI : 10.3115/1613704.1613705

C. Bannard, T. Baldwin, and A. Lascarides, A statistical approach to the semantics of verb-particles, Proceedings of the ACL 2003 workshop on Multiword expressions analysis, acquisition and treatment -, pp.65-72, 2003.
DOI : 10.3115/1119282.1119291

J. Baptista, A. Correia, and G. Fernandes, Frozen sentences of Portuguese, Proceedings of the Workshop on Multiword Expressions Integrating Processing, MWE '04, pp.72-79, 2004.
DOI : 10.3115/1613186.1613196

URL : https://hal.archives-ouvertes.fr/hal-01025937

A. Barreiro and L. M. Cabral, ReEscreve: a translator-friendly multi-purpose paraphrasing software tool The Twelfth Machine Translation Summit, Proceedings of the Workshop Beyond Translation Memories: New Tools for Translators, pp.1-8, 2009.

R. Basili, M. T. Pazienza, V. , and P. , A "not-so-shallow" parser for collocational analysis, Proceedings of the 15th conference on Computational linguistics -, pp.447-453, 1994.
DOI : 10.3115/991886.991965

S. Bergsma, D. Lin, and R. Goebel, Web-scale N-gram models for lexical disambiguation, IJCAI, pp.1507-1512, 2009.

D. Biber, S. Johansson, G. Leech, S. Conrad, and E. Finegan, Longman Grammar of Spoken and Written English, 1999.

E. Bick, The parsing system Palavras, 2000.

C. Boitet, Y. Bey, M. Tomokio, W. Cao, and H. Blanchon, IWSLT-06: experiments with commercial MT systems and lessons from subjective evaluations, International Workshop on Spoken Language Translation, 2006.

D. Bolinger, The phrasal verb in English, Harvard UP, vol.187, 1971.

F. Bonin, F. Dell-'orletta, S. Montemagni, and G. Venturi, A contrastive approach to multi-word extraction from domain-specific corpora, Proc. of the Seventh LREC, 2010.

F. Bonin, F. Dell-'orletta, G. Venturi, and S. Montemagni, Contrastive filtering of domain-specific multi-word terms from different types of corpora, Proc. of the COLING Workshop on MWEs: from Theory to Applications, pp.76-79, 2010.

D. Bouamor, N. Semmar, and P. Zweigenbaum, Improved statistical machine translation using multiword expressions, Proceedings of the International Workshop on Using Linguistic Information for Hybrid Machine Translation, pp.15-20, 2011.

S. Boulaknadel, B. Daille, and D. Aboutajdine, A multi-word term extraction program for Arabic language, Proc. of the Sixth LREC, pp.1485-1488, 2008.

G. Bouma and B. V. Moirón, Corpus-based Acquisition of Collocational Prepositional Phrases, Proc. of the Twelfth Conf. of CLIN, pp.23-37, 2001.
DOI : 10.1163/9789004334038_004

T. Briscoe, J. Carroll, and R. Watson, The second release of the RASP system, Proceedings of the COLING/ACL on Interactive presentation sessions -, pp.77-80, 2006.
DOI : 10.3115/1225403.1225423

P. F. Brown, V. J. Pietra, S. A. Pietra, and R. L. Mercer, The mathematics of statistical machine translation: parameter estimation, Comp. Ling, vol.19, issue.2, pp.263-311, 1993.

F. Bu, X. Zhu, and M. Li, Measuring the non-compositionality of multiword expressions, Proc. of the 23rd COLING The Coling 2010 Organizing Com- mittee, pp.116-124, 2010.

L. Burnard, User Reference Guide for the British National Corpus, 2007.

C. Butnariu, S. N. Kim, P. Nakov, D. O. Séaghdha, S. Szpakowicz et al., Semeval-2 task 9: The interpretation of noun compounds using paraphrasing verbs and prepositions, Proc. of the 5th SemEval, pp.39-44, 2010.

M. Butt, The light verb jungle, Proceedings of the Workshop on Multi-Verb Constructions, pp.243-246, 2003.

M. T. Cabré, La terminologia. La teoria, els mètodes, les aplicacions, Empúries, vol.527, 1992.

N. Calzolari and R. Bindi, Acquisition of lexical information from a large textual Italian corpus, Proc. of the 13th COLING, pp.54-59, 1990.

N. Calzolari, C. Fillmore, R. Grishman, N. Ide, A. Lenci et al., Towards best practice for multiword expressions in computational lexicons, 2002.

M. Carpuat and M. Diab, Task-based evaluation of multiword expressions: a pilot study in statistical machine translation, Proc. of HLT: The 2010 Annual Conf. of the NAACL (NAACL 2003), pp.242-245, 2010.

P. Carvalho, L. Sarmento, J. Teixeira, and M. J. Silva, Liars and saviors in a sentiment annotated corpus of comments to political debates, Proc. of the 49th ACL: HLT (ACL HLT 2011), pp.564-568, 2011.

D. Català and J. Baptista, Spanish adverbial frozen expressions, Proceedings of the Workshop on a Broader Perspective on Multiword Expressions, MWE '07, pp.33-40, 2007.
DOI : 10.3115/1613704.1613709

T. Chakraborty and S. Bandyopadhyay, Identification of reduplication in Bengali corpus and their semantic analysis: A rule-based approach, Proc. of the COLING Workshop on MWEs: from Theory to Applications, pp.72-75, 2010.

T. Chakraborty, D. Das, and S. Bandyopadhyay, Semantic clustering: an attempt to identify multiword expressions in Bengali, In Kordoni et al, pp.8-13, 2011.

S. F. Chen and J. Goodman, An empirical study of smoothing techniques for language modeling, Computer Speech & Language, vol.13, issue.4, pp.359-394, 1999.
DOI : 10.1006/csla.1999.0128

Y. Choueka, Looking for needles in a haystack or locating interesting collocational expressions in large textual databases, RIAO'88, pp.609-624, 1988.

O. Christ, A modular and flexible architecture for an integrated corpus query system, COMPLEX 1994, pp.23-32, 1994.

K. Church, How many multiword expressions do people know?, ACM Transactions on Speech and Language Processing, vol.10, issue.2, pp.137-144, 2011.
DOI : 10.1145/2483691.2483693

K. Church and P. Hanks, Word association norms, mutual information, and lexicography, Proceedings of the 27th annual meeting on Association for Computational Linguistics -, pp.22-29, 1990.
DOI : 10.3115/981623.981633

C. R. Conejo, O verbo-suporte fazer na língua portuguesa: um exercício de análise de base funcionalista, 2008.

M. Constant and A. Sigogne, MWU-aware part-of-speech tagging with a CRF model and lexical resources, In Kordoni et al, pp.49-56, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00621585

P. Cook, A. Fazly, and S. Stevenson, Pulling their weight, Proceedings of the Workshop on a Broader Perspective on Multiword Expressions, MWE '07, pp.41-48, 2007.
DOI : 10.3115/1613704.1613710

P. Cook, A. Fazly, and S. Stevenson, The VNC-tokens dataset, Proc. of the LREC Workshop Towards a Shared Task for MWEs, pp.19-22, 2008.

P. Cook, S. Stevenson, B. V. Moirón, A. Villavicencio, D. Mccarthy et al., Classifying particle semantics in English verb-particle constructions, Proceedings of the Workshop on Multiword Expressions Identifying and Exploiting Underlying Properties, MWE '06, pp.45-53, 2006.
DOI : 10.3115/1613692.1613702

B. C. Da-silva, Brazilian Portuguese wordnet: A computational linguistic exercise of encoding bilingual relational lexicons, International Journal of Computational Linguistics and Applications, vol.1, issue.12, pp.137-150, 2010.

J. F. Da-silva, G. Dias, S. Guilloré, and J. G. Lopes, Using LocalMaxs Algorithm for the Extraction of Contiguous and Non-contiguous Multiword Lexical Units, Proceedings of the 9th Portuguese Conference on Artificial Intelligence: Progress in Artificial Intelligence, pp.113-132, 1999.
DOI : 10.1007/3-540-48159-1_9

B. F. Daille, A. Korhonen, D. Mccarthy, and A. Villavicencio, Conceptual structuring through term variations, Proceedings of the ACL 2003 workshop on Multiword expressions analysis, acquisition and treatment -, pp.9-16, 2003.
DOI : 10.3115/1119282.1119284

URL : https://hal.archives-ouvertes.fr/hal-00456518

B. Daille, S. Dufour-kowalski, and E. Morin, French-English multi-word term alignment based on lexical context analysis, Proc. of the Fourth LREC, pp.919-922, 2004.

L. Danlos and P. Samvelian, Translation of the predicative element of a sentence: category switching, aspect and diathesis, Proceedings of the Fourth International Conference on Theoretical and Methodological Issues in Machine Translation (TMI), pp.21-34, 1992.

D. Das, S. Pal, T. Mondal, T. Chakraborty, and S. Bandyopadhyay, Automatic extraction of complex predicates in Bengali, Proc. of the COLING Workshop on MWEs: from Theory to Applications, pp.36-44, 2010.

T. V. De-cruys and B. V. Moirón, Semantics-based multiword expression extraction, Proceedings of the Workshop on a Broader Perspective on Multiword Expressions, MWE '07, pp.25-32, 2007.
DOI : 10.3115/1613704.1613708

H. De-medeiros-caseli, C. Ramisch, M. Das-graças-volpe-nunes, and A. Villavicencio, Alignment-based extraction of multiword expressions, Language Resources and Evaluation, vol.29, issue.1, pp.59-77, 2010.
DOI : 10.1007/s10579-009-9097-9

H. De-medeiros-caseli, A. Villavicencio, A. Machado, and M. J. Finatto, Statistically-driven alignment-based multiword expression identification for technical domains, Proc. of the ACL Workshop on MWEs: Identification, pp.1-8, 2009.

H. Déjean, É. Gaussier, and F. Sadat, An approach based on multilingual thesauri and model combination for bilingual lexicon extraction, Proceedings of the 19th international conference on Computational linguistics -, 2002.
DOI : 10.3115/1072228.1072394

A. P. Dempster, N. M. Laird, R. , and D. B. , Maximum likelihood from incomplete data via the EM algorithm, Journal of the RSS. Series B, vol.39, issue.1, pp.1-38, 1977.

B. Devereux and F. Costello, Learning to interpret novel noun-noun compounds, Proceedings of the Workshop on Cognitive Aspects of Computational Language Acquisition, CACLA '07, pp.89-96, 2007.
DOI : 10.3115/1629795.1629807

G. Dias, Multiword unit hybrid extraction, Proceedings of the ACL 2003 workshop on Multiword expressions analysis, acquisition and treatment -, pp.41-48, 2003.
DOI : 10.3115/1119282.1119288

URL : http://acl.ldc.upenn.edu/W/W03/W03-1806.pdf

G. Doddington, Automatic evaluation of machine translation quality using ngram co-occurrence statistics, Proc. of the Second HLT Conf, pp.128-132, 2002.

A. Doucet and H. Ahonen-myka, Non-contiguous word sequences for information retrieval, Proceedings of the Workshop on Multiword Expressions Integrating Processing, MWE '04, pp.88-95, 2004.
DOI : 10.3115/1613186.1613198

URL : https://hal.archives-ouvertes.fr/hal-00324779

M. Dras, Automatic identification of support verbs: A step towards a definition of semantic weight, Proceedings of the Eighth Australian Joint Conference on Artificial Intelligence, pp.451-458, 1995.

J. Duan, R. Lu, W. Wu, Y. Hu, and Y. Tian, A bio-inspired approach for multiword expression extraction, Proc. of the COLING/ACL 2006 Main Conference Poster Sessions, pp.176-182, 2006.

I. Duarte, A. Gonçalves, M. Miguel, A. Mendes, I. Hendrickx et al., Light verbs features in European Portuguese, Proceedings of the Interdisciplinary Workshop on Verbs: The Identification and Representation of Verb Features, 2010.

T. Dunning, Accurate methods for the statistics of surprise and coincidence, Comp. Ling, vol.19, issue.1, pp.61-74, 1993.

M. S. Duran and C. Ramisch, How do you feel? investigating lexical-syntactic patterns in sentiment expression, Proceedings of Corpus Linguistics 2011: Discourse and Corpus Linguistics Conference, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00959162

M. S. Duran, C. Ramisch, S. M. Aluísio, and A. Villavicencio, Identifying and analyzing Brazilian Portuguese complex predicates, In Kordoni et al, pp.74-82, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00959161

A. Esuli and F. Sebastiani, SENTIWORDNET: A publicly available lexical resource for opinion mining, Proc. of the Sixth LREC, pp.417-422, 2006.

B. D. Eugenio and M. Glass, The Kappa Statistic: A Second Look, Computational Linguistics, vol.23, issue.1, pp.95-101, 2004.
DOI : 10.1086/266577

S. Evert, The Statistics of Word Cooccurrences: Word Pairs and Collocations, 2004.

S. Evert and B. Krenn, Using small random samples for the manual evaluation of statistical association measures, Computer Speech & Language, vol.19, issue.4, pp.450-466, 2005.
DOI : 10.1016/j.csl.2005.02.005

A. Fazly, P. Cook, and S. Stevenson, Unsupervised Type and Token Identification of Idiomatic Expressions, Computational Linguistics, vol.19, issue.1, pp.61-103, 2009.
DOI : 10.2307/3001968

A. Fazly and S. Stevenson, Automatically constructing a lexicon of verb phrase idiomatic combinations, Proc. of the 11th Conf. of the EACL, 2006.

A. Fazly and S. Stevenson, Distinguishing subtypes of multiword expressions using linguistically-motivated statistical measures, Proceedings of the Workshop on a Broader Perspective on Multiword Expressions, MWE '07, pp.9-16, 2007.
DOI : 10.3115/1613704.1613706

A. Fazly, S. Stevenson, and R. North, Automatically learning semantic knowledge about multiword predicates, Language Resources and Evaluation, vol.58, issue.4, pp.61-89, 2007.
DOI : 10.1007/s10579-007-9017-9

C. Fellbaum, WordNet: An Electronic Lexical Database (Language, Speech, and Communication), 1998.

C. J. Fillmore, P. Kay, O. Connor, and M. C. , Regularity and Idiomaticity in Grammatical Constructions: The Case of Let Alone, Language, vol.64, issue.3, pp.501-538, 1988.
DOI : 10.2307/414531

M. Finlayson and N. Kulkarni, Detecting multi-word expressions improves word sense disambiguation, pp.20-24, 2011.

J. R. Firth, Papers in Linguistics 1934-1951, 1957.

J. L. Fleiss, Measuring nominal scale agreement among many raters., Psychological Bulletin, vol.76, issue.5, pp.378-382, 1971.
DOI : 10.1037/h0031619

M. L. Forcada, Apertium: traducció automàtica de codi obert per a les llengües romàniques, Linguamática, vol.1, issue.1, pp.13-23, 2009.

A. Fotopoulou, Une classification des phrases à compléments figés en grec moderne : étude morphosyntaxique des phrases figées, 1993.

A. Fotopoulou, L'ordre des mots dans les phrases figées à un complément libre en grec moderne, pp.37-48, 1997.

A. Fotopoulou, G. Giannopoulos, M. Zourari, and M. Mini, Automatic recognition and extraction of multiword nominal expressions from corpora (in Greek), Proceedings of the 29th Annual Meeting, 2008.

K. Frantzi, S. Ananiadou, and H. Mima, Automatic recognition of multi-word terms:. the C-value/NC-value method, International Journal on Digital Libraries, vol.3, issue.2, pp.115-130, 2000.
DOI : 10.1007/s007999900023

F. Fritzinger, M. Weller, and U. Heid, A survey of idiomatic preposition-nounverb triples on token level, Proc. of the Seventh LREC, pp.2908-2914, 2010.

W. A. Gale and K. Church, A program for aligning sentences in bilingual corpora, Proceedings of the 29th annual meeting on Association for Computational Linguistics -, pp.75-102, 1993.
DOI : 10.3115/981344.981367

A. Gil and G. Dias, Using masks, suffix array-based data structures and multidimensional arrays to compute positional n-gram statistics from corpora, Proc. of the ACL Workshop on MWEs: Analysis, Acquisition and Treatment, pp.25-32, 2003.

A. J. Gill, R. M. French, D. Gergle, and J. Oberlander, The language of emotion in short blog texts, Proceedings of the ACM 2008 conference on Computer supported cooperative work, CSCW '08, 2008.
DOI : 10.1145/1460563.1460612

R. Girju, D. Moldovan, M. Tatu, and D. Antohe, On the semantics of noun compounds, Computer Speech & Language, vol.19, issue.4, pp.479-496, 2005.
DOI : 10.1016/j.csl.2005.02.006

R. Girju, P. Nakov, V. Nastase, S. Szpakowicz, P. Turney et al., Classification of semantic relations between nominals, SemEval-2007 and Beyond, pp.105-121, 2009.
DOI : 10.1007/s10579-009-9083-2

I. J. Good, THE POPULATION FREQUENCIES OF SPECIES AND THE ESTIMATION OF POPULATION PARAMETERS, Biometrika, vol.40, issue.3-4, pp.3-4237, 1953.
DOI : 10.1093/biomet/40.3-4.237

F. Grali´nskigrali´nski, A. Savary, M. Czerepowicka, and F. Makowiecki, Computational lexicography of multi-word units: How efficient can it be?, Proc. of the COLING Workshop on MWEs: from Theory to Applications, pp.1-9, 2010.

R. Granada, L. Lopes, C. Ramisch, C. Trojahn, R. Vieira et al., A comparable corpus based on aligned multilingual ontologies, Proceedings of the ACL 2012 First Workshop on Multilingual Modeling, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00954213

S. Green, M. De-marneffe, J. Bauer, and C. D. Manning, Multiword expression identification with tree substitution grammars: A parsing tour de force with French, Proc. of the 2011 EMNLP, pp.725-735, 2011.
URL : https://hal.archives-ouvertes.fr/hal-01111383

G. Grefenstette, The World Wide Web as a resource for example-based machine translation tasks, Proc. of the Twenty-First Translating and the Computer, 1999.

N. Grégoire, Design and implementation of a lexicon of Dutch multiword expressions, Proceedings of the Workshop on a Broader Perspective on Multiword Expressions, MWE '07, pp.17-24, 2007.
DOI : 10.3115/1613704.1613707

N. Grégoire, DuELME: a Dutch electronic lexicon of multiword expressions, Language Resources and Evaluation, vol.19, issue.4, pp.23-39, 2010.
DOI : 10.1007/s10579-009-9094-z

A. Gurrutxaga and I. Alegria, Automatic extraction of NV expressions in Basque: Basic issues on cooccurrence techniques, In Kordoni et al, pp.2-7, 2011.

P. Haugereid and F. Bond, Extracting transfer rules for multiword expressions from parallel corpora, pp.92-100, 2011.

A. Hautli and S. Sulger, Extracting and classifying Urdu multiword expressions, Proc. of the ACL 2011 SRW, pp.24-29, 2011.

G. Hazelbeck and H. Saito, A hybrid approach for functional expression identification in a Japanese reading assistant, Proc. of the COLING Workshop on MWEs: from Theory to Applications, pp.80-83, 2010.

U. Heid and M. Weller, Tools for collocation extraction: Preferences for active vs. passive, Proc. of the Sixth LREC, pp.1266-1272, 2008.

I. Hendrickx, A. Mendes, S. Pereira, A. Gonçalves, and I. Duarte, Complex predicates annotation in a corpus of Portuguese, Proceedings of the ACL 2010 Fourth Linguistic Annotation Workshop, pp.100-108, 2010.

H. H. Hoang, S. N. Kim, and M. Kan, A re-examination of lexical association measures, Proceedings of the Workshop on Multiword Expressions Identification, Interpretation, Disambiguation and Applications, MWE '09, pp.31-39, 2009.
DOI : 10.3115/1698239.1698246

D. Hogan, J. Foster, and J. Van-genabith, Decreasing lexical data sparsity in statistical syntactic parsing -experiments with named entities, In Kordoni et al, pp.14-19, 2011.

C. Huang, A. Kilgarriff, Y. Wu, C. Chiu, S. Smith et al., Chinese sketch engine and the extraction of grammatical collocations, Proceedings of the Fourth SIGHAN Workshop on Chinese Language Processing, pp.48-55, 2005.

J. D. Hwang, A. Bhatia, C. Bonial, A. Mansouri, A. Vaidya et al., Propbank annotation of multilingual light verb constructions, Proceedings of the ACL 2010 Fourth Linguistic Annotation Workshop, pp.82-90, 2010.

S. Ikehara, M. Tokuhisa, and J. Murakami, Pattern Dictionary Development based on Non-Compositional Language Model for Japanese Compound and Complex Sentences, Proc. of the 22nd COLING The Coling 2008 Organizing Committee, pp.353-360, 2008.
DOI : 10.1142/S0219427907001640

T. Izumi, K. Imamura, G. Kikui, and S. Sato, Standardizing complex functional expressions in Japanese predicates: Applying theoretically-based paraphrasing rules, Proc. of the COLING Workshop on MWEs: from Theory to Applications, pp.63-71, 2010.

R. Jackendoff, Twistin' the night away. Language, pp.534-559, 1997.

O. Jespersen, A Modern English Grammar on Historical Principles, 1965.

A. Joshi, Multi-word expressions as discourse relation markers (DRMs), Proc. of the COLING Workshop on MWEs: from Theory to Applications, p.89, 2010.

T. Joyce and I. Srdanovi´csrdanovi´c, Comparing lexical relationships observed within Japanese collocation data and Japanese word association norms, Proceedings of the workshop on Cognitive Aspects of the Lexicon, COGALEX '08, pp.1-8, 2008.
DOI : 10.3115/1598848.1598850

D. Jurafsky and J. H. Martin, Speech and Language Processing, 2008.

J. S. Justeson and S. M. Katz, Technical terminology: some linguistic properties and an algorithm for identification in text, Natural Language Engineering, vol.2, issue.01, pp.9-27, 1995.
DOI : 10.1109/72.363484

F. Keller and M. Lapata, Using the Web to Obtain Frequencies for Unseen Bigrams, Computational Linguistics, vol.24, issue.2, pp.459-484, 2003.
DOI : 10.1093/ijl/3.4.235

A. Kilgarriff, Googleology is Bad Science, Computational Linguistics, vol.29, issue.8, pp.147-151, 2007.
DOI : 10.1162/089120103322711604

A. Kilgarriff and G. Grefenstette, Introduction to the Special Issue on the Web as Corpus, Computational Linguistics, vol.19, issue.1, pp.333-347, 2003.
DOI : 10.1038/21987

J. Kim, T. Ohta, Y. Teteisi, and J. Tsujii, GENIA ontology, 2006.

S. Kim, Z. Yang, M. Song, and J. And-ahn, Retrieving collocations from Korean text, Proceedings of the 1999 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, pp.71-81, 1999.

S. Kim and E. Hovy, Determining the sentiment of opinions, Proceedings of the 20th international conference on Computational Linguistics , COLING '04, pp.1367-1373, 2004.
DOI : 10.3115/1220355.1220555

S. N. Kim and T. Baldwin, Standardised evaluation of English noun compound interpretation, Proc. of the LREC Workshop Towards a Shared Task for MWEs, pp.39-42, 2008.

S. N. Kim and T. Baldwin, How to pick out token instances of English verb-particle constructions, Language Resources and Evaluation, vol.19, issue.2, pp.97-113, 2010.
DOI : 10.1007/s10579-009-9099-7

S. N. Kim and M. Kan, Re-examining automatic keyphrase extraction approaches in scientific articles, Proceedings of the Workshop on Multiword Expressions Identification, Interpretation, Disambiguation and Applications, MWE '09, pp.9-16, 2009.
DOI : 10.3115/1698239.1698242

S. N. Kim and P. Nakov, Large-scale noun compound interpretation using bootstrapping and the web as a corpus, Proc. of the 2011 EMNLP, pp.648-658, 2011.

R. Kneser and H. Ney, Improved backing-off for M-gram language modeling, 1995 International Conference on Acoustics, Speech, and Signal Processing, 1995.
DOI : 10.1109/ICASSP.1995.479394

K. Knight, Decoding complexity in word-replacement translation models, Comp. Ling, vol.25, issue.4, pp.607-615, 1999.

K. Knight and P. Koehn, What's new in statistical machine translation, Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology Tutorials, NAACL '03, p.5, 2003.
DOI : 10.3115/1075168.1075173

P. Koehn, Europarl: A parallel corpus for statistical machine translation, Proc. of the Tenth MT Summit(MT Summit 2005), pp.79-86, 2005.

P. Koehn, Statistical Machine Translation, Cambridge UP, 2010.
DOI : 10.1017/CBO9780511815829

URL : https://hal.archives-ouvertes.fr/hal-01433972

P. Koehn, H. Hoang, A. Birch, C. Callison-burch, M. Federico et al., Moses, Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions, ACL '07, pp.177-180, 2007.
DOI : 10.3115/1557769.1557821

P. Koehn, F. J. Och, and D. Marcu, Statistical phrase-based translation, Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology , NAACL '03, pp.48-54, 2003.
DOI : 10.3115/1073445.1073462

URL : http://acl.ldc.upenn.edu/N/N03/N03-1017.pdf

I. Korkontzelos and S. Manandhar, Can recognising multiword expressions improve shallow parsing?, Proc. of HLT: The 2010 Annual Conf. of the NAACL (NAACL 2003), pp.636-644, 2010.

B. Krenn, Description of evaluation resource ? German PP-verb data, Proc. of the LREC Workshop Towards a Shared Task for MWEs, pp.7-10, 2008.

M. Krieger and M. J. Finatto, Introdução à Terminologia: teoria & prática. Editora Contexto, 2004.

N. Kulkarni and M. Finlayson, jMWE: A Java toolkit for detecting multi-word expressions, In Kordoni et al, pp.122-124, 2011.

S. Langer, A linguistic test battery for support verb constructions, Special issue of Linguisticae Investigationes, pp.171-184, 2004.
URL : https://hal.archives-ouvertes.fr/hal-01017138

S. Langer, A formal specification of support verb constructions, Semantik im Lexikon, pp.179-202, 2005.
URL : https://hal.archives-ouvertes.fr/hal-01101490

M. Lapata, The Disambiguation of Nominalizations, Computational Linguistics, vol.23, issue.1, pp.357-388, 2002.
DOI : 10.1093/ijl/3.4.235

M. Lapata and F. Keller, Web-based models for natural language processing, ACM Transactions on Speech and Language Processing, vol.2, issue.1, pp.1-31, 2005.
DOI : 10.1145/1075389.1075392

É. Laporte, T. Nakamura, and S. Voyatzi, A French corpus annotated for multiword nouns, Proc. of the LREC Workshop Towards a Shared Task for MWEs, pp.27-30, 2008.
URL : https://hal.archives-ouvertes.fr/halshs-00286552

É. Laporte and S. Voyatzi, An electronic dictionary of French multiword adverbs, Proc. of the LREC Workshop Towards a Shared Task for MWEs, pp.31-34, 2008.
URL : https://hal.archives-ouvertes.fr/halshs-00286546

E. Lavagnino and J. Park, Conceptual structure of automatically extracted multiword terms from domain specific corpora: a case study for Italian, Proc. of the 2nd COGALEX workshop The Coling 2010 Organizing Committee, pp.48-55, 2010.

J. Lee, Two types of Korean light verb constructions in a typed feature structure grammar, pp.40-48, 2011.

L. Lee, A. Aw, M. Zhang, L. , and H. , EM-based hybrid model for bilingual terminology extraction from comparable corpora, Proc. of the 23rd COLING (COLING 2010) ? Posters The Coling 2010 Organizing Committee, pp.639-646, 2010.

Z. Li, C. Callison-burch, C. Dyer, J. Ganitkevitch, S. Khudanpur et al., Joshua, Proceedings of the Fourth Workshop on Statistical Machine Translation, StatMT '09, pp.135-139, 2009.
DOI : 10.3115/1626431.1626459

E. Linardaki, C. Ramisch, A. Villavicencio, and A. Fotopoulou, Towards the construction of language resources for Greek multiword expressions: Extraction and evaluation, Proc. of the LREC Workshop on Exploitation of multilingual resources and tools for Central and, pp.31-40, 2010.
URL : https://hal.archives-ouvertes.fr/hal-00959198

B. Lohse, J. A. Hawkins, and T. Wasow, Domain Minimization in English Verb-Particle Constructions, Language, vol.80, issue.2, pp.238-261, 2004.
DOI : 10.1353/lan.2004.0089

A. Lopez, Statistical machine translation, ACM Computing Surveys, vol.40, issue.3, pp.1-49, 2008.
DOI : 10.1145/1380584.1380586

J. M. López, R. Gil, R. García, I. Cearreta, and N. Garay, Towards an Ontology for Describing Emotions, Emerging Technologies and Information Systems for the Knowledge Society, pp.96-104, 2008.
DOI : 10.1007/978-3-540-87781-3_11

U. Manber and G. Myers, Suffix Arrays: A New Method for On-Line String Searches, SODA '90: Proceedings of the first annual ACM-SIAM symposium on Discrete algorithms, pp.319-327, 1990.
DOI : 10.1137/0222058

M. Mangeot and A. Chalvin, Dictionary building with the jibiki platform: the GDEF case, Proc. of the Sixth LREC, pp.1666-1669, 2006.
URL : https://hal.archives-ouvertes.fr/hal-00968611

M. Mangeot and C. Ramisch, A serious lexical game for building a Portuguese lexical-semantic network, Proceedings of the ACL 2012 3rd Workshop on The People's Web Meets NLP: Collaboratively Constructed Semantic Resources and their Applications to NLP, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00954214

C. D. Manning and H. Schütze, Foundations of statistical natural language processing, 1999.

C. Marchello-nizia, Les verbes supports en diachronie : le cas du fran??ais, Langages, vol.30, issue.121, pp.91-98, 1996.
DOI : 10.3406/lgge.1996.1742

S. Martens, Varro: An algorithm and toolkit for regular structure discovery in treebanks, Proc. of the 23rd COLING (COL- ING 2010) ? Posters The Coling 2010 Organizing Committee, pp.810-818, 2010.

S. Martens and V. Vandeghinste, An efficient, generic approach to extracting multi-word expressions from dependency trees, Proc. of the COLING Workshop on MWEs: from Theory to Applications, pp.84-87, 2010.

Y. Y. Mathieu, Annotation of Emotions and Feelings in Texts, Affective Computing and Intelligent Interaction, pp.350-357, 2005.
DOI : 10.1007/11573548_45

D. Mccarthy, B. Keller, C. , and J. , Detecting a continuum of compositionality in phrasal verbs, Proceedings of the ACL 2003 workshop on Multiword expressions analysis, acquisition and treatment -, pp.73-80, 2003.
DOI : 10.3115/1119282.1119292

D. Mccarthy, S. Venkatapathy, and A. Joshi, Detecting compositionality of verbobject combinations using selectional preferences, Proc. of the 2007 Joint Conference on EMNLP and Computational NLL (EMNLP-CoNLL 2007), pp.369-379, 2007.

I. D. Melamed, Automatic discovery of non-compositional compounds in parallel data, Proc. of the 2nd EMNLP (EMNLP-2), pp.97-108, 1997.

I. Mel-'?-cuk, N. Arbatchewsky-jumarie, L. Elnitsky, L. Iordanskaja, and A. Lessard, Dictionnaire explicatif et combinatoire du français contemporain. Recherches lexico-sémantiques I. Les presses de l, 1984.

I. Mel-'?-cuk and A. Polguère, A formal lexicon in the meaning-text theory or (how to do lexica with words), Comp. Ling, vol.13, pp.3-4261, 1987.

. Dictionnaire-explicatif-et-combinatoire-du-français-contemporain, Recherches lexicosémantiques IV. Les presses de l

I. Mel-'?-cuk, N. Arbatchewsky-jumarie, L. Dagenais, L. Elnitsky, L. Iordanskaja et al., Dictionnaire explicatif et combinatoire du français contemporain. Recherches lexico-sémantiques II. Les presses de l, 1988.

I. Mel-'?-cuk, N. Arbatchewsky-jumarie, L. Iordanskaja, and S. Mantha, Dictionnaire explicatif et combinatoire du français contemporain. Recherches lexicosémantiques III. Les presses de l, 1992.

I. Mel-'?-cuk, A. Clas, and A. Polguère, Introduction à la lexicologie explicative et combinatoire, Editions Duculot, vol.256, 1995.

C. Messiant, T. Poibeau, and A. Korhonen, Lexschem: a large subcategorization lexicon for French verbs, Proc. of the Sixth LREC, pp.533-538, 2008.
URL : https://hal.archives-ouvertes.fr/hal-00539025

A. Michou and V. Seretan, A tool for multi-word expression extraction in modern Greek using syntactic parsing, Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics: Demonstrations Session on, EACL '09, pp.45-48, 2009.
DOI : 10.3115/1609049.1609061

M. Mini and A. Fotopoulou, Typology of multiword verbal expressions in modern Greek dictionaries: limits and differences (in Greek), Proceedings of the 18th International Symposium of Theoretical & Applied Linguistics, School of English, pp.491-503, 2009.

E. Morin and B. Daille, Compositionality and lexical alignment of multi-word terms, Language Resources and Evaluation, vol.16, issue.2, pp.79-95, 2010.
DOI : 10.1007/s10579-009-9098-8

URL : https://hal.archives-ouvertes.fr/hal-00417686

A. Moustaki, Les expressions figées ? ? ? µ??/être Prép C W en grec moderne, 1995.

A. Mukerjee, A. Soni, R. , and A. M. , Detecting complex predicates in Hindi using POS projection across parallel corpora, Proceedings of the Workshop on Multiword Expressions Identifying and Exploiting Underlying Properties, MWE '06, pp.28-35, 2006.
DOI : 10.3115/1613692.1613699

P. Nakov, Using the web as an implicit training set, Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing , HLT '05, 2007.
DOI : 10.3115/1220575.1220680

P. Nakov, Improved statistical machine translation using monolingual paraphrases, Proc. of the 18th ECAI, pp.338-342, 2008.

P. Nakov, Paraphrasing verbs for noun compound interpretation, Proc. of the LREC Workshop Towards a Shared Task for MWEs, pp.46-49, 2008.

P. Nakov, M. A. Hearst, U. Mi, and . Acl, Search engine statistics beyond the n-gram, Proceedings of the Ninth Conference on Computational Natural Language Learning, CONLL '05, pp.17-24, 2005.
DOI : 10.3115/1706543.1706547

P. Nakov and M. A. Hearst, Solving relational similarity problems using the web as a corpus, Proc. of the 46th ACL: HLT (ACL-08: HLT), pp.452-460, 2008.

A. Nematzadeh, A. Fazly, and S. Stevenson, Child Acquisition of Multiword Verbs: A Computational Investigation, Cognitive Aspects of Computational Language Acquisition, 2012.
DOI : 10.1007/978-3-642-31863-4_9

M. H. Neves, Estudo das construções com verbos-suporte em português, Gramática do português falado VI: Desenvolvimentos, pp.201-231, 1996.

M. E. Newman, Power laws, Pareto distributions and Zipf's law, Contemporary Physics, vol.27, issue.5, pp.323-351, 2005.
DOI : 10.1103/PhysRevLett.75.2055

URL : http://arxiv.org/abs/cond-mat/0412004

J. Nicholson, T. Baldwin, B. V. Moirón, A. Villavicencio, D. Mccarthy et al., Interpretation of compound nominalisations using corpus and web statistics, Proceedings of the Workshop on Multiword Expressions Identifying and Exploiting Underlying Properties, MWE '06, pp.54-61, 2006.
DOI : 10.3115/1613692.1613703

J. Nicholson and T. Baldwin, Interpreting compound nominalisations, Proc. of the LREC Workshop Towards a Shared Task for MWEs, pp.43-45, 2008.

R. North, Computational measures of the acceptability of light verb constructions, 2005.

F. J. Och, Statistical machine translation: Foundations and recent advances, Proc. of the Tenth MT Summit(MT Summit 2005), 2005.

F. J. Och and H. Ney, Improved statistical alignment models, Proceedings of the 38th Annual Meeting on Association for Computational Linguistics , ACL '00, pp.440-447, 2000.
DOI : 10.3115/1075218.1075274

F. J. Och and H. Ney, A Systematic Comparison of Various Statistical Alignment Models, Computational Linguistics, vol.22, issue.1, pp.19-51, 2003.
DOI : 10.1109/89.817451

F. J. Och and H. Ney, The Alignment Template Approach to Statistical Machine Translation, Computational Linguistics, vol.25, issue.4, pp.417-449, 2004.
DOI : 10.1162/089120103321337458

T. Ohta, Y. Tateishi, K. , and J. , The GENIA corpus, Proceedings of the second international conference on Human Language Technology Research -, pp.82-86, 2002.
DOI : 10.3115/1289189.1289260

S. Pal, S. K. Naskar, P. Pecina, S. Bandyopadhyay, and A. Way, Handling named entities and compound verbs in phrase-based statistical machine translation, Proc. of the COLING Workshop on MWEs: from Theory to Applications, pp.45-53, 2010.

B. Pang and L. Lee, Opinion Mining and Sentiment Analysis, volume 2 of Foundations and Trends in Information Retrieval, 2008.

H. Papageorgiou, P. Prokopidis, V. Giouli, and S. Piperidis, A unified POS tagging architecture and its application to Greek, Proc. of the Second LREC, pp.1455-1462, 2000.

K. Papineni, S. Roukos, T. Ward, and W. Zhu, BLEU, Proceedings of the 40th Annual Meeting on Association for Computational Linguistics , ACL '02, pp.311-318, 2002.
DOI : 10.3115/1073083.1073135

D. Pearce, Synonymy in collocation extraction, WordNet and Other Lexical Resources: Applications, Extensions and Customizations (NAACL 2001 Workshop), pp.41-46, 2001.

D. Pearce, A comparative evaluation of collocation extraction techniques, Proc. of the Third LREC, pp.1530-1536, 2002.

P. Pecina, An extensive empirical study of collocation extraction methods, Proceedings of the ACL Student Research Workshop on, ACL '05, pp.13-18, 2005.
DOI : 10.3115/1628960.1628964

P. Pecina, A machine learning approach to multiword expression extraction, Proc. of the LREC Workshop Towards a Shared Task for MWEs, pp.54-57, 2008.

P. Pecina, Reference data for Czech collocation extraction, Proc. of the LREC Workshop Towards a Shared Task for MWEs, pp.11-14, 2008.

P. Pecina, Lexical association measures and collocation extraction, Language Resources and Evaluation, vol.19, issue.1, pp.137-158, 2010.
DOI : 10.1007/s10579-009-9101-4

T. Pedersen, Fishing for exactness, Proc. of the South-Central SAS Users Group Conference (SCSUG-96), pp.188-200, 1996.

T. Pedersen, S. Banerjee, B. Mcinnes, S. Kohli, M. Joshi et al., The n-gram statistics package (text::NSP) : A flexible tool for identifying n-grams, collocations , and word associations, pp.131-133, 2011.

S. S. Piao, P. Rayson, D. Archer, A. Wilson, and T. Mcenery, Extracting multiword expressions with a semantic tagger, Proceedings of the ACL 2003 workshop on Multiword expressions analysis, acquisition and treatment -, pp.49-56, 2003.
DOI : 10.3115/1119282.1119289

S. S. Piao, G. Sun, P. Rayson, and Q. Yuan, Automatic extraction of Chinese multiword expressions with a statistical tool, Proc. of the EACL Workshop on MWEsin Multilingual Context (EACL-MWE 2006), 2006.

E. Planas and O. Furuse, Multi-level similar segment matching algorithm for translation memories and Example-based Machine Translation, Proceedings of the 18th conference on Computational linguistics -, 2000.
DOI : 10.3115/992730.992736

J. Preiss, T. Briscoe, and A. Korhonen, A system for large-scale acquisition of verbal, nominal and adjectival subcategorization frames from corpora, Proc. of the 45th ACL, pp.912-919, 2007.

C. Ramisch, Multiword terminology extraction for domain-specific documents, École Nationale Supérieure d'Informatique et de Mathématiques Appliquées, 2009.

C. Ramisch, A generic framework for multiword expressions treatment: from acquisition to applications, Proc. of the ACL 2012 SRW, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00954212

C. Ramisch, Une plate-forme générique et ouverte pour le traitement des expressions polylexicales, Actes de 14e Rencontres des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RECITAL 2012), 2012.

C. Ramisch, V. D. Araujo, and A. Villavicencio, A broad evaluation of techniques for automatic acquisition of multiword expressions, Proc. of the ACL 2012 SRW, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00954205

C. Ramisch, H. De-medeiros-caseli, A. Villavicencio, A. Machado, and M. J. Finatto, A Hybrid Approach for Multiword Expression Identification, Proc. of the 9th PROPOR, pp.65-74, 2010.
DOI : 10.1007/978-3-642-12320-7_9

URL : https://hal.archives-ouvertes.fr/hal-00959199

C. Ramisch, P. Schreiner, M. Idiart, and A. Villavicencio, An evaluation of methods for the extraction of multiword expressions, Proc. of the LREC Workshop Towards a Shared Task for MWEs, pp.50-53, 2008.
URL : https://hal.archives-ouvertes.fr/hal-01200613

C. Ramisch, A. Villavicencio, and C. Boitet, Multiword expressions in the wild? the mwetoolkit comes in handy, Proc. of the 23rd COLING (COLING 2010) ? Demonstrations The Coling 2010 Organizing Committee, pp.57-60, 2010.
URL : https://hal.archives-ouvertes.fr/hal-00959176

C. Ramisch, A. Villavicencio, and C. Boitet, mwetoolkit: a framework for multiword expression identification, Proc. of the Seventh LREC, pp.662-669, 2010.
URL : https://hal.archives-ouvertes.fr/hal-00959174

C. Ramisch, A. Villavicencio, and C. Boitet, Web-based and combined language models: a case study on noun compound identification, Proc. of the 23rd COLING (COLING 2010) ? Posters The Coling 2010 Organizing Committee, pp.1041-1049, 2010.
URL : https://hal.archives-ouvertes.fr/hal-01002431

C. Ramisch, A. Villavicencio, L. Moura, and M. Idiart, Picking them up and figuring them out, Proceedings of the Twelfth Conference on Computational Natural Language Learning, CoNLL '08, pp.49-56, 2008.
DOI : 10.3115/1596324.1596334

URL : https://hal.archives-ouvertes.fr/hal-01200612

E. Ranchhod, Construções com nomes predicativos na crónica geral de espanha de 1344, Lindley Cintra. Homenagem ao Homem, ao Mestre e ao Cidadão, pp.667-682, 1999.

R. Rapp, The computation of associative responses to multiword stimuli, Proceedings of the workshop on Cognitive Aspects of the Lexicon, COGALEX '08, pp.102-109, 2008.
DOI : 10.3115/1598848.1598865

P. Rayson, S. Piao, S. Sharoff, S. Evert, and B. V. Moirón, expression: hard going or plain sailing, Lang. Res. & Eval. Special Issue on Multiword, vol.44, 2010.

P. Rayson, S. Piao, S. Sharoff, S. Evert, and B. V. Moirón, Multiword expressions: hard going or plain sailing? Lang. Res. & Eval. Special Issue on Multiword expression: hard going or plain sailing, pp.1-5, 2010.

Z. Ren, Y. Lü, J. Cao, Q. Liu, and Y. Huang, Improving statistical machine translation using domain bilingual multiword expressions, Proceedings of the Workshop on Multiword Expressions Identification, Interpretation, Disambiguation and Applications, MWE '09, pp.47-54, 2009.
DOI : 10.3115/1698239.1698249

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.212.5231

G. Rio-torto, O Léxico: semântica e gramática das unidades lexicais, Estudos sobre léxico e gramática, pp.11-34, 2006.

P. Rychlý and P. Smrz, Manatee, bonito and word sketches for Czech, Proceedings of the Second International Conference on Corpus Linguisitcs, pp.124-131, 2004.

I. Sag, T. Baldwin, F. Bond, A. Copestake, and D. Flickinger, Multiword Expressions: A Pain in the Neck for NLP, Proc. of the 3rd CICLing, pp.1-15, 2002.
DOI : 10.1007/3-540-45715-1_1

M. Salkoff, Automatic translation of support verb constructions, Proceedings of the 13th conference on Computational linguistics -, pp.243-246, 1990.
DOI : 10.3115/991146.991189

E. Sanjuan, J. Dowdall, F. Ibekwe-sanjuan, and F. Rinaldi, A symbolic approach to automatic multiword term structuring, Computer Speech & Language, vol.19, issue.4, pp.524-542, 2005.
DOI : 10.1016/j.csl.2005.02.002

URL : https://hal.archives-ouvertes.fr/hal-00636158

H. Schmid, Probabilistic part-of-speech tagging using decision trees, Proceedings of the International Conference on New Methods in Language Processing, pp.44-49, 1994.

P. Schone and D. Jurafsky, Is knowledge-free induction of multiword unit dictionary headwords a solved problem, Proc. of the 2001 EMNLP, pp.100-108, 2001.

W. Schuler and A. Joshi, Tree-rewriting models of multi-word expressions, pp.25-30, 2011.

V. Seretan, Collocation extraction based on syntactic parsing, 2008.

V. Seretan and E. Wehrli, Multilingual collocation extraction, Proceedings of the Workshop on Multilingual Language Resources and Interoperability, MLRI '06, pp.40-49, 2006.
DOI : 10.3115/1613162.1613168

V. Seretan and E. Wehrli, Multilingual collocation extraction with a syntactic parser, Language Resources and Evaluation, vol.19, issue.1, pp.71-85, 2009.
DOI : 10.1007/s10579-008-9075-7

V. Seretan and E. Wehrli, Fipscoview: On-line visualisation of collocations extracted from multilingual parallel corpora, In Kordoni et al, pp.125-127, 2011.

S. Shimohata, T. Sugio, and J. Nagata, Retrieving collocations by co-occurrences and word order constraints, Proc. of the 35th ACL and 8th Conf. of the EACL (ACL- EACL 1997), pp.476-481, 1997.

T. Shinozaki and M. Ostendorf, Cross-validation and aggregated EM training for robust parameter estimation, Computer Speech & Language, vol.22, issue.2, pp.185-195, 2008.
DOI : 10.1016/j.csl.2007.07.005

H. M. Silva, Verbos-suporte ou expressões cristalizadas? Soletras, pp.175-182, 2009.

J. Silva and G. Lopes, A local maxima method and a fair dispersion normalization for extracting multi-word units from corpora, Proceedings of the Sixth Meeting on Mathematics of Language (MOL6), pp.369-381, 1999.

J. Silva and G. Lopes, Towards automatic building of document keywords, Proc. of the 23rd COLING (COLING 2010) ? Posters The Coling 2010 Organizing Committee, pp.1149-1157, 2010.

M. J. Silva, P. Carvalho, L. Sarmento, E. Oliveira, and P. Magalhães, The design of OPTIMISM, an opinion mining system for Portuguese politics, Proc.of the Fourteenth Portuguese Conference on Artificial Intelligence, pp.565-576, 2006.

J. Sinclair, Collins COBUILD Dictionary of Phrasal Verbs, Collins COBUILD, 1989.

R. M. Sinha, Mining complex predicates in Hindi using a parallel Hindi-English corpus, Proceedings of the Workshop on Multiword Expressions Identification, Interpretation, Disambiguation and Applications, MWE '09, pp.40-46, 2009.
DOI : 10.3115/1698239.1698247

R. M. Sinha, Stepwise mining of multi-word expressions in Hindi, pp.110-115, 2011.

F. A. Smadja, Retrieving collocations from text: Xtract, Comp. Ling, vol.19, issue.1, pp.143-177, 1993.

S. Spina, The dictionary of Italian collocations: Design and integration in an online learning environment, Proc. of the Seventh LREC, pp.3202-3208, 2010.

M. Steedman, On Becoming a Discipline, Computational Linguistics, vol.11, issue.1, pp.137-144, 2008.
DOI : 10.1162/coli.2008.34.1.137

S. Stevenson, A. Fazly, and R. North, Statistical measures of the semiproductivity of light verb constructions, Proc. of the ACL Workshop on MWEs: Integrating Processing, pp.1-8, 2004.

A. Stolcke, SRILM ? an extensible language modeling toolkit, Proc. of the Seventh ICSLP, Third INTERSPEECH Event, pp.901-904, 2001.

S. Stymne, A comparison of merging strategies for translation of German compounds, Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop on, EACL '09, pp.61-69, 2009.
DOI : 10.3115/1609179.1609187

S. Stymne, Pre-and postprocessing for statistical machine translation into Germanic languages, Proc. of the ACL 2011 SRW, pp.12-17, 2011.

T. Tanaka and T. Baldwin, Noun-noun compound machine translation A feasibility study on shallow processing, Proc. of the ACL Workshop on MWEs: Analysis, Acquisition and Treatment, pp.17-24, 2003.

S. Teufel and G. Grefenstette, Corpus-based method for automatic identification of support verbs for nominalizations, Proc. of the 7th Conf. of the EACL (EACL 1995), pp.98-103, 1995.

C. Tillmann and H. Ney, Word Reordering and a Dynamic Programming Beam Search Algorithm for Statistical Machine Translation, Computational Linguistics, vol.25, issue.4, pp.97-133, 2003.
DOI : 10.1109/89.817451

K. Uchiyama, T. Baldwin, and S. Ishizaki, Disambiguating Japanese compound verbs, Computer Speech & Language, vol.19, issue.4, pp.497-512, 2005.
DOI : 10.1016/j.csl.2005.02.001

B. Vauquois, A survey of formal grammars and algorithms for recognition and transformation in mechanical translation, IFIP Congress, pp.1114-1122, 1968.

S. Venkatapathy and A. K. Joshi, Using information about multi-word expressions for the word-alignment task, Proceedings of the Workshop on Multiword Expressions Identifying and Exploiting Underlying Properties, MWE '06, pp.20-27, 2006.
DOI : 10.3115/1613692.1613697

S. Venkatsubramanyan, J. Perez-carballo, T. Tanaka, A. Villavicencio, F. Bond et al., Multiword expression filtering for building knowledge maps, Proceedings of the Workshop on Multiword Expressions Integrating Processing, MWE '04, pp.40-47, 2004.
DOI : 10.3115/1613186.1613192

URL : http://acl.ldc.upenn.edu/acl2004/mwe/pdf-onecolumn/venkatsubramanyan-1col.pdf

A. Villavicencio, F. Bond, A. Korhonen, and D. Mccarthy, Introduction to the special issue on multiword expressions: Having a crack at a hard nut, Computer Speech & Language, vol.19, issue.4, pp.365-377, 2005.
DOI : 10.1016/j.csl.2005.05.001

A. Villavicencio, M. Idiart, C. Ramisch, V. D. Araujo, B. Yankama et al., Get out but don't fall down: verb-particle constructions in child language, pp.43-50, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00954204

A. Villavicencio, V. Kordoni, Y. Zhang, M. Idiart, R. et al., Validation and evaluation of automatically acquired multiword expressions for grammar engineering, Proc. of the 2007 Joint Conference on EMNLP and Computational NLL (EMNLP-CoNLL 2007), pp.1034-1043, 2007.
URL : https://hal.archives-ouvertes.fr/hal-01200614

A. Villavicencio, C. Ramisch, A. Machado, H. De-medeiros-caseli, and M. J. Finatto, Identificação de expressões multipalavra em domínios específicos, Linguamática, vol.2, issue.1, pp.15-33, 2010.

V. Vincze, T. , I. N. Berend, and G. , Detecting noun compounds and light verb constructions: a contrastive study, In Kordoni et al, pp.116-121, 2011.

W. Weaver, Translation, Machine Translation of Languages: Fourteen Essays, pp.15-23, 1955.

E. Wehrli, Translating idioms, Proc. of the 36th ACL and 17th COLING (ACL-COLING 1998), pp.1388-1392, 1998.

E. Wehrli, V. Seretan, and L. Nerima, Sentence analysis and collocation identification, Proc. of the COLING Workshop on MWEs: from Theory to Applications, pp.27-35, 2010.

M. Weller and U. Heid, Extraction of German multiword expressions from parsed corpora using context features, Proc. of the Seventh LREC, pp.3195-3201, 2010.

J. Wermter and U. Hahn, You can't beat frequency (unless you use linguistic knowledge) ? A qualitative evaluation of association measures for collocation and term extraction, Proc. of the 21st COLING and 44th ACL (COLING/ACL 2006), pp.785-792, 2006.

Y. Xu, R. Goebel, C. Ringlstetter, and G. Kondrak, Application of the tightness continuum measure to Chinese information retrieval, Proc. of the COLING Workshop on MWEs: from Theory to Applications, pp.54-62, 2010.

M. Yamamoto and K. Church, Using Suffix Arrays to Compute Term Frequency and Document Frequency for All Substrings in a Corpus, Computational Linguistics, vol.4, issue.4, pp.1-30, 2001.
DOI : 10.1007/BF01206331

D. Yarowsky, One sense per collocation, Proceedings of the workshop on Human Language Technology , HLT '93, pp.266-271, 2001.
DOI : 10.3115/1075671.1075731

A. Zaninello and M. Nissim, Creation of lexical resources for a characterisation of multiword expressions in Italian, Proc. of the Seventh LREC, pp.654-661, 2010.

S. Zarrieß, J. Kuhn, D. Anastasiou, C. Hashimoto, P. Nakov et al., Exploiting translational correspondences for patternindependent MWE identification, Proc. of the ACL Workshop on MWEs: Identification, pp.23-30, 2009.

Y. Zhang and V. Kordoni, Automated deep lexical acquisition for robust open texts processing, Proc. of the Sixth LREC, pp.275-280, 2006.

Y. Zhang, V. Kordoni, A. Villavicencio, M. Idiart, B. V. Moirón et al., Automated multiword expression prediction for grammar engineering, Proceedings of the Workshop on Multiword Expressions Identifying and Exploiting Underlying Properties, MWE '06, pp.36-44, 2006.
DOI : 10.3115/1613692.1613700

. Ensuite, on décrit les EPL cibles en définissant des motifs multiniveaux qui reposent sur des connaissances linguistiques expertes, sur l'intuition, sur l'observation empirique et

. Pour-le-filtrage, une multitude de méthodes est disponible, allant de simples seuils de nombres d'occurrences à des listes de mots interdits et des mesures d'association sophistiquées

. Enfin, les candidates filtrées sont soit directement injectées dans une application de TAL, soit validées manuellement avant l'application. Une autre utilisation pour les candidates validées est la création d'un modèle d'apprentissage automatique, qui peut être appliqué sur des nouveaux corpus afin d

À. Ce-jour, il n'y a pas de consensus sur une méthode optimale d'acquisition d'EPL

. Mccarthy, ou alors s'il faudrait chercher une combinaison de méthodes ou un sous-ensemble de verbes phrasaux à travers l'emploi d'un analyseur plus profond qui peut capturer les dépendances de longue distance sur des expressions syntaxiquement variables Potentiellement, l'information syntaxique peut fournir de nouveaux attributs au modèle de traduction. La détection de la compositionnalité des verbes phrasaux fondée sur les corpus, Il n'est donc pas possible de déterminer si il existe une méthode unique pour toutes les EPL 2003) pourrait aider à générer des traductions plus exactes. Nous voulons aussi étudier d'autres constructions polylexicales qui ont un impact sur les équivalences et asymétries entre les langues. Notre objectif à long terme est d'intégrer les EPL dans les systèmes de TA dans le but d'obtenir des traductions de haute qualité en combinant des informations statistiques et linguistiques, 2003.

?. Projet and C. , financé par l'allocation 707-11 de l'appel CAPES-COFECUB. 7 Son objectif principal est d'étudier des techniques automatiques et collaboratives dans le développement de ressources ontologiques et lexicales pour les applications multilingues. Nous voulons étudier l'apport d'un système de gestion collaborative de ressources lexicales dans le filtrage et la validation les EPL acquises automatiquement. Nous avons également des expériences en cours et certains résultats préliminaires publiés dans l'acquisition automatique d'un corpus comparable représentant un échantillon du langage utilisé dans les conférences scientifiques en français, portugais et anglais (Granada et al. 2012). Simultanément, nous testons la faisabilité d'une approche fondée sur le jeu lexical sérieux JeuxDeMots pour la construction d, des résultats de cette thèse est le projet CAMELEON Ces expériences constituent un environnement expérimental intéressant pour des recherches futures sur le rôle des EPL dans les ressources et applications créées dans ces trois langues

. Bien-sûr and . Ceci-n, ont posé la question si l'identification automatique d'EPL était un problème résolu, et la réponse que cet article apporta à l'époque fut négative. De même, des publications spécialisées plus récentes montrent des indices que cela est encore vrai aujourd'hui. Par exemple, les préfaces des derniers numéros de revue consacrés aux EPL (Villavicencio et al, nombreuses études linguistiques. Au début des années Rayson et al. 2010b) et des annales de l'atelier MWE (Kordoni et al. 2011a) listent plusieurs défis dans le traitement des EPL tels que le multilinguisme, la représentation dans les lexiques et l'évaluation fondée sur les applications, 2000.

. Attia, intégration des EPL automatiquement extraites dans des applications de TAL réelles. Néanmoins, étant donné la complexité du problème, ce traitement doit être continuellement amélioré, car il semble peu probable que, dans un avenir proche, on puisse proposer une solution définitive et unifiée pour le traitement des EPL dans les applications de TAL. Ainsi, à long terme, notre objectif peut être résumé comme étant d'améliorer et étendre le travail présenté ici. Car si, d'une part, nous avons effectué un premier pas important, d'autre part la route qui reste à parcourir est encore longue, Une des contributions principales de ce travail réside dans le fait qu'il représente une étape vers l, 2010.

?. English and . Piao, Bannard (2007) practically impossible to crawl and download all the ever-growing text of the web, but search engines can be used to estimate the counts of words in the web, We use Google and Yahoo! search APIs and the implementation of the mwetoolkit, 2002.

?. Treetagger, The TreeTagger is a free downloadable POS tagger available for several languages, and with a good performance for English It performs not only POS tagging but also sentence splitting, tokenisation and lemmatisation of the text. The TreeTagger is freely available at http://www.ims. uni-stuttgart.de/projekte/corplex/TreeTagger/. The tagset used by the TreeTagger in English is available, 1994.

?. Palavras, This deep syntactic parsing tool of Portuguese was used for the analysis of Portuguese text It supports tokenisation, sentence splitting, POS tagging, lemmatisation, dependency parsing annotation and shallow semantic annotation, most cases, only the first four features were used, 2000.

?. Rasp, The RASP parser is a free downloadable tool for the syntactic analysis of English text It provides not only POS tagging but also constituent and dependency trees, 2006.