R. Artstein and M. Poesio, Inter-Coder Agreement for Computational Linguistics, Computational Linguistics, vol.27, issue.1, pp.555-596, 2008.
DOI : 10.1037/0033-2909.103.3.374

P. S. Bayerl and K. I. Paul, What Determines Inter-Coder Agreement in Manual Annotations? A Meta-Analytic Investigation, Computational Linguistics, vol.9, issue.1, pp.699-725, 2011.
DOI : 10.3102/00028312028001189

C. Benzitoun, K. Fort, and B. Sagot, TCOF-POS : un corpus libre de français parlé annoté en morphosyntaxe, Proceedings of the Traitement Automatique des Langues Naturelles (TALN), pp.99-112, 2012.

Y. Bestgen, Quels indices pour mesurer l'efficacité en segmentation thématique?, Proceedings of the TALN'09, 2009.

A. Bookstein, V. A. Kulyukin, R. , and T. , Generalized Hamming distance, Information Retrieval, issue.5, pp.353-375, 2002.

J. Cohen, A Coefficient of Agreement for Nominal Scales, Educational and Psychological Measurement, vol.20, issue.1, pp.37-46, 1960.
DOI : 10.1177/001316446002000104

J. Cohen, Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit., Psychological Bulletin, vol.70, issue.4, pp.213-220, 1968.
DOI : 10.1037/h0026256

K. Fort, C. François, O. Galibert, and M. Ghribi, Analyzing the impact of prevalence on the evaluation of a manual annotation campaign, Proceedings of the Eighth International Conference on Language Resources and Evaluation, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00709174

U. Gut and P. S. Bayerl, Measuring the reliability of manual annotations of speech corpora, Proceedings of the Speech Prosody, pp.565-568, 2004.

K. L. Gwet, Handbook of Inter-rater Reliability, 2012.

K. Krippendorff, On the reliability of unitizing contiguous data, Sociological Methodology, issue.25, pp.47-76, 1995.

Y. Mathet and A. Widlöcher, Une approche holiste et unifiée de l'alignement et de la mesure d'accord inter-annotateurs, Proceedings of the Traitement Automatique des Langues Naturelles, 2011.

L. Pevzner and M. A. Hearst, A Critique and Improvement of an Evaluation Metric for Text Segmentation, Computational Linguistics, vol.17, issue.1, pp.19-36, 2002.
DOI : 10.1126/science.264.5164.1421

D. Reidsma and J. Carletta, Reliability Measurement without Limits, Computational Linguistics, vol.41, issue.3, pp.319-326, 2008.
DOI : 10.1162/089120104773633402
URL : http://doi.org/10.1162/coli.2008.34.3.319

N. Schluter, Treebank-Based Deep Grammar Acquisition for French Probabilistic Parsing Resources, 2011.