Abstract : The level of agreement among participants is a key aspect of gesture elicitation studies, and it is typically quantified by means of agreement rates (AR). We show that this measure is problematic, as it does not account for chance agreement. The problem of chance agreement has been extensively discussed in a range of scientific fields in the context of inter-rater reliability studies. We review chance-corrected agreement coefficients that are routinely used in inter-reliability studies and show how to apply them to gesture elicitation studies. We also discuss how to compute interval estimates for these coefficients and how to use them for statistical inference.