Crowdsourcing for Language Resource Development: Critical Analysis of Amazon Mechanical Turk Overpowering Use

Abstract : This article is a position paper about crowdsourced microworking systems and especially Amazon Mechanical Turk, the use of which has been steadily growing in language processing in the past few years. According to the mainstream opinion expressed in the articles of the domain, this type of on-line working platforms allows to develop very quickly all sorts of quality language resources, for a very low price, by people doing that as a hobby or wanting some extra cash. We shall demonstrate here that the situation is far from being that ideal, be it from the point of view of quality, price, workers' status or ethics and bring back to mind already existing or proposed alternatives. Our goal here is threefold: 1 - to inform researchers, so that they can make their own choices with all the elements of the reflection in mind, 2- to ask for help from funding agencies and scientific associations, and develop alternatives, 3- to propose practical and organizational solutions in order to improve new language resources development, while limiting the risks of ethical and legal issues without letting go price or quality.
Type de document :
Communication dans un congrès
5th Language and Technology Conference, Nov 2011, Poznan, Poland. 2011
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-00648187
Contributeur : Karën Fort <>
Soumis le : lundi 5 décembre 2011 - 11:56:42
Dernière modification le : vendredi 4 janvier 2019 - 17:33:24
Document(s) archivé(s) le : vendredi 16 novembre 2012 - 14:21:17

Fichier

ltc-56-adda_final.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-00648187, version 1

Collections

Citation

Gilles Adda, Benoît Sagot, Karën Fort, Joseph Mariani. Crowdsourcing for Language Resource Development: Critical Analysis of Amazon Mechanical Turk Overpowering Use. 5th Language and Technology Conference, Nov 2011, Poznan, Poland. 2011. 〈hal-00648187〉

Partager

Métriques

Consultations de la notice

1105

Téléchargements de fichiers

1346