Skip to Main content Skip to Navigation
Conference papers

Crowdsourcing for Language Resource Development: Critical Analysis of Amazon Mechanical Turk Overpowering Use

Abstract : This article is a position paper about crowdsourced microworking systems and especially Amazon Mechanical Turk, the use of which has been steadily growing in language processing in the past few years. According to the mainstream opinion expressed in the articles of the domain, this type of on-line working platforms allows to develop very quickly all sorts of quality language resources, for a very low price, by people doing that as a hobby or wanting some extra cash. We shall demonstrate here that the situation is far from being that ideal, be it from the point of view of quality, price, workers' status or ethics and bring back to mind already existing or proposed alternatives. Our goal here is threefold: 1 - to inform researchers, so that they can make their own choices with all the elements of the reflection in mind, 2- to ask for help from funding agencies and scientific associations, and develop alternatives, 3- to propose practical and organizational solutions in order to improve new language resources development, while limiting the risks of ethical and legal issues without letting go price or quality.
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-00648187
Contributor : Karën Fort <>
Submitted on : Monday, December 5, 2011 - 11:56:42 AM
Last modification on : Friday, March 27, 2020 - 3:07:04 AM
Document(s) archivé(s) le : Friday, November 16, 2012 - 2:21:17 PM

File

ltc-56-adda_final.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00648187, version 1

Citation

Gilles Adda, Benoît Sagot, Karen Fort, Joseph Mariani. Crowdsourcing for Language Resource Development: Critical Analysis of Amazon Mechanical Turk Overpowering Use. 5th Language and Technology Conference, Nov 2011, Poznan, Poland. ⟨hal-00648187⟩

Share

Metrics

Record views

1222

Files downloads

1781