Context-aware worker selection for efficient quality control in crowdsourcing

Abstract : Crowdsourcing has proved its ability to address large scale data collection tasks at a low cost and in a short time. However, due to the dependence on unknown workers, the quality of the crowdsourcing process is questionable and must be controlled. Indeed, maintaining the efficiency of crowdsourcing requires the time and cost overhead related to this quality control to stay low. Current quality control techniques suffer from high time and budget overheads and from their dependency on prior knowledge about individual workers. In this thesis, we address these limitation by proposing the CAWS (Context-Aware Worker Selection) method which operates in two phases: in an offline phase, the correlations between the worker declarative profiles and the task types are learned. Then, in an online phase, the learned profile models are used to select the most reliable online workers for the incoming tasks depending on their types. Using declarative profiles helps eliminate any probing process, which reduces the time and the budget while maintaining the crowdsourcing quality. In order to evaluate CAWS, we introduce an information-rich dataset called CrowdED (Crowdsourcing Evaluation Dataset). The generation of CrowdED relies on a constrained sampling approach that allows to produce a dataset which respects the requester budget and type constraints. Through its generality and richness, CrowdED helps also in plugging the benchmarking gap present in the crowdsourcing community. Using CrowdED, we evaluate the performance of CAWS in terms of the quality, the time and the budget gain. Results shows that automatic grouping is able to achieve a learning quality similar to job-based grouping, and that CAWS is able to outperform the state-of-the-art profile-based worker selection when it comes to quality, especially when strong budget ant time constraints exist. Finally, we propose CREX (CReate Enrich eXtend) which provides the tools to select and sample input tasks and to automatically generate custom crowdsourcing campaign sites in order to extend and enrich CrowdED.
Document type :
Complete list of metadatas

Cited literature [151 references]  Display  Hide  Download
Contributor : Tarek Awwad <>
Submitted on : Tuesday, February 12, 2019 - 10:10:43 AM
Last modification on : Friday, May 17, 2019 - 10:34:42 AM
Long-term archiving on : Monday, May 13, 2019 - 12:13:39 PM


Manuscript Tarek Awwad v2.pdf
Files produced by the author(s)


  • HAL Id : tel-01965446, version 1


Tarek Awwad. Context-aware worker selection for efficient quality control in crowdsourcing. Performance [cs.PF]. Université de Lyon; Universität Passau, 2018. English. ⟨NNT : 2018LYSEI099⟩. ⟨tel-01965446⟩



Record views


Files downloads