Abstract
Paid crowdsourcing platforms have evolved into remarkable marketplaces where requesters can tap into human intelligence to serve a multitude of purposes, and the workforce can benefit through monetary returns for investing their efforts. In this work, we focus on individual crowd worker competencies. By drawing from self-assessment theories in psychology, we show that crowd workers often lack awareness about their true level of competence. Due to this, although workers intend to maintain a high reputation, they tend to participate in tasks that are beyond their competence. We reveal the diversity of individual worker competencies, and make a case for competence-based pre-selection in crowdsourcing marketplaces. We show the implications of flawed self-assessments on real-world microtasks, and propose a novel worker pre-selection method that considers accuracy of worker self-assessments. We evaluated our method in a sentiment analysis task and observed an improvement in the accuracy by over 15%, when compared to traditional performance-based worker pre-selection. Similarly, our proposed method resulted in an improvement in accuracy of nearly 6% in an image validation task. Our results show that requesters in crowdsourcing platforms can benefit by considering worker self-assessments in addition to their performance for pre-selection.
- Yoram Bachrach, Thore Graepel, Gjergji Kasneci, Michal Kosinski, and Jurgen Van Gael. 2012. Crowd IQ: Aggregating opinions to boost performance. In Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems-Volume 1. International Foundation for Autonomous Agents and Multiagent Systems, 535--542Google Scholar
- David Boud. 2013. Enhancing Learning Through Self-Assessment. Routledge.Google Scholar
- Katherine A. Burson, Richard P. Larrick, and Joshua Klayman. 2006. Skilled or unskilled, but still unaware of it: How perceptions of difficulty drive miscalibration in relative comparisons. Journal of Personality and Social Psychology 90, 1 (2006), 60.Google ScholarCross Ref
- Carrie J. Cai, Shamsi T. Iqbal, and Jaime Teevan. 2016. Chain reactions: The impact of order on microtask chains. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. 3143--3154. Google ScholarDigital Library
- Lydia B. Chilton, John J. Horton, Robert C. Miller, and Shiri Azenkot. 2010. Task search in a human computation market. In Proceedings of the ACM SIGKDD Workshop on Human Computation. ACM, 1--9. Google ScholarDigital Library
- D. E. Difallah, M. Catasta, G. Demartini, P. G. Ipeirotis, and P. Cudré-Mauroux. 2015. The dynamics of micro-task crowdsourcing -- The case of Amazon MTurk. In Proceedings of the 24th International Conference on World Wide Web (WWW). ACM, 238--247. Google ScholarDigital Library
- Steven Dow, Anand Kulkarni, Scott Klemmer, and Björn Hartmann. 2012. Shepherding the crowd yields better work. In Proceedings of the ACM 2012 Conference on Computer Supported Cooperative Work. ACM, 1013--1022. Google ScholarDigital Library
- Christoph Dukat and Simon Caton. 2013. Towards the competence of crowdsourcees: Literature-based considerations on the problem of assessing crowdsourcees’ qualities. In Proceedings of the 3rd International Conference on Cloud and Green Computing. IEEE, 536--540. Google ScholarDigital Library
- David Dunning. 2011. The dunning-kruger effect: On being ignorant of one’s own ignorance. Advances in Experimental Social Psychology 44 (2011), 247. Google ScholarCross Ref
- David Dunning, Chip Heath, and Jerry M. Suls. 2004. Flawed self-assessment implications for health, education, and the workplace. Psychological Science in the Public Interest 5, 3 (2004), 69--106. Google ScholarCross Ref
- Joyce Ehrlinger and David Dunning. 2003. How chronic self-views influence (and potentially mislead) estimates of performance. Journal of Personality and Social Psychology 84, 1 (2003), 5.Google ScholarCross Ref
- Joyce Ehrlinger, Kerri Johnson, Matthew Banner, David Dunning, and Justin Kruger. 2008. Why the unskilled are unaware: Further explorations of (absent) self-insight among the incompetent. Organizational Behavior and Human Decision Processes 105, 1 (2008), 98--121. Google ScholarCross Ref
- Ujwal Gadiraju, Besnik Fetahu, and Ricardo Kawase. 2015. Training workers for improving performance in crowdsourcing microtasks. In Proceedings of the 10th European Conference on Technology Enhanced Learning. EC-TEL 2015. Springer, 100--114. Google ScholarCross Ref
- Ujwal Gadiraju, Ricardo Kawase, and Stefan Dietze. 2014. A taxonomy of microtasks on the web. In Proceedings of the 25th ACM Conference on Hypertext and Social Media. ACM, 218--223. Google ScholarDigital Library
- Ujwal Gadiraju, Ricardo Kawase, Stefan Dietze, and Gianluca Demartini. 2015. Understanding malicious behavior in crowdsourcing platforms: The case of online surveys. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI’15). 1631--1640.Google ScholarDigital Library
- Lilly C. Irani and M. Silberman. 2013. Turkopticon: Interrupting worker invisibility in amazon mechanical turk. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 611--620. Google ScholarDigital Library
- Nicolas Kaufmann, Thimo Schulze, and Daniel Veit. 2011. More than fun and money. Worker motivation in crowdsourcing--A study on mechanical turk. A Renaissance of Information Technology for Sustainability and Global Competitiveness. 17th Americas Conference on Information Systems (AMCIS’11). Association for Information Systems, Detroit, Michigan, USA.Google Scholar
- Gabriella Kazai. 2011. In search of quality in crowdsourcing for search engine evaluation. In Advances in Information Retrieval. Springer, 165--176. Google ScholarDigital Library
- Gabriella Kazai, Jaap Kamps, and Natasa Milic-Frayling. 2011. Worker types and personality traits in crowdsourcing relevance labels. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management. ACM, 1941--1944. Google ScholarDigital Library
- Gabriella Kazai, Jaap Kamps, and Natasa Milic-Frayling. 2012. The face of quality in crowdsourcing relevance labels: Demographics, personality and labeling accuracy. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management. ACM, 2583--2586. Google ScholarDigital Library
- Aniket Kittur, Ed H. Chi, and Bongwon Suh. 2008. Crowdsourcing user studies with mechanical Turk. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 453--456. Google ScholarDigital Library
- Aniket Kittur, Jeffrey V. Nickerson, Michael Bernstein, Elizabeth Gerber, Aaron Shaw, John Zimmerman, Matt Lease, and John Horton. 2013. The future of crowd work. In Proceedings of the 2013 Conference on Computer Supported Cooperative Work. ACM, 1301--1318. Google ScholarDigital Library
- Michal Kosinski, Yoram Bachrach, Gjergji Kasneci, Jurgen Van-Gael, and Thore Graepel. 2012. Crowd IQ: Measuring the intelligence of crowdsourcing platforms. In Proceedings of the 4th Annual ACM Web Science Conference. ACM, 151--160. Google ScholarDigital Library
- Marian Krajc and Andreas Ortmann. 2008. Are the unskilled really that unaware? An alternative explanation. Journal of Economic Psychology 29, 5 (2008), 724--738. Google ScholarCross Ref
- Joachim Krueger and Ross A. Mueller. 2002. Unskilled, unaware, or both? The better-than-average heuristic and statistical regression predict errors in estimates of own performance. Journal of Personality and Social Psychology 82, 2 (2002), 180.Google ScholarCross Ref
- Justin Kruger and David Dunning. 1999. Unskilled and unaware of it: How difficulties in recognizing one’s own incompetence lead to inflated self-assessments. Journal of Personality and Social Psychology 77, 6 (1999), 1121.Google ScholarCross Ref
- Chinmay Kulkarni, Koh Pang Wei, Huy Le, Daniel Chia, Kathryn Papadopoulos, Justin Cheng, Daphne Koller, and Scott R. Klemmer. 2015. Peer and self assessment in massive online classes. In Design Thinking Research. Springer, 131--168. Google ScholarCross Ref
- John Le, Andy Edmonds, Vaughn Hester, and Lukas Biewald. 2010. Ensuring quality in crowdsourced search relevance evaluation: The effects of training question distribution. In Proceedings of the SIGIR 2010 Workshop on Crowdsourcing for Search Evaluation. 21--26.Google Scholar
- Catherine C. Marshall and Frank M. Shipman. 2013. Experiences surveying the crowd: Reflections on methods, participation, and reliability. In Proceedings of the 5th Annual ACM Web Science Conference. ACM, 234--243. Google ScholarDigital Library
- David Martin, Benjamin V. Hanrahan, Jacki O’Neill, and Neha Gupta. 2014. Being a turker. In Proceedings of the 17th ACM Conference on Computer Supported Cooperative Work 8 Social Computing. ACM, 224--235. Google ScholarDigital Library
- David Martin, Jacki O. Neill, Neha Gupta, and Benjamin V. Hanrahan. 2016. Turking in a global labour market. Computer Supported Cooperative Work (CSCW) 25, 1 (2016), 39--77. Google ScholarDigital Library
- Winter Mason and Siddharth Suri. 2012. Conducting behavioral research on Amazons Mechanical Turk. Behavior Research Methods 44, 1 (2012), 1--23. Google ScholarCross Ref
- Tyler M. Miller and Lisa Geraci. 2011. Training metacognition in the classroom: The influence of incentives and feedback on exam predictions. Metacognition and Learning 6, 3 (2011), 303--314. Google ScholarCross Ref
- Edward Newell and Derek Ruths. 2016. How one microtask affects another. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. ACM, 3155--3166. Google ScholarDigital Library
- David Oleson, Alexander Sorokin, Greg P. Laughlin, Vaughn Hester, John Le, and Lukas Biewald. 2011. Programmatic gold: Targeted and scalable quality assurance in crowdsourcing. Human Computation 11, 11 (2011).Google Scholar
- Joel Ross, Lilly Irani, M. Silberman, Andrew Zaldivar, and Bill Tomlinson. 2010. Who are the crowdworkers?: Shifting demographics in mechanical turk. In CHI’10 Extended Abstracts on Human Factors in Computing Systems. ACM, 2863--2872.Google Scholar
- Thomas Schlösser, David Dunning, Kerri L. Johnson, and Justin Kruger. 2013. How unaware are the unskilled? Empirical tests of the signal extraction counter explanation for the Dunning--Kruger effect in self-evaluation of performance. Journal of Economic Psychology 39 (2013), 85--100.Google ScholarCross Ref
- Barry Schwartz. 2004. The paradox of choice: Why less is more. Ecco, New York.Google Scholar
- Barry Schwartz and Andrew Ward. 2004. Doing better but feeling worse: The paradox of choice. Positive Psychology in Practice (2004), 86--104.Google Scholar
- Han Yu, Zhiqi Shen, Chunyan Miao, and Bo An. 2012. Challenges and opportunities for trust management in crowdsourcing. In Proceedings of the 2012 IEEE/WIC/ACM International Conferences on Intelligent Agent Technology (IAT’12). IEEE Computer Society, 486--493. Google ScholarDigital Library
- Ujwal Gadiraju and Stefan Dietze. 2017. Improving learning through achievement priming in crowdsourced information finding microtasks. In Proceedings of the Seventh International Learning Analytics 8 Knowledge Conference. ACM, Vancouver, BC, Canada, 105--114. Google ScholarDigital Library
Index Terms
- Using Worker Self-Assessments for Competence-Based Pre-Selection in Crowdsourcing Microtasks
Recommendations
Crowd Anatomy Beyond the Good and Bad: Behavioral Traces for Crowd Worker Modeling and Pre-selection
AbstractThe suitability of crowdsourcing to solve a variety of problems has been investigated widely. Yet, there is still a lack of understanding about the distinct behavior and performance of workers within microtasks. In this paper, we first introduce a ...
Modus Operandi of Crowd Workers: The Invisible Role of Microtask Work Environments
The ubiquity of the Internet and the widespread proliferation of electronic devices has resulted in flourishing microtask crowdsourcing marketplaces, such as Amazon MTurk. An aspect that has remained largely invisible in microtask crowdsourcing is that ...
Understanding Worker Moods and Reactions to Rejection in Crowdsourcing
HT '19: Proceedings of the 30th ACM Conference on Hypertext and Social MediaRequesters on crowdsourcing platforms typically exercise the power to decide the fate of tasks completed by crowd workers. Rejecting work has a direct impact on workers; (i) they may not be rewarded for the work completed and for their effort that has ...
Comments