skip to main content
research-article
Open Access

Crowdsourcing Without a Crowd: Reliable Online Species Identification Using Bayesian Models to Minimize Crowd Size

Published:05 May 2016Publication History
Skip Abstract Section

Abstract

We present an incremental Bayesian model that resolves key issues of crowd size and data quality for consensus labeling. We evaluate our method using data collected from a real-world citizen science program, BeeWatch, which invites members of the public in the United Kingdom to classify (label) photographs of bumblebees as one of 22 possible species. The biological recording domain poses two key and hitherto unaddressed challenges for consensus models of crowdsourcing: (1) the large number of potential species makes classification difficult, and (2) this is compounded by limited crowd availability, stemming from both the inherent difficulty of the task and the lack of relevant skills among the general public. We demonstrate that consensus labels can be reliably found in such circumstances with very small crowd sizes of around three to five users (i.e., through group sourcing). Our incremental Bayesian model, which minimizes crowd size by re-evaluating the quality of the consensus label following each species identification solicited from the crowd, is competitive with a Bayesian approach that uses a larger but fixed crowd size and outperforms majority voting. These results have important ecological applicability: biological recording programs such as BeeWatch can sustain themselves when resources such as taxonomic experts to confirm identifications by photo submitters are scarce (as is typically the case), and feedback can be provided to submitters in a timely fashion. More generally, our model provides benefits to any crowdsourced consensus labeling task where there is a cost (financial or otherwise) associated with soliciting a label.

Skip Supplemental Material Section

Supplemental Material

References

  1. Steven Blake, Advaith Siddharthan, Hien Nguyen, Nirwan Sharma, Anne-Marie Robinson, Elaine O’Mahony, Ben Darvill, Chris Mellish, and René van der Wal. 2012. Natural language generation for nature conservation: Automating feedback to help volunteers identify bumblebee species. In Proceedings of the 24th International Conference on Computational Linguistics (COLING’12). 311--324.Google ScholarGoogle Scholar
  2. David N. Bonter and Caren B. Cooper. 2012. Data validation in citizen science: A case study from project feederwatch. Frontiers in Ecology and the Environment 10, 6, 305--307.Google ScholarGoogle ScholarCross RefCross Ref
  3. Doug Clow and Elpida Makriyannis. 2011. iSpot analysed: Participatory learning and reputation. In Proceedings of the 1st International Conference on Learning Analytics and Knowledge (LAK’11). ACM, New York, NY, 34--43. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Richard F. Comont, Helen E. Roy, Owen T. Lewis, Richard Harrington, Christopher R. Shortall, and Bethan V. Purse. 2012. Using biological traits to explain ladybird distribution patterns. Journal of Biogeography 39, 10, 1772--1781. DOI:http://dx.doi.org/10.1111/j.1365-2699.2012.02734.xGoogle ScholarGoogle ScholarCross RefCross Ref
  5. Finn Danielsen, Neil D. Burgess, and Andrew Balmford. 2005. Monitoring matters: Examining the potential of locally-based approaches. Biodiversity and Conservation 14, 11, 2507--2542.Google ScholarGoogle ScholarCross RefCross Ref
  6. Alexander Philip Dawid and Allan M. Skene. 1979. Maximum likelihood estimation of observer error-rates using the EM algorithm. Applied Statistics 28, 1, 20--28.Google ScholarGoogle ScholarCross RefCross Ref
  7. Pinar Donmez and Jaime G. Carbonell. 2008. Proactive learning: Cost-sensitive active learning with multiple imperfect oracles. In Proceedings of the 17th ACM Conference on Information and Knowledge Management. ACM, New York, NY, 619--628. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Jeremy J. D. Greenwood. 2007. Citizens, science and bird conservation. Journal of Ornithology 148, 1, 77--124.Google ScholarGoogle ScholarCross RefCross Ref
  9. Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, and Ian H. Witten. 2009. The WEKA data mining software: An update. ACM SIGKDD Explorations Newsletter 11, 1, 10--18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Wesley M. Hochachka, Daniel Fink, Rebecca A. Hutchinson, Daniel Sheldon, Weng-Keen Wong, and Steve Kelling. 2012. Data-intensive science applied to broad-scale citizen science. Trends in Ecology and Evolution 27, 2, 130--137.Google ScholarGoogle ScholarCross RefCross Ref
  11. Panagiotis G. Ipeirotis, Foster Provost, Victor S. Sheng, and Jing Wang. 2014. Repeated labeling using multiple noisy labelers. Data Mining and Knowledge Discovery 28, 2, 402--441. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Ece Kamar, Severin Hacker, and Eric Horvitz. 2012. Combining human and machine intelligence in large-scale crowdsourcing. In Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems, Volume 1. 467--474. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. David R. Karger, Sewoong Oh, and Devavrat Shah. 2014. Budget-optimal task allocation for reliable crowdsourcing systems. Operations Research 62, 1, 1--24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Hongwei Li, Bo Zhao, and Ariel Fuxman. 2014. The wisdom of minority: Discovering and targeting the right group of workers for crowdsourcing. In Proceedings of the 23rd International Conference on World Wide Web. 165--176. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Chris J. Lintott, Kevin Schawinski, Anže Slosar, Kate Land, Steven Bamford, Daniel Thomas, M. Jordan Raddick, Robert C. Nichol, Alex Szalay, Dan Andreescu, Phil Murray, and Jan van den Berg. 2008. Galaxy zoo: Morphologies derived from visual inspection of galaxies from the Sloan Digital Sky Survey. Monthly Notices of the Royal Astronomical Society 389, 3, 1179--1189.Google ScholarGoogle ScholarCross RefCross Ref
  16. Greg Little, Lydia B. Chilton, Max Goldman, and Robert C. Miller. 2009. Turkit: Tools for iterative tasks on mechanical turk. In Proceedings of the ACM SIGKDD Workshop on Human Computation. ACM, New York, NY, 29--30. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Babak Loni, Jonathon Hare, Mihai Georgescu, Michael Riegler, Xiaofei Zhu, Mohamed Morchid, Richard Dufour, and Martha Larson. 2014. Getting by with a little help from the crowd: Practical approaches to social image labeling. In Proceedings of the 2014 International ACM Workshop on Crowdsourcing for Multimedia. ACM, New York, NY, 69--74. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Andrew Mao, Ece Kamar, and Eric Horvitz. 2013. Why stop now? Predicting worker engagement in online crowdsourcing. In Proceedings of the 1st AAAI Conference on Human Computation and Crowdsourcing.Google ScholarGoogle Scholar
  19. Helen E. Roy, Tim Adriaens, Nick J. B. Isaac, Marc Kenis, Thierry Onkelinx, Gilles San Martin, Peter M. J. Brown, Louis Hautier, Remy Poland, David B. Roy, Richard Comont, Ren Eschen, Robert Frost, Renate Zindel, Johan Van Vlaenderen, Oldich Nedvd, Hans Peter Ravn, Jean-Claude Grgoire, Jean-Christophe de Biseau, and Dirk Maes. 2012. Invasive alien predator causes rapid declines of native European ladybirds. Diversity and Distributions 18, 7, 717--725. DOI:http://dx.doi.org/10.1111/j.1472-4642.2012.00883.xGoogle ScholarGoogle ScholarCross RefCross Ref
  20. Science Communication Unit. 2013. Science for Environment Policy Indepth Report: Environmental Citizen Science. Report produced for the European Commission DG Environment. European Commission, University of the West of England, Bristol. Available at http://ec.europa.eu/science-environment-policy.Google ScholarGoogle Scholar
  21. Victor S. Sheng, Foster Provost, and Panagiotis G. Ipeirotis. 2008. Get another label? Improving data quality and data mining using multiple, noisy labelers. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York, NY, 614--622. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Aashish Sheshadri and Matthew Lease. 2013. SQUARE: A benchmark for research on computing crowd consensus. In Proceedings of the 1st AAAI Conference on Human Computation and Crowdsourcing.Google ScholarGoogle Scholar
  23. Jonathan Silvertown, Martin Harvey, Richard Greenwood, Mike Dodd, Jon Rosewell, Tony Rebelo, Janice Ansine, and Kevin McConway. 2015. Crowdsourcing the identification of organisms: A case-study of iSpot. ZooKeys 480, 125.Google ScholarGoogle ScholarCross RefCross Ref
  24. Jeffrey S. Simonoff. 1995. Smoothing categorical data. Journal of Statistical Planning and Inference 47, 1, 41--69.Google ScholarGoogle ScholarCross RefCross Ref
  25. Rion Snow, Brendan O’Connor, Daniel Jurafsky, and Andrew Y. Ng. 2008. Cheap and fast—but is it good? Evaluating non-expert annotations for natural language tasks. In Proceedings of the the Conference on Empirical Methods in Natural Language Processing. 254--263. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Wei Tang and Matthew Lease. 2011. Semi-supervised consensus labeling for crowdsourcing. In Proceedings of the Workshop on Crowdsourcing for Information Retrieval (CIR’11).Google ScholarGoogle Scholar
  27. René Van der Wal, Helen Anderson, Ann-Marie Robinson, Nirwan Sharma, Chris Mellish, Stewart Roberts, Darvill Benn, and Advaith Siddharthan. 2015. Mapping species distributions: A comparison of skilled naturalist and lay citizen science recording. AMBIO: A Journal of the Human Environment 44, 4, 584--600.Google ScholarGoogle ScholarCross RefCross Ref
  28. Matteo Venanzi, John Guiver, Gabriella Kazai, Pushmeet Kohli, and Milad Shokouhi. 2014. Community-based Bayesian aggregation models for crowdsourcing. In Proceedings of the 23rd International Conference on World Wide Web. 155--164. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Jacob Whitehill, Ting-Fan Wu, Jacob Bergsma, Javier R. Movellan, and Paul L. Ruvolo. 2009. Whose vote should count more: Optimal integration of labels from labelers of unknown expertise. In Advances in Neural Information Processing Systems 22 (NIPS’09). 2035--2043.Google ScholarGoogle Scholar
  30. Yan Yan, Glenn M. Fung, Rómer Rosales, and Jennifer G. Dy. 2011. Active learning from crowds. In Proceedings of the 28th International Conference on Machine Learning (ICML-11). 1161--1168.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

  • Published in

    cover image ACM Transactions on Intelligent Systems and Technology
    ACM Transactions on Intelligent Systems and Technology  Volume 7, Issue 4
    Special Issue on Crowd in Intelligent Systems, Research Note/Short Paper and Regular Papers
    July 2016
    498 pages
    ISSN:2157-6904
    EISSN:2157-6912
    DOI:10.1145/2906145
    • Editor:
    • Yu Zheng
    Issue’s Table of Contents

    Copyright © 2016 Owner/Author

    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 5 May 2016
    • Accepted: 1 May 2015
    • Revised: 1 March 2015
    • Received: 1 January 2015
    Published in tist Volume 7, Issue 4

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader