Efficient Crowd Exploration of Large Networks: The Case of Causal Attribution

Authors:
Daniel Berenberg

University of Vermont, Burlington, VT, USA

University of Vermont, Burlington, VT, USA
View Profile

,
James P. Bagrow

University of Vermont, Burlington, VT, USA

University of Vermont, Burlington, VT, USA
View Profile

Proceedings of the ACM on Human-Computer Interaction Volume 2 Issue CSCWArticle No.: 24pp 1–25https://doi.org/10.1145/3274293

Published:01 November 2018Publication History

Proceedings of the ACM on Human-Computer Interaction

Abstract

Accurately and efficiently crowdsourcing complex, open-ended tasks can be difficult, as crowd participants tend to favor short, repetitive "microtasks". We study the crowdsourcing of large networks where the crowd provides the network topology via microtasks. Crowds can explore many types of social and information networks, but we focus on the network of causal attributions, an important network that signifies cause-and-effect relationships. We conduct experiments on Amazon Mechanical Turk (AMT) testing how workers can propose and validate individual causal relationships and introduce a method for independent crowd workers to explore large networks. The core of the method, Iterative Pathway Refinement, is a theoretically-principled mechanism for efficient exploration via microtasks. We evaluate the method using synthetic networks and apply it on AMT to extract a large-scale causal attribution network. Worker interactions reveal important characteristics of causal perception and the generated network data can help improve our understanding of causality and causal inference.

References

Lada A Adamic, Rajan M Lukose, Amit R Puniyani, and Bernardo A Huberman. 2001. Search in power-law networks. Physical review E , Vol. 64, 4 (2001), 046135.Google Scholar
Cecilia R. Aragon and Alison Williams. 2011. Collaborative Creativity: A Complex Systems Model with Distributed Affect. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '11). ACM, New York, NY, USA, 1875--1884. 00033. Google ScholarDigital Library
James P Bagrow. 2018. Crowd ideation of supervised learning problems. arXiv preprint arXiv:1802.05101 (2018).Google Scholar
Albert-László Barabási and Réka Albert. 1999. Emergence of scaling in random networks. science , Vol. 286, 5439 (1999), 509--512.Google Scholar
Michael S. Bernstein, Greg Little, Robert C. Miller, Björn Hartmann, Mark S. Ackerman, David R. Karger, David Crowell, and Katrina Panovich. 2015. Soylent: a word processor with a crowd inside. Commun. ACM , Vol. 58, 8 (2015), 85--94. 00607. Google ScholarDigital Library
Kirsten E. Bevelander, Kirsikka Kaipainen, Robert Swain, Simone Dohle, Josh C. Bongard, Paul D. H. Hines, and Brian Wansink. 2014. Crowdsourcing Novel Childhood Predictors of Adult Obesity . PLOS ONE , Vol. 9, 2 (2014), e87756. 00019.Google ScholarCross Ref
Gerd Bohner, Herbert Bless, Norbert Schwarz, and Fritz Strack. 1988. What triggers causal attributions? The impact of valence and subjective probability. European Journal of Social Psychology, Vol. 18, 4 (1988), 335--345.Google ScholarCross Ref
Josh C. Bongard, Paul DH Hines, Dylan Conger, Peter Hurd, and Zhenyu Lu. 2013. Crowdsourcing predictors of behavioral outcomes. IEEE Transactions on Systems, Man, and Cybernetics: Systems, Vol. 43, 1 (2013), 176--185. 00024.Google ScholarCross Ref
Daren C Brabham. 2008. Crowdsourcing as a model for problem solving: An introduction and cases. Convergence, Vol. 14, 1 (2008), 75--90.Google ScholarCross Ref
Ulrik Brandes. 2008. On variants of shortest-path betweenness centrality and their generic computation. Social Networks, Vol. 30, 2 (2008), 136--145.Google ScholarCross Ref
Roger Brown and Deborah Fish. 1983. The psychological causality implicit in language. Cognition, Vol. 14, 3 (1983), 237--273.Google ScholarCross Ref
Olivier Chapelle and Lihong Li. 2011. An Empirical Evaluation of Thompson Sampling. In Advances in Neural Information Processing Systems 24, J. Shawe-Taylor, R. S. Zemel, P. L. Bartlett, F. Pereira, and K. Q. Weinberger (Eds.). Curran Associates, Inc., 2249--2257. Google ScholarDigital Library
Justin Cheng, Jaime Teevan, Shamsi T Iqbal, and Michael S Bernstein. 2015. Break it down: A comparison of macro-and microtasks. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. ACM, 4061--4064. Google ScholarDigital Library
Lydia B. Chilton, Greg Little, Darren Edge, Daniel S. Weld, and James A. Landay. 2013. Cascade: Crowdsourcing taxonomy creation. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 1999--2008. 00117. Google ScholarDigital Library
Alexander Philip Dawid and Allan M Skene. 1979. Maximum likelihood estimation of observer error-rates using the EM algorithm. Applied statistics (1979), 20--28.Google Scholar
Pierre-Gilles De Gennes. 1979. Scaling concepts in polymer physics .Cornell university press.Google Scholar
Djellel Eddine Difallah, Michele Catasta, Gianluca Demartini, Panagiotis G. Ipeirotis, and Philippe Cudré-Mauroux. 2015. The Dynamics of Micro-Task Crowdsourcing: The Case of Amazon MTurk. In Proceedings of the 24th International Conference on World Wide Web (WWW '15). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, 238--247. Google ScholarDigital Library
Cynthia Dwork, Ravi Kumar, Moni Naor, and Dandapani Sivakumar. 2001. Rank aggregation methods for the web. In Proceedings of the 10th international conference on World Wide Web. ACM, 613--622. Google ScholarDigital Library
Paul Erdös and Alfréd Rényi. 1959. On random graphs, I. Publicationes Mathematicae (Debrecen) , Vol. 6 (1959), 290--297.Google ScholarCross Ref
Enrique Estellés-Arolas and Fernando González-Ladrón-De-Guevara. 2012. Towards an integrated crowdsourcing definition. Journal of Information science , Vol. 38, 2 (2012), 189--200. Google ScholarDigital Library
Roxana Girju, Dan Moldovan, et almbox. 2002. Text Mining for Causal Relations. In FLAIRS Conference. 360--364. Google ScholarDigital Library
Clive W J Granger. 1969. Investigating Causal Relations by Econometric Models and Cross-spectral Methods. Econometrica: Journal of the Econometric Society (1969), 424--438.Google Scholar
Douglas D Heckathorn. 1997. Respondent-driven sampling: a new approach to the study of hidden populations. Social problems , Vol. 44, 2 (1997), 174--199.Google Scholar
Carlos P Herrero. 2005. Self-avoiding walks on scale-free networks. Physical Review E , Vol. 71, 1 (2005), 016103.Google ScholarCross Ref
Steven M Hill, Laura M Heiser, Thomas Cokelaer, Michael Unger, Nicole K Nesser, Daniel E Carlin, Yang Zhang, Artem Sokolov, Evan O Paull, Chris K Wong, et almbox. 2016. Inferring causal molecular networks: empirical assessment through a community-based effort. Nature Methods , Vol. 13, 4 (2016), 310.Google ScholarCross Ref
Denis J Hilton. 1990. Conversational processes and causal explanation. Psychological Bulletin , Vol. 107, 1 (1990), 65.Google ScholarCross Ref
Jeff Howe. 2006. The rise of crowdsourcing. Wired magazine , Vol. 14, 6 (2006), 1--4.Google Scholar
David Hume. 2012. A Treatise of Human Nature .Courier Corporation.Google Scholar
Jason T Jacques and Per Ola Kristensson. 2013. Crowdsourcing a HIT: measuring workers' pre-task interactions on microtask markets. In First AAAI Conference on Human Computation and Crowdsourcing .Google Scholar
RB Joynson. 1971. Michotte's Experimental Methods. British Journal of Psychology , Vol. 62, 3 (1971), 293--302.Google ScholarCross Ref
Immanuel Kant and Paul Guyer. 1998. Critique of Pure Reason .Cambridge University Press.Google Scholar
David R Karger, Sewoong Oh, and Devavrat Shah. 2011. Iterative learning for reliable crowdsourcing systems. In Advances in neural information processing systems. 1953--1961. Google ScholarDigital Library
Frank C Keil. 2006. Explanation and understanding. Annu. Rev. Psychol. , Vol. 57 (2006), 227--254.Google ScholarCross Ref
Harold H Kelley. 1967. Attribution Theory in Social Psychology.. In Nebraska symposium on motivation. University of Nebraska Press.Google Scholar
Hyun Duk Kim, Malu Castellanos, Meichun Hsu, ChengXiang Zhai, Thomas Rietz, and Daniel Diermeier. 2013. Mining Causal Topics in Text Data: Iterative Topic Modeling with Time Series Feedback. In Proceedings of the 22nd ACM international conference on Conference on information & knowledge management. ACM, 885--890. Google ScholarDigital Library
Aniket Kittur. 2010. Crowdsourcing, collaboration and creativity. XRDS: crossroads, the ACM magazine for students , Vol. 17, 2 (2010), 22--26. Google ScholarDigital Library
Aniket Kittur, Ed H Chi, and Bongwon Suh. 2008. Crowdsourcing user studies with Mechanical Turk. In Proceedings of the SIGCHI conference on human factors in computing systems. ACM, 453--456. Google ScholarDigital Library
Aniket Kittur, Boris Smus, Susheel Khamkar, and Robert E Kraut. 2011. Crowdforge: Crowdsourcing complex work. In Proceedings of the 24th annual ACM symposium on User interface software and technology. ACM, 43--52. Google ScholarDigital Library
Jon M Kleinberg. 2000. Navigation in a small world. Nature , Vol. 406, 6798 (2000), 845.Google Scholar
Justin Kruger, Ulle Endriss, Raquel Fernández, and Ciyang Qing. 2014. Axiomatic analysis of aggregation methods for collective annotation. In Proceedings of the 2014 international conference on Autonomous agents and multi-agent systems . International Foundation for Autonomous Agents and Multiagent Systems, 1185--1192. Google ScholarDigital Library
Bin Li, Neal Madras, and Alan D Sokal. 1995. Critical exponents, hyperscaling, and universal amplitude ratios for two-and three-dimensional self-avoiding walks. Journal of Statistical Physics , Vol. 80, 3--4 (1995), 661--754.Google ScholarCross Ref
Qi Li, Fenglong Ma, Jing Gao, Lu Su, and Christopher J. Quinn. 2016. Crowdsourcing High Quality Labels with a Tight Budget. In Proceedings of the Ninth ACM International Conference on Web Search and Data Mining (WSDM '16). ACM, New York, NY, USA, 237--246. Google ScholarDigital Library
Greg Little, Lydia B Chilton, Max Goldman, and Robert C Miller. 2010. Turkit: human computation algorithms on mechanical turk. In Proceedings of the 23nd annual ACM symposium on User interface software and technology. ACM, 57--66. Google ScholarDigital Library
Daniel Marbach, James C Costello, Robert Küffner, Nicole M Vega, Robert J Prill, Diogo M Camacho, Kyle R Allison, Andrej Aderhold, Richard Bonneau, Yukun Chen, et almbox. 2012. Wisdom of crowds for robust gene network inference. Nature methods , Vol. 9, 8 (2012), 796.Google Scholar
Thomas C. McAndrew, Elizaveta Guseva, and James P. Bagrow. 2017. Reply & Supply: Efficient crowdsourcing when workers do more than answer questions. PLOS ONE , Vol. 12, 8 (2017), e69829. . 2001. A guide to first-passage processes .Cambridge University Press.Google ScholarCross Ref
Martin Rolfs, Michael Dambacher, and Patrick Cavanagh. 2013. Visual Adaptation of the Perception of Causality. Current Biology , Vol. 23, 3 (2013), 250--254.Google ScholarCross Ref
Donald B Rubin. 2011. Causal Inference Using Potential Outcomes: Design, Modeling, Decisions. J. Amer. Statist. Assoc. (2011).Google Scholar
Matthew J Salganik and Karen EC Levy. 2015. Wiki surveys: Open and quantifiable social data collection. PLOS ONE , Vol. 10, 5 (2015), e0123483.Google ScholarCross Ref
Brian J Scholl and Patrice D Tremoulet. 2000. Perceptual causality and animacy. Trends in cognitive sciences , Vol. 4, 8 (2000), 299--309.Google Scholar
Pao Siangliulue, Kenneth C Arnold, Krzysztof Z Gajos, and Steven P Dow. 2015. Toward collaborative ideation at scale: Leveraging ideas from others to generate more creative and diverse ideas. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing. ACM, 937--945. Google ScholarDigital Library
Rion Snow, Brendan O'Connor, Daniel Jurafsky, and Andrew Y Ng. 2008. Cheap and fast--but is it good?: Evaluating non-expert annotations for natural language tasks. In Proceedings of the conference on empirical methods in natural language processing. Association for Computational Linguistics, 254--263. Google ScholarDigital Library
Shelley E Taylor and Susan T Fiske. 1975. Point of View and Perceptions of Causality. Journal of Personality and Social Psychology , Vol. 32, 3 (1975), 439.Google ScholarCross Ref
Jaime Teevan, Shamsi T. Iqbal, and Curtis von Veh. 2016. Supporting Collaborative Writing with Microtasks. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (CHI '16). ACM, New York, NY, USA, 2657--2668. Google ScholarDigital Library
Jeffrey Travers and Stanley Milgram. 1967. The small world problem. Phychology Today , Vol. 1, 1 (1967), 61--67.Google Scholar
M. D. Wagy, J. C. Bongard, J. P. Bagrow, and P. D. H. Hines. 2017. Crowdsourcing Predictors of Residential Electric Energy Usage. IEEE Systems Journal , Vol. PP, 99 (2017), 1--10.Google Scholar
Sebastian Wernicke and Florian Rasche. 2006. FANMOD: a tool for fast network motif detection. Bioinformatics , Vol. 22, 9 (2006), 1152--1153. Google ScholarDigital Library
Shi-Jie Yang. 2005. Exploring complex networks by walking on them. Phys. Rev. E , Vol. 71 (Jan 2005), 016107. Issue 1.Google ScholarCross Ref

Index Terms

Efficient Crowd Exploration of Large Networks: The Case of Causal Attribution

Recommendations

Modus Operandi of Crowd Workers: The Invisible Role of Microtask Work Environments

The ubiquity of the Internet and the widespread proliferation of electronic devices has resulted in flourishing microtask crowdsourcing marketplaces, such as Amazon MTurk. An aspect that has remained largely invisible in microtask crowdsourcing is that ...
Read More
Make Hay While the Crowd Shines: Towards Efficient Crowdsourcing on the Web
WWW '15 Companion: Proceedings of the 24th International Conference on World Wide Web

Within the scope of this PhD proposal, we set out to investigate two pivotal aspects that influence the effectiveness of crowdsourcing: (i) microtask design, and (ii) workers behavior. Leveraging the dynamics of tasks that are crowdsourced on the one ...
Read More
Crowd Anatomy Beyond the Good and Bad: Behavioral Traces for Crowd Worker Modeling and Pre-selection
Abstract
The suitability of crowdsourcing to solve a variety of problems has been investigated widely. Yet, there is still a lack of understanding about the distinct behavior and performance of workers within microtasks. In this paper, we first introduce a ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
Proceedings of the ACM on Human-Computer Interaction Volume 2, Issue CSCW
November 2018
4104 pages
EISSN:2573-0142
DOI:10.1145/3290265
Editors:
Karrie Karahalios
University of Illinois & Adobe
,
Andrés Monroy-Hernández
Snap Inc.
,
Airi Lampinen
Stockholm University
,
Geraldine Fitzpatrick
Vienna University of Technology
Issue’s Table of Contents
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 November 2018
Published in pacmhci Volume 2, Issue CSCW

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
amazon mechanical turk
causal attribution
causality
crowdsourcing
crowdwork
microtasks
network motifs
networks
self-avoiding walks
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 6
  Total Citations
  View Citations
- 521
  Total Downloads
- Downloads (Last 12 months)88
- Downloads (Last 6 weeks)19
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Efficient Crowd Exploration of Large Networks: The Case of Causal Attribution

Proceedings of the ACM on Human-Computer Interaction

Abstract

References

Cited By

Index Terms

Recommendations

Modus Operandi of Crowd Workers: The Invisible Role of Microtask Work Environments

Make Hay While the Crowd Shines: Towards Efficient Crowdsourcing on the Web

Crowd Anatomy Beyond the Good and Bad: Behavioral Traces for Crowd Worker Modeling and Pre-selection