skip to main content
10.1145/3173574.3173722acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article
Open Access

Distance and Attraction: Gravity Models for Geographic Content Production

Published:19 April 2018Publication History

ABSTRACT

Volunteered Geographic Information (VGI), such as contributions to OpenStreetMap and geotagged Wikipedia articles, is often assumed to be produced locally. However, recent work has found that peer-produced VGI is frequently contributed by non-locals. We evaluate this approach across hundreds of content types from Wikipedia, OpenStreetMap, and eBird, and show that these models can describe more than 90% of "VGI flows" for some content types. Our findings advance geographic HCI theory, suggesting some spatial mechanisms underpinning VGI production. We also discuss design implications that can help (a) human and algorithmic consumers of VGI evaluate the perspectives it contains and (b) address geographic coverage variations in these platforms (e.g. via more effective volunteer recruitment strategies).

References

  1. Judd Antin, Ed H. Chi, James Howison, Sharoda Paul, Aaron Shaw, and Jude Yew. 2011. Apples to Oranges?: Comparing Across Studies of Open Collaboration/Peer Production. In Proceedings of the 7th International Symposium on Wikis and Open Collaboration (WikiSym '11), 227--228. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Michael Bailey, Ruiqing (Rachel) Cao, Theresa Kuchler, Johannes Stroebel, and Arlene Wong. 2017. Measuring Social Connectedness. National Bureau of Economic Research.Google ScholarGoogle Scholar
  3. Saeideh Bakhshi, Partha Kanuparthy, and Eric Gilbert. 2014. Demographics, weather and online reviews: a study of restaurant recommendations. 443--454. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Jonathan M. Bossenbroek, Clifford E. Kraft, and Jeffrey C. Nekola. 2001. Prediction of Long-Distance Dispersal Using Gravity Models: Zebra Mussel Invasion of Inland Lakes. Ecological Applications 11, 6: 1778--1788.Google ScholarGoogle ScholarCross RefCross Ref
  5. Ronan Collobert and Jason Weston. 2008. A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning. In Proceedings of the 25th International Conference on Machine Learning (ICML '08), 160--167. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Ryan Compton, Craig Lee, Jiejun Xu, Luis Artieda-Moncada, Tsai-Ching Lu, Lalindra De Silva, and Michael Macy. 2014. Using publicly visible social media to build detailed forecasts of civil unrest. Security Informatics 3, 1: 4.Google ScholarGoogle ScholarCross RefCross Ref
  7. Stefany Coxe, Stephen G. West, and Leona S. Aiken. 2009. The Analysis of Count Data: A Gentle Introduction to Poisson Regression and Its Alternatives. Journal of Personality Assessment 91, 2: 121--136.Google ScholarGoogle ScholarCross RefCross Ref
  8. Martin Dittus, Giovanni Quattrone, and Licia Capra. 2017. Mass Participation During Emergency Response: Event-centric Crowdsourcing in Humanitarian Mapping. In Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing (CSCW '17), 1290--1303. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Stuart Carter Dodd. 1950. The Interactance Hypothesis: A Gravity Model Fitting Physical Masses and Human Groups. American Sociological Review 15, 2: 245--256.Google ScholarGoogle ScholarCross RefCross Ref
  10. Melanie Eckle. Quality Assessment of Remote Mapping in OpenStreetMap for Disaster Management Purposes. Retrieved September 24, 2015 from http://iscram2015.uia.no/wp-content/uploads/2015/05/5--1.pdfGoogle ScholarGoogle Scholar
  11. Robin Flowerdew and Murray Aitkin. 1982. A Method of Fitting the Gravity Model Based on the Poisson Distribution*. Journal of Regional Science 22, 2: 191--202.Google ScholarGoogle ScholarCross RefCross Ref
  12. Evgeniy Gabrilovich and Shaul Markovitch. 2007. Computing semantic relatedness using Wikipedia-based explicit semantic analysis. In IJcAI, 1606--1611. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Ruth García-Gavilanes, Yelena Mejova, and Daniele Quercia. 2014. Twitter Ain't Without Frontiers: Economic, Social, and Cultural Boundaries in International Communication. In Proceedings of the 17th ACM Conference on Computer Supported Cooperative Work & Social Computing (CSCW '14), 1511--1522. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Nicholas Generous, Geoffrey Fairchild, Alina Deshpande, Sara Y Del Valle, and Reid Priedhorsky. 2014. Global Disease Monitoring and Forecasting with Wikipedia. PLoS Computational Biology 10, 11: e1003892.Google ScholarGoogle ScholarCross RefCross Ref
  15. Darren Gergle and Brent Hecht. 2010. The tower of Babel meets web 2.0. In the 28th international conference, 291.Google ScholarGoogle Scholar
  16. M. Gibson and M. Pullen. 1972. Retail turnover in the East Midlands: A regional application of a gravity model. Regional Studies 6, 2: 183--196.Google ScholarGoogle ScholarCross RefCross Ref
  17. Ruediger Glott, Philipp Schmidt, and Rishab Ghosh. 2010. Wikipedia Survey - Overview of Results.Google ScholarGoogle Scholar
  18. Michael F Goodchild. 2007. Citizens as sensors: the world of volunteered geography. GeoJournal 69, 4: 211--221.Google ScholarGoogle ScholarCross RefCross Ref
  19. Mordechai Haklay. 2010. How good is volunteered geographical information? A comparative study of OpenStreetMap and Ordnance Survey datasets. Environment and Planning B: Planning and Design 37, 4: 682--703.Google ScholarGoogle ScholarCross RefCross Ref
  20. Mark E Hanson. 1966. Project METRAN: an integrated, evolutionary transportation system for urban areas. Cambridge, Mass.: MIT Press.Google ScholarGoogle Scholar
  21. Darren Hardy, James Frew, and Michael F Goodchild. 2012. Volunteered geographic information production as a spatial process. International Journal of Geographical Information Science 26, 7: 1191--1212. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Brent Hecht. 2013. The Mining and Application of Diverse Cultural Perspectives in User-generated Content. Northwestern University, Evanston, IL, USA.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Brent Hecht and Darren Gergle. 2009. Measuring Self-Focus Bias in Community-Maintained Knowledge Repositories. In Communities and Technologies 2009: 4th International Conference on Communities and Technologies, 11--19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Brent Hecht and Darren Gergle. 2010. On the "localness" of user-generated content. 229.Google ScholarGoogle Scholar
  25. Brent Hecht, Johannes Schöning, Muki Haklay, Licia Capra, Afra J Mashhadi, Loren Terveen, and Mei-Po Kwan. 2013. Geographic human-computer interaction. In CHI '13 Extended Abstracts on Human Factors in Computing Systems, 3163. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Brent Hecht and Monica Stephens. 2014. A Tale of Cities: Urban Biases in Volunteered Geographic Information. In Eighth International AAAI Conference on Weblogs and Social Media. Retrieved February 13, 2015 from http://www.aaai.org/ocs/index.php/ICWSM/ICWSM14/paper/view/8114Google ScholarGoogle ScholarCross RefCross Ref
  27. Marit Hinnosaar, Toomas Hinnosaar, Michael Kummer, and Olga Slivko. 2017. Wikipedia Matters.Google ScholarGoogle Scholar
  28. Isaac L. Johnson, Yilun Lin, Toby Jia-Jun Li, Andrew Hall, Aaron Halfaker, Johannes Schöning, and Brent Hecht. 2016. Not at Home on the Range: Peer Production and the Urban/Rural Divide. 13--25. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Isaac L. Johnson, Connor J McMahon, Johannes Schöning, and Brent Hecht. 2017. The Effect of Population and "Structural" Biases on Social Media-based Algorithms -- A Case Study in Geolocation Inference Across the Urban-Rural Spectrum. In Proceedings of the 35th Annual ACM Conference on Human Factors in Computing Systems (CHI 2017). Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Isaac L. Johnson, Subhasree Sengupta, Johannes Schöning, and Brent Hecht. 2016. The Geography and Importance of Localness in Geotagged Social Media. 515--526. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. David Jurgens, Tyler Finethy, James McCorriston, Yi Tian Xu, and Derek Ruths. 2015. Geolocation prediction in twitter using social networks: A critical analysis and review of current practice. In Proceedings of the 9th International AAAI Conference on Weblogs and Social Media (ICWSM).Google ScholarGoogle Scholar
  32. Leo H. Kahane. 2013. Understanding the Interstate Export of Crime Guns: A Gravity Model Approach. Contemporary Economic Policy 31, 3: 618--634.Google ScholarGoogle ScholarCross RefCross Ref
  33. Krishna Y Kamath, James Caverlee, Kyumin Lee, and Zhiyuan Cheng. 2013. Spatio-temporal dynamics of online memes: a study of geo-tagged tweets. In Proceedings of the 22nd international conference on World Wide Web, 667--678. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Bonnie L Keeler, Spencer A Wood, Stephen Polasky, Catherine Kling, Christopher T Filstrup, and John A Downing. 2015. Recreational demand for clean water: evidence from geotagged photographs by visitors to lakes. Frontiers in Ecology and the Environment 13, 2: 76--81.Google ScholarGoogle ScholarCross RefCross Ref
  35. Won W. Koo and David Karemera. 1991. Determinants of World Wheat Trade Flows and Policy Analysis. Canadian Journal of Agricultural Economics/Revue canadienne d'agroeconomie 39, 3: 439--455.Google ScholarGoogle Scholar
  36. Won W. Koo, David Karemera, and Richard Taylor. 1994. A gravity model analysis of meat trade policies. Agricultural Economics 10, 1: 81--88.Google ScholarGoogle ScholarCross RefCross Ref
  37. Gautier Krings, Francesco Calabrese, Carlo Ratti, and Vincent D. Blondel. 2009. Urban gravity: a model for inter-city telecommunication flows. Journal of Statistical Mechanics: Theory and Experiment 2009, 07: L07003.Google ScholarGoogle ScholarCross RefCross Ref
  38. Linna Li, Michael Goodchild, and Bo Xu. 2013. Spatial, temporal, and socioeconomic patterns in the use of Twitter and Flickr. Cartography and Geographic Information Science 40, 2: 61--77.Google ScholarGoogle ScholarCross RefCross Ref
  39. Michael D Lieberman and Jimmy Lin. 2009. You Are Where You Edit?: Locating Wikipedia Contributors Through Edit Histories. Proceedings of the Third International ICWSM Conference: 106--113.Google ScholarGoogle Scholar
  40. M J Hodgson. 1978. Toward More Realistic Allocation in Location-Allocation Models: An Interaction Approach. Environment and Planning A: Economy and Space 10, 11: 1273--1285.Google ScholarGoogle ScholarCross RefCross Ref
  41. Robert McGill, John W. Tukey, and Wayne A. Larsen. 1978. Variations of Box Plots. The American Statistician 32, 1: 12--16.Google ScholarGoogle ScholarCross RefCross Ref
  42. Alan Mislove, Sune Lehmann, Yong-Yeol Ahn, Jukka-Pekka Onnela, and J Niels Rosenquist. 2011. Understanding the Demographics of Twitter Users. ICWSM 11: 5th.Google ScholarGoogle Scholar
  43. Lewis Mitchell, Morgan R. Frank, Kameron Decker Harris, Peter Sheridan Dodds, and Christopher M. Danforth. 2013. The Geography of Happiness: Connecting Twitter Sentiment and Expression, Demographics, and Objective Characteristics of Place. PLoS ONE 8, 5: e64417.Google ScholarGoogle ScholarCross RefCross Ref
  44. Mohamed Musthag and Deepak Ganesan. 2013. Labor Dynamics in a Mobile Micro-task Market. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '13), 641--650. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Pascal Neis and Alexander Zipf. 2012. Analyzing the Contributor Activity of a Volunteered Geographic Information Project - The Case of OpenStreetMap. ISPRS International Journal of Geo-Information 1, 3: 146--165.Google ScholarGoogle ScholarCross RefCross Ref
  46. OpenStreetMap. Armchair Mapping - OpenStreetMap Wiki. Retrieved October 24, 2016 from http://wiki.openstreetmap.org/wiki/Armchair_mappingGoogle ScholarGoogle Scholar
  47. Ingmar Poese, Steve Uhlig, Mohamed Ali Kaafar, Benoit Donnet, and Bamba Gueye. 2011. IP Geolocation Databases: Unreliable? SIGCOMM Comput. Commun. Rev. 41, 2: 53--56. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Giovanni Quattrone, Davide Proserpio, Daniele Quercia, Licia Capra, and Mirco Musolesi. 2016. Who Benefits from the "Sharing" Economy of Airbnb? arXiv:1602.02238 {physics}. Retrieved February 25, 2016 from http://arxiv.org/abs/1602.02238Google ScholarGoogle Scholar
  49. Derek Ruths and Jürgen Pfeffer. 2014. Social media for large studies of behavior. Science 346, 6213: 1063--1064.Google ScholarGoogle Scholar
  50. Takeshi Sakaki, Makoto Okazaki, and Yutaka Matsuo. 2010. Earthquake Shakes Twitter Users: Real-time Event Detection by Social Sensors. In Proceedings of the 19th International Conference on World Wide Web (WWW '10), 851--860. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Salvatore Scellato, Anastasios Noulas, Renaud Lambiotte, and Cecilia Mascolo. 2011. Socio-Spatial Properties of Online Location-Based Social Networks. In Fifth International AAAI Conference on Weblogs and Social Media. Retrieved June 15, 2016 from http://www.aaai.org/ocs/index.php/ICWSM/ICWSM11/paper/view/2751Google ScholarGoogle Scholar
  52. Shilad W. Sen, Heather Ford, David R. Musicant, Mark Graham, Oliver S.B. Keyes, and Brent Hecht. 2015. Barriers to the Localness of Volunteered Geographic Information. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI '15), 197--206. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Chris Smith, Daniele Quercia, and Licia Capra. 2013. Finger on the Pulse: Identifying Deprivation Using Transit Flow Analysis. In Proceedings of the 2013 Conference on Computer Supported Cooperative Work (CSCW '13), 683--692. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. John Q. Stewart. 1948. Demographic Gravitation: Evidence and Applications. Sociometry 11, 1/2: 31--58.Google ScholarGoogle Scholar
  55. Jacob Thebault-Spieker, Aaron Halfaker, Brent Hecht, and Loren Terveen. 2018. enwiki.revisions_with_coords.201510--201610.csv.Google ScholarGoogle Scholar
  56. Jacob Thebault-Spieker, Brent Hecht, and Loren Terveen. 2018. Geographic Biases Are "Born, Not Made": Exploring Contributors' Spatiotemporal Behavior in OpenStreetMap. In Proceedings of the 2018 ACM Conference on Supporting Groupwork (GROUP '18), 71--82. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Alessandro Venerandi, Giovanni Quattrone, Licia Capra, Daniele Quercia, and Diego Saez-Trumper. 2015. Measuring Urban Deprivation from User Generated Content. 254--264. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Wikipedia. 2015. Wikipedia:Introduction. Wikipedia. Retrieved October 22, 2016 from https://en.wikipedia.org/w/index.php?title=Wikipedia:Introduction&oldid=680454568Google ScholarGoogle Scholar
  59. Spencer A. Wood, Anne D. Guerry, Jessica M. Silver, and Martin Lacayo. 2013. Using social media to quantify nature-based tourism and recreation. Scientific Reports 3.Google ScholarGoogle Scholar
  60. Dennis Zielstra, Hartwig H. Hochmair, Pascal Neis, and Francesco Tonini. 2014. Areal Delineation of Home Regions from Contribution and Editing Patterns in OpenStreetMap. ISPRS International Journal of Geo-Information 3, 4: 1211--1233.Google ScholarGoogle ScholarCross RefCross Ref
  61. About eBird | eBird. Retrieved September 19, 2017 from http://ebird.org/content/ebird/about/Google ScholarGoogle Scholar

Index Terms

  1. Distance and Attraction: Gravity Models for Geographic Content Production

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader