skip to main content
10.1145/2908131.2908172acmconferencesArticle/Chapter ViewAbstractPublication PageswebsciConference Proceedingsconference-collections
short-paper
Open Access

A manifesto for data sharing in social media research

Published:22 May 2016Publication History

ABSTRACT

More and more researchers want to share research data collected from social media to allow for reproducibility and comparability of results. With this paper we want to encourage them to pursue this aim -- despite initial obstacles that they may face. Sharing can occur in various, more or less formal ways. We provide background information that allows researchers to make a decision about whether, how and where to share depending on their specific situation (data, platform, targeted user group, research topic etc.). Ethical, legal and methodological considerations are important for making this decision. Based on these three dimensions we develop a framework for social media sharing that can act as a first set of guidelines to help social media researchers make practical decisions for their own projects. In the long run, different stakeholders should join forces to enable better practices for data sharing for social media researchers. This paper is intended as our call to action for the broader research community to advance current practices of data sharing in the future.

References

  1. Borgman, C. L. 2012. The conundrum of sharing research data. Journal of the American Society for Information Science and Technology 63(6):1059--1078. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. boyd, d., Crawford, K. 2012. Critical questions for Big Data: Provocations for a cultural, technological, and scholarly phenomenon. Information, Communication & Society, 15(5), 662--679. DOI: http://doi.org/10.1080/1369118X.2012.678878Google ScholarGoogle ScholarCross RefCross Ref
  3. Bruns, A. 2013. Faster than the speed of print: Reconciling 'Big Data' social media analysis and academic scholarship. First Monday 18(10). DOI: 10.5210/fm.v18i10.4879.Google ScholarGoogle Scholar
  4. Bruns, A., Stieglitz, S. 2014. Twitter data: What do they represent? it Information Technology 59(5):240--245. DOI: 10.1515/itit-2014-1049Google ScholarGoogle Scholar
  5. COCA. no date. The Corpus of Contemporary American English (COCA) Retrieved from http://corpus.byu.edu/coca/ (accessed Feb 12, 2016, archived by WebCite® at http://www.webcitation.org/6fFjBR3mZ)Google ScholarGoogle Scholar
  6. Cha, M., Haddadi, H., Benevenuto, B., Gummadi, K. P. 2010. Measuring User Influence in Twitter: The Million Follower Fallacy. In Proceedings of the International AAAI Conference on Weblogs and Social Media (ICWSM), May 2010.Google ScholarGoogle Scholar
  7. CrisisLex. No date. CrisisLex. Retrieved from http://crisislex.org/ (accessed Feb 12, 2016, archived by WebCite® at http://www.webcitation.org/6fFm4G3Jx))Google ScholarGoogle Scholar
  8. Fecher, B., Friesike, S., Hebing, M., Linek, S., Sauermann, A. 2015. A Reputation Economy: Results from an Empirical Survey on Academic Data Sharing. DIW Berlin Discussion Paper, No. 1454. Retrieved from http://www.diw.de/documents/publikationen/73/diw_01.c.497416.de/dp1454.pdf (accessed March 19, 2015).Google ScholarGoogle Scholar
  9. Fecher, B., Puschmann, C. 2015. On the limits of openness in science: between aspiration and reality when sharing research data. Information -- Wissenschaft und Praxis 66(2-3):146--150.Google ScholarGoogle Scholar
  10. Frické, M. 2014. Big Data and Its Epistemology. Journal of the Association for Information Science and Technology 66(4): 651--661. DOI: 10.1002/asi.23212Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Giglietto, F., Rossi, L., Bennato, D. 2012. The open laboratory: Limits and possibilities of using Facebook, Twitter, and YouTube as a research data source. Journal of Technology in Human Services 30(3--4): 145--159. DOI: 10.1080/15228835.2012.743797Google ScholarGoogle Scholar
  12. Hadgu, A. T., Jäschke, R. 2014. Identifying and analyzing researchers on twitter. In Proceedings of the 2014 ACM conference on Web science. New York: ACM Press, 23--32. DOI:10.1145/2615569.2615676 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Hutton, L., and Henderson, T. 2015. "I didn't sign up for this!": Informed consent in social network research. In Proceedings of the Ninth International AAAI Conference on Weblogs and Social Media (ICWSM), 178--187.Google ScholarGoogle Scholar
  14. ICWSM. 2012. ICWSM Dataset Sharing Service. Retrieved from: http://icwsm.cs.mcgill.ca (accessed Feb 6, 2016, archived by WebCite® at http://www.webcitation.org/6fC7JfFyR)Google ScholarGoogle Scholar
  15. ICWSM. 2015. Usage Agreement for ICWSM Contributed Datasets. Retrieved from http://www.icwsm.org/2015/datasets/datasets/icwsm_user_agreement_v1.pdf (accessed Feb 12, 2016, archived by WebCite® at http://www.webcitation.org/6fFl9SHLu).Google ScholarGoogle Scholar
  16. Kaczmirek, L., Mayr, P. 2015. German Bundestag Elections 2013: Twitter usage by electoral candidates. ZA5973 Data file Version 1.0.0. DOI: dx.doi.org/10.4232/1.12319Google ScholarGoogle Scholar
  17. Kaczmirek, L., Mayr, P., Vatrapu, R. et al. 2014. Social Media Monitoring of the Campaigns for the 2013 German Bundestag Elections on Facebook and Twitter. DOI: http://arxiv.org/abs/1312.4476Google ScholarGoogle Scholar
  18. Kinder-Kurlanda, K. E., Weller, K. 2014. 'I always feel it must be great to be a hacker!': The role of interdisciplinary work in social media research. In: Proceedings of the 2014 ACM conference on Web Science, 91--98. New York: ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. KONECT. No date. The Koblenz Network Collection. Retrieved from http://konect.uni-koblenz.de/ (accessed Feb 12, 2016, archived by WebCite® at http://www.webcitation.org/6fFmJQs4w).Google ScholarGoogle Scholar
  20. McLemee, S. (2015). The archive is closed. Inside Higher Ed. Retrieved from https://www.insidehighered.com/views/2015/06/03/article-difficulties-social-media-research (accessed Feb 6, 2016, archived by WebCite® at http://www.webcitation.org/6fFldRaKg).Google ScholarGoogle Scholar
  21. Morstatter, F.; Pfeffer, J.; Liu, H.; Carley, K. M. 2013. Is the sample good enough? Comparing data from Twitter's streaming api with twitter's firehose. In Seventh International AAAI Conference on Weblogs and Social Media.Google ScholarGoogle Scholar
  22. Morstatter, F., Pfeffer, J., Liu, H. 2014. When is it biased? Assessing the representativeness of twitter's streaming API. In Proceedings of Web ScienceTrack at the 23rd Conference on the WWW, 555--556. New York: ACM. DOI: 10.1145/2567948.2576952 Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. MPI-SWS. no date. The Twitter Project Page at MPI-SWS. Retrieved from http://twitter.mpi-sws.org/ (accessed January 26, 2015, archived by WebCite® at http://www.webcitation.org/6VsuuxQlU)Google ScholarGoogle Scholar
  24. Pfeffer, J., Morstatter, F. 2016. Geotagged Twitter posts from the United States: A tweet collection to investigate representativeness. DOI:10.7802/1166Google ScholarGoogle Scholar
  25. Puschmann, C., Burgess, J. 2013. The politics of Twitter data. HIIG Discussion Paper Series No. 2013-01. DOI: http://dx.doi.org/10.2139/ssrn.2206225Google ScholarGoogle Scholar
  26. Recker, A., Müller, S., Trixa, J., Schumann, N. (2015). Paving the Way For Data-Centric, Open Science: An Example From the Social Sciences. Journal of Librarianship and Scholarly Communication, 3(2), eP1227. DOI: http://dx.doi.org/10.7710/2162-3309.1227Google ScholarGoogle Scholar
  27. Ruths, D., Pfeffer, J. (2014). Social media for large studies of behavior. Science 346(621):1063--1064. DOI: 10.1126/science.346.6213.1063Google ScholarGoogle ScholarCross RefCross Ref
  28. Schroeder, R. 2014. Big Data and the brave new world of social media research. Big Data & Society 1(2):1--11. DOI: 10.1177/2053951714563194.Google ScholarGoogle ScholarCross RefCross Ref
  29. Stone, B. 2010. Tweet preservation. Twitter Blog (14 April 2010). Retrieved from https://blog.twitter.com/2010/tweet-preservation (accessed Feb 6, 2016).Google ScholarGoogle Scholar
  30. stuck_in_the_matrix. 2015a. I have every publicly available Reddit comment for research. ~ 1.7 billion comments @ 250 GB compressed. Any interest in this? Retrieved from https://www.reddit.com/r/datasets/comments/3bxlg7/i_have_every_publicly_available_reddit_comment (accessed Feb 12, 2016, archived by WebCite® at http://www.webcitation.org/6fFpMhWNk).Google ScholarGoogle Scholar
  31. stuck_in_the_matrix. 2015b. Complete Public Reddit Comments Corpus. Retrieved from https://archive.org/details/2015_reddit_comments_corpus (accessed Feb 12, 2016).Google ScholarGoogle Scholar
  32. Summers, E. 2014. Ferguson-tweet-ids. Retrieved from https://archive.org/details/ferguson-tweet-ids (accessed Feb 6, 2016).Google ScholarGoogle Scholar
  33. Summers, E. 2015. Tweets and deletes: silences in the social media archive. Retrieved from https://medium.com/on-archivy/tweets-and-deletes-727ed74f84ed#.pay32r3eu (accessed Feb 6, 2016; archived by WebCite® at http://www.webcitation.org/6f6KxoikL)Google ScholarGoogle Scholar
  34. Thomson, S. D. 2016. Preserving Social Media. DPC Technology Watch Report. Retrieved from http://dpconline.org/publications/technology-watch-reportsGoogle ScholarGoogle Scholar
  35. Tiropanis T., Hall, W., Hendler, J., de Larrinaga, C. 2014. The Web Observatory: A Middle Layer for Broad Data. Big Data. September 2014, 2(3): 129--133. DOI:10.1089/big.2014.0035.Google ScholarGoogle Scholar
  36. TREC. 2011. Tweets2011. Retrieved from http://trec.nist.gov/data/tweets/ (retrieved Feb 12, 2016, archived by WebCite® at http://www.webcitation.org/6W1ZVkk8o)Google ScholarGoogle Scholar
  37. Tufekci, Z. 2014. Big Questions for Social Media Big Data: Representativeness, Validity and Other Methodological Pitfalls. In ICWSM'14: Proceedings of the 8th International AAAI Conference on Weblogs and Social Media.Google ScholarGoogle Scholar
  38. Twitter, Inc. 2015. Developer agreement & policy. Retrieved from: https://dev.twitter.com/overview/terms/agreement-and-policy (accessed Feb 6, 2016).Google ScholarGoogle Scholar
  39. Web Science Trust. No date. Web Observatory. Retrieved from http://webscience.org/web-observatory/ (accessed Feb 12, 2016, archived by WebCite® at http://www.webcitation.org/6fFnJwWSa).Google ScholarGoogle Scholar
  40. Weller, K. 2014. Twitter und Wahlen: Zwischen 140 Zeichen und Milliarden von Tweets. In R. Reichert ed., Big Data: Analysen zum digitalen Wandel von Wissen, Macht und Ökonomie. Bielefeld: transcript, 239--257.Google ScholarGoogle Scholar
  41. Weller, K., Kinder-Kurlanda, K. E. 2015. Uncovering the Challenges in Collection, Sharing and Documentation: The Hidden Data of Social Media Research? In Standards and Practices in Large-Scale Social Media Research: Papers from the 2015 ICWSM Workshop. Proceedings Ninth International AAAI Conference on Web and Social Media Oxford University, May 26, 2015 - May 29, 2015, 28--37. Ann Arbor, MI: AAAI Press. Retrieved from http://www.aaai.org/ocs/index.php/ICWSM/ICWSM15/paper/view/10657 (accessed Feb 12, 2016).Google ScholarGoogle Scholar
  42. Wikipedia. No date. Wikipedia:Database_download. Retrieved from https://en.wikipedia.org/wiki/Wikipedia:Database_download (accessed Feb 12, 2016, archived by WebCite® at http://www.webcitation.org/6fFnfeGKS).Google ScholarGoogle Scholar
  43. Zenk-Möltgen, W. 2014. Datorium: Benefit from Data Sharing. Presentation at IASSIST 2014. Retrieved from http://www.iassistdata.org/conferences/2014/presentation/3834 (accessed Feb 12, 2016).Google ScholarGoogle Scholar
  44. Zimmer, M. 2010. But the data is already public: on the ethics of research in Facebook. Ethics and Information Technology 12(4):313--325. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A manifesto for data sharing in social media research

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader