ABSTRACT
More and more researchers want to share research data collected from social media to allow for reproducibility and comparability of results. With this paper we want to encourage them to pursue this aim -- despite initial obstacles that they may face. Sharing can occur in various, more or less formal ways. We provide background information that allows researchers to make a decision about whether, how and where to share depending on their specific situation (data, platform, targeted user group, research topic etc.). Ethical, legal and methodological considerations are important for making this decision. Based on these three dimensions we develop a framework for social media sharing that can act as a first set of guidelines to help social media researchers make practical decisions for their own projects. In the long run, different stakeholders should join forces to enable better practices for data sharing for social media researchers. This paper is intended as our call to action for the broader research community to advance current practices of data sharing in the future.
- Borgman, C. L. 2012. The conundrum of sharing research data. Journal of the American Society for Information Science and Technology 63(6):1059--1078. Google ScholarDigital Library
- boyd, d., Crawford, K. 2012. Critical questions for Big Data: Provocations for a cultural, technological, and scholarly phenomenon. Information, Communication & Society, 15(5), 662--679. DOI: http://doi.org/10.1080/1369118X.2012.678878Google ScholarCross Ref
- Bruns, A. 2013. Faster than the speed of print: Reconciling 'Big Data' social media analysis and academic scholarship. First Monday 18(10). DOI: 10.5210/fm.v18i10.4879.Google Scholar
- Bruns, A., Stieglitz, S. 2014. Twitter data: What do they represent? it Information Technology 59(5):240--245. DOI: 10.1515/itit-2014-1049Google Scholar
- COCA. no date. The Corpus of Contemporary American English (COCA) Retrieved from http://corpus.byu.edu/coca/ (accessed Feb 12, 2016, archived by WebCite® at http://www.webcitation.org/6fFjBR3mZ)Google Scholar
- Cha, M., Haddadi, H., Benevenuto, B., Gummadi, K. P. 2010. Measuring User Influence in Twitter: The Million Follower Fallacy. In Proceedings of the International AAAI Conference on Weblogs and Social Media (ICWSM), May 2010.Google Scholar
- CrisisLex. No date. CrisisLex. Retrieved from http://crisislex.org/ (accessed Feb 12, 2016, archived by WebCite® at http://www.webcitation.org/6fFm4G3Jx))Google Scholar
- Fecher, B., Friesike, S., Hebing, M., Linek, S., Sauermann, A. 2015. A Reputation Economy: Results from an Empirical Survey on Academic Data Sharing. DIW Berlin Discussion Paper, No. 1454. Retrieved from http://www.diw.de/documents/publikationen/73/diw_01.c.497416.de/dp1454.pdf (accessed March 19, 2015).Google Scholar
- Fecher, B., Puschmann, C. 2015. On the limits of openness in science: between aspiration and reality when sharing research data. Information -- Wissenschaft und Praxis 66(2-3):146--150.Google Scholar
- Frické, M. 2014. Big Data and Its Epistemology. Journal of the Association for Information Science and Technology 66(4): 651--661. DOI: 10.1002/asi.23212Google ScholarDigital Library
- Giglietto, F., Rossi, L., Bennato, D. 2012. The open laboratory: Limits and possibilities of using Facebook, Twitter, and YouTube as a research data source. Journal of Technology in Human Services 30(3--4): 145--159. DOI: 10.1080/15228835.2012.743797Google Scholar
- Hadgu, A. T., Jäschke, R. 2014. Identifying and analyzing researchers on twitter. In Proceedings of the 2014 ACM conference on Web science. New York: ACM Press, 23--32. DOI:10.1145/2615569.2615676 Google ScholarDigital Library
- Hutton, L., and Henderson, T. 2015. "I didn't sign up for this!": Informed consent in social network research. In Proceedings of the Ninth International AAAI Conference on Weblogs and Social Media (ICWSM), 178--187.Google Scholar
- ICWSM. 2012. ICWSM Dataset Sharing Service. Retrieved from: http://icwsm.cs.mcgill.ca (accessed Feb 6, 2016, archived by WebCite® at http://www.webcitation.org/6fC7JfFyR)Google Scholar
- ICWSM. 2015. Usage Agreement for ICWSM Contributed Datasets. Retrieved from http://www.icwsm.org/2015/datasets/datasets/icwsm_user_agreement_v1.pdf (accessed Feb 12, 2016, archived by WebCite® at http://www.webcitation.org/6fFl9SHLu).Google Scholar
- Kaczmirek, L., Mayr, P. 2015. German Bundestag Elections 2013: Twitter usage by electoral candidates. ZA5973 Data file Version 1.0.0. DOI: dx.doi.org/10.4232/1.12319Google Scholar
- Kaczmirek, L., Mayr, P., Vatrapu, R. et al. 2014. Social Media Monitoring of the Campaigns for the 2013 German Bundestag Elections on Facebook and Twitter. DOI: http://arxiv.org/abs/1312.4476Google Scholar
- Kinder-Kurlanda, K. E., Weller, K. 2014. 'I always feel it must be great to be a hacker!': The role of interdisciplinary work in social media research. In: Proceedings of the 2014 ACM conference on Web Science, 91--98. New York: ACM. Google ScholarDigital Library
- KONECT. No date. The Koblenz Network Collection. Retrieved from http://konect.uni-koblenz.de/ (accessed Feb 12, 2016, archived by WebCite® at http://www.webcitation.org/6fFmJQs4w).Google Scholar
- McLemee, S. (2015). The archive is closed. Inside Higher Ed. Retrieved from https://www.insidehighered.com/views/2015/06/03/article-difficulties-social-media-research (accessed Feb 6, 2016, archived by WebCite® at http://www.webcitation.org/6fFldRaKg).Google Scholar
- Morstatter, F.; Pfeffer, J.; Liu, H.; Carley, K. M. 2013. Is the sample good enough? Comparing data from Twitter's streaming api with twitter's firehose. In Seventh International AAAI Conference on Weblogs and Social Media.Google Scholar
- Morstatter, F., Pfeffer, J., Liu, H. 2014. When is it biased? Assessing the representativeness of twitter's streaming API. In Proceedings of Web ScienceTrack at the 23rd Conference on the WWW, 555--556. New York: ACM. DOI: 10.1145/2567948.2576952 Google ScholarDigital Library
- MPI-SWS. no date. The Twitter Project Page at MPI-SWS. Retrieved from http://twitter.mpi-sws.org/ (accessed January 26, 2015, archived by WebCite® at http://www.webcitation.org/6VsuuxQlU)Google Scholar
- Pfeffer, J., Morstatter, F. 2016. Geotagged Twitter posts from the United States: A tweet collection to investigate representativeness. DOI:10.7802/1166Google Scholar
- Puschmann, C., Burgess, J. 2013. The politics of Twitter data. HIIG Discussion Paper Series No. 2013-01. DOI: http://dx.doi.org/10.2139/ssrn.2206225Google Scholar
- Recker, A., Müller, S., Trixa, J., Schumann, N. (2015). Paving the Way For Data-Centric, Open Science: An Example From the Social Sciences. Journal of Librarianship and Scholarly Communication, 3(2), eP1227. DOI: http://dx.doi.org/10.7710/2162-3309.1227Google Scholar
- Ruths, D., Pfeffer, J. (2014). Social media for large studies of behavior. Science 346(621):1063--1064. DOI: 10.1126/science.346.6213.1063Google ScholarCross Ref
- Schroeder, R. 2014. Big Data and the brave new world of social media research. Big Data & Society 1(2):1--11. DOI: 10.1177/2053951714563194.Google ScholarCross Ref
- Stone, B. 2010. Tweet preservation. Twitter Blog (14 April 2010). Retrieved from https://blog.twitter.com/2010/tweet-preservation (accessed Feb 6, 2016).Google Scholar
- stuck_in_the_matrix. 2015a. I have every publicly available Reddit comment for research. ~ 1.7 billion comments @ 250 GB compressed. Any interest in this? Retrieved from https://www.reddit.com/r/datasets/comments/3bxlg7/i_have_every_publicly_available_reddit_comment (accessed Feb 12, 2016, archived by WebCite® at http://www.webcitation.org/6fFpMhWNk).Google Scholar
- stuck_in_the_matrix. 2015b. Complete Public Reddit Comments Corpus. Retrieved from https://archive.org/details/2015_reddit_comments_corpus (accessed Feb 12, 2016).Google Scholar
- Summers, E. 2014. Ferguson-tweet-ids. Retrieved from https://archive.org/details/ferguson-tweet-ids (accessed Feb 6, 2016).Google Scholar
- Summers, E. 2015. Tweets and deletes: silences in the social media archive. Retrieved from https://medium.com/on-archivy/tweets-and-deletes-727ed74f84ed#.pay32r3eu (accessed Feb 6, 2016; archived by WebCite® at http://www.webcitation.org/6f6KxoikL)Google Scholar
- Thomson, S. D. 2016. Preserving Social Media. DPC Technology Watch Report. Retrieved from http://dpconline.org/publications/technology-watch-reportsGoogle Scholar
- Tiropanis T., Hall, W., Hendler, J., de Larrinaga, C. 2014. The Web Observatory: A Middle Layer for Broad Data. Big Data. September 2014, 2(3): 129--133. DOI:10.1089/big.2014.0035.Google Scholar
- TREC. 2011. Tweets2011. Retrieved from http://trec.nist.gov/data/tweets/ (retrieved Feb 12, 2016, archived by WebCite® at http://www.webcitation.org/6W1ZVkk8o)Google Scholar
- Tufekci, Z. 2014. Big Questions for Social Media Big Data: Representativeness, Validity and Other Methodological Pitfalls. In ICWSM'14: Proceedings of the 8th International AAAI Conference on Weblogs and Social Media.Google Scholar
- Twitter, Inc. 2015. Developer agreement & policy. Retrieved from: https://dev.twitter.com/overview/terms/agreement-and-policy (accessed Feb 6, 2016).Google Scholar
- Web Science Trust. No date. Web Observatory. Retrieved from http://webscience.org/web-observatory/ (accessed Feb 12, 2016, archived by WebCite® at http://www.webcitation.org/6fFnJwWSa).Google Scholar
- Weller, K. 2014. Twitter und Wahlen: Zwischen 140 Zeichen und Milliarden von Tweets. In R. Reichert ed., Big Data: Analysen zum digitalen Wandel von Wissen, Macht und Ökonomie. Bielefeld: transcript, 239--257.Google Scholar
- Weller, K., Kinder-Kurlanda, K. E. 2015. Uncovering the Challenges in Collection, Sharing and Documentation: The Hidden Data of Social Media Research? In Standards and Practices in Large-Scale Social Media Research: Papers from the 2015 ICWSM Workshop. Proceedings Ninth International AAAI Conference on Web and Social Media Oxford University, May 26, 2015 - May 29, 2015, 28--37. Ann Arbor, MI: AAAI Press. Retrieved from http://www.aaai.org/ocs/index.php/ICWSM/ICWSM15/paper/view/10657 (accessed Feb 12, 2016).Google Scholar
- Wikipedia. No date. Wikipedia:Database_download. Retrieved from https://en.wikipedia.org/wiki/Wikipedia:Database_download (accessed Feb 12, 2016, archived by WebCite® at http://www.webcitation.org/6fFnfeGKS).Google Scholar
- Zenk-Möltgen, W. 2014. Datorium: Benefit from Data Sharing. Presentation at IASSIST 2014. Retrieved from http://www.iassistdata.org/conferences/2014/presentation/3834 (accessed Feb 12, 2016).Google Scholar
- Zimmer, M. 2010. But the data is already public: on the ethics of research in Facebook. Ethics and Information Technology 12(4):313--325. Google ScholarDigital Library
Index Terms
- A manifesto for data sharing in social media research
Recommendations
Opening Doors to Sharing Social Media Data
Research data sharing becomes increasingly difficult in the context of social media. Increasing restrictions from social media sites are creating an environment where data cannot be freely shared and as a result scientific claims cannot be verified. In ...
Situated Social Media Use: A Methodological Approach to Locating Social Media Practices and Trajectories
CHI '15: Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing SystemsIn this paper we draw upon a number of explorations of social media activities, trying to capture and understand them as located, situated practices. This methodological endeavor spans over analyzing patterns in big data feeds (here Instagram) as well ...
Uses and gratifications of social networking sites for bridging and bonding social capital
Applying uses and gratifications theory (UGT) and social capital theory, our study examined users of four social networking sites (SNSs) (Facebook, Twitter, Instagram, and Snapchat), and their influence on online bridging and bonding social capital. ...
Comments