Abstract
The spread of malicious or accidental misinformation in social media, especially in time-sensitive situations, such as real-world emergencies, can have harmful effects on individuals and society. In this work, we developed models for automated verification of rumors (unverified information) that propagate through Twitter. To predict the veracity of rumors, we identified salient features of rumors by examining three aspects of information spread: linguistic style used to express rumors, characteristics of people involved in propagating information, and network propagation dynamics. The predicted veracity of a time series of these features extracted from a rumor (a collection of tweets) is generated using Hidden Markov Models. The verification algorithm was trained and tested on 209 rumors representing 938,806 tweets collected from real-world events, including the 2013 Boston Marathon bombings, the 2014 Ferguson unrest, and the 2014 Ebola epidemic, and many other rumors about various real-world events reported on popular websites that document public rumors. The algorithm was able to correctly predict the veracity of 75% of the rumors faster than any other public source, including journalists and law enforcement officials. The ability to track rumors and predict their outcomes may have practical applications for news consumers, financial markets, journalists, and emergency services, and more generally to help minimize the impact of false information on Twitter.
- Pear Analytics. 2009. Twitter Study--August 2009. Available: https://pearanalytics.com/wp-content/uploads/2009/08/Twitter-Study-August-2009.pdf. Accessed 2015 March 13.Google Scholar
- Sinan Aral and Dylan Walker. 2012. Identifying influential and susceptible members of social networks. Science 337, 6092 (2012), 337--341. Google ScholarCross Ref
- Eytan Bakshy, Jake M. Hofman, Winter A. Mason, and Duncan J. Watts. 2011. Everyone’s an influencer: Quantifying influence on Twitter. In Proceedings of the 4th ACM International Conference on Web Search and Data Mining. ACM, 65--74. Google ScholarDigital Library
- Alex Beutel, Wanhong Xu, Venkatesan Guruswami, Christopher Palow, and Christos Faloutsos. 2013. Copycatch: Stopping group attacks by spotting lockstep behavior in social networks. In Proceedings of the 22nd international conference on World Wide Web. International World Wide Web Conferences Steering Committee, 119--130. Google ScholarDigital Library
- Prashant Bordia and Ralph L. Rosnow. 1998. Rumor rest stops on the information highway transmission patterns in a computer-mediated rumor chain. Human Communication Research 25, 2 (1998), 163--179. Google ScholarCross Ref
- Carlos Castillo, Marcelo Mendoza, and Barbara Poblete. 2011. Information credibility on Twitter. In Proceedings of the 20th International Conference on World Wide Web. ACM, 675--684. Google ScholarDigital Library
- Damon Centola. 2010. The spread of behavior in an online social network experiment. Science 329, 5996 (2010), 1194--1197. Google ScholarCross Ref
- Danqi Chen and Christopher D. Manning. 2014. A fast and accurate dependency parser using neural networks. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). ACL, 740--750. Google ScholarCross Ref
- Robin Cowan and Nicolas Jonard. 2004. Network structure and the diffusion of knowledge. Journal of economic Dynamics and Control 28, 8 (2004), 1557--1575. Google ScholarCross Ref
- David Crystal. 2006. Language and the Internet (2nd). Cambridge: Cambridge University Press. Google ScholarCross Ref
- Bertrand De Longueville, Robin S. Smith, and Gianluca Luraschi. 2009. Omg, from here, I can see the flames! A use case of mining location based social networks to acquire spatio-temporal data on forest fires. In Proceedings of the 2009 International Workshop on Location Based Social Networks. ACM, 73--80. Google ScholarDigital Library
- Hui Ding, Goce Trajcevski, Peter Scheuermann, Xiaoyue Wang, and Eamonn Keogh. 2008. Querying and mining of time series data Experimental comparison of representations and distance measures. Proceedings of the VLDB Endowment 1, 2 (2008), 1542--1552. Google ScholarDigital Library
- Paul Earle, Michelle Guy, Richard Buckmaster, Chris Ostrum, Scott Horvath, and Amy Vaughan. 2010. OMG earthquake! Can Twitter improve earthquake response? Seismological Research Letters 81, 2 (2010), 246--251. Google ScholarCross Ref
- Bradley Efron. 1982. The Jackknife, the Bootstrap, and Other Resampling Plans. (SIAM Monograph #38) Philadelphia: Society for Industrial and Applied Mathematics. Google ScholarCross Ref
- Eric K. Foster and Ralph L. Rosnow. 2006. Gossip and network relationships. Relating Difficulty: The Process of Constructing and Managing Difficult Interaction (2006), 161--180.Google Scholar
- Adrien Friggeri, Lada A. Adamic, Dean Eckles, and Justin Cheng. 2014. Rumor cascades. In Proceedings of the 8th International AAAI Conference on Weblogs and Social Media.Google Scholar
- Ayalvadi Ganesh, Laurent Massoulié, and Don Towsley. 2005. The effect of network topology on the spread of epidemics. In Proceedings of the 24th Annual Joint Conference of the IEEE Computer and Communications Societies INFOCOM 2005, Vol. 2. IEEE, 1455--1466. Google ScholarCross Ref
- Sharad Goel, Duncan J. Watts, and Daniel G. Goldstein. 2012. The structure of online diffusion networks. In Proceedings of the 13th ACM Conference on Electronic Commerce. ACM, 623--638. Google ScholarDigital Library
- Frank E. Harrell. 2001. Regression Modeling Strategies. Springer Science 8 Business Media.Google Scholar
- Amanda Lee Hughes and Leysia Palen. 2009. Twitter adoption and use in mass convergence and emergency events. International Journal of Emergency Management 6, 3 (2009), 248--260. Google ScholarCross Ref
- Akshay Java, Xiaodan Song, Tim Finin, and Belle Tseng. 2007. Why we Twitter: Understanding microblogging usage and communities. In Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 Workshop on Web Mining and Social Network Analysis. ACM, 56--65. Google ScholarDigital Library
- Meng Jiang, Alex Beutel, Peng Cui, Bryan Hooi, Shiqiang Yang, and Christos Faloutsos. 2016a. Spotting suspicious behaviors in multimodal data: A general metric and algorithms. IEEE Transactions on Knowledge and Data Engineering 28, 8 (2016), 2187--2200. Google ScholarDigital Library
- Meng Jiang, Peng Cui, Alex Beutel, Christos Faloutsos, and Shiqiang Yang. 2014. Catchsync: Catching synchronized behavior in large directed graphs. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 941--950. Google ScholarDigital Library
- Meng Jiang, Peng Cui, and Christos Faloutsos. 2016b. Suspicious behavior detection: Current trends and future directions. IEEE Intelligent Systems 31, 1 (2016), 31--39. Google ScholarDigital Library
- Fang Jin, Wei Wang, Liang Zhao, Edward Dougherty, Yang Cao, Chang-Tien Lu, and Naren Ramakrishnan. 2014. Misinformation propagation in the age of Twitter. Computer 47, 12 (2014), 90--94. Google ScholarDigital Library
- Márton Karsai, Gerardo Iñiguez, Kimmo Kaski, and János Kertész. 2014. Complex contagion process in spreading of online innovation. Journal of The Royal Society Interface 11, 101 (2014), 20140694.Google ScholarCross Ref
- Max Kaufmann and Jugal Kalita. 2010. Syntactic normalization of Twitter messages. In Proceedings of the International Conference on Natural Language Processing. Kharagpur, India.Google Scholar
- Kirill Kireyev, Leysia Palen, and K. Anderson. 2009. Applications of topics models to analysis of disaster-related Twitter data. In NIPS Workshop on Applications for Topic Models: Text and Beyond. Amherst, MA.Google Scholar
- Lingpeng Kong, Nathan Schneider, Swabha Swayamdipta, Archna Bhatia, Chris Dyer, and Noah A. Smith. 2014. A dependency parser for tweets. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP'14). ACL, 1001--1012. Google ScholarCross Ref
- Lalit Kundani. 2013. When the Tail Wags the Dog: Dangers of Crowdsourcing Justice. Retrieved from http://newamericamedia.org/2013/07/when-the-tail-wags-the-dog-dangers-of-crowdsourcing-justice.php/.Google Scholar
- Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon. 2010. What is Twitter, a social network or a news media? In Proceedings of the 19th International Conference on World Wide Web. ACM, 591--600. Google ScholarDigital Library
- Sejeong Kwon, Meeyoung Cha, Kyomin Jung, Wei Chen, and Yajun Wang. 2013. Prominent features of rumor propagation in online social media. In Proceedings of the 13th International Conference on Data Mining (ICDM). IEEE, 1103--1108. Google ScholarCross Ref
- Sam Laird. 2012. “How Social Media Is Taking Over the News Industry”. (April 2012). http://mashable.com/ 2012/04/18/social-media-and-the-news/[mashable.com; posted 18-April-2012].Google Scholar
- Vasileios Lampos, Tijl De Bie, and Nello Cristianini. 2010. Flu detector-tracking epidemics on Twitter. In Machine Learning and Knowledge Discovery in Databases. Springer, 599--602. Google ScholarDigital Library
- Dave Lee. 2013. Boston bombing: How internet detectives got it very wrong. Retrieved from http://www.bbc.com/news/technology-22214511/.Google Scholar
- Jure Leskovec, Lars Backstrom, and Jon Kleinberg. 2009. Meme-tracking and the dynamics of the news cycle. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 497--506. Google ScholarDigital Library
- Yixuan Li, Oscar Martinez, Xing Chen, Yi Li, and John E. Hopcraft. 2016. In a world that counts: Clustering and detecting fake social engagement at scale. In Proceedings of the 25th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 111--120. Google ScholarDigital Library
- Gang Liang, Jin Yang, and Chun Xu. 2016. Automatic rumors identification on Sina Weibo. In Proceedings of the12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD’16). IEEE, 1523--1531. Google ScholarCross Ref
- Hugo Liu and Push Singh. 2004. ConceptNeta practical commonsense reasoning tool-kit. BT Technology Journal 22, 4 (2004), 211--226. Google ScholarDigital Library
- Yasuko Matsubara, Yasushi Sakurai, B. Aditya Prakash, Lei Li, and Christos Faloutsos. 2012. Rise and fall patterns of information diffusion: Model and implications. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 6--14. Google ScholarDigital Library
- Marcelo Mendoza, Barbara Poblete, and Carlos Castillo. 2010. Twitter under crisis: Can we trust what we RT? In Proceedings of the 1st Workshop on Social Media Analytics. ACM, 71--79. Google ScholarDigital Library
- George Miller and Christiane Fellbaum. 1998. Wordnet: An electronic lexical database. (1998).Google Scholar
- Mor Naaman, Jeffrey Boase, and Chih-Hui Lai. 2010. Is it really about me? Message content in social awareness streams. In Proceedings of the 2010 ACM Conference on Computer Supported Cooperative Work. ACM, 189--192. Google ScholarDigital Library
- Mark E. J. Newman. 2002. Spread of epidemic disease on networks. Physical review E 66, 1 (2002), 016128.Google Scholar
- Romualdo Pastor-Satorras and Alessandro Vespignani. 2001. Epidemic spreading in scale-free networks. Physical Review Letters 86, 14 (2001), 3200.Google ScholarCross Ref
- James W. Pennebaker, Matthias R. Mehl, and Kate G. Niederhoffer. 2003. Psychological aspects of natural language use: Our words, our selves. Annual Review of Psychology 54, 1 (2003), 547--577. Google Scholar
- The Pew Research Center. 2008. Internet Overtakes Newspapers As News Outlet. (December 2008). http://pewresearch.org/pubs/1066/internet-overtakes-newspapers-as-news-source[pewresearch.org; posted 23-December-2008].Google Scholar
- The Pew Research Center. 2009. Public Evaluations of the News Media: 1985-2009. Press Accuracy Rating Hits Two Decade Low. Retrieved from http://www.people-press.org/2009/09/13/press-accuracy-rating-hits-two-decade-low/.Google Scholar
- The Pew Research Center. 2012. Further Decline in Credibility Ratings for Most News Organizations. Retrieved from http://www.people-press.org/2012/08/16/further-decline-in-credibility-ratings-for-most-news-organizations/.Google Scholar
- Kevin Poulsen. 2007. Firsthand reports from California wildfires pour through Twitter. Available: www.wired.com/threatlevel/2007/10/firsthand. Accessed 2009 Feburary 15.Google Scholar
- Vahed Qazvinian, Emily Rosengren, Dragomir R. Radev, and Qiaozhu Mei. 2011. Rumor has it: Identifying misinformation in microblogs. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1589--1599.Google ScholarDigital Library
- Lawrence Rabiner. 1989. A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE 77, 2 (1989), 257--286. Google Scholar
- Jacob Ratkiewicz, Michael Conover, Mark Meiss, Bruno Gonçalves, Alessandro Flammini, and Filippo Menczer. 2011a. Detecting and tracking political abuse in social media. In Proceedings of the 5th International AAAI Conference on Weblogs and Social Media (ICWSM'11). AAAI, 297--304.Google Scholar
- Jacob Ratkiewicz, Michael Conover, Mark Meiss, Bruno Goncalves, Snehal Patil, Alessandro Flammini, and Filippo Menczer. 2011b. Detecting and tracking the spread of astroturf memes in microblog streams. In Proceedings of the 20th International Conference Companion on World Wide Web. ACM, 249--252.Google ScholarDigital Library
- Ralph L. Rosnow. 1991. Inside rumor: A personal journey. American Psychologist 46, 5 (1991), 484.Google ScholarCross Ref
- Takeshi Sakaki, Makoto Okazaki, and Yutaka Matsuo. 2010. Earthquake shakes Twitter users: Real-time event detection by social sensors. In Proceedings of the 19th International Conference on World Wide Web. ACM, 851--860. Google ScholarDigital Library
- Hiroaki Sakoe and Seibi Chiba. 1978. Dynamic programming algorithm optimization for spoken word recognition. IEEE Transactions on Acoustics, Speech and Signal Processing 26, 1 (1978), 43--49. Google ScholarCross Ref
- Jagan Sankaranarayanan, Hanan Samet, Benjamin E. Teitler, Michael D. Lieberman, and Jon Sperling. 2009. Twitterstand: News in tweets. In Proceedings of the 17th ACM Sigspatial International Conference on Advances in Geographic Information Systems. ACM, 42--51. Google ScholarDigital Library
- Devavrat Shah and Tauhid Zaman. 2011. Rumors in a network: Who’s the culprit? IEEE Transactions on Information Theory 57, 8 (2011), 5163--5181. Google ScholarDigital Library
- Tamotsu Shibutani. 1966. Improvised News: A Sociological Study of Rumor. Ardent Media.Google Scholar
- Kate Starbird, Leysia Palen, Amanda L. Hughes, and Sarah Vieweg. 2010. Chatter on the red: What hazards threat reveals about the social life of microblogged information. In Proceedings of the 2010 ACM Conference on Computer Supported Cooperative Work. ACM, 241--250. Google ScholarDigital Library
- Wilma Stassen. 2010. Your news in 140 characters: Exploring the role of social media in journalism. Global Media Journal-African Edition 4, 1 (2010), 116--131.Google Scholar
- Manuel Valdes. 2013. Innocents accused in online manhunt. Retieved from http://www.3news.co.nz/Innocents-accused-in-online-manhunt/tabid/412/articleID/295143/Default.aspx/.Google Scholar
- Sarah Vieweg. 2010. Microblogged contributions to the emergency arena: Discovery, interpretation and implications. In Proceedings of the 2010 ACM Conference on Computer Supported Cooperative Work. ACM, 241--250.Google Scholar
- Sarah Vieweg, Amanda L. Hughes, Kate Starbird, and Leysia Palen. 2010. Microblogging during two natural hazards events: What Twitter may contribute to situational awareness. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 1079--1088. Google ScholarDigital Library
- Soroush Vosoughi. 2015. Automatic detection and verification of rumors on Twitter. Ph.D. Dissertation. Massachusetts Institute of Technology.Google Scholar
- Soroush Vosoughi and Deb Roy. 2015. A human-machine collaborative system for identifying rumors on Twitter. In 2015 IEEE International Conference on Data Mining Workshop (ICDMW'15). IEEE, 47--50. Google ScholarDigital Library
- Soroush Vosoughi and Deb Roy. 2016a. A semi-automatic method for efficient detection of stories on social media. In Proceedings of the10th International AAAI Conference on Web and Social Media. AAAI, 707--710.Google Scholar
- Soroush Vosoughi and Deb Roy. 2016b. Tweet acts: A speech act classifier for Twitter. In Proceedings of the10th International AAAI Conference on Web and Social Media. AAAI, 711--714.Google Scholar
- Soroush Vosoughi, Helen Zhou, and Deb Roy. 2015. Enhanced Twitter sentiment classification using contextual information. In Proceedings of the 6th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis. Association for Computational Linguistics, 16--24. http://aclweb.org/anthology/W15-2904.Google ScholarCross Ref
- Duncan J. Watts and Peter Sheridan Dodds. 2007. Influentials, networks, and public opinion formation. Journal of consumer research 34, 4 (2007), 441--458. Google ScholarCross Ref
- Kang Zhao, John Yen, Greta Greer, Baojun Qiu, Prasenjit Mitra, and Kenneth Portier. 2014. Finding influential users of online health communities: A new metric based on sentiment influence. Journal of the American Medical Informatics Association (JAMIA) 21, e2 (2014), e212--e218. Google ScholarCross Ref
- Zhe Zhao, Paul Resnick, and Qiaozhu Mei. 2015. Enquiring minds: Early detection of rumors in social media from enquiry posts. In Proceedings of the 24th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 1395--1405. Google ScholarDigital Library
Index Terms
- Rumor Gauge: Predicting the Veracity of Rumors on Twitter
Recommendations
The diffusion of misinformation on social media
This study examines dynamic communication processes of political misinformation on social media focusing on three components: the temporal pattern, content mutation, and sources of misinformation. We traced the lifecycle of 17 popular political rumors ...
The Retransmission of Rumor-related Tweets: Characteristics of Source and Message
SMSociety '16: Proceedings of the 7th 2016 International Conference on Social Media & SocietyThis paper investigates the characteristics of rumor-related tweets that would attract retransmission. Drawing on the uses and gratifications (U & G) and influential users' theories, it proposes a rumor retransmission model which comprises variables ...
The web centipede: understanding how web communities influence each other through the lens of mainstream and alternative news sources
IMC '17: Proceedings of the 2017 Internet Measurement ConferenceAs the number and the diversity of news outlets on the Web grows, so does the opportunity for "alternative" sources of information to emerge. Using large social networks like Twitter and Facebook, misleading, false, or agenda-driven information can ...
Comments