research-article

Rumor Gauge: Predicting the Veracity of Rumors on Twitter

Authors:
Soroush Vosoughi

Massachusetts Institute of Technology, Cambridge, Massachusetts

Massachusetts Institute of Technology, Cambridge, Massachusetts

0000-0002-2564-8909
View Profile

,
Mostafa ‘Neo’ Mohsenvand

Massachusetts Institute of Technology, Cambridge, Massachusetts

Massachusetts Institute of Technology, Cambridge, Massachusetts
View Profile

,
Deb Roy

Massachusetts Institute of Technology, Cambridge, Massachusetts

Massachusetts Institute of Technology, Cambridge, Massachusetts
View Profile

ACM Transactions on Knowledge Discovery from Data Volume 11 Issue 4Article No.: 50pp 1–36https://doi.org/10.1145/3070644

Published:14 July 2017Publication History

ACM Transactions on Knowledge Discovery from Data

Abstract

The spread of malicious or accidental misinformation in social media, especially in time-sensitive situations, such as real-world emergencies, can have harmful effects on individuals and society. In this work, we developed models for automated verification of rumors (unverified information) that propagate through Twitter. To predict the veracity of rumors, we identified salient features of rumors by examining three aspects of information spread: linguistic style used to express rumors, characteristics of people involved in propagating information, and network propagation dynamics. The predicted veracity of a time series of these features extracted from a rumor (a collection of tweets) is generated using Hidden Markov Models. The verification algorithm was trained and tested on 209 rumors representing 938,806 tweets collected from real-world events, including the 2013 Boston Marathon bombings, the 2014 Ferguson unrest, and the 2014 Ebola epidemic, and many other rumors about various real-world events reported on popular websites that document public rumors. The algorithm was able to correctly predict the veracity of 75% of the rumors faster than any other public source, including journalists and law enforcement officials. The ability to track rumors and predict their outcomes may have practical applications for news consumers, financial markets, journalists, and emergency services, and more generally to help minimize the impact of false information on Twitter.

References

Pear Analytics. 2009. Twitter Study--August 2009. Available: https://pearanalytics.com/wp-content/uploads/2009/08/Twitter-Study-August-2009.pdf. Accessed 2015 March 13.Google Scholar
Sinan Aral and Dylan Walker. 2012. Identifying influential and susceptible members of social networks. Science 337, 6092 (2012), 337--341. Google ScholarCross Ref
Eytan Bakshy, Jake M. Hofman, Winter A. Mason, and Duncan J. Watts. 2011. Everyone’s an influencer: Quantifying influence on Twitter. In Proceedings of the 4th ACM International Conference on Web Search and Data Mining. ACM, 65--74. Google ScholarDigital Library
Alex Beutel, Wanhong Xu, Venkatesan Guruswami, Christopher Palow, and Christos Faloutsos. 2013. Copycatch: Stopping group attacks by spotting lockstep behavior in social networks. In Proceedings of the 22nd international conference on World Wide Web. International World Wide Web Conferences Steering Committee, 119--130. Google ScholarDigital Library
Prashant Bordia and Ralph L. Rosnow. 1998. Rumor rest stops on the information highway transmission patterns in a computer-mediated rumor chain. Human Communication Research 25, 2 (1998), 163--179. Google ScholarCross Ref
Carlos Castillo, Marcelo Mendoza, and Barbara Poblete. 2011. Information credibility on Twitter. In Proceedings of the 20th International Conference on World Wide Web. ACM, 675--684. Google ScholarDigital Library
Damon Centola. 2010. The spread of behavior in an online social network experiment. Science 329, 5996 (2010), 1194--1197. Google ScholarCross Ref
Danqi Chen and Christopher D. Manning. 2014. A fast and accurate dependency parser using neural networks. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). ACL, 740--750. Google ScholarCross Ref
Robin Cowan and Nicolas Jonard. 2004. Network structure and the diffusion of knowledge. Journal of economic Dynamics and Control 28, 8 (2004), 1557--1575. Google ScholarCross Ref
David Crystal. 2006. Language and the Internet (2nd). Cambridge: Cambridge University Press. Google ScholarCross Ref
Bertrand De Longueville, Robin S. Smith, and Gianluca Luraschi. 2009. Omg, from here, I can see the flames&excl; A use case of mining location based social networks to acquire spatio-temporal data on forest fires. In Proceedings of the 2009 International Workshop on Location Based Social Networks. ACM, 73--80. Google ScholarDigital Library
Hui Ding, Goce Trajcevski, Peter Scheuermann, Xiaoyue Wang, and Eamonn Keogh. 2008. Querying and mining of time series data Experimental comparison of representations and distance measures. Proceedings of the VLDB Endowment 1, 2 (2008), 1542--1552. Google ScholarDigital Library
Paul Earle, Michelle Guy, Richard Buckmaster, Chris Ostrum, Scott Horvath, and Amy Vaughan. 2010. OMG earthquake&excl; Can Twitter improve earthquake response? Seismological Research Letters 81, 2 (2010), 246--251. Google ScholarCross Ref
Bradley Efron. 1982. The Jackknife, the Bootstrap, and Other Resampling Plans. (SIAM Monograph #38) Philadelphia: Society for Industrial and Applied Mathematics. Google ScholarCross Ref
Eric K. Foster and Ralph L. Rosnow. 2006. Gossip and network relationships. Relating Difficulty: The Process of Constructing and Managing Difficult Interaction (2006), 161--180.Google Scholar
Adrien Friggeri, Lada A. Adamic, Dean Eckles, and Justin Cheng. 2014. Rumor cascades. In Proceedings of the 8th International AAAI Conference on Weblogs and Social Media.Google Scholar
Ayalvadi Ganesh, Laurent Massoulié, and Don Towsley. 2005. The effect of network topology on the spread of epidemics. In Proceedings of the 24th Annual Joint Conference of the IEEE Computer and Communications Societies INFOCOM 2005, Vol. 2. IEEE, 1455--1466. Google ScholarCross Ref
Sharad Goel, Duncan J. Watts, and Daniel G. Goldstein. 2012. The structure of online diffusion networks. In Proceedings of the 13th ACM Conference on Electronic Commerce. ACM, 623--638. Google ScholarDigital Library
Frank E. Harrell. 2001. Regression Modeling Strategies. Springer Science 8 Business Media.Google Scholar
Amanda Lee Hughes and Leysia Palen. 2009. Twitter adoption and use in mass convergence and emergency events. International Journal of Emergency Management 6, 3 (2009), 248--260. Google ScholarCross Ref
Akshay Java, Xiaodan Song, Tim Finin, and Belle Tseng. 2007. Why we Twitter: Understanding microblogging usage and communities. In Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 Workshop on Web Mining and Social Network Analysis. ACM, 56--65. Google ScholarDigital Library
Meng Jiang, Alex Beutel, Peng Cui, Bryan Hooi, Shiqiang Yang, and Christos Faloutsos. 2016a. Spotting suspicious behaviors in multimodal data: A general metric and algorithms. IEEE Transactions on Knowledge and Data Engineering 28, 8 (2016), 2187--2200. Google ScholarDigital Library
Meng Jiang, Peng Cui, Alex Beutel, Christos Faloutsos, and Shiqiang Yang. 2014. Catchsync: Catching synchronized behavior in large directed graphs. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 941--950. Google ScholarDigital Library
Meng Jiang, Peng Cui, and Christos Faloutsos. 2016b. Suspicious behavior detection: Current trends and future directions. IEEE Intelligent Systems 31, 1 (2016), 31--39. Google ScholarDigital Library
Fang Jin, Wei Wang, Liang Zhao, Edward Dougherty, Yang Cao, Chang-Tien Lu, and Naren Ramakrishnan. 2014. Misinformation propagation in the age of Twitter. Computer 47, 12 (2014), 90--94. Google ScholarDigital Library
Márton Karsai, Gerardo Iñiguez, Kimmo Kaski, and János Kertész. 2014. Complex contagion process in spreading of online innovation. Journal of The Royal Society Interface 11, 101 (2014), 20140694.Google ScholarCross Ref
Max Kaufmann and Jugal Kalita. 2010. Syntactic normalization of Twitter messages. In Proceedings of the International Conference on Natural Language Processing. Kharagpur, India.Google Scholar
Kirill Kireyev, Leysia Palen, and K. Anderson. 2009. Applications of topics models to analysis of disaster-related Twitter data. In NIPS Workshop on Applications for Topic Models: Text and Beyond. Amherst, MA.Google Scholar
Lingpeng Kong, Nathan Schneider, Swabha Swayamdipta, Archna Bhatia, Chris Dyer, and Noah A. Smith. 2014. A dependency parser for tweets. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP'14). ACL, 1001--1012. Google ScholarCross Ref
Lalit Kundani. 2013. When the Tail Wags the Dog: Dangers of Crowdsourcing Justice. Retrieved from http://newamericamedia.org/2013/07/when-the-tail-wags-the-dog-dangers-of-crowdsourcing-justice.php/.Google Scholar
Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon. 2010. What is Twitter, a social network or a news media? In Proceedings of the 19th International Conference on World Wide Web. ACM, 591--600. Google ScholarDigital Library
Sejeong Kwon, Meeyoung Cha, Kyomin Jung, Wei Chen, and Yajun Wang. 2013. Prominent features of rumor propagation in online social media. In Proceedings of the 13th International Conference on Data Mining (ICDM). IEEE, 1103--1108. Google ScholarCross Ref
Sam Laird. 2012. “How Social Media Is Taking Over the News Industry”. (April 2012). http://mashable.com/ 2012/04/18/social-media-and-the-news/[mashable.com; posted 18-April-2012].Google Scholar
Vasileios Lampos, Tijl De Bie, and Nello Cristianini. 2010. Flu detector-tracking epidemics on Twitter. In Machine Learning and Knowledge Discovery in Databases. Springer, 599--602. Google ScholarDigital Library
Dave Lee. 2013. Boston bombing: How internet detectives got it very wrong. Retrieved from http://www.bbc.com/news/technology-22214511/.Google Scholar
Jure Leskovec, Lars Backstrom, and Jon Kleinberg. 2009. Meme-tracking and the dynamics of the news cycle. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 497--506. Google ScholarDigital Library
Yixuan Li, Oscar Martinez, Xing Chen, Yi Li, and John E. Hopcraft. 2016. In a world that counts: Clustering and detecting fake social engagement at scale. In Proceedings of the 25th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 111--120. Google ScholarDigital Library
Gang Liang, Jin Yang, and Chun Xu. 2016. Automatic rumors identification on Sina Weibo. In Proceedings of the12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD’16). IEEE, 1523--1531. Google ScholarCross Ref
Hugo Liu and Push Singh. 2004. ConceptNeta practical commonsense reasoning tool-kit. BT Technology Journal 22, 4 (2004), 211--226. Google ScholarDigital Library
Yasuko Matsubara, Yasushi Sakurai, B. Aditya Prakash, Lei Li, and Christos Faloutsos. 2012. Rise and fall patterns of information diffusion: Model and implications. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 6--14. Google ScholarDigital Library
Marcelo Mendoza, Barbara Poblete, and Carlos Castillo. 2010. Twitter under crisis: Can we trust what we RT? In Proceedings of the 1st Workshop on Social Media Analytics. ACM, 71--79. Google ScholarDigital Library
George Miller and Christiane Fellbaum. 1998. Wordnet: An electronic lexical database. (1998).Google Scholar
Mor Naaman, Jeffrey Boase, and Chih-Hui Lai. 2010. Is it really about me? Message content in social awareness streams. In Proceedings of the 2010 ACM Conference on Computer Supported Cooperative Work. ACM, 189--192. Google ScholarDigital Library
Mark E. J. Newman. 2002. Spread of epidemic disease on networks. Physical review E 66, 1 (2002), 016128.Google Scholar
Romualdo Pastor-Satorras and Alessandro Vespignani. 2001. Epidemic spreading in scale-free networks. Physical Review Letters 86, 14 (2001), 3200.Google ScholarCross Ref
James W. Pennebaker, Matthias R. Mehl, and Kate G. Niederhoffer. 2003. Psychological aspects of natural language use: Our words, our selves. Annual Review of Psychology 54, 1 (2003), 547--577. Google Scholar
The Pew Research Center. 2008. Internet Overtakes Newspapers As News Outlet. (December 2008). http://pewresearch.org/pubs/1066/internet-overtakes-newspapers-as-news-source[pewresearch.org; posted 23-December-2008].Google Scholar
The Pew Research Center. 2009. Public Evaluations of the News Media: 1985-2009. Press Accuracy Rating Hits Two Decade Low. Retrieved from http://www.people-press.org/2009/09/13/press-accuracy-rating-hits-two-decade-low/.Google Scholar
The Pew Research Center. 2012. Further Decline in Credibility Ratings for Most News Organizations. Retrieved from http://www.people-press.org/2012/08/16/further-decline-in-credibility-ratings-for-most-news-organizations/.Google Scholar
Kevin Poulsen. 2007. Firsthand reports from California wildfires pour through Twitter. Available: www.wired.com/threatlevel/2007/10/firsthand. Accessed 2009 Feburary 15.Google Scholar
Vahed Qazvinian, Emily Rosengren, Dragomir R. Radev, and Qiaozhu Mei. 2011. Rumor has it: Identifying misinformation in microblogs. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1589--1599.Google ScholarDigital Library
Lawrence Rabiner. 1989. A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE 77, 2 (1989), 257--286. Google Scholar
Jacob Ratkiewicz, Michael Conover, Mark Meiss, Bruno Gonçalves, Alessandro Flammini, and Filippo Menczer. 2011a. Detecting and tracking political abuse in social media. In Proceedings of the 5th International AAAI Conference on Weblogs and Social Media (ICWSM'11). AAAI, 297--304.Google Scholar
Jacob Ratkiewicz, Michael Conover, Mark Meiss, Bruno Goncalves, Snehal Patil, Alessandro Flammini, and Filippo Menczer. 2011b. Detecting and tracking the spread of astroturf memes in microblog streams. In Proceedings of the 20th International Conference Companion on World Wide Web. ACM, 249--252.Google ScholarDigital Library
Ralph L. Rosnow. 1991. Inside rumor: A personal journey. American Psychologist 46, 5 (1991), 484.Google ScholarCross Ref
Takeshi Sakaki, Makoto Okazaki, and Yutaka Matsuo. 2010. Earthquake shakes Twitter users: Real-time event detection by social sensors. In Proceedings of the 19th International Conference on World Wide Web. ACM, 851--860. Google ScholarDigital Library
Hiroaki Sakoe and Seibi Chiba. 1978. Dynamic programming algorithm optimization for spoken word recognition. IEEE Transactions on Acoustics, Speech and Signal Processing 26, 1 (1978), 43--49. Google ScholarCross Ref
Jagan Sankaranarayanan, Hanan Samet, Benjamin E. Teitler, Michael D. Lieberman, and Jon Sperling. 2009. Twitterstand: News in tweets. In Proceedings of the 17th ACM Sigspatial International Conference on Advances in Geographic Information Systems. ACM, 42--51. Google ScholarDigital Library
Devavrat Shah and Tauhid Zaman. 2011. Rumors in a network: Who’s the culprit? IEEE Transactions on Information Theory 57, 8 (2011), 5163--5181. Google ScholarDigital Library
Tamotsu Shibutani. 1966. Improvised News: A Sociological Study of Rumor. Ardent Media.Google Scholar
Kate Starbird, Leysia Palen, Amanda L. Hughes, and Sarah Vieweg. 2010. Chatter on the red: What hazards threat reveals about the social life of microblogged information. In Proceedings of the 2010 ACM Conference on Computer Supported Cooperative Work. ACM, 241--250. Google ScholarDigital Library
Wilma Stassen. 2010. Your news in 140 characters: Exploring the role of social media in journalism. Global Media Journal-African Edition 4, 1 (2010), 116--131.Google Scholar
Manuel Valdes. 2013. Innocents accused in online manhunt. Retieved from http://www.3news.co.nz/Innocents-accused-in-online-manhunt/tabid/412/articleID/295143/Default.aspx/.Google Scholar
Sarah Vieweg. 2010. Microblogged contributions to the emergency arena: Discovery, interpretation and implications. In Proceedings of the 2010 ACM Conference on Computer Supported Cooperative Work. ACM, 241--250.Google Scholar
Sarah Vieweg, Amanda L. Hughes, Kate Starbird, and Leysia Palen. 2010. Microblogging during two natural hazards events: What Twitter may contribute to situational awareness. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 1079--1088. Google ScholarDigital Library
Soroush Vosoughi. 2015. Automatic detection and verification of rumors on Twitter. Ph.D. Dissertation. Massachusetts Institute of Technology.Google Scholar
Soroush Vosoughi and Deb Roy. 2015. A human-machine collaborative system for identifying rumors on Twitter. In 2015 IEEE International Conference on Data Mining Workshop (ICDMW'15). IEEE, 47--50. Google ScholarDigital Library
Soroush Vosoughi and Deb Roy. 2016a. A semi-automatic method for efficient detection of stories on social media. In Proceedings of the10th International AAAI Conference on Web and Social Media. AAAI, 707--710.Google Scholar
Soroush Vosoughi and Deb Roy. 2016b. Tweet acts: A speech act classifier for Twitter. In Proceedings of the10th International AAAI Conference on Web and Social Media. AAAI, 711--714.Google Scholar
Soroush Vosoughi, Helen Zhou, and Deb Roy. 2015. Enhanced Twitter sentiment classification using contextual information. In Proceedings of the 6th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis. Association for Computational Linguistics, 16--24. http://aclweb.org/anthology/W15-2904.Google ScholarCross Ref
Duncan J. Watts and Peter Sheridan Dodds. 2007. Influentials, networks, and public opinion formation. Journal of consumer research 34, 4 (2007), 441--458. Google ScholarCross Ref
Kang Zhao, John Yen, Greta Greer, Baojun Qiu, Prasenjit Mitra, and Kenneth Portier. 2014. Finding influential users of online health communities: A new metric based on sentiment influence. Journal of the American Medical Informatics Association (JAMIA) 21, e2 (2014), e212--e218. Google ScholarCross Ref
Zhe Zhao, Paul Resnick, and Qiaozhu Mei. 2015. Enquiring minds: Early detection of rumors in social media from enquiry posts. In Proceedings of the 24th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 1395--1405. Google ScholarDigital Library

Index Terms

Rumor Gauge: Predicting the Veracity of Rumors on Twitter
1. Computing methodologies
  1. Artificial intelligence
    1. Knowledge representation and reasoning
    2. Natural language processing
      1. Information extraction
2. Information systems
  1. Information retrieval
  2. Information systems applications
    1. Data mining

Recommendations

The diffusion of misinformation on social media

This study examines dynamic communication processes of political misinformation on social media focusing on three components: the temporal pattern, content mutation, and sources of misinformation. We traced the lifecycle of 17 popular political rumors ...
Read More
The Retransmission of Rumor-related Tweets: Characteristics of Source and Message
SMSociety '16: Proceedings of the 7th 2016 International Conference on Social Media & Society

This paper investigates the characteristics of rumor-related tweets that would attract retransmission. Drawing on the uses and gratifications (U & G) and influential users' theories, it proposes a rumor retransmission model which comprises variables ...
Read More
The web centipede: understanding how web communities influence each other through the lens of mainstream and alternative news sources
IMC '17: Proceedings of the 2017 Internet Measurement Conference

As the number and the diversity of news outlets on the Web grows, so does the opportunity for "alternative" sources of information to emerge. Using large social networks like Twitter and Facebook, misleading, false, or agenda-driven information can ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Knowledge Discovery from Data Volume 11, Issue 4
Special Issue on KDD 2016 and Regular Papers
November 2017
419 pages
ISSN:1556-4681
EISSN:1556-472X
DOI:10.1145/3119906
Editor:
Jie Tang
Tsinghua University, China
Issue’s Table of Contents
Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 14 July 2017
- Accepted: 1 March 2017
- Revised: 1 October 2016
- Received: 1 November 2015
Published in tkdd Volume 11, Issue 4

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Twitter
fake news
propagation
rumor
veracity prediction
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 124
  Total Citations
  View Citations
- 2,578
  Total Downloads
- Downloads (Last 12 months)165
- Downloads (Last 6 weeks)19
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Rumor Gauge: Predicting the Veracity of Rumors on Twitter

ACM Transactions on Knowledge Discovery from Data

Abstract

References

Cited By

Index Terms

Recommendations

The diffusion of misinformation on social media

The Retransmission of Rumor-related Tweets: Characteristics of Source and Message

The web centipede: understanding how web communities influence each other through the lens of mainstream and alternative news sources

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Rumor Gauge: Predicting the Veracity of Rumors on Twitter

ACM Transactions on Knowledge Discovery from Data

Abstract

References

Cited By

Index Terms

Recommendations

The diffusion of misinformation on social media

The Retransmission of Rumor-related Tweets: Characteristics of Source and Message

The web centipede: understanding how web communities influence each other through the lens of mainstream and alternative news sources

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media