ABSTRACT
Fact checking has captured the attention of the media and the public alike; it has also recently received strong attention from the computer science community, in particular from data and knowledge management, natural language processing and information retrieval; we denote these together under the term "content management". In this paper, we identify the fact checking tasks which can be performed with the help of content management technologies, and survey the recent research works in this area, before laying out some perspectives for the future. We hope our work will provide interested researchers, journalists and fact checkers with an entry point in the existing literature as well as help develop a roadmap for future research and development work.
- Eneko Agirre, Daniel Cer, Mona Diab, and Aitor Gonzalez-Agirre. 2013. *Sem 2013 Shared Task: Semantic Textual Similarity. In Proceedings of the Second Joint Conference on Lexical and Computational Semantics and the Shared Task: Semantic Textual Similarity. 32--43.Google Scholar
- Isabelle Augenstein, Tim Rocktäschel, Andreas Vlachos, and Kalina Bontcheva. 2016. Stance Detection with Bidirectional Conditional Encoding. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, Jian Su, Kevin Duh, and Xavier Carreras (Eds.). Association for Computational Linguistics, Austin, Texas, 876--885. https://aclweb.org/anthology/D16--1084Google ScholarCross Ref
- Mevan Babakar and Will Moy. 2016. The State of Automated Factchecking. https://fullfact.org/media/uploads/full_factthe_state_of_automated_factchecking_aug_2016.pdf. (2016).Google Scholar
- Roy Bar-Haim, Indrajit Bhattacharya, Francesco Dinuzzo, Amrita Saha, and Noam Slonim. 2017. Stance Classification of Context-Dependent Claims. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers. Association for Computational Linguistics, Valencia, Spain, 251--261. http://www.aclweb.org/anthology/E17--1024Google ScholarCross Ref
- Adnene Belfodil, Sylvie Cazalens, Philippe Lamarre, and Marc Plantevit. 2017. Flash Points: Discovering Exceptional Pairwise Behaviors in Vote or Rating Data. In Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2017, Skopje, Macedonia, September 18--22, 2017, Proceedings, Part II (Lecture Notes in Computer Science), Michelangelo Ceci, Jaakko Hollmén, Ljupco Todorovski, Celine Vens, and Saso Dzeroski (Eds.), Vol. 10535. Springer, 442--458.Google Scholar
- Gajanan K. Birajdar and Vijay H. Mankar. 2013. Digital Image Forgery Detection Using Passive Techniques: A Survey. Digit. Investig. 10, 3 (Oct. 2013), 226--245. Google ScholarDigital Library
- Leticia Bode and Emily K. Vraga. 2015. In Related News, That Was Wrong: The Correction of Misinformation Through Related Stories Functionality in Social Media. Journal of Communication 65, 4 (2015), 619--638.Google ScholarCross Ref
- Raphaël Bonaque, T. D. Cao, Bogdan Cautis, François Goasdoué, J. Letelier, Ioana Manolescu, O. Mendoza, S. Ribeiro, Xavier Tannier, and Michaël Thomazo. 2016. Mixed-instance querying: a lightweight integration architecture for data journalism. PVLDB 9, 13 (2016), 1513--1516. Google ScholarDigital Library
- Allan Borodin, Gareth O. Roberts, Jeffrey S. Rosenthal, and Panayiotis Tsaparas. 2005. Link Analysis Ranking: Algorithms, Theory, and Experiments. ACM Trans. Internet Technol. 5, 1 (Feb. 2005), 231--297. Google ScholarDigital Library
- Samuel R. Bowman, Gabor Angeli, Christopher Potts, and Christopher D. Manning. 2015. A large annotated corpus for learning natural language inference. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics.Google Scholar
- Giovanni Luca Ciampaglia, Prashant Shiralkar, Luis M. Rocha, Johan Bollen, Filippo Menczer, and Alessandro Flammini. 2015. Computational fact checking from knowledge networks. PloS one 10, 6 (2015), e0128193.Google ScholarCross Ref
- Sarah Cohen, Hamilton James T., and Fred Turner. 2011. Computational Journalism. Commun. ACM 54, 10 (Oct. 2011), 66--71. Google ScholarDigital Library
- Ido Dagan, Oren Glickman, and Bernardo Magnini. 2005. The PASCAL Recognising Textual Entailment Challenge. In PASCAL Challenges Workshop for Recognizing Textual Entailment. http://oren.glickman.com/publications/LNAI_39440177. pdfGoogle Scholar
- Nicholas Diakopoulos. 2012. Cultivating the Landscape of Innovation in Computational Journalism. http://www.nickdiakopoulos.com/wpcontent/uploads/2012/05/diakopoulos_whitepaper_systematicinnovation.pdf. (2012).Google Scholar
- B. Dolan, C. Quirk, and C. Brockett. 2004. Unsupervised Construction of Large Paraphrase Corpora: Exploiting Massively Parallel News Sources. In Proceedings of the 20th International Conference on Computational Linguistics (Coling 04). COLING, Geneva, Switzerland. Google ScholarDigital Library
- Xin Dong, Evgeniy Gabrilovich, Geremy Heitz, Wilko Horn, Ni Lao, Kevin Murphy, Thomas Strohmann, Shaohua Sun, and Wei Zhang. 2014. Knowledge Vault: A Web-scale Approach to Probabilistic Knowledge Fusion. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '14). ACM, New York, NY, USA, 601--610. Google ScholarDigital Library
- Xin Luna Dong, Laure Berti-Equille, and Divesh Srivastava. 2009. Truth discovery and copying detection in a dynamic world. Proceedings of the VLDB Endowment 2, 1 (2009), 562--573. Google ScholarDigital Library
- Xin Luna Dong, Evgeniy Gabrilovich, Kevin Murphy, Van Dang, Wilko Horn, Camillo Lugaresi, Shaohua Sun, and Wei Zhang. 2015. Knowledge-Based Trust: Estimating the Trustworthiness of Web Sources. In Proceedings of the VLDB Endowment. http://arxiv.org/pdf/1502.03519v1.pdfGoogle ScholarDigital Library
- Rob Ennals, Beth Trushkowsky, and John Mark Agosta. 2010. Highlighting disputed claims on the web. In Proceedings of the 19th international conference on World wide web. ACM, 341--350. Google ScholarDigital Library
- Adam Faulkner. 2014. Automated Classification of Stance in Student Essays: An Approach Using Stance Target Information and the Wikipedia Link-Based Measure. In Proceedings of the Twenty-Seventh International Florida Artificial Intelligence Research Society Conference. Association for the Advancement of Artificial Intelligence, Pensacola Beach, USA. https://pdfs.semanticscholar.org/ 2e8d/01e2fcf7ad7bfc889360a7c9c495effbdc34.pdfGoogle Scholar
- William Ferreira and Andreas Vlachos. 2016. Emergent: a novel data-set for stance classification. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, San Diego, California, 1163--1168. http://www.aclweb.org/anthology/N16--1138Google ScholarCross Ref
- Michael Franklin, Alon Halevy, and David Maier. 2005. From Databases to Dataspaces: A New Abstraction for Information Management. SIGMOD Record 34, 4 (Dec. 2005), 27--33. Google ScholarDigital Library
- R. Kelly Garrett. 2009. Echo chambers online: Politically motivated selective exposure among Internet news users1. Journal of Computer-Mediated Communication 14, 2 (2009), 265--285.Google ScholarCross Ref
- R. Kelly Garrett. 2016. Facebook's problem is more complicated than fake news. http://theconversation.com/facebooks-problem-is-more-complicated-thanfake-news-68886. (2016).Google Scholar
- R. Kelly Garrett and Brian E. Weeks. 2013. The promise and peril of real-time corrections to political misperceptions. In Proceedings of the 2013 conference on Computer supported cooperative work. ACM, 1047--1058. Google ScholarDigital Library
- Daniel Gerber, Diego Esteves, Jens Lehmann, Lorenz Bühmann, Ricardo Usbeck, Axel-Cyrille Ngonga Ngomo, and René Speck. 2015. DeFactoTemporal and multilingual Deep Fact Validation. Web Semantics: Science, Services and Agents on the World Wide Web 35 (2015), 85--101. Google ScholarDigital Library
- François Goasdoué, Konstantinos Karanasos, Yannis Katsis, Julien Leblay, Ioana Manolescu, and Stamatis Zampetakis. 2013. Fact checking and analyzing the web. In SIGMOD, Kenneth A. Ross, Divesh Srivastava, and Dimitris Papadias (Eds.). ACM, 997--1000. Google ScholarDigital Library
- Alan Greenblatt. 2017. The Future of Fact-Checking: Moving ahead in political accountability journalism. https://www.americanpressinstitute.org/publications/reports/whitepapers/future-of-fact-checking/. (2017).Google Scholar
- Chinnappa Guggilla, Tristan Miller, and Iryna Gurevych. 2016. CNN- and LSTMbased Claim Classification in Online User Comments. In Proceedings of the 26th International Conference on Computational Linguistics: Technical Papers (COLING 2016). 2740--2751. https://www.ukp.tu-darmstadt.de/fileadmin/user_upload/ Group_UKP/publikationen/2016/2016_COLING_CG.pdfGoogle Scholar
- Kazi Saidul Hasan and Vincent Ng. 2013. Stance Classification of Ideological Debates: Data, Models, Features, and Constraints. In Proceedings of the Sixth International Joint Conference on Natural Language Processing. Asian Federation of Natural Language Processing, Nagoya, Japan, 1348--1356. http://www.aclweb. org/anthology/I13--1191Google Scholar
- Naeemul Hassan, Chengkai Li, and Mark Tremayne. 2015. Detecting checkworthy factual claims in presidential debates. In Proceedings of International on Conference on Information and Knowledge Management. ACM, 1835--1838. Google ScholarDigital Library
- Naeemul Hassan, Gensheng Zhang, Fatma Arslan, Josue Caraballo, Damian Jimenez, Siddhant Gawsane, Shohedul Hasan, Minumol Joseph, Aaditya Kulkarni, Anil Kumar Nayak, and others. 2017. ClaimBuster: The First-ever End-to-end Fact-checking System. Proceedings of the VLDB Endowment 10, 7 (2017). Google ScholarDigital Library
- Julien Leblay. 2017. A Declarative Approach to Data-Driven Fact Checking. In AAAI. 147--153.Google Scholar
- Jens Lehmann, Daniel Gerber, Mohamed Morsey, and Axel-Cyrille Ngonga Ngomo. 2012. DeFacto-deep fact validation. In International Semantic Web Conference. Springer, 312--327. Google ScholarDigital Library
- Ran Levy, Yonatan Bilu, Daniel Hershcovich, Ehud Aharoni, and Noam Slonim. 2014. Context Dependent Claim Detection. In Proceedings of the 25th International Conference on Computational Linguistics (COLING 2014). Dublin City University and Association for Computational Linguistics, Dublin, Ireland, 1489--1500. http: //www.aclweb.org/anthology/C14--1141Google Scholar
- Yaliang Li, Jing Gao, Chuishi Meng, Qi Li, Lu Su, Bo Zhao, Wei Fan, and Jiawei Han. 2016. A survey on truth discovery. Acm Sigkdd Explorations Newsletter 17, 2 (2016), 1--16. Google ScholarDigital Library
- Amnon Lotan, Asher Stern, and Ido Dagan. 2013. TruthTeller: Annotating Predicate Truth. In Proceedings of NAACL-HLT 2013. Atlanta, USA, 752--757. http://www.aclweb.org/anthology/N13--1091Google Scholar
- Ioana Manolescu. 2017. ContentCheck: Content Management Techniques and Tools for Fact-checking. ERCIM News (Oct. 2017). https://hal.inria.fr/hal-01596563Google Scholar
- Saif Mohammad, Svetlana Kiritchenko, Parinaz Sobhani, Xiaodan Zhu, and Colin Cherry. 2016. SemEval-2016 Task 6: Detecting Stance in Tweets. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval2016). Association for Computational Linguistics, San Diego, California, 31--41. http://www.aclweb.org/anthology/S16--1003Google ScholarCross Ref
- Akiko Murakami and Rudy Raymond. 2010. Support or Oppose Classifying Positions in Online Debates from Reply Activities and Opinion Expressions. In Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010). Beijing, China, 869--875. http://www.aclweb.org/anthology/C10--2100 Google ScholarDigital Library
- Brendan Nyhan and Jason Reifler. 2010. When Corrections Fail: The Persistence of Political Misperceptions. Political Behavior 32, 2 (01 Jun 2010), 303--330.Google Scholar
- Ramesh C. Pandey, Sanjay K. Singh, and Kaushal K. Shukla. 2016. Passive Forensics in Image and Video Using Noise Features: A Review. Digital Investigation 19, C (Dec. 2016), 1--28. Google ScholarDigital Library
- Joonsuk Park and Claire Cardie. 2014. Identifying Appropriate Support for Propositions in Online User Comments. In Proceedings of the First Workshop on Argumentation Mining. Association for Computational Linguistics, Baltimore, Maryland, 29--38. http://www.aclweb.org/anthology/W14--2105Google ScholarCross Ref
- Kashyap Popat, Subhabrata Mukherjee, Jannik Strötgen, and Gerhard Weikum. 2016. Credibility Assessment of Textual Claims on the Web. In CIKM. ACM, 2173--2178. Google ScholarDigital Library
- Kashyap Popat, Subhabrata Mukherjee, Jannik Strötgen, and Gerhard Weikum. 2017. Where the Truth Lies: Explaining the Credibility of Emerging Claims on the Web and Social Media. In Proceedings of the 26th International Conference on World Wide Web Companion (WWW '17 Companion). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, 1003--1012. Google ScholarDigital Library
- Ashwin Rajadesingan and Huan Liu. 2014. Identifying Users with Opposing Opinions in Twitter Debates. In 7th International Conference Social Computing, Behavioral-Cultural Modeling and Prediction, William G. Kennedy, Nitin Agarwal, and Shanchieh Jay Yang (Eds.). Springer International Publishing, Cham, 153--160.Google Scholar
- David P. Redlawsk, Andrew JW. Civettini, and Karen M. Emmerson. 2010. The affective tipping point: Do motivated reasoners ever get it Political Psychology 31, 4 (2010), 563--593.Google Scholar
- Victoria L. Rubin. 2017. News Verification Suite: Towards System Design to Supplement Reporters' and Editors' Judgements. In Proceedings of the 45th Annual Conference of The Canadian Association for Information Science/ LAssociation canadienne des sciences de linformation (CAIS/ACSI2017).Google Scholar
- Victoria L. Rubin, Yimin Chen, and Niall J. Conroy. 2015. Deception detection for news: three types of fakes. Proceedings of the Association for Information Science and Technology 52, 1 (2015), 1--4. Google ScholarCross Ref
- Victoria L. Rubin, Niall J. Conroy, Yimin Chen, and Sarah Cornwell. 2016. Fake News or Truth Using Satirical Cues to Detect Potentially Misleading News. In Proceedings of NAACL-HLT. 7--17.Google ScholarCross Ref
- Misa Sato, Kohsuke Yanai, Toshinori Miyoshi, Toshihiko Yanase, Makoto Iwayama, Qinghua Sun, and Yoshiki Niwa. 2015. End-to-end Argument Generation System in Debating. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics (ACL), System Demonstrations. 109--114. http://aclweb.org/anthology/P/P15/P15--4019.pdfGoogle ScholarCross Ref
- Baoxu Shi and Tim Weninger. 2016. Fact Checking in Heterogeneous Information Networks. In Proceedings of the 25th International Conference Companion on World Wide Web (WWW '16 Companion). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, 101--102. Google ScholarDigital Library
- Jieun Shin, Lian Jian, Kevin Driscoll, and François Bar. 2017. Political rumoring on Twitter during the 2012 US presidential election: Rumor diffusion and correction. New Media & Society 19, 8 (2017), 1214--1235.Google ScholarCross Ref
- Kai Shu, Amy Sliva, Suhang Wang, Jiliang Tang, and Huan Liu. 2017. Fake News Detection on Social Media: A Data Mining Perspective. ACM SIGKDD Explorations Newsletter 19, 1 (2017), 22--36. Google ScholarDigital Library
- K. Sitara and B.M. Mehtre. 2016. Digital Video Tampering Detection: An Overview of Passive Techniques. Digital Investigation 18, C (Sept. 2016), 8--22. Google ScholarDigital Library
- Swapna Somasundaran and Janyce Wiebe. 2009. Recognizing Stances in Online Debates. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. Association for Computational Linguistics, Suntec, Singapore, 226--234. http://www.aclweb.org/anthology/P/P09/P09--1026 Google ScholarDigital Library
- Matthew C. Stamm, Min Wu, and Kuo J. Ray Liu. 2013. Information Forensics: An Overview of the First Decade. IEEE Access 1 (2013), 167--200.Google ScholarCross Ref
- Jonathan Stray. 2017. Defense Against the Dark Arts: Networked Propaganda and Counter-Propaganda. http://jonathanstray.com/networked-propaganda-andcounter-propaganda. (2017).Google Scholar
- Jonathan Stray. 2017. Introducing the CJ Workbench. http://jonathanstray.com/introducing-the-cj-workbench. (2017).Google Scholar
- Mihai Surdeanu. 2013. Overview of the TAC2013 Knowledge Bbase Population Evaluation: English Slot Filling and Temporal Slot Filling. In Proceedings of the TAC-KBP 2013 Workshop.Google Scholar
- Denis Teyssou, Jean-Michel Leung, Evlampios Apostolidis, Konstantinos Apostolidis, Symeon Papadopoulos, Markos Zampoglou, Olga Papadopoulou, and Vasileios Mezaris. 2017. The InVID Plug-in: Web Video Verification on the Browser. In Proceedings of the 1st International Workshop on Multimedia Verification. Mountain View, USA. http://www.iti.gr/~bmezaris/publications/mm17_2_preprint.pdf Google ScholarDigital Library
- Matt Thomas, Bo Pang, and Lillian Lee. 2006. Get out the vote: Determining support or opposition from Congressional floor-debate transcripts. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Sydney, Australia, 327--335. https: //www.cs.cornell.edu/home/llee/papers/tpl-convote.dec06.pdf Google ScholarDigital Library
- Andreas Vlachos and Sebastian Riedel. 2014. Fact Checking: Task definition and dataset construction. In ACL 2014 Workshop on Language Technologies and Computational Social Science. 18--22. http://aclweb.org/anthology/W14--2508Google ScholarCross Ref
- Marilyn Walker, Jean Fox Tree, Pranav Anand, Rob Abbott, and Joseph King. 2012. A Corpus for Research on Deliberation and Debate. In Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12) (23--25), Nicoletta Calzolari (Conference Chair), Khalid Choukri, Thierry Declerck, Mehmet Uur Doan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, and Stelios Piperidis (Eds.). European Language Resources Association (ELRA), Istanbul, Turkey.Google Scholar
- Thomas Wood and Ethan Porter. 2016. The elusive backfire effect: mass attitudes' steadfast factual adherence. Political Behavior (2016), 1--29.Google Scholar
- You Wu, Pankaj K Agarwal, Chengkai Li, Jun Yang, and Cong Yu. 2014. Toward computational fact-checking. Proceedings of the VLDB Endowment 7, 7 (2014), 589--600. Google ScholarDigital Library
- You Wu, Pankaj K Agarwal, Chengkai Li, Jun Yang, and Cong Yu. 2017. Computational Fact Checking through Query Perturbations. ACM Transactions on Database Systems (TODS) 42, 1 (2017), 4. Google ScholarDigital Library
- You Wu, Junyang Gao, Pankaj K Agarwal, and Jun Yang. 2017. Finding diverse, high-value representatives on a surface of answers. Proceedings of the VLDB Endowment 10, 7 (2017), 793--804. Google ScholarDigital Library
- Hong Yu and Vasileios Hatzivassiloglou. 2003. Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying the Polarity of Opinion Sentences. In Proceedings of 2003 Conference on Empirical Methods in Natural Language Processing (EMNLP). 129--136. Google ScholarDigital Library
- Markos Zampoglou, Symeon Papadopoulos, and Yiannis Kompatsiaris. 2017. Large-scale Evaluation of Splicing Localization Algorithms for Web Images. Multimedia Tools Appl. 76, 4 (Feb. 2017), 4801--4834. Google ScholarDigital Library
- Álvaro Rodrigo, Anselmo Peñas, and Felisa Verdejo. 2008. Overview of the Answer Validation Exercise 2008. In Proceedings of the 9th Workshop of the CrossLanguage Evaluation Forum, CLEF 2008. 296--313. Google ScholarDigital Library
Index Terms
- A Content Management Perspective on Fact-Checking
Recommendations
A semantic-based approach to content abstraction and annotation for content management
In recent years, knowledge becomes the most important asset of individuals as well organizations, and determines the competitiveness of an enterprise. Content is a knowledge container that implies what human beings transform their knowledge in when they ...
From content distribution networks to content networks - issues and challenges
Due to the technical developments in electronics the amount of digital content is continuously increasing. In order to make digital content respectively multimedia content available to potentially large and geographically distributed consumer ...
Constructing Structured Content on WordPress: Emerging Paradigms in Web Content Management
Web content management systems (WCMSs) are widely used technologies that, like previous writing tools, shape how people think about and create documents. Despite their influence and ubiquity, however, WCMSs have received exceedingly little attention ...
Comments