Selection Bias in News Coverage: Learning it, Fighting it

Authors:
Dylan Bourgeois

Ecole Polytechnique Federale de Lausanne, Lausanne, Switzerland

Ecole Polytechnique Federale de Lausanne, Lausanne, Switzerland
View Profile

,
Jérémie Rappaz

Ecole Polytechnique Federale de Lausanne, Lausanne, Switzerland

Ecole Polytechnique Federale de Lausanne, Lausanne, Switzerland
View Profile

,
Karl Aberer

Ecole Polytechnique Federale de Lausanne, Lausanne, Switzerland

Ecole Polytechnique Federale de Lausanne, Lausanne, Switzerland
View Profile

WWW '18: Companion Proceedings of the The Web Conference 2018April 2018Pages 535–543https://doi.org/10.1145/3184558.3188724

Published:23 April 2018Publication History

WWW '18: Companion Proceedings of the The Web Conference 2018

Pages 535–543

ABSTRACT

News entities must select and filter the coverage they broadcast through their respective channels since the set of world events is too large to be treated exhaustively. The subjective nature of this filtering induces biases due to, among other things, resource constraints, editorial guidelines, ideological affinities, or even the fragmented nature of the information at a journalist's disposal. The magnitude and direction of these biases are, however, widely unknown. The absence of ground truth, the sheer size of the event space, or the lack of an exhaustive set of absolute features to measure make it difficult to observe the bias directly, to characterize the leaning's nature and to factor it out to ensure a neutral coverage of the news. In this work, we introduce a methodology to capture the latent structure of media's decision process on a large scale. Our contribution is multi-fold. First, we show media coverage to be predictable using personalization techniques, and evaluate our approach on a large set of events collected from the GDELT database. We then show that a personalized and parametrized approach not only exhibits higher accuracy in coverage prediction, but also provides an interpretable representation of the selection bias. Last, we propose a method able to select a set of sources by leveraging the latent representation. These selected sources provide a more diverse and egalitarian coverage, all while retaining the most actively covered events.

References

Eytan Bakshy, Solomon Messing, and Lada A. Adamic. 2015. Political science. Exposure to ideologically diverse news and opinion on Facebook. Science 348 6239 (2015), 1130--2.Google Scholar
Carlos de Juan Carbonell and Jade Goldstein-Stewart. 1998. The Use of MMR, Diversity-Based Reranking for Reordering Documents and Producing Summaries. SIGIR Forum 51 (1998), 209--210. Google ScholarDigital Library
Stefano DellaVigna, Ethan Kaplan, Alan B. Krueger, Marco Manacorda, Enrico Moretti, Torsten Persson, Sam Popkin, Riccardo Puglisi, Matthew Rabin, Jesse M. Shapiro, Uri Simonsohn, Laura Stoker, David Stromberg, Tatyana Deryugina, Monica Deza, Dylan Fox, Melissa Galicia, Calvin Wai-Loon Ho, Sudhamas Khanchanawong, Richard M. Kim, Martin Kohan, Vipul Surender Kumar, Jonathan J. Leung, Clarice Li, Tze Yang Lim, Ming Mai, Sameer Parekh, Sharmini Radakrishnan, Rohan Relan, Dan Acland, Saurabh Bhargava, Avi Ebenstein, and Devin G. Pope. 2005. The Fox News Effect: Media Bias and Voting.Google Scholar
Martin Ester, Hans-Peter Kriegel, Jörg Sander, and Xiaowei Xu. 1996. A DensityBased Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In KDD. Google ScholarDigital Library
Seth Flaxman, Sharad Goel, and Justin M. Rao. 2015. Filter Bubbles, Echo Chambers, and Online News Consumption.Google Scholar
Tim Groseclose and Jeffrey Milyo. 2005. A Measure of Media Bias. The Quarterly Journal of Economics 120, 4 (2005), 1191--1237. http://www.jstor.org/stable/ 25098770Google ScholarCross Ref
Yifan Hu, Yehuda Koren, and Chris Volinsky. 2008. Collaborative Filtering for Implicit Feedback Datasets. 2008 Eighth IEEE International Conference on Data Mining (2008), 263--272. Google ScholarDigital Library
Swetha Keertipati, Bastin Tony Roy Savarimuthu, Maryam Purvis, and Martin K. Purvis. 2014. Multi-level Analysis of Peace and Conflict Data in GDELT. In MLSDA@PRICAI. Google ScholarDigital Library
Yehuda Koren, Robert M. Bell, and Chris Volinsky. 2009. Matrix Factorization Techniques for Recommender Systems. Computer 42 (2009). Google ScholarDigital Library
Haewoon Kwak and Jisun An. 2016. Two Tales of the World: Comparison of Widely Used World News Datasets GDELT and EventRegistry. In ICWSM.Google Scholar
Kalev Leetaru and Philip A. Schrodt. 2013. GDELT: Global data on events, location, and tone. ISA Annual Convention (2013).Google Scholar
Defu Lian, Cong Zhao, Xing Xie, Guangzhong Sun, Enhong Chen, and Yong Rui. 2014. GeoMF: joint geographical modeling and matrix factorization for point-of-interest recommendation. In KDD. Google ScholarDigital Library
Yu-Ru Lin, James P. Bagrow, and David Lazer. 2011. More Voices Than Ever Quantifying Media Bias in Networks. CoRR abs/1111.1227 (2011).Google Scholar
Rowland Lorimer and Scannell. 1994. Mass communications: a comparative introduction. pp. 86--87 pages.Google Scholar
Alexandros Nanopoulos, Dimitrios Rafailidis, Panagiotis Symeonidis, and Yannis Manolopoulos. 2010. MusicBox: Personalized Music Recommendation Based on Cubic Analysis of Social Tags. IEEE Transactions on Audio, Speech, and Language Processing 18 (2010), 407--412.Google ScholarDigital Library
Alexandra Olteanu, Carlos Castillo, Nicholas Diakopoulos, and Karl Aberer. 2015. Comparing Events Coverage in Online News and Social Media: The Case of Climate Change. In ICWSM.Google Scholar
Rong Pan, Yunhong Zhou, Bin Cao, Nathan Nan Liu, Rajan M. Lukose, Martin Scholz, and Qiang Yang. 2008. One-Class Collaborative Filtering. 2008 Eighth IEEE International Conference on Data Mining (2008), 502--511. Google ScholarDigital Library
Martin Piotte and Martin Chabbert. 2009. The pragmatic theory solution to the netflix grand prize. Netflix prize documentation (2009).Google Scholar
Fengcai Qiao, Pei Li, Jingsheng Deng, Zhaoyun Ding, and Hui Wang. 2015. GraphBased Method for Detecting Occupy Protest Events Using GDELT Dataset. 2015 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (2015), 164--168. Google ScholarDigital Library
Jérémie Rappaz, Maria-Luiza Vladarean, J. Randall McAuley, and Michele Catasta. 2017. Bartering Books to Beers: A Recommender System for Exchange Platforms. In WSDM. Google ScholarDigital Library
Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. 2009. BPR: Bayesian Personalized Ranking from Implicit Feedback. In UAI. Google ScholarDigital Library
Francesco Ricci, Lior Rokach, Bracha Shapira, and Paul B. Kantor. 2010. Recommender Systems Handbook (1st ed.). Springer-Verlag New York, Inc., New York, NY, USA. 148--149, 161--168 pages. Google ScholarDigital Library
Diego Sáez-Trumper, Carlos Castillo, and Mounia Lalmas. 2013. Social media news communities: gatekeeping, coverage, and statement bias. In CIKM. Google ScholarDigital Library
Guy Shani and Asela Gunawardana. 2011. Evaluating recommendation systems. Springer, 257--297.Google Scholar
Jonathon Shlens. 2014. A tutorial on principal component analysis. arXiv preprint arXiv:1404.1100 (2014).Google Scholar
Laurens van der Maaten, Geoffrey E. Hinton, and Yoshua Bengio. 2008. Visualizing Data using t-SNE.Google Scholar
Kevin Wallsten. 2005. Political Blogs and the Bloggers Who Blog Them: Is the Political Blogosphere and Echo ChamberGoogle Scholar

Index Terms

Recommendations

A Dynamic Embedding Model of the Media Landscape
WWW '19: The World Wide Web Conference

Information about world events is disseminated through a wide variety of news channels, each with specific considerations in the choice of their reporting. Although the multiplicity of these outlets should ensure a variety of viewpoints, recent reports ...
Read More
Correcting for Selection Bias in Learning-to-rank Systems
WWW '20: Proceedings of The Web Conference 2020

Click data collected by modern recommendation systems are an important source of observational data that can be utilized to train learning-to-rank (LTR) systems. However, these data suffer from a number of biases that can result in poor performance for ...
Read More
Addressing Selection Bias in Event Studies with General-Purpose Social Media Panels
Challenge Paper and Research Papers

Data from Twitter have been employed in prior research to study the impacts of events. Conventionally, researchers use keyword-based samples of tweets to create a panel of Twitter users who mention event-related keywords during and after an event. ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WWW '18: Companion Proceedings of the The Web Conference 2018
April 2018
2023 pages
ISBN:9781450356404
General Chairs:
Pierre-Antoine Champin
Université Claude Bernard Lyon 1, France
,
Fabien Gandon
Inria, Université Côte d'Azur, CNRS, I3S, France
,
Lionel Médini
Université Claude Bernard Lyon 1, CNRS, LIRIS, France
,
Program Chairs:
Mounia Lalmas
Spotify, UK
,
Panagiotis G. Ipeirotis
New York University, USA
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
International World Wide Web Conferences Steering Committee
Republic and Canton of Geneva, Switzerland
Publication History
- Published: 23 April 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
echo-chamber
factorization methods
media pluralism
news coverage
ranking methods
selection bias
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,899of8,196submissions,23%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 9
  Total Citations
  View Citations
- 1,940
  Total Downloads
- Downloads (Last 12 months)878
- Downloads (Last 6 weeks)188
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Selection Bias in News Coverage: Learning it, Fighting it

WWW '18: Companion Proceedings of the The Web Conference 2018

ABSTRACT

References

Cited By

Index Terms

Recommendations

A Dynamic Embedding Model of the Media Landscape

Correcting for Selection Bias in Learning-to-rank Systems

Addressing Selection Bias in Event Studies with General-Purpose Social Media Panels