ABSTRACT
This paper presents an algorithm audit of the Google Top Stories box, a prominent component of search engine results and powerful driver of traffic to news publishers. As such, it is important in shaping user attention towards news outlets and topics. By analyzing the number of appearances of news article links we contribute a series of novel analyses that provide an in-depth characterization of news source diversity and its implications for attention via Google search. We present results indicating a considerable degree of source concentration (with variation among search terms), a slight exaggeration in the ideological skew of news in comparison to a baseline, and a quantification of how the presentation of items translates into traffic and attention for publishers. We contribute insights that underscore the power that Google wields in exposing users to diverse news information, and raise important questions and opportunities for future work on algorithmic news curation.
Supplemental Material
Available for Download
This is a .csv file with all the search terms used in this study, generated through a method of selecting trending hard news topics and their most appropriate search term, as explained in the paper. The table also contains the contextual topics of that search term, the dates they were used to collect data and the links for their Google Trends pages.
- Elisa Shearer and Jeffrey Gottfried. 2017. News use across social media platforms 2017. Pew Research Center. Retrieved from http://www.journalism.org/2017/09/07/news-use-across-social-mediaplatforms-2017/Google Scholar
- Nic Newman, Richard Fletcher, Antonis Kalogeropoulos, David A. L. Levy, and Rasmus Kleis Nielsen. 2018. Reuters Institute Digital News Report 2018. Reuters Institute. Retrieved fromhttp://media.digitalnewsreport.org/wpcontent/uploads/2018/06/digital-news-report-2018.pdf?x89475Google Scholar
- 2018. Share of search queries handled by leading U.S. search engine providers as of July 2018. Search Engines & SEO. Retrieved from https://www.statista.com/statistics/267161/market-share-of-searchengines-in-the-united-states/Google Scholar
- 2018. Explore traffic source trends for digital publishers. Retrieved from https://www.parse.ly/resources/data-studies/referrer-dashboard/Google Scholar
- John Saroff. 2018. Google referrals are up: Why that's good and how to make the most of it. Retrieved from https://digitalcontentnext.org/blog/2018/02/14/google-referrals-thatsgood-make/Google Scholar
- Peter Van Aelst, Jesper Strömbäck, Toril Aalberg, Frank Esser, Claes de Vreese, Jörg Matthes, David Hopmann, Susana Salgado, Nicolas Hubé, Agnieszka Stpi'ska, Stylianos Papathanassopoulos, Rosa Berganza, Guido Legnante, Carsten Reinemann, Tamir Sheafer, and James Stanyer. 2017. Political communication in a high-choice media environment: a challenge for democracy? Annals of the International Communication Association 41, 1: 3--27.Google ScholarCross Ref
- Bing Pan, Helene Hembrooke, Thorsten Joachims, Lori Lorigo, Geri Gay, and Laura Granka. 2007. In Google We Trust: Users' Decisions on Rank, Position, and Relevance. Journal of Computer-Mediated Communication 12, 3: 801--8Google ScholarCross Ref
- Nicholas Diakopoulos, Daniel Trielli, Jennifer Stark, and Sean Mussenden. 2018. I Vote For-How Search Informs Our Choice of Candidate. In Digital Dominance: The Power of Google, Amazon, Facebook, and Apple, Martin Moore and Damian Tambini (eds.). Oxford University Press.Google Scholar
- Silvia Knobloch-Westerwick, Benjamin K. Johnson, Nathaniel A. Silver, and Axel Westerwick. 2015. Science Exemplars in the Eye of the Beholder. Science Communication 37, 5: 575--60Google ScholarCross Ref
- Robert Epstein. 2018. Manipulating Minds: the Power of Search Engines to Influence Votes and Opinions. In Digital Dominance: The Power of Google, Amazon, Facebook, and Apple, Martin Moore and Damian Tambini (eds.). Oxford University Press.Google Scholar
- Matthew Kay, Cynthia Matuszek, and Sean A. Munson. Unequal Representation and Gender Stereotypes in Image Search Results for Occupations. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, 3819--38 Google ScholarDigital Library
- Robert Epstein and Ronald E. Robertson. 2015. The search engine manipulation effect (SEME) and its possible impact on the outcomes of elections. National Academy of Sciences 112, 33: E4512--E452Google ScholarCross Ref
- Tarleton Gillespie. 2014. The relevance of algorithms. Media technologies: Essays on communication, materiality, and society, 167. Google ScholarDigital Library
- Christian Sandvig, Kevin Hamilton, Karrie Karahalios, and Cedric Langbort. 2014. Auditing algorithms: Research methods for detecting discrimination on Internet platforms.Google Scholar
- Nicholas Diakopoulos. 2015. Algorithmic Accountability: Journalistic Investigation of Computational Power Structures. Digital Journalism. 3.Google ScholarCross Ref
- Rob Kitchin. 2016. Thinking critically about and researching algorithms. Information, Communication & Society, 20(1), 14--29.Google ScholarCross Ref
- Ronald E. Robertson, David Lazer, and Christo Wilson. 2018. Auditing the Personalization and Composition of Politically-Related Search Engine Results Pages. In Proceedings of the 2018 World Wide Web Conference, 955--96 Google ScholarDigital Library
- Bob Franklin, Martin Hamer, Mark Hanna, Marie Kinsey, and John E. Richardson. 2005. Key concepts in journalism studies. SAGE Publications Ltd.Google Scholar
- Tony Harcup and Deirdre O'Neill. 2017. What is news? Journalism Studies 18, 12: 1470--148Google ScholarCross Ref
- Kjerstin Thorson and Chris Wells. 2016. Curated Flows: A Framework for Mapping Media Exposure in the Digital Age. Communication Theory 26, 3: 309--3Google ScholarCross Ref
- Matthew S. Weber and Allie Kosterich. 2017. Coding the News. Digital Journalism 6, 3: 310--3Google ScholarCross Ref
- Taina Bucher. 2012. Want to Be on the Top? Algorithmic Power and the Threat of Invisibility on Facebook. New Media & Society 14 (7): 1164--80.Google ScholarCross Ref
- Måns Magnusson, Jens Finnäs, and Leonard Wallentin. 2016. Finding the news lead in the data haystack: Automated local data journalism using crime data. In Computation + Journalism Symposium.Google Scholar
- Deokgun Park, Simranjit Sachar, Nicholas Diakopoulos, and Niklas Elmqvist. 2016. Supporting Comment Moderators in Identifying High Quality Online News Comments. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, 1114--11 Google ScholarDigital Library
- Alejandro Montes-García, Jose María Álvarez-Rodríguez, Jose Emilio LabraGayo, and Marcos Martínez-Merino. 2013. Towards a journalist-based news recommendation system: The Wesomender approach. Expert Systems with Applications 40, 17: 6735--674 Google ScholarDigital Library
- Xiaomo Liu, Armineh Nourbakhsh, Quanzhi Li, Sameena Shah, Robert Martin, and John Duprey. 2017. Reuters Tracer: Toward Automated News Production Using Large Scale Social Media Data.Google Scholar
- William Perrin. 2017. Local News Engine: Can the machine help spot diamonds in the dust? In Data Journalism Past, Present, Future, John Mair, Richard Lance Keeble and Megan Lucero (eds.). Abramis academic publishing.Google Scholar
- Michael A. DeVito. 2016. From Editors to Algorithms. Digital Journalism 5, 6: 753--77Google ScholarCross Ref
- Natali Helberger. 2018. Challenging Diversity- Social Media Platforms and a New Conception of Media Diversity. In Digital Dominance: The Power of Google, Amazon, Facebook, and Apple, Martin Moore and Damian Tambini (eds.). Oxford University Press.Google Scholar
- Philip M. Napoli. 2011. Exposure Diversity Reconsidered. Journal of Information Policy 1: 246--25Google ScholarCross Ref
- Paul S. Voakes, Jack Kapfer, David Kurpius, and David Shano-Yeon Chern. 1996. Diversity in the News: A Conceptual and Methodological Framework. Journalism & Mass Communication Quarterly 73, 3: 582--59Google ScholarCross Ref
- Engin Bozdag. 2013. Bias in algorithmic filtering and personalization. Ethics and Information Technology 15, 3: 209--2 Google ScholarDigital Library
- Natali Helberger, Kari Karppinen, and Lucia D'Acunto. 2018. Exposure diversity as a design principle for recommender systems. Information, Communication & Society 21, 2: 191--20Google ScholarCross Ref
- Denis McQuail and Jan J. Van Cuilenburg. 1983. Diversity as a Media Policy Goal: a Strategy for Evaluative Research and a Netherlands Case Study. International Communication Gazette 31, 3: 145--16Google ScholarCross Ref
- Richard van der Wurff. 2011. Do audiences receive diverse ideas from news media? Exposure to a variety of news media and personal characteristics as determinants of diversity as received. European Journal of Communication 26, 4: 328--34Google ScholarCross Ref
- Eli Pariser. 2011. The filter bubble: How the New Personalized Web Is Changing What We Read and How We Think. Penguin Press. Google ScholarDigital Library
- Natalie Jomini Stroud. 2010. Polarization and partisan selective exposure. Journal of Communication 60, 3: 556--57Google ScholarCross Ref
- Michael A. Beam. 2013. Automating the News. Communication Research 41, 8: 1019--104Google ScholarCross Ref
- Richard Fletcher and Rasmus Kleis Nielsen. 2017. Are people incidentally exposed to news on social media? A comparative analysis. New Media & Society 20, 7: 2450--246Google ScholarCross Ref
- Elizabeth Dubois and Grant Blank. 2018. The echo chamber is overstated: the moderating effect of political interest and diverse media. Information, Communication & Society 21, 5: 729--74Google ScholarCross Ref
- Mario Haim, Andreas Graefe, and Hans-Bernd Brosius. 2018. Burst of the Filter Bubble? Digital Journalism 6, 3: 330--34Google ScholarCross Ref
- Eytan Bakshy, Solomon Messing, and Lada A. Adamic. 2015. Political science. Exposure to ideologically diverse news and opinion on Facebook. Science 348, 6239: 1130--Google Scholar
- Roland Schroeder and Moritz Kralemann. 2005. Journalism Ex Machina - Google News Germany and Its News Selection Processes. Journalism Studies 6, 2: 245--24Google ScholarCross Ref
- Efrat Nechushtai and Seth C. Lewis. What kind of news gatekeepers do we want machines to be? Filter bubbles, fragmentation, and the normative dimensions of algorithmic recommendations. Computers in Human Behavior.Google Scholar
- Nic Newman and Richard Fletcher. 2018. Platform Reliance, Information Intermediaries and News Diversity. In Digital Dominance: The Power of Google, Amazon, Facebook, and Apple, Martin Moore and Damian Tambini (eds.). Oxford University Press.Google Scholar
- Chloe Kliman-Silver, Anikó Hánnak, David Lazer, Christo Wilson, and Alan Mislove. 2015. Location, location, location: The impact of geolocation on web search personalization. In Proceedings of the 2015 Internet Measurement Conference, 121--1 https://dl.acm.org/citation.cfm?doid=2815675.2815714 Google ScholarDigital Library
- Anikó Hannák, Piotr Sapiezynski, Arash Molavi Khaki, David Lazer, Alan Mislove, and Christo Wilson. 2017. Measuring Personalization of Web Search. arXiv:1706.05011 {cs.CY}.Google Scholar
- Andrea Ballatore, Mark Graham, and Shilad Sen. 2017. Digital Hegemonies: The Localness of Search Engine Results. Annals of the American Association of Geographers 107(5): 1194--12Google ScholarCross Ref
- Min Jiang. 2012. The Business and Politics of Search Engines: A Comparative Study of Baidu and Google's Search Results of Internet Events in China. New Media & Society 16, 2: 212--23Google ScholarCross Ref
- P. Takis Metaxas and Yada Pruksachatkun. 2017. Manipulation of search engine results during the 2016 US congressional elections. In Proceedings of the ICIW 2017.Google Scholar
- Gabriel Magno, Camila Souza Araújo, Wagner Meira Jr, and Virgilio Almeida. 2016. Stereotypes in Search Engine Results: Understanding The Role of Local and Global Factors. arXiv:1609.05413 {cs.CY}.Google Scholar
- Mohammed A. Alam and Doug Downey. 2014. Analyzing the content emphasis of web search engines. In Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval, 1083--108 Google ScholarDigital Library
- Jacob Ørmen. 2016. Googling the news: Opportunities and challenges in studying news events through Google Search. Digital Journalism 4, 1: 107--1Google ScholarCross Ref
- Connor McMahon, Isaac L. Johnson, and Brent Hecht. 2017. The Substantial Interdependence of Wikipedia and Google: A Case Study on the Relationship Between Peer Production Communities and Information Technologies. In Proceedings of the Eleventh International Conference on Web and Social Media, 142--151.Google Scholar
- Cornelius Puschmann. 2017. How significant is algorithmic personalization in searches for political parties and candidates? Alexander von Humboldt Institute for Internet and Society (HIIG). Retrieved from https://www.hiig.de/en/personalized-search-results-elections/Google Scholar
- Ronald E. Robertson, Shan Jiang, Kenneth Joseph, Lisa Friedland, David Lazer, and Christo Wilson. 2018. Auditing Partisan Audience Bias within Google Search. Proceedings of the ACM on Human-Computer Interaction 2 (CSCW): 148. Google ScholarDigital Library
- Carsten Reinemann, James Stanyer, Sebastian Scherr, and Guido Legnante. 2012. Hard and soft news: A review of concepts, operationalizations and key findings. Journalism 12, 2: 221--23Google ScholarCross Ref
- Amy Mitchell, Jeffrey Gottfried, Jocelyn Kiley, and Katerina Eva Matsa. 2014. Political polarization & media habits. Pew Research Center. Retrieved from http://www.journalism.org/2014/10/21/political-polarization-mediahabits/Google Scholar
- Matthew Gentzkow and Jesse M. Shapiro. 2010. What drives media slant? Evidence from US daily newspapers. Econometrica 78: 35--7Google ScholarCross Ref
- Matti Nelimarkka, Salla-Maaria Laaksonen, and Bryan Semaan. 2018. Social Media Is Polarized, Social Media Is Polarized: Towards a New Design Agenda for Mitigating Polarization. In Proceedings of the 2018 on Designing Interactive Systems Conference, 957--970. Google ScholarDigital Library
- Kalev Leetaru and Philip A. Schrodt. 2013. GDELT: Global data on events, location, and tone. Retrieved from http://data.gdeltproject.org/documentation/ISA.2013.GDELT.pdfGoogle Scholar
- Alamir Novin and Eric Meyers. 2017. Making Sense of Conflicting Science Information: Exploring Bias in the Search Engine Result Page. In Proceedings of the 2017 Conference on Conference Human Information Interaction and Retrieval, 175--18 Google ScholarDigital Library
- Tarleton Gillespie. 2017. Algorithmically recognizable: Santorum's Google problem, and Google's Santorum problem. Information, Communication & Society 20, 1: 63--80.Google ScholarCross Ref
- Carlos Castillo, Mohammed El-Haddad, Jürgen Pfeffer, and Matt Stempeck. 2014. Characterizing the life cycle of online news stories using social media reactions. In Proceedings of the 17th ACM conference on Computer supported cooperative work & social computing, 211--2 Google ScholarDigital Library
- Philip M. Napoli, Matthew Weber, Katie McCollough, and Qun Wang. 2018. Assessing Local Journalism: News Deserts, Journalism Divides, and the Determinants of the Robustness of Local News. DeWitt Wallace Center Media & Democracy.Google Scholar
Index Terms
- Search as News Curator: The Role of Google in Shaping Attention to News Information
Recommendations
The influence of commercial intent of search results on their perceived relevance
iConference '11: Proceedings of the 2011 iConferenceWe carried out a retrieval effectiveness test on the three major web search engines (i.e., Google, Microsoft and Yahoo). In addition to relevance judgments, we classified the results according to their commercial intent and whether or not they carried ...
What users see - Structures in search engine results pages
This paper investigates the composition of search engine results pages. We define what elements the most popular web search engines use on their results pages (e.g., organic results, advertisements, shortcuts) and to which degree they are used for ...
Evaluating the performance and neutrality/bias of search engines
VALUETOOLS 2019: Proceedings of the 12th EAI International Conference on Performance Evaluation Methodologies and ToolsDifferent search engines provide different outputs for the same keyword. This may be due to different definitions of relevance, to different ranking aggregation methods, and/or to different knowledge/anticipation of users' preferences, but rankings are ...
Comments