skip to main content
research-article
Free Access

Bias on the web

Published:23 May 2018Publication History
Skip Abstract Section

Abstract

Bias in Web data and use taints the algorithms behind Web-based applications, delivering equally biased results.

Skip Supplemental Material Section

Supplemental Material

References

  1. ACM U.S. Public Policy Council. Statement on Algorithmic Transparency and Accountability, ACM, Washington, D.C., Jan. 2017; https://www.acm.org/binaries/content/assets/public-policy/2017_usacm_statement_algorithms.pdfGoogle ScholarGoogle Scholar
  2. Agarwal, D., Chen, B-C., and Elango, P. Explore/exploit schemes for Web content optimization. In Proceedings of the Ninth IEEE International Conference on Data Mining (Miami, FL, Dec. 6--9). IEEE Computer Society Press, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Baeza-Yates, R., Castillo, C., and López, V. Characteristics of the Web of Spain. Cybermetrics 9, 1 (2005), 1--41.Google ScholarGoogle Scholar
  4. Baeza-Yates, R. and Castillo, C. Relationship between Web links and trade (poster). In Proceedings of the 15th International Conference on the World Wide Web (Edinburgh, U.K., May 23--26). ACM Press, New York, 2006, 927--928. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Baeza-Yates, R., Castillo, C., and Efthimiadis, E.N. Characterization of national Web domains. ACM Transactions on Internet Technology 7, 2 (May 2007), article 9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Baeza-Yates, R., Pereira, Á., and Ziviani, N. Genealogical trees on the Web: A search engine user perspective. In Proceedings of the 17th International Conference on the World Wide Web (Beijing, China, Apr 21--25). ACM Press, New York, 2008, 367--376. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Baeza-Yates, R. Incremental sampling of query logs. In Proceedings of the 38th ACM SIGIR Conference (Santiago, Chile, Aug. 9--13). ACM Press, New York, 2015, 1093--1096. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Baeza-Yates, R. and Saez-Trumper, D. Wisdom of the crowd or wisdom of a few? An analysis of users' content generation. In Proceedings of the 26th ACM Conference on Hypertext and Social Media (Guzelyurt, TRNC, Cyprus, Sept. 1--4). ACM Press, New York, 2015, 69--74. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Bolukbasi, R., Chang, K.W., Zou, J., Saligrama, V., and Kalai, A. Man is to computer programmer as woman is to homemaker? De-biasing word embeddings. In Proceedings of the 30th Conference on Neural Information Processing Systems (Barcelona, Spain, Dec. 5--10). Curran Associates, Inc., Red Hook, NY, 2016, 4349--4357. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Caliskan, A., Bryson, J.J., and Narayanan, A. Semantics derived automatically from language corpora contain human-like biases. Science 356, 6334 (Apr. 2017), 183--186.Google ScholarGoogle ScholarCross RefCross Ref
  11. Chapelle, O. and Zhang, Y. A dynamic Bayesian network click model for Web search ranking. In Proceedings of the 18th International Conference on the World Wide Web (Madrid, Spain, Apr. 20--24). ACM Press, New York, 2009, 1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Dupret, G.E. and Piwowarski, B. A user-browsing model to predict search engine click data from past observations. In Proceedings of the 31st ACM SIGIR Conference (Singapore, July 20--24). ACM Press, New York, 2008, 331--338. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Fetterly, D., Manasse, M., and Najork, M. 0n the evolution of clusters of near-duplicate webpages. Journal of Web Engineering 2, 4 (Oct. 2003), 228--246. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Gong, W., Lim, E.-P., and Zhu, F. Characterizing silent users in social media communities. In Proceedings of the Ninth International AAAI Conference on Web and Social Media (Oxford, U.K., May 26--29). AAAI, Fremont, CA, 2015, 140--149.Google ScholarGoogle Scholar
  15. Graells-Garrido, E. and Lalmas, M. Balancing diversity to countermeasure geographical centralization in microblogging platforms. In Proceedings of the 25th ACM Conference on Hypertext and Social Media (Santiago, Chile, Sept. 1--4). ACM Press, New York, 2014, 231--236. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Graells-Garrido, E., Lalmas, M., and Menczer, F. First women, second sex: Gender bias in Wikipedia. In Proceedings of the 26th ACM Conference on Hypertext and Social Media (Guzelyurt, TRNC, Cyprus, Sept. 1--4). ACM Press, New York, 2015, 165--174. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Lazer, D.M.J. et al. The science of fake news. Science 359, 6380 (Mar. 2018), 1094--1096.Google ScholarGoogle ScholarCross RefCross Ref
  18. Mediative. The Evolution of Google's Search Results Pages & Effects on User Behaviour, White paper, 2014; http://www.mediative.com/SERPGoogle ScholarGoogle Scholar
  19. Mercer, A., Deane, C., and McGeeney, K. Why 2016 Election Polls Missed Their Mark, Pew Research Center, Washington, D.C., Nov 2016; http://www.pewresearch.org/fact-tank/2016/11/09/why-2016-election-polls-missed-their-mark/Google ScholarGoogle Scholar
  20. Olteanu, A., Castillo, C., Diaz, F., and Kiciman, E. Social Data: Biases, Methodological Pitfalls, and Ethical Boundaries, SSRN, Rochester, NY, Dec. 20, 2016; https://ssrn.com/abstract=2886526Google ScholarGoogle ScholarCross RefCross Ref
  21. Pariser, E. The Filter Bubble: How the New Personalized Web Is Changing What We Read and How We Think, Penguin, London, U.K., 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Saez-Trumper, D., Castillo, C., and Lalmas, M. Social media news communities: Gatekeeping, coverage, and statement bias. In Proceedings of the ACM International Conference on Information and Knowledge Management (San Francisco, CA, Oct. 27-Nov. 1). ACM Press, New York, 2013, 1679--1684. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Silberzahn, R. and Uhlmann, E.L. Crowdsourced research: Many hands make tight work. Nature 526, 7572 (Oct. 2015), 189--191; https://psyarxiv.com/qkwst/Google ScholarGoogle ScholarCross RefCross Ref
  24. Smith, M., Patil, D.J., and Muñoz, C. Big Data: A Report on Algorithmic Systems, Opportunity, and Civil Rights. Executive Office of the President, Washington, D.C., 2016; https://obamawhitehouse.archives.gov/sites/default/files/microsites/ostp/2016_0504_data_discrimination.pdfGoogle ScholarGoogle Scholar
  25. Wagner, C., Garcia, D., Jadidi, M., and Strohmaier, M. It's a man's Wikipedia? Assessing gender inequality in an online encyclopedia. In Proceedings of the Ninth International AAAI Conference on Web and Social Media (Oxford, U.K., May 26--29). AAAI, Fremont, CA, 2015, 454--463.Google ScholarGoogle Scholar
  26. Wang, T. and Wang, D. Why Amazon's ratings might mislead you: The story of herding effects. Big Data 2, 4 (Dec. 2014), 196--204.Google ScholarGoogle ScholarCross RefCross Ref
  27. White, R. Beliefs and biases in Web search. In Proceedings of the 36th ACM SIGIR Conference (Dublin, Ireland, July 28-Aug. 1). ACM Press, New York, 2013, 3--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Wu, S., Hofman, J.M., Mason, W.A., and Watts, D.J. Who says what to whom on Twitter. In Proceedings of the 20th International Conference on the World Wide Web (Hyderabad, India, Mar. 28--Apr. 1). ACM Press, New York, 2011, 705--714. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Zipf, G.K. Human Behavior and the Principle of Least Effort, Addison-Wesley Press, Cambridge, MA, 1949.Google ScholarGoogle Scholar

Index Terms

  1. Bias on the web

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image Communications of the ACM
            Communications of the ACM  Volume 61, Issue 6
            June 2018
            97 pages
            ISSN:0001-0782
            EISSN:1557-7317
            DOI:10.1145/3229066
            Issue’s Table of Contents

            Copyright © 2018 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 23 May 2018

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Popular
            • Refereed

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format .

          View HTML Format