research-article

FA*IR: A Fair Top-k Ranking Algorithm

Authors:
Meike Zehlike

TU Berlin, Berlin, Germany

TU Berlin, Berlin, Germany
View Profile

,
Francesco Bonchi

ISI Foundation, Turin, Italy

ISI Foundation, Turin, Italy
View Profile

,
Carlos Castillo

Universitat Pompeu Fabra, Barcelona, Spain

Universitat Pompeu Fabra, Barcelona, Spain
View Profile

,
Sara Hajian

NTENT, Barcelona, Spain

NTENT, Barcelona, Spain
View Profile

,
Mohamed Megahed

TU Berlin, Berlin, Germany

TU Berlin, Berlin, Germany
View Profile

,
Ricardo Baeza-Yates

Universitat Pompeu Fabra, Barcelona, Spain

Universitat Pompeu Fabra, Barcelona, Spain
View Profile

CIKM '17: Proceedings of the 2017 ACM on Conference on Information and Knowledge ManagementNovember 2017Pages 1569–1578https://doi.org/10.1145/3132847.3132938

Published:06 November 2017Publication History

CIKM '17: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management

Pages 1569–1578

ABSTRACT

In this work, we define and solve the Fair Top-k Ranking problem, in which we want to determine a subset of k candidates from a large pool of n » k candidates, maximizing utility (i.e., select the "best" candidates) subject to group fairness criteria.

Our ranked group fairness definition extends group fairness using the standard notion of protected groups and is based on ensuring that the proportion of protected candidates in every prefix of the top-k ranking remains statistically above or indistinguishable from a given minimum. Utility is operationalized in two ways: (i) every candidate included in the top-k should be more qualified than every candidate not included; and (ii) for every pair of candidates in the top-k, the more qualified candidate should be ranked above.

An efficient algorithm is presented for producing the Fair Top-k Ranking, and tested experimentally on existing datasets as well as new datasets released with this paper, showing that our approach yields small distortions with respect to rankings that maximize utility without considering fairness criteria. To the best of our knowledge, this is the first algorithm grounded in statistical tests that can mitigate biases in the representation of an under-represented group along a ranked list.

References

Julia Angwin, Jeff Larson, Surya Mattu, and Lauren Kirchner. 2016. Machine Bias. ProPublica (May. 2016).Google Scholar
Solon Barocas and Andrew D Selbst. 2014. Big data's disparate impact. SSRN 2477899 (2014).Google Scholar
Francesco Bonchi, Sara Hajian, Bud Mishra, and Daniele Ramazzotti. 2017. Exposing the probabilistic causal structure of discrimination. International Journal of Data Science and Analytics, Vol. 3, 1 (2017), 1--21.Google ScholarCross Ref
Toon Calders and Sicco Verwer. 2010. Three naive Bayes approaches for discrimination-free classification. Data Mining and Knowledge Discovery Vol. 21, 2 (2010), 277--292. Google ScholarDigital Library
Jaime Carbonell and Jade Goldstein. 1998. The use of MMR, diversity-based reranking for reordering documents and producing summaries Proc. of SIGIR. ACM Press, 335--336. Google ScholarDigital Library
L Elisa Celis, Amit Deshpande, Tarun Kathuria, and Nisheeth K Vishnoi. 2016. How to be Fair and Diverse? arXiv:1610.07183 (2016).Google Scholar
L Elisa Celis, Damian Straszak, and Nisheeth K Vishnoi. 2017. Ranking with Fairness Constraints. arXiv:1704.06840 (2017).Google Scholar
Sushma Channamsetty and Michael D Ekstrand. 2017. Recommender Response to Diversity and Popularity Bias in User Profiles (short paper) 30th International Florida Artificial Intelligence Research Society Conference.Google Scholar
Sam Corbett-Davies, Emma Pierson, Avi Feller, Sharad Goel, and Aziz Huq. 2017. Algorithmic decision making and the cost of fairness. arXiv:1701.08230 (2017). Google ScholarDigital Library
Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. 2012. Fairness through awareness. In Proc. of ITCS. ACM Press, 214--226. Google ScholarDigital Library
Evelyn Ellis and Philippa Watson. 2012. EU anti-discrimination law. Oxford University Press.Google Scholar
Michael Feldman, Sorelle A Friedler, John Moeller, Carlos Scheidegger, and Suresh Venkatasubramanian. 2015. Certifying and removing disparate impact. In Proc. of KDD. ACM Press, 259--268. Google ScholarDigital Library
Sorelle A Friedler, Carlos Scheidegger, and Suresh Venkatasubramanian. 2016. On the (im) possibility of fairness. arXiv:1609.07236 (2016).Google Scholar
Batya Friedman and Helen Nissenbaum. 1996. Bias in computer systems. ACM Transactions on Information Systems Vol. 14, 3 (1996), 330--347. Google ScholarDigital Library
Sara Hajian, Francesco Bonchi, and Carlos Castillo. 2016. Algorithmic Bias: From Discrimination Discovery to Fairness-aware Data Mining KDD Tutorials. Google ScholarDigital Library
Sara Hajian and Josep Domingo-Ferrer. 2013. A methodology for direct and indirect discrimination prevention in data mining. IEEE TKDE, Vol. 25, 7 (2013). Google ScholarDigital Library
Sara Hajian, Josep Domingo-Ferrer, and Oriol Farràs. 2014. Generalization-based privacy preservation and discrimination prevention in data publishing and mining. Data Mining and Knowledge Discovery Vol. 28, 5--6 (2014), 1158--1188. Google ScholarDigital Library
Moritz Hardt, Eric Price, and Nati Srebro. 2016. Equality of opportunity in supervised learning. In Proc. of NIPS. Curran Associates, Inc., 3315--3323. Google ScholarDigital Library
Shahin Jabbari, Matthew Joseph, Michael Kearns, Jamie Morgenstern, and Aaron Roth. 2016. Fair Learning in Markovian Environments. arXiv:1611.03071 (2016).Google Scholar
Kalervo J"arvelin and Jaana Kek"al"ainen. 2002. Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems Vol. 20, 4 (2002), 422--446. Google ScholarDigital Library
Faisal Kamiran, Toon Calders, and Mykola Pechenizkiy. 2010. Discrimination aware decision tree learning. In Proc. of ICDM. IEEE CS, 869--874. Google ScholarDigital Library
Toshihiro Kamishima, Shotaro Akaho, Hideki Asoh, and Jun Sakuma. 2012. Fairness-aware classifier with prejudice remover regularizer. Machine Learning and Knowledge Discovery in Databases. Springer, 35--50.Google Scholar
Jon Kleinberg, Sendhil Mullainathan, and Manish Raghavan. 2016. Inherent trade-offs in the fair determination of risk scores. arXiv:1609.05807 (2016).Google Scholar
Juhi Kulshrestha, Muhammad B. Zafar, Motahhare Eslami, Saptarshi Ghosh, Johnnatan Messias, and Krishna P. Gummadi. 2017. Quantifying search bias: Investigating sources of bias for political searches in social media Proc. of CSCW. Google ScholarDigital Library
Matevvz Kunaver and Tomavz Povzrl. 2017. Diversity in recommender systems--A survey. Knowledge-Based Systems Vol. 123 (2017), 154--162. Google ScholarDigital Library
N=at=an Lerner. 2003. Group rights and discrimination in international law. Vol. Vol. 77. Martinus Nijhoff Publishers.Google Scholar
M. Lichman. 2013. UCI Machine Learning Repository. (2013).Google Scholar
Cathy O'Neil. 2016. Weapons of math destruction: How big data increases inequality and threatens democracy. Crown Publishing Group. Google Scholar
Dino Pedreschi, Salvatore Ruggieri, and Franco Turini. 2009 a. Integrating induction and deduction for finding evidence of discrimination Proc. of AI and Law. ACM Press, 157--166. Google ScholarDigital Library
Dino Pedreschi, Salvatore Ruggieri, and Franco Turini. 2009 b. Measuring Discrimination in Socially-Sensitive Decision Records Proc. of SDM. SIAM, 581--592.Google Scholar
Dino Pedreshi, Salvatore Ruggieri, and Franco Turini. 2008. Discrimination-aware data mining. In Proc. of KDD. ACM Press, 560--568. Google ScholarDigital Library
Tetsuya Sakai and Ruihua Song. 2011. Evaluating diversified search results using per-intent graded relevance Proc. of SIGIR. ACM Press, 1043--1052. Google ScholarDigital Library
Thomas Sowell. 2005. Affirmative action around the world: An empirical analysis. Yale University Press.Google Scholar
The College Board. 2014. SAT Percentile Ranks. (2014).Google Scholar
Tània Verge. 2010. Gendering representation in Spain: Opportunities and limits of gender quotas. J. of Women, Politics & Policy Vol. 31, 2 (2010), 166--190.Google ScholarCross Ref
Ke Yang and Julia Stoyanovich. 2016. Measuring Fairness in Ranked Outputs. In Proc. of FATML.Google Scholar
Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rodriguez, and Krishna P Gummadi. 2015. Fairness constraints: A mechanism for fair classification. arXiv:1507.05259 (2015).Google Scholar
Rich Zemel, Yu Wu, Kevin Swersky, Toni Pitassi, and Cynthia Dwork. 2013. Learning fair representations. In Proc. of ICML. 325--333. Google ScholarDigital Library
Indre vZliobaite, Faisal Kamiran, and Toon Calders. 2011. Handling conditional discrimination. In Proc. of ICDM. IEEE CS, 992--1001. Google ScholarDigital Library

Index Terms

FA*IR: A Fair Top-k Ranking Algorithm
1. Applied computing
  1. Law, social and behavioral sciences
    1. Law
2. Information systems
  1. Information retrieval
    1. Retrieval models and ranking
      1. Top-k retrieval in databases

Recommendations

Equity of Attention: Amortizing Individual Fairness in Rankings
SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval

Rankings of people and items are at the heart of selection-making, match-making, and recommender systems, ranging from employment sites to sharing economy platforms. As ranking positions influence the amount of attention the ranked subjects receive, ...
Read More
Finding k-dominant skylines in high dimensional space
SIGMOD '06: Proceedings of the 2006 ACM SIGMOD international conference on Management of data

Given a d-dimensional data set, a point p dominates another point q if it is better than or equal to q in all dimensions and better than q in at least one dimension. A point is a skyline point if there does not exists any point that can dominate it. ...
Read More
Re-ranking search results using query logs
CIKM '06: Proceedings of the 15th ACM international conference on Information and knowledge management

This work addresses two common problems in search, frequently occurring with underspecified user queries: the top-ranked results for such queries may not contain documents relevant to the user's search intent, and fresh and relevant pages may not get ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '17: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management
November 2017
2604 pages
ISBN:9781450349185
DOI:10.1145/3132847
General Chairs:
Ee-Peng Lim
Singapore Management University, Singapore
,
Marianne Winslett
University of Illinois at Urbana-Champaign, USA, and Advanced Digital Sciences Center, Singapore
,
Program Chairs:
Mark Sanderson
RMIT, Australia
,
Ada Fu
Chinese University of Hong Kong, Hong Kong
,
Jimeng Sun
Georgia Tech, USA
,
Shane Culpepper
RMIT, Australia
,
Eric Lo
Chinese University of Hong Kong, Hong Kong
,
Joyce Ho
Emory University, USA
,
Debora Donato
Mix Tech, Inc., USA
,
Rakesh Agrawal
Data Insights Laboratories, USA
,
Yu Zheng
Microsoft Research Asia, China
,
Carlos Castillo
Qatar Computing Research Institute, Qatar
,
Aixin Sun
Nanyang Technological University, Singapore
,
Vincent S. Tseng
National Cheng Kung University, Taiwan
,
Chenliang Li
Wuhan University, China
Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 6 November 2017
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
algorithmic fairness
bias in computer systems
ranking
top-k selection
Qualifiers
- research-article
Conference

Acceptance Rates
CIKM '17 Paper Acceptance Rate171of855submissions,20%Overall Acceptance Rate1,861of8,427submissions,22%
More
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 241
  Total Citations
  View Citations
- 1,956
  Total Downloads
- Downloads (Last 12 months)292
- Downloads (Last 6 weeks)44
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

FA*IR: A Fair Top-k Ranking Algorithm

CIKM '17: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management

ABSTRACT

References

Cited By

Index Terms

Recommendations

Equity of Attention: Amortizing Individual Fairness in Rankings

Finding k-dominant skylines in high dimensional space

Re-ranking search results using query logs