ABSTRACT
In this work, we define and solve the Fair Top-k Ranking problem, in which we want to determine a subset of k candidates from a large pool of n » k candidates, maximizing utility (i.e., select the "best" candidates) subject to group fairness criteria.
Our ranked group fairness definition extends group fairness using the standard notion of protected groups and is based on ensuring that the proportion of protected candidates in every prefix of the top-k ranking remains statistically above or indistinguishable from a given minimum. Utility is operationalized in two ways: (i) every candidate included in the top-k should be more qualified than every candidate not included; and (ii) for every pair of candidates in the top-k, the more qualified candidate should be ranked above.
An efficient algorithm is presented for producing the Fair Top-k Ranking, and tested experimentally on existing datasets as well as new datasets released with this paper, showing that our approach yields small distortions with respect to rankings that maximize utility without considering fairness criteria. To the best of our knowledge, this is the first algorithm grounded in statistical tests that can mitigate biases in the representation of an under-represented group along a ranked list.
- Julia Angwin, Jeff Larson, Surya Mattu, and Lauren Kirchner. 2016. Machine Bias. ProPublica (May. 2016).Google Scholar
- Solon Barocas and Andrew D Selbst. 2014. Big data's disparate impact. SSRN 2477899 (2014).Google Scholar
- Francesco Bonchi, Sara Hajian, Bud Mishra, and Daniele Ramazzotti. 2017. Exposing the probabilistic causal structure of discrimination. International Journal of Data Science and Analytics, Vol. 3, 1 (2017), 1--21.Google ScholarCross Ref
- Toon Calders and Sicco Verwer. 2010. Three naive Bayes approaches for discrimination-free classification. Data Mining and Knowledge Discovery Vol. 21, 2 (2010), 277--292. Google ScholarDigital Library
- Jaime Carbonell and Jade Goldstein. 1998. The use of MMR, diversity-based reranking for reordering documents and producing summaries Proc. of SIGIR. ACM Press, 335--336. Google ScholarDigital Library
- L Elisa Celis, Amit Deshpande, Tarun Kathuria, and Nisheeth K Vishnoi. 2016. How to be Fair and Diverse? arXiv:1610.07183 (2016).Google Scholar
- L Elisa Celis, Damian Straszak, and Nisheeth K Vishnoi. 2017. Ranking with Fairness Constraints. arXiv:1704.06840 (2017).Google Scholar
- Sushma Channamsetty and Michael D Ekstrand. 2017. Recommender Response to Diversity and Popularity Bias in User Profiles (short paper) 30th International Florida Artificial Intelligence Research Society Conference.Google Scholar
- Sam Corbett-Davies, Emma Pierson, Avi Feller, Sharad Goel, and Aziz Huq. 2017. Algorithmic decision making and the cost of fairness. arXiv:1701.08230 (2017). Google ScholarDigital Library
- Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. 2012. Fairness through awareness. In Proc. of ITCS. ACM Press, 214--226. Google ScholarDigital Library
- Evelyn Ellis and Philippa Watson. 2012. EU anti-discrimination law. Oxford University Press.Google Scholar
- Michael Feldman, Sorelle A Friedler, John Moeller, Carlos Scheidegger, and Suresh Venkatasubramanian. 2015. Certifying and removing disparate impact. In Proc. of KDD. ACM Press, 259--268. Google ScholarDigital Library
- Sorelle A Friedler, Carlos Scheidegger, and Suresh Venkatasubramanian. 2016. On the (im) possibility of fairness. arXiv:1609.07236 (2016).Google Scholar
- Batya Friedman and Helen Nissenbaum. 1996. Bias in computer systems. ACM Transactions on Information Systems Vol. 14, 3 (1996), 330--347. Google ScholarDigital Library
- Sara Hajian, Francesco Bonchi, and Carlos Castillo. 2016. Algorithmic Bias: From Discrimination Discovery to Fairness-aware Data Mining KDD Tutorials. Google ScholarDigital Library
- Sara Hajian and Josep Domingo-Ferrer. 2013. A methodology for direct and indirect discrimination prevention in data mining. IEEE TKDE, Vol. 25, 7 (2013). Google ScholarDigital Library
- Sara Hajian, Josep Domingo-Ferrer, and Oriol Farràs. 2014. Generalization-based privacy preservation and discrimination prevention in data publishing and mining. Data Mining and Knowledge Discovery Vol. 28, 5--6 (2014), 1158--1188. Google ScholarDigital Library
- Moritz Hardt, Eric Price, and Nati Srebro. 2016. Equality of opportunity in supervised learning. In Proc. of NIPS. Curran Associates, Inc., 3315--3323. Google ScholarDigital Library
- Shahin Jabbari, Matthew Joseph, Michael Kearns, Jamie Morgenstern, and Aaron Roth. 2016. Fair Learning in Markovian Environments. arXiv:1611.03071 (2016).Google Scholar
- Kalervo J"arvelin and Jaana Kek"al"ainen. 2002. Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems Vol. 20, 4 (2002), 422--446. Google ScholarDigital Library
- Faisal Kamiran, Toon Calders, and Mykola Pechenizkiy. 2010. Discrimination aware decision tree learning. In Proc. of ICDM. IEEE CS, 869--874. Google ScholarDigital Library
- Toshihiro Kamishima, Shotaro Akaho, Hideki Asoh, and Jun Sakuma. 2012. Fairness-aware classifier with prejudice remover regularizer. Machine Learning and Knowledge Discovery in Databases. Springer, 35--50.Google Scholar
- Jon Kleinberg, Sendhil Mullainathan, and Manish Raghavan. 2016. Inherent trade-offs in the fair determination of risk scores. arXiv:1609.05807 (2016).Google Scholar
- Juhi Kulshrestha, Muhammad B. Zafar, Motahhare Eslami, Saptarshi Ghosh, Johnnatan Messias, and Krishna P. Gummadi. 2017. Quantifying search bias: Investigating sources of bias for political searches in social media Proc. of CSCW. Google ScholarDigital Library
- Matevvz Kunaver and Tomavz Povzrl. 2017. Diversity in recommender systems--A survey. Knowledge-Based Systems Vol. 123 (2017), 154--162. Google ScholarDigital Library
- N=at=an Lerner. 2003. Group rights and discrimination in international law. Vol. Vol. 77. Martinus Nijhoff Publishers.Google Scholar
- M. Lichman. 2013. UCI Machine Learning Repository. (2013).Google Scholar
- Cathy O'Neil. 2016. Weapons of math destruction: How big data increases inequality and threatens democracy. Crown Publishing Group. Google Scholar
- Dino Pedreschi, Salvatore Ruggieri, and Franco Turini. 2009 a. Integrating induction and deduction for finding evidence of discrimination Proc. of AI and Law. ACM Press, 157--166. Google ScholarDigital Library
- Dino Pedreschi, Salvatore Ruggieri, and Franco Turini. 2009 b. Measuring Discrimination in Socially-Sensitive Decision Records Proc. of SDM. SIAM, 581--592.Google Scholar
- Dino Pedreshi, Salvatore Ruggieri, and Franco Turini. 2008. Discrimination-aware data mining. In Proc. of KDD. ACM Press, 560--568. Google ScholarDigital Library
- Tetsuya Sakai and Ruihua Song. 2011. Evaluating diversified search results using per-intent graded relevance Proc. of SIGIR. ACM Press, 1043--1052. Google ScholarDigital Library
- Thomas Sowell. 2005. Affirmative action around the world: An empirical analysis. Yale University Press.Google Scholar
- The College Board. 2014. SAT Percentile Ranks. (2014).Google Scholar
- Tània Verge. 2010. Gendering representation in Spain: Opportunities and limits of gender quotas. J. of Women, Politics & Policy Vol. 31, 2 (2010), 166--190.Google ScholarCross Ref
- Ke Yang and Julia Stoyanovich. 2016. Measuring Fairness in Ranked Outputs. In Proc. of FATML.Google Scholar
- Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rodriguez, and Krishna P Gummadi. 2015. Fairness constraints: A mechanism for fair classification. arXiv:1507.05259 (2015).Google Scholar
- Rich Zemel, Yu Wu, Kevin Swersky, Toni Pitassi, and Cynthia Dwork. 2013. Learning fair representations. In Proc. of ICML. 325--333. Google ScholarDigital Library
- Indre vZliobaite, Faisal Kamiran, and Toon Calders. 2011. Handling conditional discrimination. In Proc. of ICDM. IEEE CS, 992--1001. Google ScholarDigital Library
Index Terms
- FA*IR: A Fair Top-k Ranking Algorithm
Recommendations
Equity of Attention: Amortizing Individual Fairness in Rankings
SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information RetrievalRankings of people and items are at the heart of selection-making, match-making, and recommender systems, ranging from employment sites to sharing economy platforms. As ranking positions influence the amount of attention the ranked subjects receive, ...
Finding k-dominant skylines in high dimensional space
SIGMOD '06: Proceedings of the 2006 ACM SIGMOD international conference on Management of dataGiven a d-dimensional data set, a point p dominates another point q if it is better than or equal to q in all dimensions and better than q in at least one dimension. A point is a skyline point if there does not exists any point that can dominate it. ...
Re-ranking search results using query logs
CIKM '06: Proceedings of the 15th ACM international conference on Information and knowledge managementThis work addresses two common problems in search, frequently occurring with underspecified user queries: the top-ranked results for such queries may not contain documents relevant to the user's search intent, and fresh and relevant pages may not get ...
Comments