ABSTRACT
When a user has watched, say, 70 romance movies and 30 action movies, then it is reasonable to expect the personalized list of recommended movies to be comprised of about 70% romance and 30% action movies as well. This important property is known as calibration, and recently received renewed attention in the context of fairness in machine learning. In the recommended list of items, calibration ensures that the various (past) areas of interest of a user are reflected with their corresponding proportions. Calibration is especially important in light of the fact that recommender systems optimized toward accuracy (e.g., ranking metrics) in the usual offline-setting can easily lead to recommendations where the lesser interests of a user get crowded out by the user's main interests-which we show empirically as well as in thought-experiments. This can be prevented by calibrated recommendations. To this end, we outline metrics for quantifying the degree of calibration, as well as a simple yet effective re-ranking algorithm for post-processing the output of recommender systems.
Supplemental Material
- A. Ashkan, B. Kveton, S. Berkovsky, and Z. Wen. 2014. Diversified Utility Maximization for Recommendations. In ACM Conference on Recommender Systems (RecSys).Google Scholar
- A. Ashkan, B. Kveton, S. Berkovsky, and Z. Wen. 2015. Optimal Greedy Diversity for Recommendation. In Int. Joint Conf. on Artificial Intelligence (IJCAI). Google ScholarDigital Library
- J. Bennet and S. Lanning. 2007. The Netflix Prize. In Workshop at SIGKDD-07, ACM Conference on Knowledge Discovery and Data Mining.Google Scholar
- A. Bhaskara, M. Ghadiri, and V. Mirrokni. 2016. Linear Relaxations for Finding Diverse Elements in Metric Spaces. In Advances in Neural Information Processing Systems (NIPS). Google ScholarDigital Library
- D.M. Blei, A.Y. Ng, and M.Jordan. 2001. Latent Dirichlet Allocation. In Advances in Neural Information Processing Systems (NIPS). Google ScholarDigital Library
- J. Carbonell and J. Goldstein. 1998. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In ACM Conference on Research and development in information retrieval (SIGIR). Google ScholarDigital Library
- O. Chapelle, D. Metzler, Y. Zhang, and P. Grinspan. 2009. Expected Reciprocal Rank for Graded Relevance. In CIKM. Google ScholarDigital Library
- L. Chen, G. Zhang, and H. Zhou. 2017. Improving the Diversity ofTop-N Recommendation via Determinantal Point Process. arXiv:1709.05135.Google Scholar
- V. Dang and W. B. Croft. 2012. Diversity by Proportionality: An Election-based Approach to Search Result Diversification. In ACM Conference on Research and development in information retrieval (SIGIR). Google ScholarDigital Library
- D.P. Foster and R.V. Vohra. 1998. Asymptotic calibration. Biometrika 85 (1998), 379--90. Issue 2.Google Scholar
- M. Gartrell, U. Paquet, and N. Koenigstein. 2016. Bayesian low-rank determinantal point processes. In ACM Conference on Recommender Systems (RecSys). 349--56. Google ScholarDigital Library
- M. Hardt, E. Price, and N. Srebro. 2016. Equality of opportunity in supervised learning. In Advances in Neural Information Processing Systems (NIPS). Google ScholarDigital Library
- F. M. Harper and J. A. Konstan. 2015. The MovieLens Datasets: History and Context. ACM Transactions on Interactive Intelligent Systems (TiiS) 5 (2015). Issue 4. Google ScholarDigital Library
- Y. Hu, Y. Koren, and C. Volinsky. 2008. Collaborative Filtering for Implicit Feedback Datasets. In IEEE International Conference on Data Mining (ICDM). Google ScholarDigital Library
- Neil Hurley and Mi Zhang. 2011. Novelty and diversity in top-N recommendation-analysis and evaluation. ACM Transactions on Internet Technology (TOIT) 10 (2011). Issue 4. Google ScholarDigital Library
- J. Kleinberg, S. Mullainathan, and M. Raghavan. 2016. Inherent Trade-Offs in the Fair Determination of Risk Scores. In Advances in Neural Information Processing Systems (NIPS).Google Scholar
- G. L. Nemhauser and L.A. Wolsey. 1978. An Analysis of Approximations for Maximizing Submodular Set Functions. Mathematical Programming 14 (1978). Google ScholarDigital Library
- R. Pan, Y. Zhou, B. Cao, N. Liu, R. Lukose, M. Scholz, and Q. Yang. 2008. One-Class Collaborative Filtering. In IEEE International Conference on Data Mining (ICDM).Google Scholar
- L. Qin and X. Zhu. 2013. Promoting Diversity in Recommendation by Entropy Regularizer. In Int. Joint Conf. on Artificial Intelligence (IJCAI). Google ScholarDigital Library
- R. L. T. Santos, C. Macdonald, and I. Ounis. 2010. Exploiting Query Reformulation for Web Search Result Diversification. In International World Wide Web Conference (WWW). Google ScholarDigital Library
- G. Shani and A. Gunawardana. 2011. Evaluating Recommendation systems. In Recommender Systems Handbook. Springer, 257--97.Google Scholar
- Y. Shinohara. 2014. A submodular optimization approach to sentence set selection. In IEEE International Conference on Acoustic, Speech and Signal processing (ICASSP).Google ScholarCross Ref
- H. Steck. 2010. Training and Testing of Recommender Systems on Data Missing Not at Random. In ACM Conference on Knowledge Discovery and Data Mining (KDD). 713--22. Google ScholarDigital Library
- A. Swaminathan, A. Krishnamurthy, A. Agarwal, M. Dudik, J. Langford, D. Jose, and I. Zitouni. 2017. Off-policy evaluation for slate recommendation. In Advances in Neural Information Processing Systems (NIPS). 3635--45.Google Scholar
- C. H. Teo, H. Nassif, D. Hill, S. Srinivasan, M. Goodman, V. Mohan, and S. V. N. Vishwanathan. 2016. Adaptive, Personalized Diversity for Visual Discovery. In ACM Conference on Recommender Systems (RecSys). 35--8. Google ScholarDigital Library
- S. Vargas, L. Baltrunas, A. Karatzoglou, and P. Castells. 2014. Coverage, Redundancy and Size-Awareness in Genre Diversity for Recommender Systems. In ACM Conference on Recommender Systems (RecSys). Google ScholarDigital Library
- S. Vargas, P. Castells, and D. Vallet. 2012. Explicit Relevance Models in Intent-Oriented Information Retrieval Diversification. In ACM Conference on Research and Development in Information Retrieval (SIGIR). Google ScholarDigital Library
- B. Woodworth, S. Gunasekar, M.I. Ohannessian, and N. Srebro. 2017. Learning Non-Discriminatory Predictors. arXiv:1702.06081.Google Scholar
- S. Yao and B. Huang. 2017. Beyond Parity: Fairness Objectives for Collaborative Filtering. In Advances in Neural Information Processing Systems (NIPS).Google Scholar
- B. Zadrozny and C. Elkan. 2001. Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers. In International Conference on Machine Learning (ICML). 609--16. Google ScholarDigital Library
- M. Zhang and N. Hurley. 2008. Avoiding monotony: improving the diversity of recommendation lists. In ACM Conference on Recommender Systems (RecSys). 123--30. Google ScholarDigital Library
- C.-N. Ziegler, S. M. McNee, J. A. Konstan, and G. Lausen. 2005. Improving recommendation lists through topic diversification. In International World Wide Web Conference (WWW). 22--32. Google ScholarDigital Library
- I. Zliobaite. 2015. A survey on measuring indirect discrimination in machine learning. arXiv:1511.00148.Google Scholar
Index Terms
Calibrated recommendations
Recommendations
A comparison of calibrated and intent-aware recommendations
RecSys '19: Proceedings of the 13th ACM Conference on Recommender SystemsCalibrated and intent-aware recommendation are recent approaches to recommendation that have apparent similarities. Both try, to a certain extent, to cover the user's interests, as revealed by her user profile. In this paper, we compare them in detail. ...
Calibrated Recommendations as a Minimum-Cost Flow Problem
WSDM '23: Proceedings of the Sixteenth ACM International Conference on Web Search and Data MiningCalibration in recommender systems has recently gained significant attention. In the recommended list of items, calibration ensures that the various (past) areas of interest of a user are reflected with their corresponding proportions. For instance, if a ...
A framework for diversifying recommendation lists by user interest expansion
Recommender systems have been widely used to discover users' preferences and recommend interesting items to users during this age of information overload. Researchers in the field of recommender systems have realized that the quality of a top-N ...
Comments