skip to main content
10.1145/3240323.3240372acmconferencesArticle/Chapter ViewAbstractPublication PagesrecsysConference Proceedingsconference-collections
research-article
Open Access

Calibrated recommendations

Published:27 September 2018Publication History

ABSTRACT

When a user has watched, say, 70 romance movies and 30 action movies, then it is reasonable to expect the personalized list of recommended movies to be comprised of about 70% romance and 30% action movies as well. This important property is known as calibration, and recently received renewed attention in the context of fairness in machine learning. In the recommended list of items, calibration ensures that the various (past) areas of interest of a user are reflected with their corresponding proportions. Calibration is especially important in light of the fact that recommender systems optimized toward accuracy (e.g., ranking metrics) in the usual offline-setting can easily lead to recommendations where the lesser interests of a user get crowded out by the user's main interests-which we show empirically as well as in thought-experiments. This can be prevented by calibrated recommendations. To this end, we outline metrics for quantifying the degree of calibration, as well as a simple yet effective re-ranking algorithm for post-processing the output of recommender systems.

Skip Supplemental Material Section

Supplemental Material

p154-steck.mp4

References

  1. A. Ashkan, B. Kveton, S. Berkovsky, and Z. Wen. 2014. Diversified Utility Maximization for Recommendations. In ACM Conference on Recommender Systems (RecSys).Google ScholarGoogle Scholar
  2. A. Ashkan, B. Kveton, S. Berkovsky, and Z. Wen. 2015. Optimal Greedy Diversity for Recommendation. In Int. Joint Conf. on Artificial Intelligence (IJCAI). Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. J. Bennet and S. Lanning. 2007. The Netflix Prize. In Workshop at SIGKDD-07, ACM Conference on Knowledge Discovery and Data Mining.Google ScholarGoogle Scholar
  4. A. Bhaskara, M. Ghadiri, and V. Mirrokni. 2016. Linear Relaxations for Finding Diverse Elements in Metric Spaces. In Advances in Neural Information Processing Systems (NIPS). Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. D.M. Blei, A.Y. Ng, and M.Jordan. 2001. Latent Dirichlet Allocation. In Advances in Neural Information Processing Systems (NIPS). Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. Carbonell and J. Goldstein. 1998. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In ACM Conference on Research and development in information retrieval (SIGIR). Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. O. Chapelle, D. Metzler, Y. Zhang, and P. Grinspan. 2009. Expected Reciprocal Rank for Graded Relevance. In CIKM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. L. Chen, G. Zhang, and H. Zhou. 2017. Improving the Diversity ofTop-N Recommendation via Determinantal Point Process. arXiv:1709.05135.Google ScholarGoogle Scholar
  9. V. Dang and W. B. Croft. 2012. Diversity by Proportionality: An Election-based Approach to Search Result Diversification. In ACM Conference on Research and development in information retrieval (SIGIR). Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. D.P. Foster and R.V. Vohra. 1998. Asymptotic calibration. Biometrika 85 (1998), 379--90. Issue 2.Google ScholarGoogle Scholar
  11. M. Gartrell, U. Paquet, and N. Koenigstein. 2016. Bayesian low-rank determinantal point processes. In ACM Conference on Recommender Systems (RecSys). 349--56. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. M. Hardt, E. Price, and N. Srebro. 2016. Equality of opportunity in supervised learning. In Advances in Neural Information Processing Systems (NIPS). Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. F. M. Harper and J. A. Konstan. 2015. The MovieLens Datasets: History and Context. ACM Transactions on Interactive Intelligent Systems (TiiS) 5 (2015). Issue 4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Y. Hu, Y. Koren, and C. Volinsky. 2008. Collaborative Filtering for Implicit Feedback Datasets. In IEEE International Conference on Data Mining (ICDM). Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Neil Hurley and Mi Zhang. 2011. Novelty and diversity in top-N recommendation-analysis and evaluation. ACM Transactions on Internet Technology (TOIT) 10 (2011). Issue 4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J. Kleinberg, S. Mullainathan, and M. Raghavan. 2016. Inherent Trade-Offs in the Fair Determination of Risk Scores. In Advances in Neural Information Processing Systems (NIPS).Google ScholarGoogle Scholar
  17. G. L. Nemhauser and L.A. Wolsey. 1978. An Analysis of Approximations for Maximizing Submodular Set Functions. Mathematical Programming 14 (1978). Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. R. Pan, Y. Zhou, B. Cao, N. Liu, R. Lukose, M. Scholz, and Q. Yang. 2008. One-Class Collaborative Filtering. In IEEE International Conference on Data Mining (ICDM).Google ScholarGoogle Scholar
  19. L. Qin and X. Zhu. 2013. Promoting Diversity in Recommendation by Entropy Regularizer. In Int. Joint Conf. on Artificial Intelligence (IJCAI). Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. R. L. T. Santos, C. Macdonald, and I. Ounis. 2010. Exploiting Query Reformulation for Web Search Result Diversification. In International World Wide Web Conference (WWW). Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. G. Shani and A. Gunawardana. 2011. Evaluating Recommendation systems. In Recommender Systems Handbook. Springer, 257--97.Google ScholarGoogle Scholar
  22. Y. Shinohara. 2014. A submodular optimization approach to sentence set selection. In IEEE International Conference on Acoustic, Speech and Signal processing (ICASSP).Google ScholarGoogle ScholarCross RefCross Ref
  23. H. Steck. 2010. Training and Testing of Recommender Systems on Data Missing Not at Random. In ACM Conference on Knowledge Discovery and Data Mining (KDD). 713--22. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. A. Swaminathan, A. Krishnamurthy, A. Agarwal, M. Dudik, J. Langford, D. Jose, and I. Zitouni. 2017. Off-policy evaluation for slate recommendation. In Advances in Neural Information Processing Systems (NIPS). 3635--45.Google ScholarGoogle Scholar
  25. C. H. Teo, H. Nassif, D. Hill, S. Srinivasan, M. Goodman, V. Mohan, and S. V. N. Vishwanathan. 2016. Adaptive, Personalized Diversity for Visual Discovery. In ACM Conference on Recommender Systems (RecSys). 35--8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. S. Vargas, L. Baltrunas, A. Karatzoglou, and P. Castells. 2014. Coverage, Redundancy and Size-Awareness in Genre Diversity for Recommender Systems. In ACM Conference on Recommender Systems (RecSys). Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. S. Vargas, P. Castells, and D. Vallet. 2012. Explicit Relevance Models in Intent-Oriented Information Retrieval Diversification. In ACM Conference on Research and Development in Information Retrieval (SIGIR). Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. B. Woodworth, S. Gunasekar, M.I. Ohannessian, and N. Srebro. 2017. Learning Non-Discriminatory Predictors. arXiv:1702.06081.Google ScholarGoogle Scholar
  29. S. Yao and B. Huang. 2017. Beyond Parity: Fairness Objectives for Collaborative Filtering. In Advances in Neural Information Processing Systems (NIPS).Google ScholarGoogle Scholar
  30. B. Zadrozny and C. Elkan. 2001. Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers. In International Conference on Machine Learning (ICML). 609--16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. M. Zhang and N. Hurley. 2008. Avoiding monotony: improving the diversity of recommendation lists. In ACM Conference on Recommender Systems (RecSys). 123--30. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. C.-N. Ziegler, S. M. McNee, J. A. Konstan, and G. Lausen. 2005. Improving recommendation lists through topic diversification. In International World Wide Web Conference (WWW). 22--32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. I. Zliobaite. 2015. A survey on measuring indirect discrimination in machine learning. arXiv:1511.00148.Google ScholarGoogle Scholar

Index Terms

  1. Calibrated recommendations

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      RecSys '18: Proceedings of the 12th ACM Conference on Recommender Systems
      September 2018
      600 pages
      ISBN:9781450359016
      DOI:10.1145/3240323

      Copyright © 2018 Owner/Author

      This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike International 4.0 License.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 27 September 2018

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      RecSys '18 Paper Acceptance Rate32of181submissions,18%Overall Acceptance Rate254of1,295submissions,20%

      Upcoming Conference

      RecSys '24
      18th ACM Conference on Recommender Systems
      October 14 - 18, 2024
      Bari , Italy

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader