skip to main content
article
Free Access

An efficient boosting algorithm for combining preferences

Authors Info & Claims
Published:01 December 2003Publication History
Skip Abstract Section

Abstract

We study the problem of learning to accurately rank a set of objects by combining a given collection of ranking or preference functions. This problem of combining preferences arises in several applications, such as that of combining the results of different search engines, or the "collaborative-filtering" problem of ranking movies for a user based on the movie rankings provided by other users. In this work, we begin by presenting a formal framework for this general problem. We then describe and analyze an efficient algorithm called RankBoost for combining preferences based on the boosting approach to machine learning. We give theoretical results describing the algorithm's behavior both on the training data, and on new test data not seen during training. We also describe an efficient implementation of the algorithm for a particular restricted but common case. We next discuss two experiments we carried out to assess the performance of RankBoost. In the first experiment, we used the algorithm to combine different web search strategies, each of which is a query expansion for a given domain. The second experiment is a collaborative-filtering task for making movie recommendations.

References

  1. Brian T. Bartell, Garrison W. Cottrell, and Richard K. Belew. Automatic combination of multiple ranked retrieval systems. In Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 1994. Google ScholarGoogle Scholar
  2. Peter L. Bartlett. The sample complexity of pattern classification with neural networks: the size of the weights is more important than the size of the network. IEEE Transactions on Information Theory, 44(2): 525-536, March 1998. Google ScholarGoogle Scholar
  3. John S. Breese, David Heckerman, and Carl Kadie. Empirical analysis of predictive algorithms for collaborative filtering. In Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence, pages 43-52, 1998. Google ScholarGoogle Scholar
  4. Rich Caruana, Shumeet Baluja, and Tom Mitchell. Using the future to "sort out" the present: Rankprop and multitask learning for medical risk evaluation. In Advances in Neural Information Processing Systems 8, pages 959-965, 1996.Google ScholarGoogle Scholar
  5. William W. Cohen, Robert E. Schapire, and Yoram Singer. Learning to order things. Journal of Artificial Intelligence Research, 10:243-270, 1999. Google ScholarGoogle Scholar
  6. Michael Collins. Discriminative reranking for natural language parsing. In Proceedings of the Seventeenth International Conference on Machine Learning, 2000. Google ScholarGoogle Scholar
  7. K. Crammer and Y. Singer. A new family of online algorithms for category ranking. In Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2002. Google ScholarGoogle Scholar
  8. Koby Crammer and Yoram Singer. Pranking with ranking. In Advances in Neural Information Processing Systems 14, 2001.Google ScholarGoogle Scholar
  9. Luc Devroye, Lázló Györfi, and Gábor Lugosi. A Probabilistic Theory of Pattern Recognition. Springer, 1996.Google ScholarGoogle Scholar
  10. O. Etzioni, S. Hanks, T. Jiang, R. M. Karp, O. Madani, and O. Waarts. Efficient information gathering on the internet. In 37th Annual Symposium on Foundations of Computer Science, 1996. Google ScholarGoogle Scholar
  11. Yoav Freund and Robert E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1): 119-139, August 1997. Google ScholarGoogle Scholar
  12. David Haussler. Decision theoretic generalizations of the PAC model for neural net and other learning applications. Information and Computation, 100(1): 78-150, 1992. Google ScholarGoogle Scholar
  13. David Haussler, Michael Kearns, Nick Littlestone, and Manfred K. Warmuth. Equivalence of models for polynomial learnability. Information and Computation, 95(2): 129-161, December 1991. Google ScholarGoogle Scholar
  14. Will Hill, Larry Stead, Mark Rosenstein, and George Furnas. Recommending and evaluating choices in a virtual community of use. In Human Factors in Computing Systems CHI'95 Conference Proceedings, pages 194-201, 1995. Google ScholarGoogle Scholar
  15. Raj D. Iyer, David D. Lewis, Robert E. Schapire, Yoram Singer, and Amit Singhal. Boosting for document routing. In Proceedings of the Ninth International Conference on Information and Knowledge Management, 2000. Google ScholarGoogle Scholar
  16. Guy Lebanon and John Lafferty. Cranking: Combining rankings using conditional probability models on permutations. In Proceedings of the Nineteenth International Conference on Machine Learning, 2002. Google ScholarGoogle Scholar
  17. Paul Resnick, Neophytos Iacovou, Mitesh Sushak, Peter Bergstrom, and John Riedl. Grouplens: An open architecture for collaborative filtering of netnews. In Proceedings of Computer Supported Cooperative Work, 1995. Google ScholarGoogle Scholar
  18. Gerard Salton. Automatic text processing: the transformation, analysis and retrieval of information by computer. Addison-Wesley, 1989. Google ScholarGoogle Scholar
  19. Gerard Salton and Michael J. McGill. Introduction to Modern Information Retrieval. McGraw-Hill, 1983. Google ScholarGoogle Scholar
  20. Robert E. Schapire, Yoav Freund, Peter Bartlett, and Wee Sun Lee. Boosting the margin: A new explanation for the effectiveness of voting methods. The Annals of Statistics, 26(5): 1651-1686, October 1998.Google ScholarGoogle Scholar
  21. Robert E. Schapire and Yoram Singer. Improved boosting algorithms using confidence-rated predictions. Machine Learning, 37(3): 297-336, December 1999. Google ScholarGoogle Scholar
  22. Upendra Shardanand and Pattie Maes. Social information filtering: Algorithms for automating "word of mouth". In Human Factors in Computing Systems CHI'95 Conference Proceedings, 1995. Google ScholarGoogle Scholar
  23. V. N. Vapnik. Estimation of Dependences Based on Empirical Data. Springer-Verlag, 1982. Google ScholarGoogle Scholar
  24. Marilyn A. Walker, Owen Rambow, and Monica Rogati. SPoT: A trainable sentence planner. In Proceedings of the 2nd Annual Meeting of the North American Chapter of the Associataion for Computational Linguistics, 2001. Google ScholarGoogle Scholar

Index Terms

  1. An efficient boosting algorithm for combining preferences

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image The Journal of Machine Learning Research
        The Journal of Machine Learning Research  Volume 4, Issue
        12/1/2003
        1486 pages
        ISSN:1532-4435
        EISSN:1533-7928
        Issue’s Table of Contents

        Publisher

        JMLR.org

        Publication History

        • Published: 1 December 2003
        Published in jmlr Volume 4, Issue

        Qualifiers

        • article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader