skip to main content
10.1145/1008992.1009006acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article

Discriminative models for information retrieval

Published:25 July 2004Publication History

ABSTRACT

Discriminative models have been preferred over generative models in many machine learning problems in the recent past owing to some of their attractive theoretical properties. In this paper, we explore the applicability of discriminative classifiers for IR. We have compared the performance of two popular discriminative models, namely the maximum entropy model and support vector machines with that of language modeling, the state-of-the-art generative model for IR. Our experiments on ad-hoc retrieval indicate that although maximum entropy is significantly worse than language models, support vector machines are on par with language models. We argue that the main reason to prefer SVMs over language models is their ability to learn arbitrary features automatically as demonstrated by our experiments on the home-page finding task of TREC-10.

References

  1. Berger, A. L., Della Pietra, D., Stephen A. and Della Pietra, V. J., A Maximum Entropy Approach to Natural Language Processing, Computational Linguistics, vol. 22(1), p39--71, 1996.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Burges, C., A Tutorial on Support Vector Machines for Pattern Recognition, Data Mining and Knowledge Discovery, vol. 2(2), p121--167, 1998.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Cooper, W. S. and Huizinga, P., The maximum entropy principle and its application to the design of probabilistic retrieval systems, Information Technology, Research and Development, 1:99--112, 1982.]]Google ScholarGoogle Scholar
  4. Cooper, W. S., Exploiting the maximum entropy principle to increase retrieval effectiveness, Journal of the American Society for Information Science, 34(1):31--39, 1983.]]Google ScholarGoogle ScholarCross RefCross Ref
  5. Cooper, W. S., Gey, F. and Dabney, D., Probabilistic Retrieval based on Staged Logistic regression, ACM SIGIR, p198--210, 1992.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Craswell, N., Home-page finding training queries, http://es.cmis.csiro.au/TRECWeb/Qrels/homepages.wt10g.training01.]]Google ScholarGoogle Scholar
  7. Gey, F., Inferring probability of relevance using the method of logistic regression, ACM SIGIR, p222--231, 1994.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Greiff, W. R. and Ponte, J. M., The maximum entropy approach and probabilistic IR models, ACM Trans. on Information Systems, 18(3):246--287, 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Harter, S. P., A probabilistic approach to automatic keyword indexing. Part I: On the distribution of specialty words in a technical literature, Journal of the ASIS, vol. 26, 197--206.]]Google ScholarGoogle Scholar
  10. Hawking, D. and Craswell, N., Overview of the TREC-2001 web track, TREC proceedings, 2001.]]Google ScholarGoogle Scholar
  11. Kantor P. B. and Lee, J. J., The maximum entropy principle in information retrieval, SIGIR, 1986.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Joachims, T., Text categorization with support vector machines: learning with many relevant features, Proceedings of 10th European Conference on Machine Learning, p137--142, 1998.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Kantor P. B. and Lee, J. J., Testing the maximum entropy principle for information retrieval, Journal of the American Society for Information Science, 49(6):557--566, 1998.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Kraaij, W., Westerveld T. and Hiemstra, D., The importance of prior probabilities for entry page search, SIGIR, pages 27--34, 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Lafferty, J. and Zhai, C., Probabilistic relevance models based on document and query generation, Workshop on Language Modeling and Information Retrieval, 2001.]]Google ScholarGoogle Scholar
  16. Joachims, T., Making large-Scale SVM Learning Practical, Advances in Kernel Methods - Support Vector Learning, B. Schölkopf and C. Burges and A. Smola(ed.), MIT-Press, 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Malouf, R., A comparison of algorithms for maximum entropy parameter estimation, http://citeseer.nj.nec.com/malouf02comparison.html.]]Google ScholarGoogle Scholar
  18. Nallapati, R. and Allan, J., Capturing Term Dependencies using a Sentence Tree based Language Model, CIKM, 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Ng., A. and Jordan, M., On Discriminative vs. Generative classifiers: A comparison of logistic regression and naïve Bayes, Neural Information Processing Systems, 2002.]]Google ScholarGoogle Scholar
  20. Nigam, K., Lafferty, J. and McCallum, A., Using maximum entropy for text classification, IJCAI-99 Workshop on Machine Learning for Information Filtering, pages 61--67, 1999.]]Google ScholarGoogle Scholar
  21. Ogilvie, P., and Callan J., Combining Document Representations for Known Item Search, SIGIR, 2003.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Page, L., Brin, S., Motwani, R. and Winograd, T., The PageRank Citation Ranking: Bringing Order to the Web, Stanford Digital Library Technologies Project, 1998.]]Google ScholarGoogle Scholar
  23. Ponte, J. M. and Croft, W. B., A Language Modeling Approach to Information Retrieval, ACM SIGIR, 275--281, 1998.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Ratnaparkhi, A., A Maximum Entropy Part-Of-Speech Tagger, Empirical Methods in Natural Language Processing, 1996.]]Google ScholarGoogle Scholar
  25. Robertson S. E. and Sparck Jones, K., Relevance weighting of search terms, Journal of American Society for Information Sciences, 27(3):129--146, 1976.]]Google ScholarGoogle ScholarCross RefCross Ref
  26. Robertson, S. E., On Bayesian models and event spaces in information retrieval, Workshop on Mathematical and Formal methods for IR, 2002.]]Google ScholarGoogle Scholar
  27. Robertson, S. E., van Rijsbergen, C.J., and Porter, M. F., Probabilistic models of indexing and searching, Proceedings of SIGIR, 1980.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Salton, G., The SMART Retrieval System - Experiments in Automatic Document Processing, Prentice hall Inc., Englewood Cliffs, NJ, 1971.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Teevan, J. and Karger, D., Empirical Development of an Exponential Probabilistic Model for Text Retrieval: Using Textual Analysis to Build a Better Model, In Proceedings of the 26th Annual ACM Conference on Research and Development in Information Retrieval, 2003.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Vapnik, V. N., Statistical Learning Theory, John Wiley & Sons, 1998.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Zhai, C. and Lafferty, J., A Study of Smoothing Methods for Language Models Applied to Ad Hoc Information Retrieval, SIGIR, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Zhang, J. and Mani, I., kNN approach to unbalanced data distributions: A case study involving Information Extraction, Workshop on learning from imbalanced datasets II, ICML, 2003.]]Google ScholarGoogle Scholar
  33. Zhang, L., A Maximum Entropy Modeling Toolkit for Python and C++, http://www.nlplab.cn/zhangle/maxent.html.]]Google ScholarGoogle Scholar
  34. Language Modeling Toolkit for Information Retrieval, http://www-2.cs.cmu.edu/lemur/.]]Google ScholarGoogle Scholar

Index Terms

  1. Discriminative models for information retrieval

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGIR '04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
      July 2004
      624 pages
      ISBN:1581138814
      DOI:10.1145/1008992

      Copyright © 2004 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 25 July 2004

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate792of3,983submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader