Article

Discriminative models for information retrieval

Author:
Ramesh Nallapati

University of Massachusetts, Amherst, MA

University of Massachusetts, Amherst, MA
View Profile

SIGIR '04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrievalJuly 2004Pages 64–71https://doi.org/10.1145/1008992.1009006

Published:25 July 2004Publication History

SIGIR '04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval

Pages 64–71

ABSTRACT

Discriminative models have been preferred over generative models in many machine learning problems in the recent past owing to some of their attractive theoretical properties. In this paper, we explore the applicability of discriminative classifiers for IR. We have compared the performance of two popular discriminative models, namely the maximum entropy model and support vector machines with that of language modeling, the state-of-the-art generative model for IR. Our experiments on ad-hoc retrieval indicate that although maximum entropy is significantly worse than language models, support vector machines are on par with language models. We argue that the main reason to prefer SVMs over language models is their ability to learn arbitrary features automatically as demonstrated by our experiments on the home-page finding task of TREC-10.

References

Berger, A. L., Della Pietra, D., Stephen A. and Della Pietra, V. J., A Maximum Entropy Approach to Natural Language Processing, Computational Linguistics, vol. 22(1), p39--71, 1996.]] Google ScholarDigital Library
Burges, C., A Tutorial on Support Vector Machines for Pattern Recognition, Data Mining and Knowledge Discovery, vol. 2(2), p121--167, 1998.]] Google ScholarDigital Library
Cooper, W. S. and Huizinga, P., The maximum entropy principle and its application to the design of probabilistic retrieval systems, Information Technology, Research and Development, 1:99--112, 1982.]]Google Scholar
Cooper, W. S., Exploiting the maximum entropy principle to increase retrieval effectiveness, Journal of the American Society for Information Science, 34(1):31--39, 1983.]]Google ScholarCross Ref
Cooper, W. S., Gey, F. and Dabney, D., Probabilistic Retrieval based on Staged Logistic regression, ACM SIGIR, p198--210, 1992.]] Google ScholarDigital Library
Craswell, N., Home-page finding training queries, http://es.cmis.csiro.au/TRECWeb/Qrels/homepages.wt10g.training01.]]Google Scholar
Gey, F., Inferring probability of relevance using the method of logistic regression, ACM SIGIR, p222--231, 1994.]] Google ScholarDigital Library
Greiff, W. R. and Ponte, J. M., The maximum entropy approach and probabilistic IR models, ACM Trans. on Information Systems, 18(3):246--287, 2000.]] Google ScholarDigital Library
Harter, S. P., A probabilistic approach to automatic keyword indexing. Part I: On the distribution of specialty words in a technical literature, Journal of the ASIS, vol. 26, 197--206.]]Google Scholar
Hawking, D. and Craswell, N., Overview of the TREC-2001 web track, TREC proceedings, 2001.]]Google Scholar
Kantor P. B. and Lee, J. J., The maximum entropy principle in information retrieval, SIGIR, 1986.]] Google ScholarDigital Library
Joachims, T., Text categorization with support vector machines: learning with many relevant features, Proceedings of 10th European Conference on Machine Learning, p137--142, 1998.]] Google ScholarDigital Library
Kantor P. B. and Lee, J. J., Testing the maximum entropy principle for information retrieval, Journal of the American Society for Information Science, 49(6):557--566, 1998.]] Google ScholarDigital Library
Kraaij, W., Westerveld T. and Hiemstra, D., The importance of prior probabilities for entry page search, SIGIR, pages 27--34, 2002.]] Google ScholarDigital Library
Lafferty, J. and Zhai, C., Probabilistic relevance models based on document and query generation, Workshop on Language Modeling and Information Retrieval, 2001.]]Google Scholar
Joachims, T., Making large-Scale SVM Learning Practical, Advances in Kernel Methods - Support Vector Learning, B. Schölkopf and C. Burges and A. Smola(ed.), MIT-Press, 1999.]] Google ScholarDigital Library
Malouf, R., A comparison of algorithms for maximum entropy parameter estimation, http://citeseer.nj.nec.com/malouf02comparison.html.]]Google Scholar
Nallapati, R. and Allan, J., Capturing Term Dependencies using a Sentence Tree based Language Model, CIKM, 2002.]] Google ScholarDigital Library
Ng., A. and Jordan, M., On Discriminative vs. Generative classifiers: A comparison of logistic regression and naïve Bayes, Neural Information Processing Systems, 2002.]]Google Scholar
Nigam, K., Lafferty, J. and McCallum, A., Using maximum entropy for text classification, IJCAI-99 Workshop on Machine Learning for Information Filtering, pages 61--67, 1999.]]Google Scholar
Ogilvie, P., and Callan J., Combining Document Representations for Known Item Search, SIGIR, 2003.]] Google ScholarDigital Library
Page, L., Brin, S., Motwani, R. and Winograd, T., The PageRank Citation Ranking: Bringing Order to the Web, Stanford Digital Library Technologies Project, 1998.]]Google Scholar
Ponte, J. M. and Croft, W. B., A Language Modeling Approach to Information Retrieval, ACM SIGIR, 275--281, 1998.]] Google ScholarDigital Library
Ratnaparkhi, A., A Maximum Entropy Part-Of-Speech Tagger, Empirical Methods in Natural Language Processing, 1996.]]Google Scholar
Robertson S. E. and Sparck Jones, K., Relevance weighting of search terms, Journal of American Society for Information Sciences, 27(3):129--146, 1976.]]Google ScholarCross Ref
Robertson, S. E., On Bayesian models and event spaces in information retrieval, Workshop on Mathematical and Formal methods for IR, 2002.]]Google Scholar
Robertson, S. E., van Rijsbergen, C.J., and Porter, M. F., Probabilistic models of indexing and searching, Proceedings of SIGIR, 1980.]] Google ScholarDigital Library
Salton, G., The SMART Retrieval System - Experiments in Automatic Document Processing, Prentice hall Inc., Englewood Cliffs, NJ, 1971.]] Google ScholarDigital Library
Teevan, J. and Karger, D., Empirical Development of an Exponential Probabilistic Model for Text Retrieval: Using Textual Analysis to Build a Better Model, In Proceedings of the 26th Annual ACM Conference on Research and Development in Information Retrieval, 2003.]] Google ScholarDigital Library
Vapnik, V. N., Statistical Learning Theory, John Wiley & Sons, 1998.]] Google ScholarDigital Library
Zhai, C. and Lafferty, J., A Study of Smoothing Methods for Language Models Applied to Ad Hoc Information Retrieval, SIGIR, 2001.]] Google ScholarDigital Library
Zhang, J. and Mani, I., kNN approach to unbalanced data distributions: A case study involving Information Extraction, Workshop on learning from imbalanced datasets II, ICML, 2003.]]Google Scholar
Zhang, L., A Maximum Entropy Modeling Toolkit for Python and C++, http://www.nlplab.cn/zhangle/maxent.html.]]Google Scholar
Language Modeling Toolkit for Information Retrieval, http://www-2.cs.cmu.edu/lemur/.]]Google Scholar

Index Terms

Discriminative models for information retrieval
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking

Recommendations

Twin Support Vector Machines for Pattern Classification

We propose Twin SVM, a binary SVM classifier that determines two nonparallel planes by solving two related SVM-type problems, each of which is smaller than in a conventional SVM. The Twin SVM formulation is in the spirit of proximal SVMs via generalized ...
Read More
Fuzzy support vector machines for multilabel classification

The problem of one-against-all support vector machines (SVMs) for multilabel classification is that a data sample may be classified into a multilabel class that is not defined or it may not be classified into any class. To solve this problem, in this ...
Read More
Extending twin support vector machine classifier for multi-category classification problems

Twin support vector machine classifier TWSVM was proposed by Jayadeva et al., which was used for binary classification problems. TWSVM not only overcomes the difficulties in handling the problem of exemplar unbalance in binary classification problems, ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR '04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
July 2004
624 pages
ISBN:1581138814
DOI:10.1145/1008992
General Chair:
Mark Sanderson
University of Sheffield (UK)
,
Program Chairs:
Kalervo Järvelin
University of Tampere (Finland)
,
James Allan
University of Massachusetts (USA)
,
Peter Bruza
Distributed Systems Technology Centre (Australia)
Copyright © 2004 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 25 July 2004
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
discriminative models
machine learning
maximum entropy
pattern classification
support vector machines
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate792of3,983submissions,20%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 178
  Total Citations
  View Citations
- 2,592
  Total Downloads
- Downloads (Last 12 months)56
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Discriminative models for information retrieval

SIGIR '04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

Twin Support Vector Machines for Pattern Classification

Fuzzy support vector machines for multilabel classification

Extending twin support vector machine classifier for multi-category classification problems