ABSTRACT
Probabilistic document retrieval systems consistent with the two Poisson independence model outperforms the binary independence model if the terms are distributed as described by the model's assumptions. The Two Poisson Effectiveness Hypothesis suggests that retrieval models based upon the two Poisson model will outperform binary independent models when used on a “real-world” database, where independence and two Poisson term occurrence distributions fail to hold, because the added information obtained from incorporating term frequency information will more than compensate for the non-Poisson distributions of terms. Searches of the MED1033 database suggest that if terms are not independent and frequencies of term occurrence are not distributed in a two Poisson manner, the binary independence sequential retrieval model outperforms the two Poisson independence retrieval model.
- Bookstein, A. and Swanson, D. "A Decision Theoretic Foundation for Indexing." Journal of the American Society for Information Science. XXVI (January 1975): 45-50.Google ScholarCross Ref
- Booketein# A. "Information Retrieval: A Seguential Learning Process." Journal of the American Society for Information Science. XXXIV (September 1983): 331-342.Google ScholarCross Ref
- Bratley, P., Fox, B., and Schrage, L. A Guide to Simulation. (New York: Springer-Verlag, ~ 983) : Google ScholarDigital Library
- Croft, W. and Harper, D. "Using Probabillstlc Models of Document Retrieval without Relevance Information." Journal of Documentation~ XXXV (December 1979): 285-295.Google ScholarCross Ref
- FOx, E. Characterization of Two New Experimental Collections in Computerand Information Science Containing Textual and Bibliographic #. Technical Report 83-561, Cornell Eniverslty Department of Computer Science. Ithaca, New York: September, 1983.Google Scholar
- Hatter, S. "A Probabilistic Approach to Keyword Indexing." Ph.D. dissertation, University of Chicago, 1974.Google Scholar
- Losee, R. "The Performance of Probabillstic Models of Document Retrieval Systems." Ph.D. dissertation, University of Chicago, 1986.Google Scholar
- Raghaven, V., Shi, H. and Yu, C. "Evaluation of the 2 Poisson Model as a Basis for using Term Frequency Data in Searching." Proceedlngsof the Sixth Annual International AC# SIGIR Conference on Research and Development in Information and Retrieval. (New York: Association for Computin9 Machinery, 1983). Google ScholarDigital Library
- D11man, J. Princlples of Database Systems, second edition. (Rockville, Maryland: Computer Science Press, 1982).Google Scholar
- Van Rijsbergen, C. Information Retrieval, second edition. (London: Butterworths, 1979). Google ScholarDigital Library
- Voorhees, E. Computer Science Department, Cornell University, Ithaca, New YOrk. Letter of 18 June, 1984 and persona# co~#aunication of 19 June, 1985.Google Scholar
- Probabilistic models for document retrieval: a comparison of perfromance on exterimental and synthetic data bases
Recommendations
Document expansion for image retrieval
RIAO '10: Adaptivity, Personalization and Fusion of Heterogeneous InformationSuccessful information retrieval requires effective matching between the user's search request and the contents of relevant documents. Often the request entered by a user may not use the same topic relevant terms as the authors' of these documents. One ...
Modeling term proximity for probabilistic information retrieval models
Proximity among query terms has been found to be useful for improving retrieval performance. However, its application to classical probabilistic information retrieval models, such as Okapi's BM25, remains a challenging research problem. In this paper, ...
Document Retrieval Using Entity-Based Language Models
SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information RetrievalWe address the ad hoc document retrieval task by devising novel types of entity-based language models. The models utilize information about single terms in the query and documents as well as term sequences marked as entities by some entity-linking tool. ...
Comments