skip to main content
10.1145/1183579.1183585acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
Article

Size doesn't always matter: exploiting pageRank for query routing in distributed IR

Published:11 November 2006Publication History

ABSTRACT

PageRank authority scores have proven to be a powerful ingredient to local document scoring. Since query routing, i.e., carefully selecting a small subset of promising peers for a particular query from a large network, bears a close resemblance to local document scoring, it suggests itself that authority scores could also be beneficial for query routing, which is one of the biggest challenges in P2P Web search. For that purpose, we showcase JXP, an authority score measure that converges to true global PageRank scores in a distributed environment. Subsequently, we present several possible strategies to incorporate authority scores into query routing, including a hybrid strategy that combines authority scores with other existing measures. Preliminary experimental results support our hypothesis that authority scores can be highly beneficial to query routing.

References

  1. K. Aberer, F. Klemm, T. Luu, I. Podnar, and M. Rajman. Building a peer-to-peer full-text Web search engine with highly discriminative keys. Technical report, EPFL (LSIR), 2005.Google ScholarGoogle Scholar
  2. K. Aberer, M. Punceva, M. Hauswirth, and R. Schmidt. Improving data access in P2P systems. IEEE Internet Computing, 6(1):58--67, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. K. Aberer and J. Wu. Towards a common framework for peer-to-peer web retrieval. In From Integrated Publication and Information Systems to Virtual Information and Knowledge Environments, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. D. Agrawal, A. E. Abbadi, and S. Suri. Attribute-based access to distributed data over P2P networks. In DNIS, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. W.-T. Balke, W. Nejdl, W. Siberski, and U. Thaden. DL meets P2P - distributed document retrieval based on classification and content. InECDL, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. M. Bender, S. Michel, P. Triantafillou, G. Weikum, and C. Zimmer. P2P content search: Give the web back to the people. In IPTPS, 2006.Google ScholarGoogle Scholar
  7. S. Brin and L. Page. The anatomy of a large-scale hypertextual web search engine. Computer Networks, 30(1-7):107--117, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Callan. Distributed information retrieval. Advances in information retrieval, Kluwer Academic Publishers., 2000.Google ScholarGoogle Scholar
  9. J. P. Callan, Z. Lu, and W. B. Croft. Searching distributed collections with inference networks. In SIGIR, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. F. M. Cuenca-Acuna, C. Peery, R. P. Martin, and T. D. Nguyen. PlanetP: Using gossiping to build content addressable peer-to-peer information sharing communities. In HPDC, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. N. Fuhr. A Decision-Theoretic Approach to Database Selection in Networked IR. In ACM Trans. Inf. Syst., 17(3), 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. L. Gravano, H. Garcia-Molina, and A. Tomasic. Gloss: text-source discovery over the internet. ACM Trans. Database Syst., 24(2):229--264, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. R. Huebsch, J. M. Hellerstein, N. L. Boon, T. Loo, S. Shenker, and I. Stoica. Querying the internet with PIER, Sept. 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. S. Kamvar, T. Haveliwala, C. Manning, and G. Golub. Exploiting the block structure of the web for computing pagerank, Technical Report, Stanford University, 2003.Google ScholarGoogle Scholar
  15. J. M. Kleinberg. Authoritative sources in a hyperlinked environment. Journal of the ACM, 46(5):604--632, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J. Lu and J. Callan. Federated search of text-based digital libraries in hierarchical peer-to-peer networks. In ECIR, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. S. Michel, M. Bender, P. Triantafillou, and G. Weikum. IQN routing: Integrating quality and novelty in P2P querying and ranking. In EDBT, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. H. Nottelmann and N. Fuhr. Evaluating different methods of estimating retrieval quality for resource selection. In SIGIR, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J. X. Parreira, D. Donato, S. Michel, and G. Weikum. Efficient and decentralized pagerank approximation in a peer-to-peer web search network. To be published inVLDB, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Schenker. A scalable content-addressable network. In SIGCOMM, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. P. Reynolds and A. Vahdat. Efficient peer-to-peer keyword searching. In Middleware, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. S. E. Robertson and S. Walker. Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. In SIGIR, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. A. Rowstron and P. Druschel. Pastry: Scalable, decentralized object location, and routing for large-scale peer-to-peer systems. In Middleware, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. K. Sankaralingam, M. Yalamanchi, S. Sethumadhavan, J. Browne. Pagerank Computation and Keyword Search on Distributed Systems and P2P Networks. J. Grid Comput., 1(3), 2003.Google ScholarGoogle ScholarCross RefCross Ref
  25. S. Shi, J. Yu, G. Yang, and D. Wang. Distributed page ranking in structured P2P networks. ICPP, 00:179, 2003.Google ScholarGoogle Scholar
  26. L. Si, R. Jin, J. Callan, and P. Ogilvie. A language modeling framework for resource selection and results merging. In CIKM, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and H. Balakrishnan. Chord: A scalable peer-to-peer lookup service for internet applications. In SIGCOMM, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. C. Tang, Z. Xu, and S. Dwarkadas. Peer-to-peer information retrieval using self-organizing semantic overlay networks. In SIGCOMM, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. P. Triantafillou and T. Pitoura. Towards a unifying framework for complex query processing over structured peer-to-peer data networks. In DBISP2P, 2003.Google ScholarGoogle Scholar
  30. Y. Wang and D. J. DeWitt. Computing pagerank in a distributed internet search engine system. In VLDB, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Y. Wang, L. Galanis, and D. J. de Witt. Galanx: An efficient peer-to-peer search engine system. Available at http://www.cs.wisc.edu~yuanwang.Google ScholarGoogle Scholar
  32. J. Wu and K. Aberer. Using a layered markov model for distributed web ranking computation. In ICDCS, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Size doesn't always matter: exploiting pageRank for query routing in distributed IR

                Recommendations

                Comments

                Login options

                Check if you have access through your login credentials or your institution to get full access on this article.

                Sign in
                • Published in

                  cover image ACM Conferences
                  P2PIR '06: Proceedings of the international workshop on Information retrieval in peer-to-peer networks
                  November 2006
                  66 pages
                  ISBN:1595935274
                  DOI:10.1145/1183579

                  Copyright © 2006 ACM

                  Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                  Publisher

                  Association for Computing Machinery

                  New York, NY, United States

                  Publication History

                  • Published: 11 November 2006

                  Permissions

                  Request permissions about this article.

                  Request Permissions

                  Check for updates

                  Qualifiers

                  • Article

                  Upcoming Conference

                • Article Metrics

                  • Downloads (Last 12 months)0
                  • Downloads (Last 6 weeks)0

                  Other Metrics

                PDF Format

                View or Download as a PDF file.

                PDF

                eReader

                View online with eReader.

                eReader