ABSTRACT
PageRank authority scores have proven to be a powerful ingredient to local document scoring. Since query routing, i.e., carefully selecting a small subset of promising peers for a particular query from a large network, bears a close resemblance to local document scoring, it suggests itself that authority scores could also be beneficial for query routing, which is one of the biggest challenges in P2P Web search. For that purpose, we showcase JXP, an authority score measure that converges to true global PageRank scores in a distributed environment. Subsequently, we present several possible strategies to incorporate authority scores into query routing, including a hybrid strategy that combines authority scores with other existing measures. Preliminary experimental results support our hypothesis that authority scores can be highly beneficial to query routing.
- K. Aberer, F. Klemm, T. Luu, I. Podnar, and M. Rajman. Building a peer-to-peer full-text Web search engine with highly discriminative keys. Technical report, EPFL (LSIR), 2005.Google Scholar
- K. Aberer, M. Punceva, M. Hauswirth, and R. Schmidt. Improving data access in P2P systems. IEEE Internet Computing, 6(1):58--67, 2002. Google ScholarDigital Library
- K. Aberer and J. Wu. Towards a common framework for peer-to-peer web retrieval. In From Integrated Publication and Information Systems to Virtual Information and Knowledge Environments, 2005. Google ScholarDigital Library
- D. Agrawal, A. E. Abbadi, and S. Suri. Attribute-based access to distributed data over P2P networks. In DNIS, 2005. Google ScholarDigital Library
- W.-T. Balke, W. Nejdl, W. Siberski, and U. Thaden. DL meets P2P - distributed document retrieval based on classification and content. InECDL, 2005. Google ScholarDigital Library
- M. Bender, S. Michel, P. Triantafillou, G. Weikum, and C. Zimmer. P2P content search: Give the web back to the people. In IPTPS, 2006.Google Scholar
- S. Brin and L. Page. The anatomy of a large-scale hypertextual web search engine. Computer Networks, 30(1-7):107--117, 1998. Google ScholarDigital Library
- J. Callan. Distributed information retrieval. Advances in information retrieval, Kluwer Academic Publishers., 2000.Google Scholar
- J. P. Callan, Z. Lu, and W. B. Croft. Searching distributed collections with inference networks. In SIGIR, 1995. Google ScholarDigital Library
- F. M. Cuenca-Acuna, C. Peery, R. P. Martin, and T. D. Nguyen. PlanetP: Using gossiping to build content addressable peer-to-peer information sharing communities. In HPDC, 2003. Google ScholarDigital Library
- N. Fuhr. A Decision-Theoretic Approach to Database Selection in Networked IR. In ACM Trans. Inf. Syst., 17(3), 1999. Google ScholarDigital Library
- L. Gravano, H. Garcia-Molina, and A. Tomasic. Gloss: text-source discovery over the internet. ACM Trans. Database Syst., 24(2):229--264, 1999. Google ScholarDigital Library
- R. Huebsch, J. M. Hellerstein, N. L. Boon, T. Loo, S. Shenker, and I. Stoica. Querying the internet with PIER, Sept. 2003. Google ScholarDigital Library
- S. Kamvar, T. Haveliwala, C. Manning, and G. Golub. Exploiting the block structure of the web for computing pagerank, Technical Report, Stanford University, 2003.Google Scholar
- J. M. Kleinberg. Authoritative sources in a hyperlinked environment. Journal of the ACM, 46(5):604--632, 1999. Google ScholarDigital Library
- J. Lu and J. Callan. Federated search of text-based digital libraries in hierarchical peer-to-peer networks. In ECIR, 2005. Google ScholarDigital Library
- S. Michel, M. Bender, P. Triantafillou, and G. Weikum. IQN routing: Integrating quality and novelty in P2P querying and ranking. In EDBT, 2006. Google ScholarDigital Library
- H. Nottelmann and N. Fuhr. Evaluating different methods of estimating retrieval quality for resource selection. In SIGIR, 2003. Google ScholarDigital Library
- J. X. Parreira, D. Donato, S. Michel, and G. Weikum. Efficient and decentralized pagerank approximation in a peer-to-peer web search network. To be published inVLDB, 2006. Google ScholarDigital Library
- S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Schenker. A scalable content-addressable network. In SIGCOMM, 2001. Google ScholarDigital Library
- P. Reynolds and A. Vahdat. Efficient peer-to-peer keyword searching. In Middleware, 2003. Google ScholarDigital Library
- S. E. Robertson and S. Walker. Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. In SIGIR, 1994. Google ScholarDigital Library
- A. Rowstron and P. Druschel. Pastry: Scalable, decentralized object location, and routing for large-scale peer-to-peer systems. In Middleware, 2001. Google ScholarDigital Library
- K. Sankaralingam, M. Yalamanchi, S. Sethumadhavan, J. Browne. Pagerank Computation and Keyword Search on Distributed Systems and P2P Networks. J. Grid Comput., 1(3), 2003.Google ScholarCross Ref
- S. Shi, J. Yu, G. Yang, and D. Wang. Distributed page ranking in structured P2P networks. ICPP, 00:179, 2003.Google Scholar
- L. Si, R. Jin, J. Callan, and P. Ogilvie. A language modeling framework for resource selection and results merging. In CIKM, 2002. Google ScholarDigital Library
- I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and H. Balakrishnan. Chord: A scalable peer-to-peer lookup service for internet applications. In SIGCOMM, 2001. Google ScholarDigital Library
- C. Tang, Z. Xu, and S. Dwarkadas. Peer-to-peer information retrieval using self-organizing semantic overlay networks. In SIGCOMM, 2003. Google ScholarDigital Library
- P. Triantafillou and T. Pitoura. Towards a unifying framework for complex query processing over structured peer-to-peer data networks. In DBISP2P, 2003.Google Scholar
- Y. Wang and D. J. DeWitt. Computing pagerank in a distributed internet search engine system. In VLDB, 2004. Google ScholarDigital Library
- Y. Wang, L. Galanis, and D. J. de Witt. Galanx: An efficient peer-to-peer search engine system. Available at http://www.cs.wisc.edu~yuanwang.Google Scholar
- J. Wu and K. Aberer. Using a layered markov model for distributed web ranking computation. In ICDCS, 2005. Google ScholarDigital Library
Index Terms
- Size doesn't always matter: exploiting pageRank for query routing in distributed IR
Recommendations
Discovering and exploiting keyword and attribute-value co-occurrences to improve P2P routing indices
CIKM '06: Proceedings of the 15th ACM international conference on Information and knowledge managementPeer-to-Peer (P2P) search requires intelligent decisions for query routing: selecting the best peers to which a given query, initiated at some peer, should be forwarded for retrieving additional search results. These decisions are based on statistical ...
Query routing for Web search engines: architecture and experiments
AbstractGeneral-purpose search engines such as AltaVista and Lycos are notorious for returning irrelevant results in response to user queries. Consequently, thousands of specialized, topic-specific search engines (from VacationSpot.com to ...
Query workload-aware overlay construction using histograms
CIKM '05: Proceedings of the 14th ACM international conference on Information and knowledge managementPeer-to-peer(p2p) systems over an efficient means of data sharing among a dynamically changing set of a large number of a tonomous nodes.Each node in a p2p system is connected with a small number of other nodes thus creating an overlay network of nodes. ...
Comments