Abstract
Although the interest of a Web page is strictly related to its content and to the subjective readers' cultural background, a measure of the page authority can be provided that only depends on the topological structure of the Web. PageRank is a noticeable way to attach a score to Web pages on the basis of the Web connectivity. In this article, we look inside PageRank to disclose its fundamental properties concerning stability, complexity of computational scheme, and critical role of parameters involved in the computation. Moreover, we introduce a circuit analysis that allows us to understand the distribution of the page score, the way different Web communities interact each other, the role of dangling pages (pages with no outlinks), and the secrets for promotion of Web pages.
- Bharat, K. and Henzinger, M. R. 1998. Improved algorithms for topic distillation in hyperlinked environments. In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, 104--111. Google ScholarDigital Library
- Bianchini, M., Fanelli, S., and Gori, M. 2001. Optimal algorithms for well-conditioned nonlinear systems of equations. IEEE Trans. Comput. 50, 7, 689--698. Google ScholarDigital Library
- Björck, A. 1996. Numerical Methods for Least Squares Problems. Society for Industrial and Applied Mathematics.Google Scholar
- Bomze, I. and Gutjahr, W. 1994. The dinamics of self--evaluation. Appl. Math. Comput. 64, 47--63. Google ScholarDigital Library
- Bomze, I. and Gutjahr, W. 1995. Estimating qualifications in a self-evaluating group. Qual. Quant. 29, 241--250.Google ScholarCross Ref
- Borodin, A., Roberts, G. O., Rosenthal, J. S., and Tsaparas, P. 2001. Finding authorities and hubs from link structures on the world wide web. In Proceedings of the 10th International World Wide Web Conference. Google ScholarDigital Library
- Brin, S., Motwani, R., Page, L., and Winograd, T. 1998. What can you do with a web in your pocket? IEEE Bulle. Techn. Comm. Data Eng., IEEE Comput. Soc. 21, 2, 37--47.Google Scholar
- Brin, S. and Page, L. 1998. The anatomy of a large--scale hypertextual Web search engine. In Proceedings of the 7th World Wide Web Conference (WWW7). Google ScholarDigital Library
- Brin, S., Page, L., Motwani, R., and Winograd, T. 1999. The PageRank citation ranking: Bringing order to the Web. Tech. Rep. 1999-66, Stanford University. Available on the Internet at http://dbpubs.stanford.edu:8090/pub/1999-66.Google Scholar
- Cohn, D. and Chang, H. 2000. Learning to probabilistically identify authoritative documents. In Proceedings of 17th International Conference on Machine Learning. Morgan Kaufmann, San Francisco, Calif., 167--174. Google ScholarDigital Library
- Cohn, D. and Hofmann, T. 2001. The missing link---A probabilistic model of document content and hypertext connectivity. In Neural Inf. Proc. Syst. 13.Google Scholar
- Diligenti, M., Gori, M., and Maggini, M. 2002. Web page scoring systems for horizontal and vertical search. In Proceedings of the 11th World Wide Web Conference (WWW11). Google ScholarDigital Library
- Golub, G. H. and Van Loan, C. F. 1993. Matrix computation. The Johns Hopkins University Press.Google Scholar
- Haveliwala, T. H. 1999. Efficient computation of pagerank. Tech. Rep. 1999-66, Stanford University. Available on the Internet at http://dbpubs.stanford.edu:8090/pub/1999-66.Google Scholar
- Haveliwala, T. H. 2002. Topic sensitive pagerank. In Proceedings of the 11th World Wide Web Conference (WWW11). Available on the Internet at http://dbpubs.stanford.edu:8090/pub/2002-6. Google ScholarDigital Library
- Henzinger, M. 2001. Hyperlink analysis for the Web. IEEE Internet Computing 5, 1, 45--50. Google ScholarDigital Library
- Kleinberg, J. 1999. Authoritative sources in a hyperlinked environment. J. ACM 46, 5, 604--632. Google ScholarDigital Library
- Lempel, R. and Moran, S. 2000. The stochatic approach for link--structure analysis (SALSA) and the TKC effect. In Proceedings of the 9th World Wide Web Conference (WWW9). Elsevier Science, 387--401. Google ScholarDigital Library
- Marchiori, M. 1997. The quest for correct information on the Web: Hyper search engines. Computer Networks and ISDN Systems 29, 1225--1235. Google ScholarDigital Library
- Motwani, R. and Raghavan, P. 1995. Randomized algorithms. Cambridge University Press. Google ScholarDigital Library
- Ng, A. Y., Zheng, A. X., and Jordan, M. I. 2001a. Link analysis, eigenvectors and stability. In Proceedings of International Conference on Research and Development in Information Retrieval (SIGIR 2001). ACM, New York.Google Scholar
- Ng, A. Y., Zheng, A. X., and Jordan, M. I. 2001b. Stable algorithms for link analysis. In Proceedings of International Joint Conference on Artificial Intelligence (IJCAI'2001).Google Scholar
- Pringle, G., Allison, L., and Dowe, D. L. 1998. What is tall poppy among the Web pages? Comput. Netwo. ISDN Syst. 30, 369--377. Google ScholarDigital Library
- Richardson, M. and Domingos, P. 2002. The intellingent surfer: probabilistic combination of link and content information in pagerank. In Advances in Neural Information Processing Systems, 14. MIT Press, Cambridge, Mass.Google Scholar
- Rumelhart, D., Hinton, G., and Williams, R. 1986. Learning representations by back-propagating errors. Nature 323, 533--536.Google ScholarCross Ref
- Seneta, E. 1981. Non-negative matrices and Markov chains. Springer-Verlag, New York, Chap. 4, pp. 112--158.Google Scholar
- Varga, R. S. 1962. Matrix Iterative Analysis. Prentice--Hall, Englewood Cliffs, N.J.Google Scholar
- Zhang, D. and Dong, Y. 2000. An efficient algorithm to rank web resources. In Proceedings of the 9th International World Wide Web Conference (WWW9). Elsevier Science, Amsterdam, The Netherlands. Google ScholarDigital Library
Index Terms
- Inside PageRank
Recommendations
PageRank revisited
PageRank, one part of the search engine Google, is one of the most prominent link-based rankings of documents in the World Wide Web. Usually it is described as a Markov chain modeling a specific random surfer. In this article, an alternative ...
Beyond PageRank: machine learning for static ranking
WWW '06: Proceedings of the 15th international conference on World Wide WebSince the publication of Brin and Page's paper on PageRank, many in the Web community have depended on PageRank for the static (query-independent) ordering of Web pages. We show that we can significantly outperform PageRank using features that are ...
Associated pagerank: improved pagerank measured by frequent term sets
VECIMS'09: Proceedings of the 2009 IEEE international conference on Virtual Environments, Human-Computer Interfaces and Measurement SystemsWeb search engines encounter many new challenges while the amount of information on the web increases rapidly. Web documents have been a main resource for various purposes, and people rely on search engines to retrieve the desired documents. This paper ...
Comments