Abstract
Consider a website containing a collection of webpages with data such as in Yahoo or the Open Directory project. Each page is associated with a weight representing the frequency with which that page is accessed by users. In the tree hierarchy representation, accessing each page requires the user to travel along the path leading to it from the root. By enhancing the index tree with additional edges (hotlinks) one may reduce the access cost of the system. In other words, the hotlinks reduce the expected number of steps needed to reach a leaf page from the tree root, assuming that the user knows which hotlinks to take. The hotlink enhancement problem involves finding a set of hotlinks minimizing this cost.
This article proposes the first exact algorithm for the hotlink enhancement problem. This algorithm runs in polynomial time for trees with logarithmic depth. Experiments conducted with real data show that significant improvement in the expected number of accesses per search can be achieved in websites using this algorithm. These experiments also suggest that the simple and much faster heuristic proposed previously by Czyzowicz et al. [2003] creates hotlinks that are nearly optimal in the time savings they provide to the user.
The version of the hotlink enhancement problem in which the weight distribution on the leaves is unknown is discussed as well. We present a polynomial-time algorithm that is optimal for any tree for any depth.
- Armstrong, R., Freitag, D., Joachims, T., and Mitchell, T. 1995. WebWatcher: A learning apprentice for the World Wide Web. In Working Notes of the AAAI Spring Symposium: Information Gathering from Heterogeneous, Distributed Environments. Stanford, CA, AAAI Press, 6--12.Google Scholar
- Attardi, di Marco, S., and Salvi, D. 1998. Categorization by context. J. Universal Comput. Sci. 4, 9, 719--736.Google Scholar
- Bose, P., Krizanc, D., Langerman, S., and Morin, P. 2002. Asymmetric communication protocols via hotlink assignments. In Proceedings of the 9th Colloquium on Structural Information and Communication Complexity. 33--39.Google Scholar
- Bose, P., Czyzowicz, J., Gasieniec, L., Kranakis, E., Krizanc, D., Pelc, A., and Martin, M. V. 2000. Strategies for hotlink assignments. In Proceedings of the 11th International Symposium on Algorithms and Computation (ISAAC). 23--34. Google ScholarDigital Library
- Cyzyowicz, J., Kranakis, E., Krizanc, D., Pelc, A., and Vargas Martin, M. 2003. Enhancing hyperlink structure for improving web performance. J. Web Eng. 1, 2, 93--127.Google Scholar
- Dmoz. 2007. DMOZ website. www.dmoz.org.Google Scholar
- Fink, J., Kobasa, A., and Nill, A. 1996. User-oriented adaptivity and adaptability in the AVANTI project. In Designing for the Web: Empirical Studies. Microsoft Usability Group, Redmond, WA.Google Scholar
- Gerstel, O., Kutten, S., Matichin, R., and Peleg, D. 2003. Hotlink enhancement algorithms for web directories. In Proceedings of the International Symposium on Algorithms and Computation (ISAAC). 68--77.Google Scholar
- Glassman, S. 1994. A caching relay for the World Wide Web. In Proceedings of the 1st International World Wide Web Conference. 69--76. Google ScholarDigital Library
- Google. 2007. Google website. http://www.google.com/.Google Scholar
- Kranakis, E., Krizanc, D., and Shende, S. 2004. Approximating hotlink assignment. Inf. Proc. Lett. 90, 3, 121--128. Google ScholarDigital Library
- Matichin, R. and Peleg, D. 2003. Approximation algorithm for hotlink assignments in web directories. In Proceedings of the 8th Workshop on Algorithms and Data Structures. Ottawa, Canada, 271--280.Google Scholar
- Perkowitz, M. and Etzioni, O. 1999. Towards adaptive web sites: Conceptual framework and case study. In Proceedings of the 8th World Wide Web Conference. Google Scholar
- Pessoa, A., Laber, E., and Souza, C. 2004a. Efficient implementation of a hotlink assignment algorithm for web sites. In Proceedings of the Workshop on Algorithm Engineering and Experiments (ALENEX).Google Scholar
- Pessoa, A., Laber, E., and Souza, C. 2004b. Efficient algorithms for the hotlink assignment problem: The worst case search. In Proceedings of the International Symposium on Algorithms and Computation (ISAAC).Google Scholar
- Yahoo. 2007. Yahoo website. http://www.yahoo.com/.Google Scholar
Index Terms
- Reducing human interactions in Web directory searches
Recommendations
Approximate hotlink assignment
Consider a directed rooted tree T = (V, E) of maximal degree d representing a collection V of web pages connected via a set E of links all reachable from a source home page, represented by the root of T. Each leaf web page carries a weight ...
On computing the number of (BC-)subtrees, eccentric subtree number, and global and local means of trees
Highlights- Algorithms for computing the number of subtrees of trees in linear-time;
- the ...
AbstractIn this paper, we present algorithms for computing the number of subtrees of trees in linear-time, the number of BC-subtrees of trees in linear-time, the global mean of trees in linear-time, the local mean of a given vertex of trees in ...
Low-Degree Spanning Trees of Small Weight
Given $n$ points in the plane, the degree-$K$ spanning-tree problem asks for a spanning tree of minimum weight in which the degree of each vertex is at most $K$. This paper addresses the problem of computing low-weight degree-$K$ spanning trees for $K>2$...
Comments