skip to main content
article

Characterizing the scalability of a large web-based shopping system

Published:01 August 2001Publication History
Skip Abstract Section

Abstract

This article presents an analysis of five days of workload data from a large Web-based shopping system. The multitier environment of this Web-based shopping system includes Web servers, application servers, database servers, and an assortment of load-balancing and firewall appliances. We characterize user requests and sessions and determine their impact on system performance scalability. The purpose of our study is to assess scalability and support capacity planning exercises for the multitier system. We find that horizontal scalability is not always an adequate mechanism for supporting increased workloads and that personalization and robots can have a significant impact on system scalability.

References

  1. ABDELZAHER,T.AND BHATTI, N. 1999. Web server QoS management by adaptive content delivery. Tech. Rep. HPL-1999-161. Hewlett-Packard Laboratories, Palo Alto, CA.Google ScholarGoogle Scholar
  2. ALMEIDA, V., RIEDI, R., MENASC~, D., MEIRA, W., RIBEIRO, F., AND FONSECA, R. 2001. Characterizing and modeling robot workload on e-business sites.Google ScholarGoogle Scholar
  3. ARLITT,M.AND JIN, T. 2000. Workload characterization of the 1998 World Cup Web site. IEEE Network 14, 3 (May-June), 30-37. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. ARLITT,M.AND WILLIAMSON, C. 1997. Internet Web servers: workload characterization and performance implications. IEEE/ACM Trans. Netw. 5, 5, 631-645. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. BARFORD, P., BESTAVROS, A., BRADLEY, A., AND CROVELLA, M. 1999. Changes in Web client access patterns. World Wide Web J. 2, 15-28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. BRESLAU, L., CAO, P., FAN, L., PHILLIPS, G., AND SHENKER, S. 1999. Web caching and Zipf-like distributions: Evidence and implications. In Proceedings of the IEEE INFOCOM Conference (New York, NY, Mar.). IEEE Computer Society Press, Los Alamitos, CA.Google ScholarGoogle ScholarCross RefCross Ref
  7. CAO,P.AND LIU, C. 1998. Maintaining strong cache consistency in the World Wide Web. IEEE Trans. Comput. 47, 4, 445-457. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. CHALLENGER, J., IYENGAR, A., AND DANTZIG, P. 1999. A scalable system for consistently caching dynamic Web data. In Proceedings of the IEEE INFOCOM Conference (New York, NY, Mar.). IEEE Computer Society Press, Los Alamitos, CA.Google ScholarGoogle ScholarCross RefCross Ref
  9. CUNHA, C., BESTAVROS, A., AND CROVELLA, M. 1995. Characteristics of WWW client-based traces. Tech. Rep. TR-95-010. Boston University, Boston, MA. Google ScholarGoogle Scholar
  10. DILLEY, J., ARLITT, M., PERRET, S., AND JIN, T. 1999. The distributed object consistency protocol: Version 1.0. Tech. Rep. HPL-1999-109. Hewlett-Packard Laboratories, Palo Alto, CA.Google ScholarGoogle Scholar
  11. HARTIGAN, J. 1975. Clustering Algorithms. John Wiley and Sons, Inc., New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. JAIN, R. 1991. The Art of Computer Systems Performance Analysis: Techniques for Experimen-tal Design, Measurement, Simulation, and Modeling. John Wiley and Sons, Inc., New York, NY.Google ScholarGoogle Scholar
  13. KAUFMAN,L.AND ROUSSEEUW, P. J. 1990. Finding Groups in Data. John Wiley and Sons, Inc., New York, NY.Google ScholarGoogle Scholar
  14. KOSTER, M. 1994. A standard for robot exclusion. Tech. Rep.Google ScholarGoogle Scholar
  15. LEE,J.AND PODLASECK, M. 2000. Visualization and analysis of clickstream data of online stores for understanding Web merchandising. Int. J. Data Mining Knowl. Discovery. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. MENASC~,D.AND ALMEIDA, V. 2000. Scaling for E-Business. Prentice-Hall, Inc., Englewood Cliffs, NJ. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. MENASC~, D., ALMEIDA, V., FONSECA, R., AND MENDES, M. 1999. A methodology for workload characterization of e-commerce sites. In Proceedings of the ACM Conference on Electronic Commerce (Denver, CO, Nov.). ACM Press, New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. MENASC~, D., ALMEIDA, V., RIEDI, R., RIBEIRO, F., FONSECA, R., AND MERIA, W. 2000. In search of invariants for e-business workloads. In Proceedings of the ACM Conference on Electronic Commerce (Minneapolis, MN, Oct.). ACM Press, New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. MOGUL, J., DOUGLIS, F., FELDMANN, A., AND KRISHNAMURTHY, B. 1997. Potential benefits of delta encoding and data compression for HTTP. SIGCOMM Comput. Commun. Rev. 27,4, 181-194. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. PADMANABHAN,V.AND QUI, L. 2000. The content and access dynamics of a busy Web site: Findings and implications. In Proceedings of the ACM SIGCOMM Conference (Stockholm, Sweden, Aug.). ACM Press, New York, NY, 111-123. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. REICHHELD,F.AND SASSER, W. 1990. Zero defections: Quality comes to services. Harvard Bus. Rev. (Sept.-Oct.).Google ScholarGoogle Scholar
  22. VANDERMEER, D., DUTTA, K., DATTA, A., RAMAMRITHAM, K., AND NAVATHE, S. 2000. Enabling scalable online personalization on the Web. In Proceedings of the ACM Conference on Electronic Commerce (Minneapolis, MN, Oct.). ACM Press, New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. WANG, J. 1999. A survey of Web caching schemes for the Internet. SIGCOMM Comput. Commun. Rev. 29, 5 (Oct.), 36-46. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. YIN, J., ALVISI, L., DAHLIN, M., AND LIN, C. 1999. Hierarchical cache consistency in WAN. In Proceedings of the Second USENIX Symposium on Internet Technologies and Systems (Boulder, CO, Oct.). USENIX Assoc., Berkeley, CA, 13-24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. YU, H., BRESLAU, L., AND SCHENKER, S. 1999. A scalable Web cache consistency architecture. In Proceedings of the ACM SIGCOMM Conference (Cambridge, MA, Sept.). ACM Press, New York, NY, 163-174. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Characterizing the scalability of a large web-based shopping system

                  Recommendations

                  Reviews

                  Guenter Haring

                  The authors present an analysis of five days of workload data from a large Web-based shopping system. Their purpose is to investigate the issues affecting the performance and scalability of such a system. The Web-based shopping system under study has a multi-tier architecture typical of e-commerce sites, including Web servers, application servers, database servers, and an assortment of load-balancing and firewall appliances. This architecture is described in section 3 of the paper. The authors then discuss the sources of their measurement data at the various levels. In section 5, the results of the (HTTP-level) workload characterization are presented, including the distribution of requests by resource type, site usage during measurement periods, resource referencing patterns, and client request behaviors. In section 6, the authors characterize classes of requests based on the impact of their performance on the system. The impact of these request classes on system scalability is also discussed. The authors identify three classes of requests, with different resource demands: cacheable, non-cacheable, and search. The section also includes an investigation of the sensitivity of system scalability to request class mix and request cache hit rate. While section 6 contains an analysis of individual requests, section 7 presents a session-level characterization of the system under study. Issues that pertain to the two kinds of sources that make use of the system, namely users and robots, and session-level characteristics for two different time periods are discussed. Clustering techniques are used to categorize individual sessions (both user and robot), based on their performance impact, to support the evaluation of system scalability. The last section covers performance and scalability issues. Although the part on capacity planning and scalability could have been more extensive, the paper represents a good workload characterization of a real Web-based shopping system. Online Computing Reviews Service

                  Access critical reviews of Computing literature here

                  Become a reviewer for Computing Reviews.

                  Comments

                  Login options

                  Check if you have access through your login credentials or your institution to get full access on this article.

                  Sign in

                  Full Access

                  • Published in

                    cover image ACM Transactions on Internet Technology
                    ACM Transactions on Internet Technology  Volume 1, Issue 1
                    Aug. 2001
                    140 pages
                    ISSN:1533-5399
                    EISSN:1557-6051
                    DOI:10.1145/383034
                    Issue’s Table of Contents

                    Copyright © 2001 ACM

                    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                    Publisher

                    Association for Computing Machinery

                    New York, NY, United States

                    Publication History

                    • Published: 1 August 2001
                    Published in toit Volume 1, Issue 1

                    Permissions

                    Request permissions about this article.

                    Request Permissions

                    Check for updates

                    Qualifiers

                    • article

                  PDF Format

                  View or Download as a PDF file.

                  PDF

                  eReader

                  View online with eReader.

                  eReader