Abstract
Commercial applications such as databases and Web servers constitute the largest and fastest-growing segment of the market for multiprocessor servers. Ongoing innovations in disk subsystems, along with the ever increasing gap between processor and memory speeds, have elevated memory system design as the critical performance factor for such workloads. However, most current server designs have been optimized to perform well on scientific and engineering workloads, potentially leading to design decisions that are non-ideal for commercial applications. The above problem is exacerbated by the lack of information on the performance requirements of commercial workloads, the lack of available applications for widespread study, and the fact that most representative applications are too large and complex to serve as suitable benchmarks for evaluating trade-offs in the design of processors and servers.This paper presents a detailed performance study of three important classes of commercial workloads: online transaction processing (OLTP), decision support systems (DSS), and Web index search. We use the Oracle commercial database engine for our OLTP and DSS workloads, and the AltaVista search engine for our Web index search workload. This study characterizes the memory system behavior of these workloads through a large number of architectural experiments on Alpha multiprocessors augmented with full system simulations to determine the impact of architectural trends. We also identify a set of simplifications that make these workloads more amenable to monitoring and simulation without affecting representative memory system behavior. We observe that systems optimized for OLTP versus DSS and index search workloads may lead to diverging designs, specifically in the size and speed requirements for off-chip caches.
- 1 J. M. Anderson, L. M. Berc, J. Dean, S, Ghemawat, M, R, Henzinger, S.-T. Leung, R, L. Sites, M. T. Vandervoorde, C, A. Waldspurger, and W. E. Weihl. Continuous profiling: Where have all the cycles gone? In Proceedings of the 16th International Symposium on Operating Systems Principles, pages 1-14, Oct 1997. Google ScholarDigital Library
- 2 M. Burrows. Private communication.Google Scholar
- 3 Z. Cventanovic and D, Bhandarkar. Performance characterization of the Alpha 21164 microprocessor using TP and SPEC- workloads. In Proceedings of the 21 st Annual International Symposium on Computer Architecture, pages 60-70, Apt 1994. Google ScholarDigital Library
- 4 Z. Cvetanovic and D. D. Donaldson. AlphaServer 4100 performance characterization. Digital Technical Journal, 8(4):3-20, 1996.Google Scholar
- 5 Digital Equipment Corporation. Digital Semiconductor 21164 Alpha microprocessor hardware reference manual, March 1996.Google Scholar
- 6 M. Dubois, J. Skeppstedt, L. Ricciulli, K. Ramamurthy, and E Stenstrom. The detection and elimination of useless misses in multiprocessors. In Proceedings of the 20th International Symposium on Computer Architecture, pages 88-97, May 1993. Google ScholarDigital Library
- 7 R, J. Eickemeyer, R. E. Johnson, S. R. Kunkel, M. S. Squillante, and S. Liu. Evaluation of multithreaded uniprocessors for commercial application environments. In Proceedings of the 21th Annual International Symposium on Computer Architecture, pages 203-212, June 1996. Google ScholarDigital Library
- 8 C. Hristea, D. Lenoski, and J. Keen. Measuring memory hierarchy performance of cache-coherent multiprocessors using micro benchmarks. In Proceedings of Supercomputing '97, November 1997. Google ScholarDigital Library
- 9 T. Kawaf, D, J. Shakshober, and D. C. Stanley. Performance analysis using very large memory on the 64-bit AlphaServer system. Digital Technical Journal, 8(3):58-65, 1996. Google ScholarDigital Library
- 10 J. L. Lo, L. A. Barroso, S. J. Eggers, K. Gharachorloo, H. M. Levy, and S. S. Parekh. An analysis of database workload performance on simultaneous multithreaded processors. In Proceedings of the 25th Annual International Symposium on Computer Architecture, June 1998. Google ScholarDigital Library
- 11 T. Lovett and R. Clapp. STING: A CC-NUMA computer system for the commercial marketplace. In Proceedings of the 23rd Annual International Symposium on Computer Architecture, pages 308-317, May 1996. Google ScholarDigital Library
- 12 A. M. G. Maynard, C. M. Donnelly, and B. R. Olszewski. Contrasting characteristics and cache performance of technical and multi-user commercial workloads. In Proceedings of the Sixth International Conference on Architectural Support.for Programming Languages and Operating Systems, pages 145-156, Oct 1994. Google ScholarDigital Library
- 13 J.D. McCalpin. Memory bandwidth and machine balance in current high performance computers. In IEEE Technical Committee on Computer Architecture Newsletter, Dec 1995.Google Scholar
- 14 S. E. Perl and R. L. Sites. Studies of windows NT performance using dynamic execution traces. In Proceedings of the Second Symposium on Operating System Design and Implementation, pages 169-184, Oct. 1996. Google ScholarDigital Library
- 15 M. Rosenblum, E. Bugnion, S. A. Herrod, and S. Devine. Using the SimOS machine simulator to study complex computer systems. ACM Transactions on Modeling and Computer Simulation, 7(1):78-103, Jan. 1997. Google ScholarDigital Library
- 16 M. Rosenblum, E. Bugnion, S. A. Herrod, E. Witchel, and A, Gupta, The impact of architectural trends on operating system performance. In Proceedings of the Fifteenth ACM Symposium on Operating Systems Principles, pages 285-298, 1995. Google ScholarDigital Library
- 17 A. Srivastava and A. Eustace. ATOM: A system for building customized program analysis tools, in Proceedings of the SIGPLAN '94 Conference on Programming Language Design and Implementation, pages 196--205, June 1994. Google ScholarDigital Library
- 18 Standard Performance Council. The SPEC95 CPU Benchmark Suite. http://www.specbench.org, 1995.Google Scholar
- 19 G. Sturner. OracleZ A User's and developer's Guide. Thomson Computer Press, 1995. Google ScholarDigital Library
- 20 S. S. Thakkar and M. Sweiger. Performance of an OLTP application on Symmetry multiprocessor system. In Proceedings of the 17th Annual International Symposium on Computer Architecture, pages 228-238, June 1990. Google ScholarDigital Library
- 21 P. Trancoso, J.-L. Larriba-Pey, Z. Zhang, and J. Torrellas. The memory performance of DSS commercial workloads in sharedmemory multiprocessors. In Third International Symposium on High-Performance Computer Architecture, Jan 1997. Google ScholarDigital Library
- 22 Transaction Processing Performance Council. TPC Benchmark B (Online Transaction Processing) Standard Specification, 1990.Google Scholar
- 23 Transaction Processing Performance Council. TPC Benchmark D (Decision Support) Standard Specification, Dec 1995.Google Scholar
- 24 B. Verghese, S, Devine, A. Gupta, and M. Rosenblum. Operating system support for improving data locality on CC-NUMA computer servers. In Proceedings of the Seventh International Conference on Architectural Support for Programming Languages and Operating Systems, pages 279-289, October 1996. Google ScholarDigital Library
- 25 E. Witchel and M. Rosenblum. Embra: Fast and flexible machine simulation, In Proceedings of the 1996 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, pages 68-79, May 1996. Google ScholarDigital Library
- 26 S. C. Woo, M. Ohara, E. Torrie, J. E Singh, and A. Gupta. The SPLASH-2 programs: Characterization and methodological considerations. In Proceedings of the 22nd international Symposium on Computer Architecture, pages 24-36, June 1995. Google ScholarDigital Library
Index Terms
- Memory system characterization of commercial workloads
Recommendations
Memory system characterization of commercial workloads
ISCA '98: Proceedings of the 25th annual international symposium on Computer architectureCommercial applications such as databases and Web servers constitute the largest and fastest-growing segment of the market for multiprocessor servers. Ongoing innovations in disk subsystems, along with the ever increasing gap between processor and ...
Cache Optimization for Memory-Resident Decision Support Commercial Workloads
ICCD '99: Proceedings of the 1999 IEEE International Conference on Computer DesignDramatic increases in the main-memory size of computers is allowing some applications to shift their main data storage area from disk to main memory and, as a result, increase their performance. This trend is at work in some databases, resulting in what ...
Memory coherence activity prediction in commercial workloads
WMPI '04: Proceedings of the 3rd workshop on Memory performance issues: in conjunction with the 31st international symposium on computer architectureRecent research indicates that prediction-based coherence optimizations offer substantial performance improvements for scientific applications in distributed shared memory multiprocessors. Important commercial applications also show sensitivity to ...
Comments