article

Statistical sampling of microarchitecture simulation

Authors:
Roland E. Wunderlich

Computer Architecture Laboratory at Carnegie Mellon, Pittsburgh, PA

Computer Architecture Laboratory at Carnegie Mellon, Pittsburgh, PA
View Profile

,
Thomas F. Wenisch

Computer Architecture Laboratory at Carnegie Mellon, Pittsburgh, PA

Computer Architecture Laboratory at Carnegie Mellon, Pittsburgh, PA
View Profile

,
Babak Falsafi

Computer Architecture Laboratory at Carnegie Mellon, Pittsburgh, PA

Computer Architecture Laboratory at Carnegie Mellon, Pittsburgh, PA
View Profile

,
James C. Hoe

Computer Architecture Laboratory at Carnegie Mellon, Pittsburgh, PA

Computer Architecture Laboratory at Carnegie Mellon, Pittsburgh, PA
View Profile

Authors Info & Claims

ACM Transactions on Modeling and Computer Simulation Volume 16 Issue 3pp 197–224https://doi.org/10.1145/1147224.1147225

Published:01 July 2006Publication History

ACM Transactions on Modeling and Computer Simulation

Abstract

Current software-based microarchitecture simulators are many orders of magnitude slower than the hardware they simulate. Hence, most microarchitecture design studies draw their conclusions from drastically truncated benchmark simulations that are often inaccurate and misleading. This article presents the Sampling Microarchitecture Simulation (SMARTS) framework as an approach to enable fast and accurate performance measurements of full-length benchmarks. SMARTS accelerates simulation by selectively measuring in detail only an appropriate benchmark subset. SMARTS prescribes a statistically sound procedure for configuring a systematic sampling simulation run to achieve a desired quantifiable confidence in estimates.Analysis of the SPEC CPU2000 benchmark suite shows that CPI and energy per instruction (EPI) can be estimated to within ±3% with 99.7% confidence by measuring fewer than 50 million instructions per benchmark. In practice, inaccuracy in microarchitectural state initialization introduces an additional uncertainty which we empirically bound to ∼2% for the tested benchmarks. Our implementation of SMARTS achieves an actual average error of only 0.64% on CPI and 0.59% on EPI for the tested benchmarks, running with average speedups of 35 and 60 over detailed simulation of 8-way and 16-way out-of-order processors, respectively.

References

Agarwal, A., Hennessy, J., and Horowitz, M. 1988. Cache performance of operating system and multiprogramming workloads. ACM Trans. Comput. Syst. 6, 4, 393--431. Google Scholar
Brooks, D., Tiwari, V., and Martonosi, M. 2000. Wattch: A framework for architectural-level power analysis and optimizations. In Proceedings of the 27th Annual International Symposium on Computer Architecture (June). Google Scholar
Burger, D. and Austin, T. M. 1997. The SimpleScalar tool set, version 2.0. Tech. rep. 1342, (June) Computer Sciences Department, University of Wisconsin--Madison, WI.Google Scholar
Burtscher, M. and Ganusov, I. 2004. Automatic synthesis of high-speed processor simulators. In Proceedings of the 37th Annual IEEE/ACM International Symposium on Microarchitecture (Dec). Google Scholar
Cain, H. W., Lepak, K. M., Schwartz, B. A., and Lipasti, M. H. 2002. Precise and accurate processor simulation. In Workshop on Computer Architecture Evaluation Using Commercial Workloads, HPCA (Feb.).Google Scholar
Chen, S. 2004. Direct SMARTS: Accelerating microarchitectural simulation through direct execution. MS Thesis (June) Electrical and Computer Engineering, Carnegie Mellon University.Google Scholar
Conte, T. M., Hirsch, M. A., and Menezes, K. N. 1996. Reducing state loss for effective trace sampling of superscalar processors. In Proceedings of the 14th International Conference on Computer Design (Oct.). Google Scholar
Easton, M. C. and Fagin, R. 1978. Cold-start vs. warm-start miss ratios. Comm. ACM 21, 10, 866--872. Google Scholar
Eeckhout, L., Nussbaum, S., Smith, J. E., and Bosschere, K. D. 2003. Statistical simulation: Adding efficiency to the computer designer's toolbox. IEEE Micro 23, 5, 26--38. Google Scholar
Eeckhout, L., Luo, Y., De Bosschere, K., and John, L. K. 2005. BLRL: Accurate and efficient warmup for sampled processor simulation. Comput. J. 48, 4, 451--459. Google Scholar
Hardavellas, N., Somogyi, S., Wenisch, T. F., Wunderlich, R. E., Chen, S., Kim, J., Falsafi, B., Hoe, J. C., and Nowatzyk, A. G. 2004. SimFlex: A fast, accurate, flexible full-system simulation framework for performance evaluation of server architecture. ACM SIGMETRICS Performance Evaluation Review (Mar.). Google Scholar
Hamerly, G., Perelman, E., Lau, J., and Calder, B. 2005. SimPoint 3.0: Faster and more flexible program analysis. J. Instruct. Level Parallel. (Sept.).Google Scholar
Haskins, J. W. and Skadron, K. 2001. Minimal Subset Evaluation: Rapid warm-up for simulated hardware state. In Proceedings of the 19th International Conference on Computer Design (Sept.). Google Scholar
Haskins, J. W. and Skadron, K. 2003. Memory Reference Reuse Latency: Accelerated warmup for sampled microarchitecture simulation. In Proceedings of the International Symposium on the Performance Analysis of Systems and Software (Mar.). Google Scholar
Hsu, W. C., Chen, H., and Yew, P. C. 2002. On the predictability of program behavior using different input data sets. In Workshop on Interaction between Compilers and Computer Architectures, (Feb.). Google Scholar
Iyengar, V. S., Trevillyan, L. H., and Bose, P. 1996. Representative traces for processor models with infinite cache. In Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture (Feb.). Google Scholar
Jain, R. K. 2001. The Art of Computer Systems Performance Analysis: Techniques for Experimental Design, Measurement, Simulation, and Modeling. Wiley-Interscience, New York, NY.Google Scholar
Kessler, R. E., Hill, M. D., and Wood, D. A. 1991. A comparison of trace-sampling techniques for multi-megabyte caches. IEEE Trans. Comput. 43, 6, 664--675. Google Scholar
Lafage, T. and Seznec, A. 2000. Choosing representative slices of program execution for microarchitecture simulations: A preliminary application to the data stream. In IEEE Workshop on Workload Characterization, ICCD (Sept.).Google Scholar
Laha, S., Patel, J. H., and Iyer, R. K. 1988. Accurate low-cost methods for performance evaluation of cache memory systems. IEEE Trans. Comput. 37, 11, 1325--1336. Google Scholar
Lau, J., Sampson, J., Perelman, E., Hamerly, G., and Calder, B. 2005. The strong correlation between code signatures and performance. In Proceedings of the International Symposium on Performance Analysis of Systems and Software (Mar.). Google Scholar
Lauterbach, G. 1994. Accelerating architectural simulation by parallel execution of trace samples. In Proceedings of the 27th Hawaii International Conference on System Sciences (Jan). Vol. 1: Architecture, 205--210.Google Scholar
Penry, D. A., Vachharajani, M., and August, D. I. 2005. Rapid development of flexible validated processor models. In Proceedings of the Workshop on Modeling, Benchmarking, and Simulation, ISCA (Nov.).Google Scholar
Reinhardt, S. K., Hill, M. D., Larus, J. R., Lebeck, A. R., Lewis, J. C., and Wood, D. A. 1993. The Wisconsin Wind Tunnel: Virtual prototyping of parallel computers. In Proceedings of the ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems (May). Google Scholar
Sherwood, T., Perelman, E., Hamerly, G., and Calder, B. 2002. Automatically characterizing large scale program behavior. In Proceedings of the 10th International Conference on Architectural Support for Programming Languages and Operating Systems (Oct.). Google Scholar
Smith, A. J. 1982. Cache memories. ACM Comput. Surv. 14, 3, 473--530. Google Scholar
Van Biesbrouck, M., Eeckhout, L., and Calder, B. 2005. Efficient sampling startup for sampled processor simulation. In Proceedings of the International Conference on High Performance Embedded Architectures and Compilers (Nov.). Google Scholar
Wenisch, T. F., Wunderlich, R. E., Fasafi, B., and Hoe, J. C. 2006. Simulation sampling with Live-points. In Proceedings of the International Symposium on Performance Analysis of Systems and Software (Mar.).Google Scholar
Wenisch, T. F., Wunderlich, R. E., Ferdman, M., Ailamaki, A., Falsafi, B., and Hoe, J. C. 2006a. Statistical sampling of computer system simulation. IEEE Macro 26, 4 (July). Google Scholar
Wood, D. A., Hill, M. D., and Kessler, R. E. 1991. A model for estimating trace-sample miss ratios. In Proceedings of the ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems (May). Google Scholar
Wunderlich, R. E., Wenisch, T. F., Falsafi, B., and Hoe, J. C. 2004. An evaluation of stratified sampling of microarchitecture simulations. In Third Annual Workshop on Duplicating, Deconstructing, and Debunking, ISCA (June).Google Scholar

Index Terms

Statistical sampling of microarchitecture simulation

Recommendations

Statistical sampling of microarchitecture simulation
IPDPS'06: Proceedings of the 20th international conference on Parallel and distributed processing

Current software-based microarchitecture simulators are many orders of magnitude slower than the hardware they simulate. Hence, most microarchitecture design studies draw their conclusions from drastically truncated benchmark simulations that are often ...
Read More
Two-Level Hybrid Sampled Simulation of Multithreaded Applications

Sampled microarchitectural simulation of single-threaded applications is mature technology for over a decade now. Sampling multithreaded applications, on the other hand, is much more complicated. Not until very recently have researchers proposed ...
Read More
TurboSMARTS: accurate microarchitecture simulation sampling in minutes
Performance evaluation review

Recent research proposes accelerating processor microarchitecture simulation through statistical sampling. Prior simulation sampling approaches construct accurate model state for each measurement by continuously warming large microarchitectural ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Modeling and Computer Simulation Volume 16, Issue 3
July 2006
119 pages
ISSN:1049-3301
EISSN:1558-1195
DOI:10.1145/1147224
Issue’s Table of Contents

Copyright © 2006 ACM
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 July 2006
Published in tomacs Volume 16, Issue 3

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Microarchitecture simulation
SPEC CPU2000 simulation
cold-start bias
simulation sampling
statistical sampling
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 28
  Total Citations
  View Citations
- 768
  Total Downloads
- Downloads (Last 12 months)22
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Statistical sampling of microarchitecture simulation

ACM Transactions on Modeling and Computer Simulation

Abstract

References

Cited By

Index Terms

Recommendations

Statistical sampling of microarchitecture simulation

Two-Level Hybrid Sampled Simulation of Multithreaded Applications

TurboSMARTS: accurate microarchitecture simulation sampling in minutes

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Statistical sampling of microarchitecture simulation

ACM Transactions on Modeling and Computer Simulation

Abstract

References

Cited By

Index Terms

Recommendations

Statistical sampling of microarchitecture simulation

Two-Level Hybrid Sampled Simulation of Multithreaded Applications

TurboSMARTS: accurate microarchitecture simulation sampling in minutes

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media