article

Free Access

Analysis of benchmark characteristics and benchmark performance prediction

Authors:
Rafael H. Saavedra

Univ. of Southern California, Los Angeles

Univ. of Southern California, Los Angeles
View Profile

,
Alan J. Smith

Univ. of California, Berkeley

Univ. of California, Berkeley
View Profile

Authors Info & Claims

ACM Transactions on Computer Systems Volume 14 Issue 4pp 344–384https://doi.org/10.1145/235543.235545

Published:01 November 1996Publication History

ACM Transactions on Computer Systems

Abstract

Standard benchmarking provides to run-times for given programs on given machines, but fails to provide insight as to why those results were obtained (either in terms of machine or program characteristics) and fails to provide run-times for that program on some other machine, or some other programs on that machine. We have developed a machine-imdependent model of program execution to characterize both machine performance and program execution. By merging these machine and program characterizations, we can estimate execution time for arbitrary machine/program combinations. Our technique allows us to identify those operations, either on the machine or in the programs, which dominate the benchmark results. This information helps designers in improving the performance of future machines and users in tuning their applications to better utilize the performance of existing machines. Here we apply our methodology to characterize benchmarks and predict their execution times. We present extensive run-time statistics for a large set of benchmarks including the SPEC and Perfect Club suites. We show how these statistics can be used to identify important shortcoming in the programs. In addition, we give execution time estimates for a large sample of programs and machines and compare these against benchmark results. Finally, we develop a metric for program similarity that makes it possible to classify benchmarks with respect to a large set of characteristics.

References

ALLEN, F., BURKE, M., CHARLES, P., CYTRON, R., AND FERRANTE, J. 1987. An overview of the PTRAN analysis system for multiprocessing. In Proceedings of the Supercomputing '87 Conference. ACM, New York. Google Scholar
BACON, D. F., GRAHAM, S. L., AND SHARP, O.L. 1994. Compiler transformations for highperformance computing. ACM Comput. Surv. 26, 4 (Dec.), 345-420. Google Scholar
BAILEY, D. H. AND BARTON, J.T. 1985. The NAS kernel benchmark program. NASA Tech. Memo. 86711, NASA, Ames, Iowa. Aug.Google Scholar
BALASUNDARAM, V., FOX, G., KENNEDY, K., AND KREMER, U. 1991. A static performance estimator to guide data partitioning decisions. In the 3rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. ACM, New York, 213-223. Google Scholar
BALASUNDARAM, V., KENNEDY, K., KREMER, U., MCKINLEY, K., AND SUBHLOK, J. 1989. The ParaScope Editor: An interactive parallel programming tool. In Proceedings of the Supercomputing '89 Conference. ACM, New York. Google Scholar
BEIZER, B. 1978. Micro Analysis of Computer System Performance. Van Nostrand, New York. Google Scholar
CLAPP, R. M., DUCHESNEAU, L., VOLZ, R. A., MUDGE, T. N., AND SCHULTZE, T. 1986. Toward real-time performance benchmarks for ADA. Commun. ACM 29, 8 (Aug.), 760-778. Google Scholar
CURNOW, H. J. AND WICHMANN, B.A. 1976. A synthetic benchmark. Comput. J. 19, 1 (Feb.), 43-49.Google Scholar
CURRAH, B. 1975. Some causes of variability in CPU time. Comput. Meas. Eval. 3, 389-392.Google Scholar
CYBENKO, a., KIPP, L., POINTER, L., AND KUCK, D. 1990. Supercomputer performance evaluation and the Perfect benchmarks. Tech. Rep. 965, Center for Supercomputing Research and Development, Univ. of Illinois, Urbana-Champaign, Ill. Mar.Google Scholar
DODUC, N. 1989. Fortran execution time benchmark. Version 29. Departement Informatique, Framantec, France. Unpublished manuscript. Mar.Google Scholar
DONGARRA, g.g. 1988. Performance of various computers using standard linear equations software in a Fortran environment. Comput. Arch. News 16, 1 (Mar.), 47-69. Google Scholar
DONGARRA, J. J., MARTIN, J., AND WORLTON, J. 1987. Computer benchmarking: Paths and pitfalls. Computer 24, 7 (July), 38-43. Google Scholar
GEE, J. AND SMITH, A.J. 1993. TLB performance of the SPEC benchmark suite. Univ. of California, Berkeley, Calif. Unpublished manuscript. Google Scholar
GEE, J., HILL, M. D., PNEVMATIKATOS, D. N., AND SMITH, A.J. 1991. Cache performance of the SPEC benchmark suite. IEEE Micro 13, 4 (Aug.), 17-27. Early version appears as Tech. Rep. UCB/CSD 91/648, Dept. of Computer Science, Univ. of California, Berkeley, Calif., Sept. 1991. Google Scholar
GROVES, R. D. AND OEHLER, R. 1990. RISC System/6000 processor architecture. In IBM RISC System~6000 Technology. SA23-2619, IBM Corp., Armonk, N.Y., 16-23.Google Scholar
HICKEY, T. AND COHEN, J. 1988. Automatic program analysis. J. ACM 35, 1 (Jan.), 185-220. Google Scholar
KNUTH, D. E. 1971. An empirical study of Fortran programs. Softw. Pract. Exper. 1, 105-133.Google Scholar
MCMAHON, F.H. 1986. The Livermore Fortran kernels: A computer test of the floatingpoint performance range. Rep. UCRL-53745, Lawrence Livermore National Laboratories, Livermore, Calif. Dec.Google Scholar
MIPS. 1989. MIPS UNIX Benchmarks. Perf. Brief CPUBenchmarks 3.8 (June).Google Scholar
OLSSON, B., MONTOYE, R., MARKSTEIN, P., AND NGUYENPHU, M. 1990. RISC System/6000 floating-point unit. In RISC System~60000 Technology. SA23-2619, IBM Corp., Armonk, N.Y., 34-43.Google Scholar
PUETO, B. L. AND SHUSTEK, L.J. 1977. An instruction timing model of CPU performance. In the 4th Annual Symposium on Computer Architecture. ACM, New York, 165-178. Google Scholar
PONDER, C.G. 1990. An analytical look at linear performance models. Tech. Rep. UCRL-JC- 106105, Lawrence Livermore National Laboratories, Livermore, Calif. Sept.Google Scholar
RAMAMOORTHY, C.V. 1965. Discrete Markov analysis of computer programs. In Proceedings of the ACM National Conference. ACM, New York, 386-392. Google Scholar
SAAVEDRA, R. H. AND SMITH, A.J. 1992. Analysis of benchmark characteristics and benchmark performance prediction. Tech. Rep. USC-CS-92-524, Univ. of Southern California, Los Angeles. Extended version appears as Tech. Rep. UCB/CSD 92/715, Univ. of California, Berkeley, Calif., 1992, Dec. Google Scholar
SAAVEDRA, R. H. AND SMITH, A.J. 1995a. Benchmarking optimizing compilers. IEEE Trans. Softw. Eng. 21, 7 (July), 615-628. Google Scholar
SAAVEDRA, R. H. AND SMITH, A.J. 1995b. Measuring cache and TLB performance. IEEE Trans. Comput. 44, 10 (Oct.), 1223-1235. Google Scholar
SAAVEDRA-BARRERA, R. H. 1988. Machine characterization and benchmark performance prediction. Tech. Rep. UCB/CSD 88/437, Univ. of California, Berkeley, Calif. June. Google Scholar
SAAVEDRA-BARRERA, R. H. 1992. CPU performance and evaluation time prediction using narrow spectrum benchmarking. Ph.D. thesis, Tech. Rep. UCB/CSD 92/684, Univ. of California, Berkeley, Calif. Feb. Google Scholar
SAAVEDRA-BARRERA, R. H. AND SMITH, A.J. 1990. Benchmarking and the abstract machine characterization model. Tech. Rep. UCB/CSD-90-607, Univ. of California, Berkeley, Calif. Nov.Google Scholar
SAAVEDRA-BARRERA, R. H., SMITH, A. J., AND MIYA, E. 1989. Machine characterization based on an abstract high-level language machine. IEEE Trans. Comput. 38, 12 (Dec.), 1659-1679. Google Scholar
SARKAR, V. 1989. Determining average program execution times and their variance. In Proceedings of the SIGPLAN '89 Conference on Programming Language Design and Implementation. ACM, New York, 298-312. Google Scholar
SPEC. 1989a. SPEC Newslett. Benchmark Results 2, 1 (Winter).Google Scholar
SPEC. 1989b. SPEC Newslett. Benchmark Results 2, 2 (Spring).Google Scholar
SPEC. 1990a. SPEC Newslett. Benchmark Results 3, 1 (Winter).Google Scholar
SPEC. 1990b. SPEC Newslett. Benchmark Results 3, 2 (Spring).Google Scholar
UCB. 1987. SPICE2G.6. EECS/ERL Industrial Liaison Program, Univ. of California, Berkeley, Calif. Mar. Software.Google Scholar
VON WORLEY, S. AND SMITH, A.J. 1995. Microbenchmarking and performance prediction for parallel computers. Tech. Rep. UCB/CSD-95-873, Univ. of California, Berkeley, Calif. May. Google Scholar
WEICKER, R. P. 1988. Dhrystone benchmark: Rationale for version 2 and measurement rules. SIGPLAN Not. 23, 8 (Aug.). Google Scholar
WORLTON, J. 1984. Understanding supercomputer benchmarks. Datamation 30, 14 (Sept. 1), 121-130.Google Scholar

Index Terms

Analysis of benchmark characteristics and benchmark performance prediction

Recommendations

How to Build a Benchmark
ICPE '15: Proceedings of the 6th ACM/SPEC International Conference on Performance Engineering

Standardized benchmarks have become widely accepted tools for the comparison of products and evaluation of methodologies. These benchmarks are created by consortia like SPEC and TPC under confidentiality agreements which provide little opportunity for ...
Read More
Analysis of Benchmark Characteristics and Benchmark Performance
Read More
Subsetting the SPEC CPU2006 benchmark suite

On August 24, 2006, the Standard Performance Evaluation Corporation (SPEC) announced CPU2006 -- the next generation of industry-standardized CPU-intensive benchmark suite. The SPEC CPU benchmark suite has become the most frequently used suite for ...
Read More

Reviews

Reviewer: Guenter Haring

A machine-independent model of program execution to characterize both machine performance and program execution is described. The model consists of a set of abstract operations representing the basic operators and language constructs in programs. A special benchmark (machine characterizer) is used to measure the time it takes to execute each abstract operation. Details of this procedure have been published in another paper. This paper focuses on program characterization and execution time prediction, using a large set of programs including the Perfect Club and SPEC benchmarks as well as various applications and synthetic benchmarks, all based on Fortran. After a short introduction, the paper presents an overview of the abstract model, including a description of its parts, among them the machine characterizer, program analyzer, and execution predictor, which allow a user to predict the performance of a program or machine. Various factors influencing the execution time, such as memory hierarchy, compiler optimization, and vectorization, are briefly discussed in this section. The investigations in this paper are based on unoptimized code and ignore memory hierarchy delays. More details on the extension of the basic abstract model to include these aspects are to be found in other publications by these authors. After the set of 28 programs and 18 machines is described, the predicted execution times for 244 program-machine combinations are compared to the real execution times. The predictions are good. The authors explain some of the larger errors. The main part of the paper is devoted to program characterization provided by the dynamic statistics of the programs, deepening readers' understanding of each benchmark and its relation to the machine. It constitutes a summary of the results, covering statistics on basic blocks and statements, arithmetic and logical operations, references to array and scalar variables, execution time distribution, dynamic distribution of basic blocks, distribution of abstract operations, the amount of skewness in programs, and the distribution of errors. Interesting observations and conclusions are provided, including similarities between benchmarks. Unfortunately, the authors do not say either how the clustering according to the similarity of distributions was done or how the number of clusters was determined. In the last section of the paper, the authors use the dynamic statistics of the benchmarks to define a metric of similarity among the programs. Two different metrics are presented and compared. The first metric is based on the expectation that programs which execute similar operations will produce similar runtime results, that is, similar mean estimates of processing speed for a given machine. On the other hand, benchmarks that yield proportional performance (real execution time) on a variety of machines should be considered similar. The investigations show that the two metrics are highly correlated. The paper should interest everyone involved in comparing the performance of various systems based on benchmarks or involved in workload modeling based on existing or synthetic benchmarks. To gain a complete understanding of this field, I recommend reading the authors' related publications, al though this paper is self-contained and well done.

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Computer Systems Volume 14, Issue 4
Nov. 1996
120 pages
ISSN:0734-2071
EISSN:1557-7333
DOI:10.1145/235543
Issue’s Table of Contents

Copyright © 1996 ACM
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 November 1996
Published in tocs Volume 14, Issue 4

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
abstract machine performance model
benchmark analysis
execution time prediction
microbenchmarking
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 123
  Total Citations
  View Citations
- 3,595
  Total Downloads
- Downloads (Last 12 months)694
- Downloads (Last 6 weeks)168
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Analysis of benchmark characteristics and benchmark performance prediction

ACM Transactions on Computer Systems

Abstract

References

Cited By

Index Terms

Recommendations

How to Build a Benchmark

Analysis of Benchmark Characteristics and Benchmark Performance

Subsetting the SPEC CPU2006 benchmark suite

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Analysis of benchmark characteristics and benchmark performance prediction

ACM Transactions on Computer Systems

Abstract

References

Cited By

Index Terms

Recommendations

How to Build a Benchmark

Analysis of Benchmark Characteristics and Benchmark Performance

Subsetting the SPEC CPU2006 benchmark suite

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media