Abstract
The past 10 years have delivered two significant revolutions. (1) Microprocessor design has been transformed by the limits of chip power, wire latency, and Dennard scaling---leading to multicore processors and heterogeneity. (2) Managed languages and an entirely new software landscape emerged---revolutionizing how software is deployed, is sold, and interacts with hardware. Researchers most often examine these changes in isolation. Architects mostly grapple with microarchitecture design through the narrow software context of native sequential SPEC CPU benchmarks, while language researchers mostly consider microarchitecture in terms of performance alone. This work explores the clash of these two revolutions over the past decade by measuring power, performance, energy, and scaling, and considers what the results may mean for the future. Our diverse findings include the following: (a) native sequential workloads do not approximate managed workloads or even native parallel workloads; (b) diverse application power profiles suggest that future applications and system software will need to participate in power optimization and management; and (c) software and hardware researchers need access to real measurements to optimize for power and energy.
- Azizi, O., Mahesri, A., Lee, B.C., Patel, S.J., Horowitz, M. Energy-performance tradeoffs in processor architecture and circuit design: a marginal cost analysis. In ISCA (2010). Google ScholarDigital Library
- Blackburn, S.M. et al. Wake up and smell the coffee: Evaluation methodologies for the 21st century. CACM 51, 8 (2008), 83--89. Google ScholarDigital Library
- Bohr, M. A 30 year retrospective on Dennard's MOSFET scaling paper. IEEE SSCS Newsletter 12, 1 (2007), 11--13 (http://dx.doi.org/10.1109/N-SSC.2007.4785534).Google Scholar
- David, H., Gorbatov, E., Hanebutte, U.R., Khanna, R., Le, C. RAPL: memory power estimation and capping. In ISLPED (2010). Google ScholarDigital Library
- Emer, J.S., Clark, D.W. A characterization of processor performance in the VAX-11/780. In ISCA (1984). Google ScholarDigital Library
- Esmaeilzadeh, H., Blem, E., St. Amant, R., Sankaralingam, K., Burger, D. Dark silicon and the end of multicore scaling. In ISCA (2011). Google ScholarDigital Library
- Fan, X., Weber, W.D., Barroso, L.A. Power provisioning for a warehouse-sized computer. In ISCA (2007). Google ScholarDigital Library
- Hardavellas, N., Ferdman, M., Falsafi, B., Ailamaki, A. Toward dark silicon in servers. IEEE Micro 31, 4 (2011), 6--15. Google ScholarDigital Library
- Hrishikesh, M.S., Burger, D., Jouppi, N.P., Keckler, S.W., Farkas, K.I., Shivakumar, P. The optimal logic depth per pipeline stage is 6 to 8 FO4 inverter delays. In International Symposium on Computer Architecture (2002). Google ScholarDigital Library
- Isci, C., Martonosi, M. Runtime power monitoring in high-end processors: Methodology and empirical data. In MICRO (2003). Google ScholarDigital Library
- ITRS Working Group. International technology roadmap for semiconductors, 2011.Google Scholar
- Le Sueur, E., Heiser, G. Dynamic voltage and frequency scaling: the laws of diminishing returns. In HotPower (2010). Google ScholarDigital Library
- Li, S., Ahn, J.H., Strong, R.D., Brockman, J.B., Tullsen, D.M., Jouppi, N.P. McPAT: an integrated power, area, and timing modeling framework for multicore and manycore architectures. In MICRO (2009). Google ScholarDigital Library
- Li, Y., Lee, B., Brooks, D., Hu, Z., Skadron, K. CMP design space exploration subject to physical contraints. In HPCA (2006).Google ScholarCross Ref
- Moore, G.E. Cramming more components onto integrated circuits. Electronics 38, 8 (19 Apr 1965), 114--117.Google Scholar
- Mudge, T. Power: a first-class architectural design constraint. Computer 34, 4 (Apr. 2001), 52--58. Google ScholarDigital Library
- Sasanka, R., Adve, S.V., Chen, Y.K., Debes, E. The energy efficiency of CMP vs. SMT for multimedia workloads. In ICS (2004). Google ScholarDigital Library
- Singhal, R. Inside Intel next generation Nehalem microarchitecture. Intel Developer Forum (IDF) presentation (August 2008), 2011.Google Scholar
- Tullsen, D.M., Eggers, S.J., Levy, H.M. Simultaneous multithreading: maximizing on-chip parallelism. In ISCA (1995). Google ScholarDigital Library
Index Terms
- Looking back and looking forward: power, performance, and upheaval
Recommendations
Looking back on the language and hardware revolutions: measured power, performance, and scaling
ASPLOS '11This paper reports and analyzes measured chip power and performance on five process technology generations executing 61 diverse benchmarks with a rigorous methodology. We measure representative Intel IA32 processors with technologies ranging from 130nm ...
Looking back on the language and hardware revolutions: measured power, performance, and scaling
ASPLOS '11This paper reports and analyzes measured chip power and performance on five process technology generations executing 61 diverse benchmarks with a rigorous methodology. We measure representative Intel IA32 processors with technologies ranging from 130nm ...
Looking back on the language and hardware revolutions: measured power, performance, and scaling
ASPLOS XVI: Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systemsThis paper reports and analyzes measured chip power and performance on five process technology generations executing 61 diverse benchmarks with a rigorous methodology. We measure representative Intel IA32 processors with technologies ranging from 130nm ...
Comments