Article

Fast Track: A Software System for Speculative Program Optimization

CGO '09: Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and OptimizationMarch 2009Pages 157–168https://doi.org/10.1109/CGO.2009.18

Published:22 March 2009Publication History

CGO '09: Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and Optimization

Pages 157–168

ABSTRACT

Fast track is a software speculation system that enables unsafe optimization of sequential code. It speculatively runs optimized code to improve performance and then checks the correctness of the speculative code by running the original program on multiple processors. We present the interface design and system implementation for Fast Track. It lets a programmer or a proﬁling tool mark fast-track code regions and uses a run-time system to manage the parallel execution of the speculative process and its checking processes and ensures the correct display of program outputs. The core of the run-time system is a novel concurrent algorithm that balances exploitable parallelism and available processors when the fast track is too slow or too fast. The programming interface closely affects the run-time support. Our system permits both explicit and implicit end markers for speculatively optimized code regions as well as extensions that allow the use of multiple tracks and user deﬁned correctness checking. We discuss the possible uses of speculative optimization and demonstrate the effectiveness of our prototype system by examples of unsafe semantic optimization and a general system for fast memory-safety checking, which is able to reduce the checking time by factors between 2 and 7 for large sequential code on a 8-CPU system.

References

L. Rauchwerger and D. Padua, "The LRPD test: Speculative run-time parallelization of loops with privatization and reduction parallelization," in Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, La Jolla, CA, June 1995. Google ScholarDigital Library
M. Gupta and R. Nim, "Techniques for run-time parallelization of loops," in Proceedings of SC'98, 1998. Google ScholarDigital Library
M. H. Cintra and D. R. Llanos, "Design space exploration of a software speculative parallelization scheme," IEEE Transactions on Parallel and Distributed Systems, vol. 16, no. 6, pp. 562-576, 2005. Google ScholarDigital Library
C. Ding, X. Shen, K. Kelsey, C. Tice, R. Huang, and C. Zhang, "Software behavior-oriented parallelization," in Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, San Diego, USA, 2007. Google ScholarDigital Library
C. Tian, M. Feng, V. Nagarajan, and R. Gupta, "Copy or discard execution model for speculative parallelization on multicores," in Proceedings of the ACM/IEEE International Symposium on Microarchitecture, 2008. Google ScholarDigital Library
C. B. Zilles and G. S. Sohi, "Master/slave speculative parallelization." in Proceedings of the ACM/IEEE International Symposium on Microarchitecture, 2002, pp. 85-96. Google ScholarDigital Library
A. Zhai, C. B. Colohan, J. G. Steffan, and T. C. Mowry, "Compiler optimization of scalar value communication between speculative threads," in Proceedings of the International Conference on Architectual Support for Programming Languages and Operating Systems, 2002, pp. 171-183. Google ScholarDigital Library
C. Quinones, C. Madriles, J. Sánchez, P. Marcuello, A. González, and D. M. Tullsen, "Mitosis compiler: an infrastructure for speculative threading based on precomputation slices," in Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation. New York, NY, USA: ACM, 2005, pp. 269-279. Google ScholarDigital Library
N. Neelakantam, R. Rajwar, S. Srinivas, U. Srinivasan, and C. B. Zilles, "Hardware atomicity for reliable software speculation," in Proceedings of the International Symposium on Computer Architecture, 2007, pp. 174-185. Google ScholarDigital Library
R. Allen and K. Kennedy, Optimizing Compilers for Modern Architectures: A Dependence-based Approach. Morgan Kaufmann Publishers, October 2001. Google ScholarDigital Library
C. Zhang, K. Kelsey, X. Shen, C. Ding, M. Hertz, and M. Ogihara, "Program-level adaptive memory management," in Proceedings of the International Symposium on Memory Management, Ottawa, Canada, June 2006. Google ScholarDigital Library
C. Grzegorczyk, S. Soman, C. Krintz, and R. Wolski, "Isla vista heap sizing: Using feedback to avoid paging," in Proceedings of the International Symposium on Code Generation and Optimization, 2007, pp. 325-340. Google ScholarDigital Library
D. Michie, "Memo functions and machine learning," Nature, vol. 218, pp. 19-22, 1968.Google Scholar
Y. Ding and Z. Li, "A compiler scheme for reusing intermediate computation results." in Proceedings of the International Symposium on Code Generation and Optimization, 2004. Google ScholarDigital Library
C.-K. Luk et al., "Pin: Building customized program analysis tools with dynamic instrumentation," in Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, Chicago, Illinois, June 2005. Google ScholarDigital Library
T. Moseley, A. Shye, V. J. Reddi, D. Grunwald, and R. Peri, "Shadow profiling: Hiding instrumentation costs with parallelism," in Proceedings of the International Symposium on Code Generation and Optimization, 2007, pp. 198-208. Google ScholarDigital Library
S. Wallace and K. Hazelwood, "Superpin: Parallelizing dynamic instrumentation for real-time performance," in Proceedings of the International Symposium on Code Generation and Optimization, 2007, pp. 209-220. Google ScholarDigital Library
E. B. Nightingale, D. Peek, P. M. Chen, and J. Flinn, "Parallelizing security checks on commodity hardware," in Proceedings of the International Conference on Architectual Support for Programming Languages and Operating Systems, 2008, pp. 308-318. Google ScholarDigital Library
H. Patil and C. Fischer, "Efcient run-time monitoring using shadow processing," 1995, presented at AADEBUG'95.Google Scholar
J.-Y. Tsai, Z. Jiang, and P.-C. Yew, "Compiler techniques for the superthreaded architectures," International Journal of Parallel Programming, vol. 27, no. 1, pp. 1-19, 1999. Google ScholarDigital Library
K. Sundaramoorthy, Z. Purser, and E. Rotenberg, "Slipstream processors: Improving both performance and fault tolerance." in Proceedings of the International Conference on Architectual Support for Programming Languages and Operating Systems, 2000, pp. 257-268. Google ScholarDigital Library
S.-W. Liao, P. H. Wang, H. Wang, J. P. Shen, G. Hoflehner, and D. M. Lavery, "Post-pass binary adaptation for software-based speculative precomputation," in Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, 2002, pp. 117-128. Google ScholarDigital Library
A. Garg and M. Huang, "A performance-correctness explicitly-decoupled architecture," in Proceedings of the ACM/IEEE International Symposium on Microarchitecture, 2008. Google ScholarDigital Library
J. T. Oplinger and M. S. Lam, "Enhancing software reliability with speculative threads," in Proceedings of the International Conference on Architectual Support for Programming Languages and Operating Systems, 2002, pp. 184-196. Google ScholarDigital Library
P. Zhou, F. Qin, W. Liu, Y. Zhou, and J. Torrellas, "iwatcher: Efficient architectural support for software debugging," in Proceedings of the International Symposium on Computer Architecture, 2004, pp. 224-237. Google ScholarDigital Library
S. Lee and J. Tuck, "Parallelizing Mudflap using thread-level speculation on a CMP," 2008, presented at the Workshop on the Parallel Execution of Sequential Programs on Multicore Architecture, co-located with ISCA.Google Scholar
J. Mellor-Crummey, "Compile-time support for efficient data race detection in shared memory parallel programs," Rice University, Tech. Rep. CRPC-TR92232, September 1992.Google Scholar
D. Perkovic and P. J. Keleher, "A protocol-centric approach to on-the-fly race detection," IEEE Transactions on Parallel and Distributed Systems, vol. 11, no. 10, pp. 1058-1072, 2000. Google ScholarDigital Library
R. Wahbe, S. Lucco, and S. L. Graham, "Practical data breakpoints: design and implementation," in Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, June 1993. Google ScholarDigital Library
C. Ding and K. Kennedy, "Improving cache performance in dynamic applications through data and computation reorganization at run time," in Proceedings of the SIGPLAN '99 Conference on Programming Language Design and Implementation , Atlanta, GA, May 1999. Google ScholarDigital Library
M. Arnold and B. G. Ryder, "A framework for reducing the cost of instrumented code," in Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, Snowbird, Utah, June 2001. Google ScholarDigital Library
T. M. Chilimbi and M. Hirzel, "Dynamic hot data stream prefetching for general-purpose programs," in Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, Berlin, Germany, June 2002. Google ScholarDigital Library
K. Li, "Shared virtual memory on loosely coupled multiprocessors," Ph.D. dissertation, Dept. of Computer Science, Yale University, New Haven, CT, Sep. 1986. Google ScholarDigital Library
P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel, "TreadMarks: Distributed shared memory on standard workstations and operating systems," in Proceedings of the 1994 Winter USENIX Conference, Jan. 1994. Google ScholarDigital Library

Index Terms

Fast Track: A Software System for Speculative Program Optimization
1. Software and its engineering
  1. Software notations and tools
    1. Compilers
2. Theory of computation
  1. Models of computation
    1. Concurrency
      1. Parallel computing models

Recommendations

Fast Track: Supporting Unsafe Optimizations with Software Speculation
PACT '07: Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques

The shift in processor technology toward multi-core, multi-processors opens new opportunities for software speculation where program code is speculatively executed to improve speed at the cost of having handle errors. In this paper we describe a new use ...
Read More
Fast, frequency-based, integrated register allocation and instruction scheduling

Instruction scheduling and register allocation are two of the most important optimization phases in modern compilers as they have a significant impact on the quality of the generated code. Unfortunately, the objectives of these two optimizations are in ...
Read More
Fast branch misprediction recovery in out-of-order superscalar processors
ICS '05: Proceedings of the 19th annual international conference on Supercomputing

Current trends in modern out-of-order processors involve implementing deeper pipelines and a large instruction window to achieve high performance. However, as pipeline depth increases, the branch misprediction penalty becomes a critical factor in ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

CGO '09: Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and Optimization
March 2009
299 pages
ISBN:9780769535760
Sponsors
In-Cooperation
Publisher
IEEE Computer Society
United States
Publication History
- Published: 22 March 2009
Check for updates
Qualifiers
- Article
Conference

Acceptance Rates
CGO '09 Paper Acceptance Rate26of70submissions,37%Overall Acceptance Rate312of1,061submissions,29%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 59
  Total Citations
  View Citations
- 406
  Total Downloads
- Downloads (Last 12 months)2
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Fast Track: A Software System for Speculative Program Optimization

CGO '09: Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and Optimization

ABSTRACT

References

Cited By

Index Terms

Recommendations

Fast Track: Supporting Unsafe Optimizations with Software Speculation

Fast, frequency-based, integrated register allocation and instruction scheduling

Fast branch misprediction recovery in out-of-order superscalar processors

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Fast Track: A Software System for Speculative Program Optimization

CGO '09: Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and Optimization

ABSTRACT

References

Cited By

Index Terms

Recommendations

Fast Track: Supporting Unsafe Optimizations with Software Speculation

Fast, frequency-based, integrated register allocation and instruction scheduling

Fast branch misprediction recovery in out-of-order superscalar processors

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media