skip to main content
article
Free Access

Instruction path coprocessors

Authors Info & Claims
Published:01 May 2000Publication History
Skip Abstract Section

Abstract

This paper presents the concept of an Instruction Path Coprocessor (I-COP), which is a programmable on-chip coprocessor, with its own mini-instruction set, that operates on the core processor's instructions to transform them into an internal format that can be more efficiently executed. It is located off the critical path of the core processor to ensure that it does not negatively impact the core processor's cycle time or pipeline depth. An I-COP is highly versatile and can be used to implement different types of instruction transformations to enhance the IPC of the core processor. We study four potential applications of the I-COP to demonstrate the feasibility of this concept and investigate the design issues of such a coprocessor. A prototype instruction set for the I-COP is presented along with an implementation framework that facilitates achieving high I-COP performance. Initial results indicate that the I-COP is able to efficiently implement the trace cache fill unit as well as the register move, stride data prefetching and linked data structure prefetching trace optimizations.

References

  1. 1 Michael Slater, "AMD's K5 Designed to Outrun Pentium," in Microprocessor Report, Vol. 8, Issue 14, Oct 1994.Google ScholarGoogle Scholar
  2. 2 Linley Gwennap, "Intel's P6 Uses Decoupled Superscalar Design," in Microprocessor Report, Vol 9, Issue 2, Feb 1995.Google ScholarGoogle Scholar
  3. 3 E. Rotenberg, S. Bennett and J. Smith, "Trace Cache: a Low Latency Approach to High Bandwidth Instruction Fetching," in Proc. of 29th Int. Symp. on Microarchitecture, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. 4 S. Patel, D. Friendly and Y. Patt, "Critical Issues Regarding the Trace Cache Fetch Mechanism," Technical Report CSE- TR-335-97, University of Michigan, May 1997.Google ScholarGoogle Scholar
  5. 5 B. Black, B. Rychlik and J. Shen, "The Block-based Trace Cache," in Proc. of 26th Int. Syrup. on Computer Architecture, May 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. 6 E. Debaere and J. Campenhout, "Interpretation and Instruction Path Coprocessing," MIT Press, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. 7 A. Chemoff, M. Herdeg, R. Hookway, C. Reeve, N. Rubin, T. Tye, S. Yadavalli, J. Yates, "FX!32 - A profile-directed binary translator," IEEE MICRO, 18(2), March-April 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. 8 D. Friendly, S. Patel and Y. Patt, "Putting the Fill Unit to Work: Dynamic Optimizations for Trace Cache Microprocessors," in Proc. of 31st Int. Symp. on Microarchitecture, December 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. 9 Q. Jacobson and J. Smith, "Instruction Pre-Processing in Trace Processors," in Proc. of 5th Int. Symp, on High Performance Computer Architecture, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. 10 Alpha Architecture Handbook, Digital Equipment Corporation, 1992.Google ScholarGoogle Scholar
  11. 11 Microprocessor Report, 5/11/98.Google ScholarGoogle Scholar
  12. 12 Keith Dieffendorf, "Katmai Enhances MMX," Microprocessor Report, 10/5/98.Google ScholarGoogle Scholar
  13. 13 A. Srivastava and A. Eustace, "ATOM: A System for Building Customized Program Analysis Tools," in Proc. of SIGPLAN Conf. on Programming Language Design and Implementation, June 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. 14 R. Nair and M. Hopkins, "Exploiting Instruction Level Parallelism in Processors by Caching Scheduled Groups," in Proc. of 24th Int. Syrup. on Computer Architecture, June 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. 15 M. Franklin and M. Smotherman, "A Fill-Unit Approach to Multiple Instruction Issue," in Proc. of 27th Int. Syrup. on Microarchitecture, December 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. 16 E Rotenberg and J. Smith, "Control Independence in Trace Processors," in Proc. of 32nd Int. Symp. on Microarchitecture, December 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. 17 T. Kistler, "Dynamic Runtime Optimization," in Proc. of the Joint Modular Languages Conference, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. 18 R. Chappell, J. Stark, S. Kim and Y. Patt, "Simultaneous Subordinate Microthreading (SSMT)," in Proc. of 26th Int. Symp. on Computer Architecture, May 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. 19 Y. Song and M. Dubois, "Assisted Execution," Technical Report #CENG 98-25, Department of EE-Systems, University of Southern California, October 1998.Google ScholarGoogle Scholar
  20. 20 K. Ebcioglu and E. Altman, "DAISY: Dynamic Compilation for 100% Architectural Compatibility," in Proc. of 24th Int. Symp. on Computer Architecture, June 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. 21 M. Schuette, "Exploitation of Instruction-Level Parallelism for Detection of Processor Execution Errors," Ph.D. Thesis, ECE Department, Carnegie Mellon University, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. 22 T. Chen and J. Baer, "Effective Hardware-Based Data Prefetching for High-Performance Processors," IEEE Transactions on Computers, Vol. 44, No. 5, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. 23 D. Joseph and D. Grunwald, "Prefetching Using Markov Predictors," in Proc. of 24th Int. Syrup. on Computer Architecture, June 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. 24 T. Mowry, "Tolerating Latency Through Software- Controlled Data Prefetching," Ph.D. Thesis, Stanford University, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. 25 C. Luk and T. Mowry, "Compiler-Based Prefetching for Recursive Data Structures," in Proc. of 7th ASPLOS, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. 26 A. Roth, A. Moshovos and G. Sohi, "Dependence Based Prefetching for Linked Data Structures," in Proc. of 8th ASPLOS, October 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. 27 A. Roth and G. Sohi, "Effective Jump-Pointer Prefetching for Linked Data Structures," in Proc. of 26th Int. Syrup. on Computer Architecture, May 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. 28 http://www.spec.orgGoogle ScholarGoogle Scholar
  29. 29 A. Rogers, M. Carlisle, J. Reppy and L. Hendren, "Supporting Dynamic Data Structures on Distributed Memory Machines," ACM Transactions on Programming Languages and Systems, 17(2), March 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. 30 R. Hank, W. Hwu and B. Rau, "Region-based Compilation: An Introduction and Motivation," in Proc. of 28th Int. Syrup. on Microarchitecture, December 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. 31 Y. Chou and J. Shen, "Instruction Path Coprocessors", CMuART Tech. Report, Carnegie Mellon Univ., March 2000.Google ScholarGoogle Scholar
  32. 32 R. Rakvic, B. Black, and J. Shen, "Completion Time Multiple Branch Prediction for Enhancing Trace Cache Performance," in Proc. of 27th Int. Syrup. on Computer Architecture, June 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Instruction path coprocessors

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM SIGARCH Computer Architecture News
          ACM SIGARCH Computer Architecture News  Volume 28, Issue 2
          Special Issue: Proceedings of the 27th annual international symposium on Computer architecture (ISCA '00)
          May 2000
          325 pages
          ISSN:0163-5964
          DOI:10.1145/342001
          Issue’s Table of Contents
          • cover image ACM Conferences
            ISCA '00: Proceedings of the 27th annual international symposium on Computer architecture
            June 2000
            327 pages
            ISBN:1581132328
            DOI:10.1145/339647

          Copyright © 2000 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 1 May 2000

          Check for updates

          Qualifiers

          • article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader