skip to main content
10.1145/513918.513928acmconferencesArticle/Chapter ViewAbstractPublication PagesdacConference Proceedingsconference-collections
Article

A fast on-chip profiler memory

Published:10 June 2002Publication History

ABSTRACT

Profiling an application executing on a microprocessor is part of the solution to numerous software and hardware optimization and design automation problems. Most current profiling techniques suffer from runtime overhead, inaccuracy, or slowness, and the traditional non-intrusive method of using a logic analyzer doesn't work for today's system-on-a-chip having embedded cores. We introduce a novel on-chip memory architecture that overcomes these limitations. The architecture, which we call ProMem, is based on a pipelined binary tree structure. It achieves single-cycle throughput, so it can keep up with today's fastest pipelined processors. It can also be laid out efficiently and scales very well, becoming more efficient the larger it gets. The memory can be used in a wide-variety of common profiling situations, such as instruction profiling, value profiling, and network traffic profiling, which in turn can be used to guide numerous design automation tasks.

References

  1. Anderson, J., et al. Continuous Profiling: Where Have All the Cycles Gone? 16th ACM Symp. of Operating Systems Design, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Artisan Components, Inc. UMC .18 Technology Library, http://www.artisan.com, 2001.Google ScholarGoogle Scholar
  3. Bala, V., E. Duesterwald, and S. Banerjia. Dynamo: A Transparent Dynamic Optimization System. ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), June 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Bellas, N., et al. Energy and Performance Improvements in Microprocessor Design Using a Loop Cache. ICCD, pp. 378--383, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Burger, D. and T. M. Austin. The SimpleScalar tool set, version 2.0. Tech. Rep. CS-1342, University of Wisconsin-Madison, June 1997.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Calder, B., P. Feller and A. Eustace. Value Profiling. MICRO, pp. 259--269, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Chung, E.Y., L. Benini and G. De Micheli. Automatic Source Code Specialization for Energy Reduction. ISLPED, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Dean, J., et al. ProfileMe: Hardware Support for Instruction-Level Profiling on Out-of-Order Processors. MICRO, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Gordon-Ross, A., S. Cotterell and F. Vahid. Exploiting Fixed Programs in Embedded Systems: A Loop Cache Example. IEEE Computer Architecture Letters, Jan. 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Graham, S.L., P.B. Kessler and M.K. McKusick. gprof: a Call Graph Execution Profiler. SIGPLAN Symp. on Compiler Construction, pp. 120--126, 1982. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. IEEE, IEEE 1149.1 Standard Test Access Port and Boundary-Scan Architecture, http://standards.ieee.org, 2001.Google ScholarGoogle Scholar
  12. Ishihara, T., H. Yasuura. A Power Reduction Technique with Object Code Merging for Application Specific Embedded Processors. DATE, March 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Klaiber, A. The Technology Behind Crusoe Processors. Transmeta Corporation, http://www.transmeta.com, 2000.Google ScholarGoogle Scholar
  14. Lakshminarayana, G., et al. Common-Case Computation: A High-Level Technique for Power and Performance Optimization. DAC, pp. 1--5, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Pettis, K. and R.C. Hansen. Profile Guided Code Positioning. ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), June 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Semiconductor Industry Association. International Technology Roadmap for Semiconductors: 1999 edition. Austin, TX: International SEMATECH, 1999.Google ScholarGoogle Scholar
  17. Synopsys, Inc. Design Compiler, http://www.synopsys.com, 2001.Google ScholarGoogle Scholar
  18. Vahid, F., T. Givargis. Platform Tuning for Embedded Systems Design. IEEE Computer, Vol 34, No. 3, pp. 112--114, March 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Vtune Environment, Intel Corp., http://developer.intel.com/vtune.Google ScholarGoogle Scholar
  20. Waldvogel, M., et al. Scalable High Speed IP Routing Lookups, SIGCOMM 97, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Zagha, M., B. Larson, S. Turner, and M. Itzkowitz. Performance Analysis Using the MIPS R10000 Performance Counters. Supercomputing, Nov. 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Zhang, X., et al. System Support for automatic Profiling and Optimization. Proceedings of the 16th Symp. on Operating Systems Principles, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Zilles, C.B. and G.S. Sohi. A Programmable Co-processor for Profiling. International Symp. on High-Performance Computer Architectures, 2001 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A fast on-chip profiler memory

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      DAC '02: Proceedings of the 39th annual Design Automation Conference
      June 2002
      956 pages
      ISBN:1581134614
      DOI:10.1145/513918

      Copyright © 2002 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 10 June 2002

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      DAC '02 Paper Acceptance Rate147of491submissions,30%Overall Acceptance Rate1,770of5,499submissions,32%

      Upcoming Conference

      DAC '24
      61st ACM/IEEE Design Automation Conference
      June 23 - 27, 2024
      San Francisco , CA , USA

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader