skip to main content
10.1145/1878921.1878923acmconferencesArticle/Chapter ViewAbstractPublication PagesesweekConference Proceedingsconference-collections
research-article

Balancing memory and performance through selective flushing of software code caches

Authors Info & Claims
Published:24 October 2010Publication History

ABSTRACT

Dynamic binary translators (DBTs) are becoming increasingly important because of their power and flexibility. However, the high memory demands of DBTs present an obstacle for all platforms, and especially embedded systems. The memory demand is typically controlled by placing a limit on cached translations and forcing the DBT to flush all translations upon reaching the limit. This solution manifests as a performance inefficiency because many flushed translations require retranslation. Ideally, translations should be selectively flushed to minimize retranslations for a given memory limit. However, three obstacles exist:(1) it is difficult to predict which selections will minimize retranslation,(2) selective flushing results in greater book-keeping overheads than full flushing, and(3) the emergence of multicore processors and multi-threaded programming complicates most flushing algorithms. These issues have led to the widespread adoption of full flushing as a standard protocol. In this paper, we present a partial flushing approach aimed at reducing retranslation overhead and improving overall performance, given a fixed memory budget. Our technique applies uniformly to single-threaded and multi-threaded guest applications

References

  1. J. Baiocchi, B. R. Childers, J. W. Davidson, J. D. Hiser, and J. Misurda. Fragment cache management for dynamic binary translators in embedded systems with scratchpad. In Compilers, Architecture, and Synthesis for Embedded Systems, pages 75--84, Salzburg, Austria, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. J. A. Baiocchi and B. R. Childers. Heterogeneous code cache: using scratchpad and main memory in dynamic binary translators. In 46th Annual Design Automation Conference, pages 744--749, San Francisco, CA, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. J. A. Baiocchi, B. R. Childers, J. W. Davidson, and J. D. Hiser. Reducing pressure in bounded DBT code caches. In Compilers, Architectures and Synthesis for Embedded Systems, pages 109--118, Atlanta, GA, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. V. Bala, E. Duesterwald, and S. Banerjia. Dynamo: a transparent dynamic optimization system. In Programming Language Design and Implementation, pages 1--12, Vancouver, BC, Canada, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. C. Bienia, S. Kumar, J. P. Singh, and K. Li. The parsec benchmark suite: Characterization and architectural implications. In Parallel Architectures and Compilation Techniques, October 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. D. Bruening and S. Amarasinghe. Maintaining consistency and bounding capacity of software code caches. In Code Generation and Optimization, pages 74--85, San Jose, CA, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. D. Bruening, T. Garnett, and S. Amarasinghe. An infrastructure for adaptive dynamic optimization. In Code Generation and Optimization, pages 265--275, San Francisco, CA, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. D. Bruening, V. Kiriansky, T. Garnett, and S. Banerji. Thread-shared software code caches. In Code Generation and Optimization, pages 28--38, New York, NY, March 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. G. Desoli, N. Mateev, E. Duesterwald, P. Faraboschi, and J. A. Fisher. Deli: a new run-time control point. In 35th Int'l Symp. on Microarchitecture, pages 257--268, Istanbul, Turkey, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A. Guha, K. Hazelwood, and M. L. Soffa. Reducing exit stub memory consumption in code caches. In High-Performance Embedded Architectures and Compilers (HiPEAC), pages 87--101, Ghent, Belgium, January 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. A. Guha, K. Hazelwood, and M. L. Soffa. Code lifetime based memory reduction for virtual execution environments. In 6th Workshop on Optimizations for DSP and Embedded Systems (ODES), Boston, MA, March 2008.Google ScholarGoogle Scholar
  12. A. Guha, K. Hazelwood, and M. L. Soffa. DBT path selection for holistic memory efficiency and performance. In Virtual Execution Environments, pages 145--156, Pittsburgh, PA, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M. R. Guthaus, J. S. Ringenberg, D. Ernst, T. M. Austin, T. Mudge, and R. B. Brown. Mibench: A free, commercially representative embedded benchmark suite. In Workshop on Workload Characterization, pages 3--14, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. K. Hazelwood and A. Klauser. A dynamic binary instrumentation engine for the ARM architecture. In Compilers, Architecture, and Synthesis for Embedded Systems, pages 261--270, Seoul, Korea, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. K. Hazelwood, G. Lueck, and R. Cohn. Scalable support for multithreaded applications on dynamic binary instrumentation systems. In International Symposium on Memory Management, pages 20--29, Dublin, Ireland, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. K. Hazelwood and M. D. Smith. Managing bounded code caches in dynamic binary optimization systems. Transactions on Code Generation and Optimization, 3(3):263--294, September 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. J. L. Henning. Spec cpu2000: Measuring CPU performance in the new millennium. Computer, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. D. Hiser, D. Williams, A. Filipi, J. W. Davidson, and B. R. Childers. Evaluating fragment construction policies for SDT systems. In Virtual Execution Environments, pages 122--132, Ottawa, Canada, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. V. Janapareddi, D. Connors, R. Cohn, and M. D. Smith. Persistent code caching: Exploiting code reuse across executions and applications. In Code Generation and Optimization, pages 74--88, San Jose, CA, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. V. Kiriansky, D. Bruening, and S. Amarasinghe. Secure execution via program shepherding. In 11th USENIX Security Symposium, pages 191--206, San Francisco, CA, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. Janapareddi, and K. Hazelwood. Pin: Building customized program analysis tools with dynamic instrumentation. In Programming Language Design and Implementation, pages 190--200, Chicago, IL, June 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. R. W. Moore, J. A. Baiocchi, B. R. Childers, J. W. Davidson, and J. D. Hiser. Addressing the challenges of DBT for the ARM architecture. In Languages, Compilers, and Tools for Embedded Systems, pages 147--156, Dublin, Ireland, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. N. Nethercote and J. Seward. Valgrind: a framework for heavyweight dynamic binary instrumentation. In Programming Language Design and Implementation, pages 89--100, San Diego, CA, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. J. Palm, H. Lee, A. Diwan, and J. E. B. Moss. When to use a compilation service? In Languages, Compilers, and Tools for Embedded Systems, Berlin, Germany, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. K. Scott, N. Kumar, S. Velusamy, B. Childers, J. Davidson, and M. L. Soffa. Reconfigurable and retargetable software dynamic translation. In Code Generation and Optimization, pages 36--47, San Francisco, CA, March 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. S. Shogan and B. R. Childers. Compact binaries with code compression in a software dynamic translator. In Design, Automation and Test in Europe, page 21052, Paris, France, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Q. Wu, M. Martonosi, D. W. Clark, V. Janapareddi, D. Connors, Y. Wu, J. Lee, and D. Brooks. A dynamic compilation framework for controlling microprocessor energy and performance. In 38th Int'l Symp. on Microarchitecture, pages 271--282, Barcelona, Spain, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. L. Zhang and C. Krintz. Adaptive unloading for resource-constrained VMs. In Languages, Compilers, and Tools for Embedded Systems, Washington, DC, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. S. Zhou, B. R. Childers, and M. L. Soffa. Planning for code buffer management in distributed virtual execution environments. In Virtual Execution Environments, pages 100--109, Chicago, IL, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Balancing memory and performance through selective flushing of software code caches

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CASES '10: Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
      October 2010
      276 pages
      ISBN:9781605589039
      DOI:10.1145/1878921

      Copyright © 2010 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 24 October 2010

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate52of230submissions,23%

      Upcoming Conference

      ESWEEK '24
      Twentieth Embedded Systems Week
      September 29 - October 4, 2024
      Raleigh , NC , USA

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader