skip to main content
10.1145/2151024.2151046acmconferencesArticle/Chapter ViewAbstractPublication PagesveeConference Proceedingsconference-collections
research-article

DDGacc: boosting dynamic DDG-based binary optimizations through specialized hardware support

Authors Info & Claims
Published:03 March 2012Publication History

ABSTRACT

Dynamic Binary Translators (DBT) and Dynamic Binary Optimization (DBO) by software are used widely for several reasons including performance, design simplification and virtualization. However, the software layer in such systems introduces non-negligible overheads which affect performance and user experience. Hence, reducing DBT/DBO overheads is of paramount importance. In addition, reduced overheads have interesting collateral effects in the rest of the software layer, such as allowing optimizations to be applied earlier. A cost-effective solution to this problem is to provide hardware support to speed up the primitives of the software layer, paying special attention to automate DBT/DBO mechanisms and leave the heuristics to the software, which is more flexible. In this work, we have characterized the overheads of a DBO system using DynamoRIO implementing several basic optimizations. We have seen that the computation of the Data Dependence Graph (DDG) accounts for 5%-10% of the execution time. For this reason, we propose to add hardware support for this task in the form of a new functional unit, called DDGacc, which is integrated in a conventional pipeline processor and is operated through new ISA instructions. Our evaluation shows that DDGacc reduces the cost of computing the DDG by 32x, which reduces overall execution time by 5%-10% on average and up to 18% for applications where the DBO optimizes large code footprints.

References

  1. Standard Performance Evaluation Corporation. SPEC CPU2006 Benchmarks. URL http://www.spec.org/cpu2006/.Google ScholarGoogle Scholar
  2. T. Austin, E. Larson, and D. Ernst. Simplescalar: an infrastructure for computer system modeling. Computer, 35 (2): 59 --67, feb 2002. ISSN 0018--9162. 10.1109/2.982917. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. V. Bala, E. Duesterwald, and S. Banerjia. Dynamo: a transparent dynamic optimization system. In PLDI '00: Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation, pages 1--12, New York, NY, USA, 2000. ACM. ISBN 1--58113--199--2. http://doi.acm.org/10.1145/349299.349303. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. L. Baraz, T. Devor, O. Etzion, S. Goldenberg, A. Skaletsky, Y. Wang, and Y. Zemach. Ia-32 execution layer: a two-phase dynamic translator designed to support ia-32 applications on itanium®-based systems. In phMICRO 36: Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture, page 191, Washington, DC, USA, 2003. IEEE Computer Society. ISBN 0--7695--2043-X. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. D. Bruening, T. Garnett, and S. Amarasinghe. An infrastructure for adaptive dynamic optimization. In CGO '03: Proceedings of the international symposium on Code generation and optimization, pages 265--275, Washington, DC, USA, 2003. IEEE Computer Society. ISBN 0--7695--1913-X. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. C. Dehnert, B. K. Grant, J. P. Banning, R. Johnson, T. Kistler, A. Klaiber, and J. Mattson. The Transmeta Code Morphing™ Software: using speculation, recovery, and adaptive retranslation to address real-life challenges. In CGO '03: Proceedings of the international symposium on Code generation and optimization, pages 15--24, Washington, DC, USA, 2003. IEEE Computer Society. ISBN 0--7695--1913-X. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. K. Ebciouglu and E. R. Altman. Daisy: dynamic compilation for 100% architectural compatibility. SIGARCH Comput. Archit. News, 25 (2): 26--37, 1997. ISSN 0163--5964. http://doi.acm.org/10.1145/384286.264126. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. K. Hazelwood and M. D. Smith. Managing bounded code caches in dynamic binary optimization systems. ACM Trans. Archit. Code Optim., 3: 263--294, September 2006. ISSN 1544--3566. http://doi.acm.org/10.1145/1162690.1162692. URL http://doi.acm.org/10.1145/1162690.1162692. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. D. Hiser, D. Williams, W. Hu, J. W. Davidson, J. Mars, and B. R. Childers. Evaluating Indirect Branch Handling Mechanisms in Software Dynamic Translation Systems. In CGO '07: Proceedings of the International Symposium on Code Generation and Optimization, pages 61--73, Washington, DC, USA, 2007. IEEE Computer Society. ISBN 0--7695--2764--7. http://dx.doi.org/10.1109/CGO.2007.10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. S. Hu and J. E. Smith. Reducing startup time in co-designed virtual machines. In ISCA '06: Proceedings of the 33rd annual international symposium on Computer Architecture, pages 277--288, Washington, DC, USA, 2006. IEEE Computer Society. ISBN 0--7695--2608-X. http://dx.doi.org/10.1109/ISCA.2006.33. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. A. Klaiber. The Technology Behind the Crusoe Processors. White paper, January 2000.Google ScholarGoogle Scholar
  12. T. Lindholm and F. Yellin. Java Virtual Machine Specification. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1999. ISBN 0201432943. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. Lu, H. Chen, R. Fu, W.-C. Hsu, B. Othmer, P.-C. Yew, and D.-Y. Chen. The performance of runtime data cache prefetching in a dynamic optimization system. In Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture, MICRO 36, pages 180--, Washington, DC, USA, 2003. IEEE Computer Society. ISBN 0--7695--2043-X. URL http://dl.acm.org/citation.cfm?id=956417.956549. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. F. Martínez, J. Renau, M. C. Huang, M. Prvulovic, and J. Torrellas. Cherry: checkpointed early resource recycling in out-of-order microprocessors. In MICRO 35: Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture, pages 3--14, Los Alamitos, CA, USA, 2002. IEEE Computer Society Press. ISBN 0--7695--1859--1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. M. C. Merten, A. R. Trick, E. M. Nystrom, R. D. Barnes, and W.-m. W. Hmu. A hardware mechanism for dynamic extraction and relayout of program hot spots. In Proceedings of the 27th annual international symposium on Computer architecture, ISCA '00, pages 59--70, New York, NY, USA, 2000. ACM. ISBN 1--58113--232--8. http://doi.acm.org/10.1145/339647.339655. URL http://doi.acm.org/10.1145/339647.339655. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. S. S. Muchnick. phAdvanced compiler design and implementation. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1997. ISBN 1--55860--320--4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. S. Patel and S. Lumetta. rePLay: A hardware framework for dynamic optimization. Computers, IEEE Transactions on, 50 (6): 590--608, Jun 2001. ISSN 0018--9340. 10.1109/12.931895. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. S. S. Paul, P. Ledak, J. Leblanc, S. Kosonocky, M. Gschwind, J. Fritts, A. Bright, E. Altman, and C. Agricola. Boa: Targeting multi-gigahertz with binary translation. In In Proc. of the 1999 Workshop on Binary Translation, IEEE Computer Society Technical Committee on Computer Architecture Newsletter, pages 2--11, 1999.Google ScholarGoogle Scholar
  19. D. Pavlou, E. Gibert, F. Latorre, and A. Gonzalez. Improving Dynamic Binary Optimizers Efficiency through Specific Hardware Support. Technical Report UPC-DAC-RR-ARCO-2009--11, Universitat Politecnica de Catalunya, Department of Computer Architecture, September 2009.Google ScholarGoogle Scholar
  20. R. Rosner, Y. Almog, M. Moffie, N. Schwartz, and A. Mendelson. Power awareness through selective dynamically optimized traces. In Computer Architecture, 2004. Proceedings. 31st Annual International Symposium on, pages 162--173, June 2004. 10.1109/ISCA.2004.1310772. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. K. Scott, N. Kumar, S. Velusamy, B. Childers, J. W. Davidson, and M. L. Soffa. Retargetable and reconfigurable software dynamic translation. In CGO '03: Proceedings of the International Symposium on Code Generation and Optimization, pages 36--47, Washington, DC, USA, 2003. IEEE Computer Society. ISBN 0--7695--1913-X. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. J. Smith and R. Nair. Virtual Machines: Versatile Platforms for Systems and Processes (The Morgan Kaufmann Series in Computer Architecture and Design). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2005. ISBN 1558609105. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. W. Srisa-an, M. B. Cohen, Y. Shang, and M. Soundararaj. A self-adjusting code cache manager to balance start-up time and memory usage. In Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization, CGO '10, pages 82--91, New York, NY, USA, 2010. ACM. ISBN 978--1--60558--635--9. http://doi.acm.org/10.1145/1772954.1772968. URL http://doi.acm.org/10.1145/1772954.1772968. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. S. Wilton and N. Jouppi. Cacti: an enhanced cache access and cycle time model. Solid-State Circuits, IEEE Journal of, 31 (5): 677--688, May 1996. ISSN 0018--9200. 10.1109/4.509850.Google ScholarGoogle Scholar
  25. W. Zhang, B. Calder, and D. M. Tullsen. An event-driven multithreaded dynamic optimization framework. In Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques, PACT '05, pages 87--98, Washington, DC, USA, 2005. IEEE Computer Society. ISBN 0--7695--2429-X. http://dx.doi.org/10.1109/PACT.2005.7. URL http://dx.doi.org/10.1109/PACT.2005.7. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    VEE '12: Proceedings of the 8th ACM SIGPLAN/SIGOPS conference on Virtual Execution Environments
    March 2012
    248 pages
    ISBN:9781450311762
    DOI:10.1145/2151024
    • cover image ACM SIGPLAN Notices
      ACM SIGPLAN Notices  Volume 47, Issue 7
      VEE '12
      July 2012
      229 pages
      ISSN:0362-1340
      EISSN:1558-1160
      DOI:10.1145/2365864
      Issue’s Table of Contents

    Copyright © 2012 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 3 March 2012

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article

    Acceptance Rates

    Overall Acceptance Rate80of235submissions,34%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader