skip to main content
10.1145/3297858.3304006acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article
Open Access

Compress Objects, Not Cache Lines: An Object-Based Compressed Memory Hierarchy

Authors Info & Claims
Published:04 April 2019Publication History

ABSTRACT

Existing cache and main memory compression techniques compress data in small fixed-size blocks, typically cache lines. Moreover, they use simple compression algorithms that focus on exploiting redundancy within a block. These techniques work well for scientific programs that are dominated by arrays. However, they are ineffective on object-based programs because objects do not fall neatly into fixed-size blocks and have a more irregular layout. We present the first compressed memory hierarchy designed for object-based applications. We observe that (i) objects, not cache lines, are the natural unit of compression for these programs, as they traverse and operate on object pointers; and (ii) though redundancy within each object is limited, redundancy across objects of the same type is plentiful. We exploit these insights through Zippads, an object-based compressed memory hierarchy, and COCO, a cross-object-compression algorithm. Building on a recent object-based memory hierarchy, Zippads transparently compresses variable-sized objects and stores them compactly. As a result, Zippads consistently outperforms a state-of-the-art compressed memory hierarchy: on a mix of array- and object-dominated workloads, Zippads achieves 1.63x higher compression ratio and improves performance by 17%.

References

  1. Alaa R Alameldeen and David A Wood. 2004 a. Adaptive cache compression for high-performance processors. In Proc. ISCA-31 . Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Alaa R Alameldeen and David A Wood. 2004 b. Frequent pattern compression: A significance-based compression scheme for L2 caches . Technical Report 1500. Dept. Comp. Sci., Univ. Wisconsin-Madison.Google ScholarGoogle Scholar
  3. Bowen Alpern, Steve Augart, Stephen M Blackburn, Maria Butrico, Anthony Cocchi, Perry Cheng, Julian Dolby, Stephen Fink, David Grove, Michael Hind, et almbox. 2005. The Jikes research virtual machine project: Building an open-source research community. IBM Systems Journal , Vol. 44, 2 (2005). Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Angelos Arelakis, Fredrik Dahlgren, and Per Stenstrom. 2015. HyComp: A hybrid cache compression method for selection of data-type-specific compression methods. In Proc. MICRO-48 . Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Angelos Arelakis and Per Stenstrom. 2014. SC2: A statistical compression cache scheme. In Proc. ISCA-41 . Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Nathan Beckmann and Daniel Sanchez. 2015. Bridging Theory and Practice in Cache Replacement . Technical Report MIT-CSAIL-TR-2015-034. Massachusetts Institute of Technology.Google ScholarGoogle Scholar
  7. Nathan Beckmann and Daniel Sanchez. 2017. Maximizing Cache Performance Under Uncertainty . In Proc. HPCA-23 .Google ScholarGoogle ScholarCross RefCross Ref
  8. Stephen M Blackburn, Robin Garner, Chris Hoffmann, Asjad M Khang, Kathryn S McKinley, Rotem Bentzur, Amer Diwan, Daniel Feinberg, Daniel Frampton, Samuel Z Guyer, Martin Hirzel, Antony Hosking, Maria Jump, Han Lee, J Eliot B Moss, Aashish Phansalkar, Darko Stefanovic, Thomas VanDrunen, Daniel von Dincklage, and Ben Wiedermann. 2006. The DaCapo benchmarks: Java benchmarking development and analysis. In Proc. OOPSLA . Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Stephen M Blackburn and Kathryn S McKinley. 2008. Immix: A mark-region garbage collector with space efficiency, fast collection, and mutator performance. In Proc. PLDI . Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Hans-J Boehm. 2002. An artificial garbage collection benchmark . http://hboehm.info/gc/gc_bench.html, archived at https://perma.cc/Y4BY-7RN4 .Google ScholarGoogle Scholar
  11. Guangyu Chen, Mahmut Kandemir, and Mary J Irwin. 2005. Exploiting frequent field values in Java objects for reducing heap memory requirements. In VEE . Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Guangyu Chen, M Kandemir, Narayanan Vijaykrishnan, Mary Jane Irwin, Bernd Mathiske, and Mario Wolczko. 2003. Heap compression for memory-constrained Java environments. In Proc. OOPSLA . Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Xi Chen, Lei Yang, Robert P Dick, Li Shang, and Haris Lekatsas. 2010. C-Pack: A high-performance microprocessor cache compression algorithm. IEEE transactions on very large scale integration (VLSI) systems , Vol. 18, 8 (2010). Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. David Cheriton, Amin Firoozshahian, Alex Solomatnikov, John P Stevenson, and Omid Azizi. 2012. HICAMP: Architectural support for efficient concurrency-safe shared structured data access. In Proc. ASPLOS-XVII . Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Esha Choukse, Mattan Erez, and Alaa Alameldeen. 2018a. Compresso: Pragmatic main memory compression. In Proc. MICRO-51 .Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Esha Choukse, Mattan Erez, and Alaa Alameldeen. 2018b. CompressPoints: An evaluation methodology for compressed memory systems. Computer Architecture Letters , Vol. 17, 2 (2018).Google ScholarGoogle ScholarCross RefCross Ref
  17. Magnus Ekman and Per Stenstrom. 2005. A robust main-memory compression scheme. In Proc. ISCA-32 . Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Jason Evans. 2005. jemalloc http://jemalloc.net/.Google ScholarGoogle Scholar
  19. Yee Ling Gan. 2018. Redesigning the memory hierarchy for memory-safe programming languages. Master's thesis. Massachusetts Institute of Technology.Google ScholarGoogle Scholar
  20. Sanjay Ghemawat and Paul Menage. 2005. TCMalloc: Thread-Caching Malloc http://goog-perftools.sourceforge.net/doc/tcmalloc.html .Google ScholarGoogle Scholar
  21. Google. 2004. Guava: Google Core Libraries for Java . https://github.com/google/guava .Google ScholarGoogle Scholar
  22. Erik G Hallnor and Steven K Reinhardt. 2005. A unified compressed memory hierarchy. In Proc. HPCA-11 . Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Nangate Inc. 2008. The NanGate 45nm Open Cell Library . http://www.nangate.com/?page_id=2325 .Google ScholarGoogle Scholar
  24. Jungrae Kim, Michael Sullivan, Esha Choukse, and Mattan Erez. 2016. Bit-plane compression: Transforming data for better compression in many-core architectures. In Proc. ISCA-43 . Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Seikwon Kim, Seonyoung Lee, Taehoon Kim, and Jaehyuk Huh. 2017. Transparent dual memory compression architecture. In Proc. PACT-26 .Google ScholarGoogle ScholarCross RefCross Ref
  26. Jan Kotek. 2012. JDBM: A simple transactional persistent engine for Java . http://jdbm.sourceforge.net/.Google ScholarGoogle Scholar
  27. Chi-Keung Luk, Robert Cohn, Robert Muth, Harish Patil, Artur Klauser, Geoff Lowney, Steven Wallace, Vijay Janapa Reddi, and Kim Hazelwood. 2005. Pin: Building customized program analysis tools with dynamic instrumentation. In Proc. PLDI . Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Barak Naveh. 2003. JGraphT . http://jgrapht.org .Google ScholarGoogle Scholar
  29. Tri M Nguyen and David Wentzlaff. 2015. MORC: A manycore-oriented compressed cache. In Proc. MICRO-48 . Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Biswabandan Panda and André Seznec. 2016. Dictionary sharing: An efficient cache compression scheme for compressed caches. In Proc. MICRO-49 . Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Gennady Pekhimenko, Evgeny Bolotin, Nandita Vijaykumar, Onur Mutlu, Todd C Mowry, and Stephen W Keckler. 2016. A case for toggle-aware compression for GPU systems. In Proc. HPCA-22 .Google ScholarGoogle ScholarCross RefCross Ref
  32. Gennady Pekhimenko, Tyler Huberty, Rui Cai, Onur Mutlu, Phillip B Gibbons, Michael A Kozuch, and Todd C Mowry. 2015. Exploiting compressed block size as an indicator of future reuse. In Proc. HPCA-21 .Google ScholarGoogle ScholarCross RefCross Ref
  33. Gennady Pekhimenko, Vivek Seshadri, Yoongu Kim, Hongyi Xin, Onur Mutlu, Phillip B Gibbons, Michael A Kozuch, and Todd C Mowry. 2012a. Linearly compressed pages: A low-complexity, low-latency main memory compression framework . Technical Report SAFARI 2012-002. Carnegie Mellon University.Google ScholarGoogle Scholar
  34. Gennady Pekhimenko, Vivek Seshadri, Yoongu Kim, Hongyi Xin, Onur Mutlu, Phillip B Gibbons, Michael A Kozuch, and Todd C Mowry. 2013. Linearly compressed pages: a low-complexity, low-latency main memory compression framework. In Proc. MICRO-46 . Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Gennady Pekhimenko, Vivek Seshadri, Onur Mutlu, Phillip B Gibbons, Michael A Kozuch, and Todd C Mowry. 2012b. Base-delta-immediate compression: Practical data compression for on-chip caches. In Proc. PACT-21 . Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Roldan Pozo and Bruce Miller. 2004. SciMark 2.0 . https://math.nist.gov/scimark2/.Google ScholarGoogle Scholar
  37. Moinuddin K. Qureshi, David Thompson, and Yale N. Patt. 2005. The V-Way cache: Demand based associativity via global replacement. In Proc. ISCA-32 . Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Andrey Rodchenko, Christos Kotselidis, Andy Nisbet, Antoniu Pop, and Mikel Luján. 2017. MaxSim: A simulation platform for managed applications. In Proc. ISPASS .Google ScholarGoogle ScholarCross RefCross Ref
  39. Daniel Sanchez and Christos Kozyrakis. 2013. ZSim: Fast and accurate microarchitectural simulation of thousand-core systems. In Proc. ISCA-40 . Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Somayeh Sardashti, Angelos Arelakis, Per Stenström, and David A Wood. 2015. A primer on compression in the memory hierarchy. Synthesis Lectures on Computer Architecture , Vol. 10, 5 (2015). Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Somayeh Sardashti, André Seznec, and David A Wood. 2014. Skewed compressed caches. In Proc. MICRO-47 . Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Somayeh Sardashti and David A Wood. 2013. Decoupled compressed cache: Exploiting spatial locality for energy-optimized compressed caching. In Proc. MICRO-46 . Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Jennifer B Sartor, Stephen M Blackburn, Daniel Frampton, Martin Hirzel, and Kathryn S McKinley. 2010. Z-rays: divide arrays and conquer speed and flexibility. In Proc. PLDI . Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Jennifer B Sartor, Wim Heirman, Stephen M Blackburn, Lieven Eeckhout, and Kathryn S McKinley. 2014. Cooperative cache scrubbing. In Proc. PACT-23 . Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Jennifer B Sartor, Martin Hirzel, and Kathryn S McKinley. 2008. No bit left behind: The limits of heap data compression. In Proc. ISMM . Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Ali Shafiee, Meysam Taassori, Rajeev Balasubramonian, and Al Davis. 2014. MemZip: Exploring unconventional benefits from memory compression. In Proc. HPCA-20 .Google ScholarGoogle ScholarCross RefCross Ref
  47. Dimitrios Skarlatos, Nam Sung Kim, and Josep Torrellas. 2017. PageForge: A near-memory content-aware page-merging architecture. In Proc. MICRO-50 . Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Standard Performance Evaluation Corporation. 2006. SPECjbb2005 (Java Server Benchmark) . https://www.spec.org/jbb2005/.Google ScholarGoogle Scholar
  49. Sun Microsystems. 2006. Memory management in the Java HotSpot virtual machine . http://www.oracle.com/technetwork/java/javase/memorymanagement-whitepaper-150215.pdf .Google ScholarGoogle Scholar
  50. Yingying Tian, Samira M Khan, Daniel A Jiménez, and Gabriel H Loh. 2016. Last-level cache deduplication. In Proc. ICS'16 . Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. R Brett Tremaine, Peter A Franaszek, John T Robinson, Charles O Schulz, T Basil Smith, Michael E Wazlowski, and P Maurice Bland. 2001. IBM memory expansion technology (MXT). IBM Journal of Research and Development (2001). Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Po-An Tsai, Yee Ling Gan, and Daniel Sanchez. 2018. Rethinking the memory hierarchy for modern languages. In Proc. MICRO-51 .Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Stephen Tu, Wenting Zheng, Eddie Kohler, Barbara Liskov, and Samuel Madden. 2013. Speedy transactions in multicore in-memory databases. In Proc. SOSP-24 . Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Christian Wimmer, Michael Haupt, Michael L Van De Vanter, Mick Jordan, Laurent Daynès, and Douglas Simon. 2013. Maxine: An approachable virtual machine for, and in, Java. ACM Transactions on Architecture and Code Optimization (TACO) , Vol. 9, 4 (2013). Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Clifford Wolf. 2012. Yosys Open Synthesis Suite . http://www.clifford.at/yosys/.Google ScholarGoogle Scholar
  56. Yi Zhao, Jin Shi, Kai Zheng, Haichuan Wang, Haibo Lin, and Ling Shao. 2009. Allocation wall: A limiting factor of Java applications on emerging multi-core platforms. In Proc. OOPSLA . Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Compress Objects, Not Cache Lines: An Object-Based Compressed Memory Hierarchy

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      ASPLOS '19: Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems
      April 2019
      1126 pages
      ISBN:9781450362405
      DOI:10.1145/3297858

      Copyright © 2019 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 4 April 2019

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      ASPLOS '19 Paper Acceptance Rate74of351submissions,21%Overall Acceptance Rate535of2,713submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader