This synthesis lecture presents the current state-of-the-art in applying low-latency, lossless hardware compression algorithms to cache, memory, and the memory/cache link. There are many non-trivial challenges that must be addressed to make data compression work well in this context. First, since compressed data must be decompressed before it can be accessed, decompression latency ends up on the critical memory access path. This imposes a significant constraint on the choice of compression algorithms. Second, while conventional memory systems store fixed-size entities like data types, cache blocks, and memory pages, these entities will suddenly vary in size in a memory system that employs compression. Dealing with variable size entities in a memory system using compression has a significant impact on the way caches are organized and how to manage the resources in main memory. We systematically discuss solutions in the open literature to these problems. Chapter 2 provides the foundations of data compression by first introducing the fundamental concept of value locality. We then introduce a taxonomy of compression algorithms and show how previously proposed algorithms fit within that logical framework. Chapter 3 discusses the different ways that cache memory systems can employ compression, focusing on the trade-offs between latency, capacity, and complexity of alternative ways to compact compressed cache blocks. Chapter 4 discusses issues in applying data compression to main memory and Chapter 5 covers techniques for compressing data on the cache-to-memory links. This book should help a skilled memory system designer understand the fundamental challenges in applying compression to the memory hierarchy and introduce him/her to the state-of-the-art techniques in addressing them.
Cited By
- Lascorz A, Mahmoud M, Zadeh A, Nikolic M, Ibrahim K, Giannoula C, Abdelhadi A and Moshovos A Atalanta: A Bit is Worth a “Thousand” Tensor Values Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2, (85-102)
- Tsai P and Sanchez D Compress Objects, Not Cache Lines Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, (229-242)
- Stevens J, Ranjan A and Raghunathan A AxBA: An Approximate Bus Architecture Framework 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), (1-8)
- Li Y, Park J, Alian M, Yuan Y, Qu Z, Pan P, Wang R, Schwing A, Esmaeilzadeh H and Kim N A network-centric hardware/algorithm co-design to accelerate distributed training of deep neural networks Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, (175-188)
- Sardashti S and Wood D (2017). Could Compression Be of General Use? Evaluating Memory Compression across Domains, ACM Transactions on Architecture and Code Optimization, 14:4, (1-24), Online publication date: 20-Dec-2017.
Recommendations
Memory organizations for 3D-DRAMs and PCMs in processor memory hierarchy
In this paper, we describe and evaluate three possible architectures for using 3D-DRAMs and PCMs in the processor memory hierarchy. We explore: (i) using 3D-DRAM as main memory with PCM as backing store; (ii) using 3D-DRAM as the Last Level Cache and ...
Revisiting wear leveling design on compression applied 3D NAND flash memory: work-in-progress
CODES '18: Proceedings of the International Conference on Hardware/Software Codesign and System SynthesisCompression has been demonstrated as an efficient method for lifetime improvement on flash memory. However, data compression ratios are various, which bring proportional wearing on flash pages. Furthermore, the compression schemes have still not been ...
3D DRAM and PCMs in Processor Memory Hierarchy
Proceedings of the 27th International Conference on Architecture of Computing Systems ARCS 2014 - Volume 8350In this paper we describe and evaluate two possible architectures using 3D DRAMs and PCMs in the processor memory hierarchy. We explore using (a) 3D DRAM as main memory with PCM as backing store and (b) 3D DRAM as the Last Level Cache and PCM as the ...