Abstract
Classic caching algorithms leverage recency, access count, and/or other properties of cached blocks at per-block granularity. However, for media such as flash which have performance and wear penalties for small overwrites, implementing cache policies at a larger granularity is beneficial. Recent research has focused on buffering small blocks and writing in large granularities, sometimes called containers, but it has not explored the ramifications and best strategies for caching compound blocks consisting of logically distinct, but physically co-located, blocks. Containers may have highly diverse blocks, with mixtures of frequently accessed, infrequently accessed, and invalidated blocks.
We propose and evaluate Pannier, a flash cache layer that provides high performance while extending flash lifespan. Pannier uses three main techniques: (1) leveraging block access counts to manage cache containers, (2) incorporating block liveness as a property to improve flash cache space efficiency, and (3) designing a multi-step feedback controller to ensure a flash cache reaches its desired lifespan while maintaining performance. Our evaluation shows that Pannier improves flash cache performance and extends lifespan beyond previous per-block and container-aware caching policies. More fundamentally, our investigation highlights the importance of creating new policies for caching compound blocks in flash.
- Nitin Agrawal, Vijayan Prabhakaran, Ted Wobber, John D. Davis, Mark Manasse, and Rina Panigrahy. 2008. Design tradeoffs for SSD performance. In Proceedings of the USENIX 2008 Annual Technical Conference. 57--70.Google ScholarDigital Library
- Anirudh Badam and Vivek S. Pai. 2011. SSDAlloc: Hybrid SSD/RAM memory management made easy. In Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation. 211--224.Google Scholar
- L. A. Belady. 1966. A study of replacement algorithms for a virtual-storage computer. IBM Syst. J. 5, 2 (1966). Google ScholarDigital Library
- L. Breslau, Pei Cao, Li Fan, G. Phillips, and S. Shenker. 1999. Web caching and zipf-like distributions: Evidence and implications. In Proceedings of the 18th Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM’99). 126--134 vol. 1.Google Scholar
- Mudashiru Busari and Carey Williamson. 2002. ProWGen: A synthetic workload generation tool for simulation evaluation of web proxy caches. Comput. Netw. 38, 6 (2002), 779--794. Google ScholarDigital Library
- Pei Cao and Sandy Irani. 1997. Cost-aware WWW proxy caching algorithms. In Proceedings of the USENIX Symposium on Internet Technologies and Systems. 18--30.Google Scholar
- Adrian M. Caulfield, Laura M. Grupp, and Steven Swanson. 2009. Gordon: Using flash memory to build fast, power-efficient clusters for data-intensive applications. In Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’09). 217--228.Google ScholarDigital Library
- Feng Chen, David A. Koufaty, and Xiaodong Zhang. 2009. Understanding intrinsic characteristics and system implications of flash memory based solid state drives. In Proceedings of the 11th International Joint Conference on Measurement and Modeling of Computer Systems (SIGMETRICS’09). 181--192. Google ScholarDigital Library
- Yue Cheng, Fred Douglis, Philip Shilane, Michael Tratchman, Grant Wallace, Peter Desnoyers, and Kai Li. 2016. Erasing belady’s limitations: In search of flash cache offline optimality. In Proceedings of the 2016 USENIX Annual Technical Conference (USENIX ATC’16). 379--392.Google Scholar
- Jeffrey Dean and Luiz André Barroso. 2013. The tail at scale. Commun. ACM 56, 2 (2013), 74--80.Google ScholarDigital Library
- G. Einziger and R. Friedman. 2014. TinyLFU: A highly efficient cache admission policy. In Proceedings of the 2014 22nd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing. 146--153. Google ScholarDigital Library
- Binny S. Gill. 2008. On multi-level exclusive caching: Offline optimality and why promotions are better than demotions. In Proceedings of the 6th USENIX Conference on File and Storage Technologies. 4:1--4:17.Google Scholar
- Aayush Gupta, Youngjae Kim, and Bhuvan Urgaonkar. 2009. DFTL: A flash translation layer employing demand-based selective caching of page-level address mappings. In Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’09). 229--240. Google ScholarDigital Library
- Yang Hu, Hong Jiang, Dan Feng, Lei Tian, Hao Luo, and Shuping Zhang. 2011. Performance impact and interplay of SSD parallelism through advanced commands, allocation strategy and data granularity. In Proceedings of the International Conference on Supercomputing (ICS’11). 96--107. Google ScholarDigital Library
- Aamer Jaleel, Kevin B. Theobald, Simon C. Steely, Jr., and Joel Emer. 2010. High performance cache replacement using re-reference interval prediction (RRIP). In Proceedings of the 37th Annual International Symposium on Computer Architecture. 60--71. Google ScholarDigital Library
- Myeongjae Jeon, Youngjae Kim, Jeaho Hwang, Joonwon Lee, and Euiseong Seo. 2012. Workload characterization and performance implications of large-scale blog servers. ACM Trans. Web 6, 4 (2012). Google ScholarDigital Library
- Song Jiang and Xiaodong Zhang. 2002. LIRS: An efficient low inter-reference recency set replacement policy to improve buffer cache performance. In Proceedings of the 2002 ACM International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS’02). 31--42.Google ScholarDigital Library
- Heeseung Jo, Jeong-Uk Kang, Seon-Yeong Park, Jin-Soo Kim, and Joonwon Lee. 2006. FAB: Flash-aware buffer management policy for portable media players. IEEE Transactions on Consumer Electronics 52, 2 (2006), 485--493. Google ScholarDigital Library
- Theodore Johnson and Dennis Shasha. 1994. 2Q: A low overhead high performance buffer management replacement algorithm. In Proceedings of the 20th International Conference on Very Large Data Bases (VLDB’94). 439--450.Google ScholarDigital Library
- Ramakrishna Karedla, J. Spencer Love, and Bradley G. Wherry. 1994. Caching strategies to improve disk system performance. Computer 27, 3 (1994). Google ScholarDigital Library
- S. Kavalanekar, B. Worthington, Qi Zhang, and V. Sharda. 2008. Characterization of storage workload traces from production windows servers. In Proceedings of the Workload Characterization IEEE International Symposium on (IISWC’08). 119--128. Google ScholarCross Ref
- Hyojun Kim and Seongjun Ahn. 2008. BPLRU: A buffer management scheme for improving random writes in flash storage. In Proceedings of the 6th USENIX Conference on File and Storage Technologies. 16:1--16:14.Google Scholar
- Youngjae Kim, Brendan Tauras, Aayush Gupta, and Bhuvan Urgaonkar. 2009. FlashSim: A simulator for NAND flash-based solid-state drives. In Proceedings of the 2009 First International Conference on Advances in System Simulation. 125--131. Google ScholarDigital Library
- Ricardo Koller, Leonardo Marmol, Raju Rangaswami, Swaminathan Sundararaman, Nisha Talagala, and Ming Zhao. 2013. Write policies for host-side flash caches. In Proceedings of the 11th USENIX Conference on File and Storage Technologies. 45--58.Google ScholarDigital Library
- SangWon Lee and others. 2007. A log buffer-based flash translation layer using fully associative sector translation. ACM Trans. Embed. Comput. Syst. 6, 3 (2007).Google ScholarDigital Library
- Cheng Li, Philip Shilane, Fred Douglis, Hyong Shim, Stephen Smaldone, and Grant Wallace. 2014. Nitro: A capacity-optimized SSD cache for primary storage. In Proceedings of the 2014 USENIX Annual Technical Conference (USENIX ATC’14). 501--512.Google Scholar
- Cheng Li, Philip Shilane, Fred Douglis, and Grant Wallace. 2015. Pannier: A container-based flash cache for compound objects. In Proceedings of the 16th Annual Middleware Conference (ACM/IFIP/USENIX’15). 50--62. Google ScholarDigital Library
- Zhichao Li, Ming Chen, Amanpreet Mukker, and Erez Zadok. 2015. On the trade-offs among performance, energy, and endurance in a versatile hybrid drive. ACM Trans. Storage 11, 3 (2015). Google ScholarDigital Library
- Yushi Liang, Yunpeng Chai, Ning Bao, Hengyu Chen, and Yaohong Liu. 2016. Elastic queue: A universal SSD lifetime extension plug-in for cache replacement algorithms. In Proceedings of the 9th ACM International on Systems and Storage Conference (SYSTOR’16). 5:1--5:11. Google ScholarDigital Library
- Nimrod Megiddo and Dharmendra S. Modha. 2003. ARC: A self-tuning, low overhead replacement cache. In Proceedings of the 2Nd USENIX Conference on File and Storage Technologies. 115--130.Google ScholarDigital Library
- Chris Mellor. 2016. QLC Flash is Tricky Stuff to Make and Use (2016). Retrieved from https://www.theregister.co.uk/2016/07/28/qlc_flash_primer.Google Scholar
- Micron. 2013. Micron MLC SSD Specification (2013). Retrieved from http://www.micron.com/products/nand-flash/.Google Scholar
- Subramanian Muralidhar, Wyatt Lloyd, Sabyasachi Roy, Cory Hill, Ernest Lin, Weiwen Liu, Satadru Pan, Shiva Shankar, Viswanath Sivakumar, Linpeng Tang, and Sanjeev Kumar. 2014. f4: Facebook’s warm BLOB storage system. In Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI’14). 383--398.Google ScholarDigital Library
- Dushyanth Narayanan, Austin Donnelly, and Antony Rowstron. 2008. Write off-loading: Practical power management for enterprise storage. In Proceedings of the 6th USENIX Conference on File and Storage Technologies. 110--124. Google ScholarDigital Library
- Y. Oh, J. Choi, D. Lee, and S. H. Noh. 2012. Caching less for better performance: Balancing cache size and update cost of flash memory cache in hybrid storage systems. In Proceedings of the 10th USENIX Conference on File and Storage Technologies (FAST’12). 10--25.Google ScholarDigital Library
- Jian Ouyang, Shiding Lin, Song Jiang, Zhenyu Hou, Yong Wang, and Yuanzheng Wang. 2014. SDF: Software-defined flash for web-scale internet storage systems. In Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’14). 471--484.Google ScholarDigital Library
- Vidyadhar Phalke and Bhaskarpillai Gopinath. 1995. An inter-reference gap model for temporal locality in program behavior. In Proceedings of the 1995 ACM Joint International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS’95). 291--300. Google ScholarDigital Library
- Dai Qin, Angela Demke Brown, and Ashvin Goel. 2014. Reliable writeback for client-side flash caches. In Proceedings of the 2014 USENIX Conference on USENIX Annual Technical Conference. 451--462.Google ScholarDigital Library
- Moinuddin K. Qureshi, Aamer Jaleel, Yale N. Patt, Simon C. Steely, and Joel Emer. 2007. Adaptive insertion policies for high performance caching. In Proceedings of the 34th Annual International Symposium on Computer Architecture. 381--391. Google ScholarDigital Library
- John T. Robinson and Murthy V. Devarakonda. 1990. Data cache management using frequency-based replacement. In Proceedings of the 1990 ACM Conference on Measurement and Modeling of Computer Systems (SIGMETRICS’90). 134--142. Google ScholarDigital Library
- Mendel Rosenblum and John K. Ousterhout. 1992. The design and implementation of a log-structured file system. ACM TOCS 10, 1 (1992). Google ScholarDigital Library
- Samsung. 2015. Samsung Server SSD Specification. Retrieved from www.samsung.com/serverssd/ (2015).Google Scholar
- Sandisk. 2015. SanDisk SATA Solid State Drives. Retrieved from http://www.sandisk.com/enterprise/sata-ssd/ (2015).Google Scholar
- Mohit Saxena, Michael M. Swift, and Yiying Zhang. 2012. FlashTier: A lightweight, consistent and durable storage cache. In Proceedings of the European Conference on Computer Systems (EuroSys’12). Google ScholarDigital Library
- Hyong Shim, Philip Shilane, and Windsor Hsu. 2013. Characterization of incremental data changes for efficient data protection. In Presented as Part of the 2013 USENIX Annual Technical Conference (USENIX ATC’13). 157--168.Google Scholar
- Yannis Smaragdakis, Scott Kaplan, and Paul Wilson. 1999. EELRU: Simple and effective adaptive page replacement. In Proceedings of the 1999 ACM International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS’99). 122--133.Google ScholarDigital Library
- L. Sonneborn and F. Van Vleck. 1965. The bang-bang principle for linear control systems. SIAM J. Control (1965).Google Scholar
- Linpeng Tang, Qi Huang, Wyatt Lloyd, Sanjeev Kumar, and Kai Li. 2015. RIPQ: Effective photo caching algorithm for facebook. In Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST’15). 373--386.Google Scholar
- Olivier Temam. 1998. Investigating optimal local memory performance. In Proceedings of the 8th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’98). 218--227. Google ScholarDigital Library
- Jun Wang and Yiming Hu. 2002. WOLF--A novel reordering write buffer to boost the performance of log-structured file system. In Proceedings of the 1st USENIX Conference on File and Storage Technologies (FAST’02). 40--53.Google ScholarDigital Library
- John Wilkes, Richard Golding, Carl Staelin, and Tim Sullivan. 1995. The HP AutoRAID hierarchical storage system. ACM Trans. Comput. Syst. 14, 1 (1995). Google ScholarDigital Library
- Michael Wu and Willy Zwaenepoel. 1994. eNVy: A non-volatile, main memory storage system. In Proceedings of the 8th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’94). 86--97.Google ScholarDigital Library
- J. Yoo, Y. Won, J. Hwang, S. Kang, J. Choil, S. Yoon, and J. Cha. 2013. VSSIM: Virtual machine based SSD simulator. In 2013 IEEE 29th Symposium on Mass Storage Systems and Technologies (MSST’13). 1--14.Google Scholar
- Yuanyuan Zhou, James F. Philbin, and Kai Li. 2001. The multi-queue replacement algorithm for second level buffer caches. In Proceedings of the General Track: 2001 USENIX Annual Technical Conference (USENIX ATC’01). 91--104.Google ScholarDigital Library
Index Terms
- Pannier: Design and Analysis of a Container-Based Flash Cache for Compound Objects
Recommendations
Pannier: A Container-based Flash Cache for Compound Objects
Middleware '15: Proceedings of the 16th Annual Middleware ConferenceClassic caching algorithms leverage recency, access count, and/or other properties of cached blocks at per-block granularity. However, for media such as flash which have performance and wear penalties for small overwrites, implementing cache policies at ...
Bucket-Based Expiration Algorithm: Improving Eviction Efficiency for In-Memory Key-Value Database
MEMSYS '20: Proceedings of the International Symposium on Memory SystemsEvicting expired keys for an in-memory key-value database is essential to save its memory resources and control its memory usage not exceeding its memory limit. Existing randomized expiration algorithms randomly sample keys from the key space ...
Design and Optimization of Large Size and Low Overhead Off-Chip Caches
Large off-chip L3 caches can significantly improve the performance of memory-intensive applications. However, conventional L3 SRAM caches are facing two issues as those applications require increasingly large caches. First, an SRAM cache has a limited ...
Comments