skip to main content
10.1145/3190508.3190524acmconferencesArticle/Chapter ViewAbstractPublication PageseurosysConference Proceedingsconference-collections
research-article
Open Access

Reducing DRAM footprint with NVM in Facebook

Published:23 April 2018Publication History

ABSTRACT

Popular SSD-based key-value stores consume a large amount of DRAM in order to provide high-performance database operations. However, DRAM can be expensive for data center providers, especially given recent global supply shortages that have resulted in increasing DRAM costs. In this work, we design a key-value store, MyNVM, which leverages an NVM block device to reduce DRAM usage, and to reduce the total cost of ownership, while providing comparable latency and queries-per-second (QPS) as MyRocks on a server with a much larger amount of DRAM. Replacing DRAM with NVM introduces several challenges. In particular, NVM has limited read bandwidth, and it wears out quickly under a high write bandwidth.

We design novel solutions to these challenges, including using small block sizes with a partitioned index, aligning blocks post-compression to reduce read bandwidth, utilizing dictionary compression, implementing an admission control policy for which objects get cached in NVM to control its durability, as well as replacing interrupts with a hybrid polling mechanism. We implemented MyNVM and measured its performance in Facebook's production environment. Our implementation reduces the size of the DRAM cache from 96 GB to 16 GB, and incurs a negligible impact on latency and queries-per-second compared to MyRocks. Finally, to the best of our knowledge, this is the first study on the usage of NVM devices in a commercial data center environment.

References

  1. Dram prices continue to climb. https://epsnews.com/2017/08/18/dram-prices-continue-climb/.Google ScholarGoogle Scholar
  2. Flexible I/O tester. https://github.com/axboe/fio.Google ScholarGoogle Scholar
  3. Intel Optane DC p4800x specifications. https://www.intel.com/content/www/us/en/products/memory-storage/solid-state-drives/data-center-ssds/optane-dc-p4800x-series.html.Google ScholarGoogle Scholar
  4. Introducing the Samsung PM1725a NVMe SSD. http://www.samsung.com/semiconductor/insights/tech-leadership/brochure-samsung-pm1725a-nvme-ssd/.Google ScholarGoogle Scholar
  5. RocksDB wiki. github.com/facebook/rocksdb/wiki//.Google ScholarGoogle Scholar
  6. T. G. Armstrong, V. Ponnekanti, D. Borthakur, and M. Callaghan. LinkBench: A database benchmark based on the Facebook social graph. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, SIGMOD '13, pages 1185--1196, New York, NY, USA, 2013. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. K. A. Bailey, P. Hornyack, L. Ceze, S. D. Gribble, and H. M. Levy. Exploring storage class memory with key value stores. In Proceedings of the 1st Workshop on Interactions of NVM/FLASH with Operating Systems and Workloads, INFLOW '13, pages 4:1--4:8, NewYork, NY, USA, 2013. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. D. S. Berger, R. K. Sitaraman, and M. Harchol-Balter. AdaptSize: Orchestrating the hot object memory cache in a content delivery network. In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17), pages 483--498, Boston, MA, 2017. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. N. Bronson, Z. Amsden, G. Cabrera, P. Chakka, P. Dimov, H. Ding, J. Ferris, A. Giardullo, S. Kulkarni, H. Li, M. Marchukov, D. Petrov, L. Puzar, Y.J. Song, and V. Venkataramani. TAO: Facebook's Distributed Data Store for the Social Graph. In Presented as part of the 2013 USENIX Annual Technical Conference (USENIX ATC 13), pages 49--60, San Jose, CA, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. Chen, Q. Wei, C. Chen, and L. Wu. FSMAC: A file system metadata accelerator with non-volatile memory. In Mass Storage Systems and Technologies (MSST), 2013 IEEE 29th Symposium on, pages 1--11. IEEE, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  11. S. Chen, P. B. Gibbons, and S. Nath. Rethinking database algorithms for phase change memory. In CIDR, pages 21--31. www.cidrdb.org, 2011.Google ScholarGoogle Scholar
  12. Y. COLLET and C. TURNER. Smaller and faster data compression with zstandard, 2016, 2016.Google ScholarGoogle Scholar
  13. J. Condit, E. B. Nightingale, C. Frost, E. Ipek, B. Lee, D. Burger, and D. Coetzee. Better i/o through byte-addressable, persistent memory. In Proceedings of the ACM SIGOPS 22Nd Symposium on Operating Systems Principles, SOSP '09, pages 133--146, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. B. Debnath, A. Haghdoost, A. Kadav, M. G. Khatib, and C. Ungureanu. Revisiting hash table design for phase change memory. In Proceedings of the 3rd Workshop on Interactions of NVM/FLASH with Operating Systems and Workloads, INFLOW '15, pages 1:1--1:9, New York, NY, USA, 2015. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. S. R. Dulloor, S. Kumar, A. Keshavamurthy, P. Lantz, D. Reddy, R. Sankaran, and J. Jackson. System software for persistent memory. In Proceedings of the Ninth European Conference on Computer Systems, EuroSys '14, pages 15:1--15:15, New York, NY, USA, 2014. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. A. Eisenman, A. Cidon, E. Pergament, O. Haimovich, R. Stutsman, M. Alizadeh, and S. Katti. Flashield: a key-value cache that minimizes writes to flash. CoRR, abs/1702.02588, 2017.Google ScholarGoogle Scholar
  17. D. Exchange. DRAM supply to remain tight with its annual bit growth for 2018 forecast at just 19.6www.dramexchange.com.Google ScholarGoogle Scholar
  18. W. Hu, G. Li, J. Ni, D. Sun, and K.-L. Tan. B-tree: A predictive B-tree for reducing writes on phase change memory. IEEE Transactions on Knowledge and Data Engineering, 26(10):2368--2381, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  19. U. Kang, H.-s. Yu, C. Park, H. Zheng, J. Halbert, K. Bains, S. Jang, and J. S. Choi. Co-architecting controllers and dram to enhance dram process scaling. In The memory forum, pages 1--4, 2014.Google ScholarGoogle Scholar
  20. W.-H. Kim, J. Kim, W. Baek, B. Nam, and Y. Won. NVWAL: Exploiting NVRAM in write-ahead logging. SIGPLAN Not., 51(4):385--398, Mar. 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. E. Lee, S. Yoo, J.-E. Jang, and H. Bahn. Shortcut-JFS: A write efficient journaling file system for phase change memory. In Mass Storage Systems and Technologies (MSST), 2012 IEEE 28th Symposium on, pages 1--6. IEEE, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  22. S.-H. Lee. Technology scaling challenges and opportunities of memory devices. In Electron Devices Meeting (IEDM), 2016 IEEE International, pages 1--1. IEEE, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  23. Y. Matsunobu. Myrocks: A space and write-optimized MySQL database. code. facebook.com/posts/190251048047090/.Google ScholarGoogle Scholar
  24. R. Nishtala, H. Fugal, S. Grimm, M. Kwiatkowski, H. Lee, H. C. Li, R. McElroy, M. Paleczny, D. Peek, P. Saab, D. Stafford, T. Tung, and V. Venkataramani. Scaling Memcache at Facebook. In Presented as part of the 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI 13), pages 385--398, Lombard, IL, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. I. Oukid, J. Lasperas, A. Nica, T. Willhalm, and W. Lehner. Fptree: A hybrid SCM-DRAM persistent and concurrent B-Tree for storage class memory. In Proceedings of the 2016 International Conference on Management of Data, SIGMOD '16, pages 371--386, New York, NY, USA, 2016. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. W. Shin, Q. Chen, M. Oh, H. Eom, and H. Y. Yeom. OS i/o path optimizations for flash solid-state drives. In 2014 USENIX Annual Technical Conference (USENIX ATC 14), pages 483--488, Philadelphia, PA, 2014. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. S. Venkataraman, N. Tolia, P. Ranganathan, and R. H. Campbell. Consistent and durable data structures for non-volatile byte-addressable memory. In Proceedings of the 9th USENIX Conference on File and Stroage Technologies, FAST'11, pages 5--5, Berkeley, CA, USA, 2011. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. F. Xia, D. Jiang, J. Xiong, and N. Sun. HiKV: A hybrid index key-value store for DRAM-NVM memory systems. In 2017 USENIX Annual Technical Conference (USENIX ATC 17), pages 349--362, Santa Clara, CA, 2017. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. J. Xu and S. Swanson. NOVA: A log-structured file system for hybrid volatile/nonvolatile main memories. In 14th USENIX Conference on File and Storage Technologies (FAST 16), pages 323--338, Santa Clara, CA, 2016. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. J. Yang, D. B. Minturn, and F. Hady. When poll is better than interrupt. In Proceedings of the 10th USENIX Conference on File and Storage Technologies, FAST'12, pages 3--3, Berkeley, CA, USA, 2012. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. J. Yang, Q. Wei, C. Chen, C. Wang, K. L. Yong, and B. He. NV-Tree: Reducing consistency cost for NVM-based single level systems. In 13th USENIX Conference on File and Storage Technologies (FAST 15), pages 167--181, Santa Clara, CA, 2015. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. P. Zuo and Y. Hua. A write-friendly hashing scheme for non-volatile memory systems. In Proceedings of the 33st Symposium on Mass Storage Systems and Technologies, MSST, volume 17, pages 1--10, 2017.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    EuroSys '18: Proceedings of the Thirteenth EuroSys Conference
    April 2018
    631 pages
    ISBN:9781450355841
    DOI:10.1145/3190508

    Copyright © 2018 Owner/Author

    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 23 April 2018

    Check for updates

    Qualifiers

    • research-article

    Acceptance Rates

    EuroSys '18 Paper Acceptance Rate43of262submissions,16%Overall Acceptance Rate241of1,308submissions,18%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader