skip to main content
10.1145/3316781.3317827acmconferencesArticle/Chapter ViewAbstractPublication PagesdacConference Proceedingsconference-collections
research-article
Public Access

FlashGPU: Placing New Flash Next to GPU Cores

Authors Info & Claims
Published:02 June 2019Publication History

ABSTRACT

We propose FlashGPU, a new GPU architecture that tightly blends new flash (Z-NAND) with massive GPU cores. Specifically, we replace global memory with Z-NAND that exhibits ultra-low latency. We also architect a flash core to manage request dispatches and address translations underneath L2 cache banks of GPU cores. While Z-NAND is a hundred times faster than conventional 3D-stacked flash, its latency is still longer than DRAM. To address this shortcoming, we propose a dynamic page-placement and buffer manager in Z-NAND subsystems by being aware of bulk and parallel memory access characteristics of GPU applications, thereby offering high-throughput and low-energy consumption behaviors.

References

  1. Jaehyung Ahn et al. 2015. DCS: a fast and scalable device-centric server architecture. In MICRO. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. AMD. 2017. Radeon Pro SSG Graphics. https://www.amd.com/en/products/professional-graphics/radeon-pro-ssg. (2017).Google ScholarGoogle Scholar
  3. Mark Harris. 2013. Unified Memory in CUDA 6. https://devblogs.nvidia.com/unified-memory-in-cuda-6/. (2013).Google ScholarGoogle Scholar
  4. Myoungsoo Jung et al. 2012. Physically addressed queueing (PAQ): improving parallelism in solid state disks. In SIGARCH Computer Architecture News. IEEE. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Myoungsoo Jung et al. 2018. SimpleSSD: modeling solid state drives for holistic system simulation. Computer Architecture Letters (2018). Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Myoungsoo Jung and Mahmut T Kandemir. 2014. Sprinkler: Maximizing resource utilization in many-chip solid state disks. In High Performance Computer Architecture (HPCA), 2014 IEEE 20th International Symposium on. IEEE, 524--535.Google ScholarGoogle ScholarCross RefCross Ref
  7. Hyesoon Kim et al. 2012. Macsim: A cpu-gpu heterogeneous simulation framework user guide. Georgia Institute of Technology (2012).Google ScholarGoogle Scholar
  8. Sungjoon Koh et al. 2018. Exploring system challenges of ultra-low latency solid state drives. In HotStorage 18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Lifeng Nai et al. 2015. GraphBIG: understanding graph computing in the context of industrial solutions. In SC. IEEE. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Samsung. 2017. Ultra-Low Latency with Samsung Z-NAND SSD. Ultra-Low_Latency_with_Samsung_Z-NAND_SSD-0.pdf. (2017).Google ScholarGoogle Scholar
  11. Sudharsan Seshadri et al. 2014. Willow: A User-Programmable SSD.. In OSDI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Mimi Xie, et al. 2018. AIM: Fast and energy-efficient AES in-memory implementation for emerging non-volatile main memory. In DATE. IEEE.Google ScholarGoogle Scholar
  13. Yuan Xue et al. 2017. Age-aware logic and memory co-placement for RRAM-FPGAs. In DAC. ACM, 1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Jie Zhang et al. 2015. Nvmmu: A non-volatile memory management unit for heterogeneous gpu-ssd architectures. In PACT. IEEE. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Jie Zhang and Myoungsoo Jung. 2018. Flashabacus: a self-governing flash-based accelerator for low-power systems. In EuroSys. ACM, 15. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    DAC '19: Proceedings of the 56th Annual Design Automation Conference 2019
    June 2019
    1378 pages
    ISBN:9781450367257
    DOI:10.1145/3316781

    Copyright © 2019 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 2 June 2019

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited

    Acceptance Rates

    Overall Acceptance Rate1,770of5,499submissions,32%

    Upcoming Conference

    DAC '24
    61st ACM/IEEE Design Automation Conference
    June 23 - 27, 2024
    San Francisco , CA , USA

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader