ABSTRACT
Persistent Memory (PM) is an emerging family of technologies that are: persistent; byte addressable; and respond in near-memory speeds. PM devices, also referred to as non-volatile DIMMs or NVDIMMs, connect to the low-latency CPU memory interconnect. PM-based solutions can achieve local persistency within a micro second, which is two orders-of-magnitude faster compared to modern Flash solutions [1].
PM is the first storage media that is faster than high-speed networks and faster than operating system thread scheduling time. Thus, current PM-based solutions are local. They do not comply with common enterprise practices, which require that data remains available even in the face of a given amount of failures, such as an entire node crash.
This work focuses on replicating PM resident data sets between nodes, which is vital in order for PM to become mainstream. We leverage RDMA-supporting network gear and the first application-agnostic PM-based file system that supports mirroring (Plexistor M1FS 3.0). The server on the left-hand side of Figure 1 is the application server running the benchmarks and the server on the right-hand side runs the PM-over-Fabric (PMoF) which owns the secondary copy of the data alongside the file system meta data required in order to mount the file system after a failure occurs.
The experimental setup uses commodity off-the-shelf hardware and 100GbE (RoCE) network. We first explore a synthetic benchmark (FIO) and then a TPC-C like benchmark (DBT-2) on top of a Postgres database. In both cases, we use work sets that fit in the PM tier, because we do not want tiering to mask the performance implications of mirroring.
Figure 2a shows the overall latency (as seen by the application) as a function of different stress levels. Three different access sizes were measured, as well as single and multithread flavors. Small accesses, including local persistency and asynchronous mirroring to the second node, were measured to complete within 1--2 micro seconds under typical storage consumption. At very high loads hardware resources become congested and latency soars.
Figure 2b reveals results for similar benchmarks, with one important difference - write requests are synchronous (i.e. fopen with OSYNC=1). These semantics mean that the file system has to also guarantee that the data written and the meta data describing it have reached the PMoF node prior to acknowledging the write system call. Synchronous mirroring is nearly 2.5us slower than asynchronous mirroring for typical loads, which is mostly due to the round trip delay. These results are an order of magnitude faster than modern block-based replication solutions.
Databases may support replication at the database layer, as an alternative to maintaining data redundancy at the storage layer. Each approach has its advantages, but the rule of thumb for PostgreSQL, as presented in PGCon IL 2017, anticipates 50% lower transactions per second for having a secondary copy. Figures 3a and 3b reveal the negligible performance overhead that PMoF based mirroring has on real life applications. Compared to Postgres on a single node deployment, transaction rate and response time were measured to be only 2.0 to 2.2% lower.
- N. Katzburg, A. Golander, and S. Weiss. Storage becomes first class memory. In IEEE Int'l Conf. on the Science of Electrical Eng. (ICSEE), pages 1--5, 2016. Google ScholarCross Ref
Index Terms
- Persistent memory over fabric (PMoF)
Recommendations
System software for persistent memory
EuroSys '14: Proceedings of the Ninth European Conference on Computer SystemsEmerging byte-addressable, non-volatile memory technologies offer performance within an order of magnitude of DRAM, prompting their inclusion in the processor memory subsystem. However, such load/store accessible Persistent Memory (PM) has implications ...
A Case for Virtualizing Persistent Memory
SoCC '16: Proceedings of the Seventh ACM Symposium on Cloud ComputingWith the proliferation of software and hardware support for persistent memory (PM) like PCM and NV-DIMM, we envision that PM will soon become a standard component of commodity cloud, especially for those applications demanding high performance and low ...
Toward Virtual Machine Image Management for Persistent Memory
Persistent memory’s (PM) byte-addressability and high capacity will also make it emerging for virtualized environment. Modern virtual machine monitors virtualize PM using either I/O virtualization or memory virtualization. However, I/O virtualization will ...
Comments