10.25394/PGS.12730685.v1 Timothy A Pritchett Timothy A Pritchett Workload Driven Designs for Cost-Effective Non-Volatile Memory Hierarchies Purdue University Graduate School 2020 SSD Caching Storage Caches Write Buffering Selective Allocation STT-MRAM Flash Cache Non-Volatile Cache Non-Volatile Write Buffer Computer Engineering Software Engineering 2020-07-28 23:38:28 Thesis https://hammer.purdue.edu/articles/thesis/Workload_Driven_Designs_for_Cost-Effective_Non-Volatile_Memory_Hierarchies/12730685 Compared to traditional hard-disk drives (HDDs), non-volatile (NV) memory technologies offer significant performance advantages on one hand, but also incur significant cost and asymmetric write-performance on the other. A common strategy to manage such cost- and performance-differentials is to use hierarchies such that a small, but intensely accessed, working set is staged in the NV storage (selective caching). However, when this working set includes write-heavy data, the low write-lifetime of NV storage necessitates significant over-provisioning to maintain required lifespans (e.g., storage lifespan must match or exceed 3 year server lifespan). One may think that employing DRAM-based write-buffers can filter writes that trickle through to the NV storage and thus alleviate the write-pressure felt at the NV storage. Unfortunately, selective caches, when used with common recency-based or frequency-based replacement, have access patterns that require large write buffers (e.g., 100MB+ relative to a 12GB cache) to filter writes adequately. Further, these large DRAM write-buffers also require backup-power to ensure the durability of disk writes. More sophisticated replacement policies that combine recency and frequency can reduce the size of the DRAM buffer (while preserving write-filtering), but are so computationally-expensive that they can limit the I/O rate, especially for simple controllers (e.g., RAID controller). <br>My first contribution is the design and implementation of WriteGuard– a self-tuning sieving write-buffer algorithm that filters writes as well as the highly-effective (but computationally-expensive) algorithms while requiring lightweight computation comparable to a simple LRU-based write-buffer. While WriteGuard reduces the capacity needed for DRAM buffering (to approx. 64 MB), it does not eliminate the need for DRAM buffers (and corresponding power backup).<br>For my second thrust, I identify two specific application characteristics – (1) the vast majority of the write-buffer’s contents is composed of write-dominant blocks, and (2) the vast majority of blocks in the write-buffer are overwritten within a period of 28 hours. I show that these characteristics help enable a high-density, optimized STT-MRAM as a replacement for DRAM, which enables durable write-buffers (thus eliminating the cost of power backup for the write-buffer). My optimized STT-MRAM-based write buffer achieves higher density by (a) trading off superfluous durability by exploiting characteristic (2), and (b) deoptimizing the read-performance of STT-MRAM by leveraging characteristic (1). Together, the techniques increase the density of STT-MRAM by 20% with low or no impact on write-buffer performance.<br>