Energy-efficient Memory System Design with Spintronics
Modern computing platforms, from servers to mobile devices, demand ever-increasing amounts of memory to keep up with the growing amounts of data they process, and to bridge the widening processor-memory gap. A large and growing fraction of chip area and energy is expended in memories, which face challenges with technology scaling due to increased leakage, process variations, and unreliability. On the other hand, data intensive workloads such as machine learning and data analytics pose increasing demands on memory systems. Consequently, improving the energy-efficiency and performance of memory systems is an important challenge for computing system designers.
Spintronic memories, which offer several desirable characteristics - near-zero leakage, high density, non-volatility and high endurance - are of great interest for designing future memory systems. However, these memories are not drop-in replacements for current memory technologies, viz. Static Random Access Memory (SRAM) and Dynamic Random Access Memory (DRAM). They pose unique challenges such as variable access times, and require higher write latency and write energy. This dissertation explores new approaches to improving the energy efficiency of spintronic memory systems.
The dissertation first explores the design of approximate memories, in which the need to store and access data precisely is foregone in return for improvements in energy efficiency. This is of particular interest, since many emerging workloads exhibit an inherent ability to tolerate approximations to their underlying computations and data while still producing outputs of acceptable quality. The dissertation proposes that approximate spintronic memories can be realized either by reducing the amount of data that is written to/read from them, or by reducing the energy consumed per access. To reduce memory traffic, the dissertation proposes approximate memory compression, wherein a quality-aware memory controller transparently compresses/decompresses data written to or read from memory. For broader applicability, the quality-aware memory controller can be programmed to specify memory regions that can tolerate approximations, and conforms to a specified error constraint for each such region. To reduce the per-access energy, various mechanisms are identified at the circuit and architecture levels that yield substantial energy benefits at the cost of small probabilities of read, write or retention failures. Based on these mechanisms, a quality-configurable Spin Transfer Torque Magnetic RAM (STT-MRAM) array is designed in which read/write operations can be performed at varying levels of accuracy and energy at runtime, depending on the needs of applications. To illustrate the utility of the proposed quality-configurable memory array, it is evaluated as an L2 cache in the context of a general-purpose processor, and as a scratchpad memory for a domain-specific vector processor.
The dissertation also explores the design of caches with Domain Wall Memory (DWM), a more advanced spintronic memory technology that offers unparalleled density arising from a unique tape-like structure. However, this structure also leads to serialized access to the bits in each bit-cell, resulting in increased access latency, thereby degrading overall performance. To mitigate the performance overheads, the dissertation proposes a reconfigurable DWM-based cache architecture that modulates the active bits per tape with minimal overheads depending on the application's memory access characteristics. The proposed cache is evaluated in a general purpose processor and improvements in performance are demonstrated over both CMOS and previously proposed spintronic caches.
In summary, the dissertation suggests directions to improve the energy efficiency of spintronic memories and re-affirms their potential for the design of future memory systems.