Dynamically Replicated Memory: Building Reliable Systems from Nanoscale Resistive Memories

ASPLOS 2010: 15th International Conference on Architectural Support for Programming Languages and Operating Systems, Pittsburgh, PA |

Published by ACM

ASPLOS Best Paper Award

DRAM is facing severe scalability challenges in sub-45nm technology nodes due to precise charge placement and sensing hurdles in deep-submicron geometries. Resistive memories, such as phase-change memory (PCM), already scale well beyond DRAM and are a promising DRAM replacement. Unfortunately, PCM is write-limited, and current approaches to managing writes must decommission pages of PCM when the first bit fails. This paper presents dynamically replicated memory (DRM), the first hardware and operating system interface designed for PCM that allows continued operation through graceful degradation when hard faults occur. DRM reuses memory pages that contain hard faults by dynamically forming pairs of complementary pages that act as a single page of storage. No changes are required to the processor cores, the cache hierarchy, or the operating system’s page tables. By changing the memory controller, the TLBs, and the operating system to be DRM-aware, we can improve the lifetime of PCM by up to 40x over conventional error-detection techniques.