Dynamically Replicated Memory: Building resilient systems from unreliable nanoscale memories

Engin Ipek, Jeremy Condit, Edmund B Nightingale, Doug Burger, and Thomas Moscibroda

Abstract

DRAM is facing severe scalability challenges in sub-45nm technology nodes due to precise charge placement and sensing hurdles in deep-submicron geometries. Resistive memories, such as phase-change memory (PCM), already scale well beyond DRAM and are a promising DRAM replacement. Unfortunately, PCM is write-limited, and current approaches to managing writes must decommission pages of PCM when the first bit fails. This paper presents dynamically replicated memory (DRM), the first hardware and operating system interface designed for PCM that allows continued operation through graceful degradation when hard faults occur. DRM reuses memory pages that contain hard faults by dynamically forming pairs of complementary pages that act as a single page of storage. No changes are required to the processor cores, the cache hierarchy, or the operating system’s page tables. By changing the memory controller, the TLBs, and the operating system to be DRM-aware, we can improve the lifetime of PCM by up to 40x over conventional error-detection techniques.

Details

Publication typeProceedings
Published inFifteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '10) Won Best Paper Award.
> Publications > Dynamically Replicated Memory: Building resilient systems from unreliable nanoscale memories