Due to the emerging needs of large-scale Web applications, DRAM technique in storage systems has been increasing rapidly in recent years. However, DRAM has its own limitations. In most cases, DRAM just acts as a cache for other storage systems. Also, developers have to manage consistency between caches and the backing storage. Besides, cache misses and backing store overheads are another two reasons that result in the difficulties to capture DRAM's full performance potential.
In this paper, the authors present RAMCloud, a DRAM-based storage system, that keeps all data in DRAM all the time with disks used only as backups. For RAMCloud, the most challenging part of the design is to provide a high level of durability and availability without impacting system performance. To solve availability problem, RAMCloud leverages fast crash recovery. It keeps only one single copy of data in DRAM with redundant copies stored on disk or flash. During the recovery phase, RAMCloud uses both data parallelism and pipelining to speed up recovery. The authors implement RAMCloud and evaluate its crash recovery properties. Their experimental results show that RAMCloud can provide durable and available DRAM-based storage for the same price and energy usage as today's volatile DRAM caches.
- The authors design and implement RAMCloud, providing inexpensive durability and availability by recovering quickly from crashes.
- Apart from durability and availability, RAMCloud has small impacts on the performance of the system.
- RAMCloud takes advantage of the system's large scale to recover quickly after crashes.
- RAMCloud uses a log-structured approach for all its data in DRAM as well as on disk, which provides high write throughput and simplifies many issues related to crash recovery.
- RAMCloud utilizes randomized approaches to provide near-optimal results at low cost.
- RAMCloud does not consider locality very well. It only guarantees that small tablets and adjacent keys in large tables may be stored together.