Blurred light trails on a city street between tall buildings at night
Colorful light trails streak through a modern city at night, symbolizing the rapid global expansion of hyperscale cloud and AI datacenter capacity.

Sandook delivers pooled SSD performance-boosting management

Published

MIT and Tufts University researchers have devized a performance-boosting management scheme for pooled SSDs that overcomes block-erase-and-write and garbage collection performance dips.

The scheme is called Sandook and separates short- and long-term performance blocking features of SSDs by having a 2-layer control structure fed with data from agent software running in storage servers.

SSD performance can vary from faster to slower depending on the ratio of read and write requests. As writing requires a block-level erase and write process, it is much slower than reads and a preponderance of writes to an SSD can slow its read performance. SDS can also vary in performance between suppliers and even within a batch of SSDs from one supplier. When an SSD’s controller needs to recover deleted cells from within blocks of cells it has to copy out the good data from the blocks, erase the block’s contents, and return the block to the unused pool of blocks; so-called garbage collection. This is typically run at an individual SSD controller’s discretion and can cause a temporary and substantial drop in performance.

The researchers envisage a cluster of compute nodes, within which are compute servers and storage servers. They have a central controller running in the compute cluster, client software in the compute servers and agent software running in the storage servers. The storage servers contain commodity, off-the-shelf SSDs, each with their own internal controller.

Sandook concept diagram.
Sandook concept diagram.

 

The Sandook Controller maintains a registry of SSDs, each of which it has tested and produced a performance profile, such as its IOPS rating. It assigns read or write mode status to each SSD, and receives updates about each drive's performance every 200ms from Sandook agents running in each storage server. It uses this information to provide read and write weights, adjusted for global data IO demand for the cluster, to the Sandook client software running in the compute servers. These are called scheduling decisions.

The researchers specify that, for reads, Sandook builds upon block replication, which it already uses for fault tolerance, to provide flexibility in routing read requests among replicas on different SSDs. For writes, Sandook adopts a log-structured design, allowing writes to be directed to any SSD regardless of current block locations. This high degree of freedom guarantees that scheduling policies can be acted upon without restriction.

The Sandook client software presents a a familiar block device interface to data access requesters, transparently routing storage requests to the most suitable SSDs based on the Sandook controller’s scheduling decisions. The client software also receives real-time SSD status from Sandook agent software running in the storage servers, and can then down-weight an individual SSD if it’s running a garbage control process. Read and write requests than go to alternative SSDs, helping to ensure the overall storage system’s tail latency doesn’t become excessive.

The Sandook agents (1) handle block read and write requests issued by Sandook clients, forwarding them to the corresponding SSD. The agents (2) also provide hardware-agnostic observability (periodic profiling and real-time signals) and serve I/O, sharing this information with the Sandook controller to facilitate its decision making and (3) providing SSD congestion information to Sandook clients so that transient, short-term events, like garbage collection on an SSD, can be handled pretty much instantly and locally, diverting reads and writes to other SSDs in the storage server.

The memory and CPU overhead of monitoring the performance of (typically tens of) SSDs on a single storage server is negligible.

What are the benefits of adding the 3-component Sandook software to a compute cluster? The researchers looked at and tested four workloads;

  • LeanStore- a high-performance storage engine for on-line transaction processing, optimized for multi-core CPUs and NVMe SSDs.
  • Machine learning - training a Unet3D CNN model using PyTorch on a 180GB dataset. 
  • LZ4 - image compression of ImageNet ILSVRC2015 dataset images using LZ4.
  • Storage server - a high-performance open-source block storage server and a latency-critical application. 

Overall Sandook achieves a 30 - 82 percent raw I/O throughput improvement over existing systems that tackle a single source of performance variability while maintaining sub-millisecond tail latency. For unmodified applications sharing a pool of SSDs, Sandook achieves a 12 - 94  percent performance improvement in end-to-end performance.

Specifically, compared to prior systems, it delivers 1.7x storage throughput, 1.12–1.94x application throughput, 71–88 percent lower latency, and 23 percent higher GPU utilization - without specialized hardware or application code changes.

The Sandook paper is entitled “Unleashing The Potential of Datacenter SSDs by Taming Performance Variability” and can be read (downloadable PDF) here. The research will be presented at the USENIX Symposium on Networked Systems Design and Implementation (NSDI 2026), held in Renton, Washington, May 4–6.