Published August 2020 | Version v1
Dissertation Open

Evolving Storage Stack for Predictability and Efficiency

Creators

  • 1. University of Chicago

Contributors

Description

With the exponential growth of data which are expected to reach 175 zettabytes by 2025, cloud storage is increasingly becoming the central hub for data management and processing. Among many benefits cloud platforms promise predictable performance and cost-efficiency are two fundamental factors driving the success of modern cloud storage. However, under rapid changes in modern cloud storage infrastructure in terms of both software and hardware, new challenges emerge for achieving predictable performance with efficiency. In more detail, modern data-intensive applications and the new wave of computing paradigms (e.g., data analytics, ML, serverless) drive the storage stack to undergo a radical shift towards more feature-rich software designs on top of increasingly heterogeneous architectures. As a result, today's cloud storage stack is extremely heavy-weight and complex, burning 10-20% of data center CPU cycles and introducing severe performance non-determinism (i.e., long tail latencies). Unfortunately, the deployment of new acceleration hardware (e.g., NVMe SSDs and I/O co-processors) only {partially} addresses the problem. Due to the intrinsic complexities and idiosyncrasies in hardware (e.g., NAND Flash management) and lack of system-level support, it remains a challenge to design performant and cost-efficient cloud storage systems. In particular, achieving sub-millisecond level latency predictability in a cost-efficient manner is the new battlefield. Rooted in deep understanding and analysis of existing software/hardware stack, this dissertation focuses on building new abstractions, interfaces and end-to-end storage systems to achieve predictable performance and cost-efficiency using a software/hardware co-design approach. By revisiting the challenges across different layers in a holistic manner, the co-design approach opens up simple yet powerful system-level policy designs to opportunistically exploit hardware idiosyncrasies and heterogeneity. The systems we build can effectively decrease latency spikes by up to orders of magnitude and increase the revenue by 20x.

Files

Li_uchicago_0330D_15435.pdf

Files (3.6 MB)

Name Size Download all
md5:9d2846a35dc5ef46d709a84d5a791185
3.6 MB Preview Download

Additional details

Identifiers

Other
oai:uchicago.tind.io:2642

UChicago Information

Division(s)
Physical Sciences Division
Department(s)
Computer Science