Files
Abstract
As we look toward exascale it is clear that high-capacity HPC storage systems will incorporate the large populations of hard disk drives that have previously only been deployed at cloud-level service providers. Further, with the rapid increase in network performance, the number of disks per storage server will need to be dramatically increased to efficiently pair with current networking technology. With the massive populations of disks integrated within local systems, the probability of various correlated failures across a large number of components becomes a critical concern in preventing data loss. To guarantee the data survival under catastrophic failures, this dissertation emphasizes higher fault tolerance to improve system reliability, provides more flexibility to understand system reliability, and further improves system reliability with less storage overhead.
The first part of this dissertation strengthens existing data protection schemes with higher fault tolerance. We present a novel declustered parity, single-overlap declustered parity (SODP), that ensures at most one overlapping disk between any two stripesets. This maximizes the number of simultaneous disk failures tolerated and minimizes disk rebuild time by balancing parity stripes across disks. Rather than making a trade-off between fault tolerance and rebuild performance, SODP takes the first step to achieve both high fault tolerance and rebuild performance. Our evaluation shows that when compared to the state of the art, SODP can achieve 30x improvements in the probability of data loss during failure bursts.
The second part of this dissertation provides the flexibility to understand how the interactions between fault tolerance and rebuild performance together impact system reliability. We design a practical and flexible tool, fractional-overlap declustered parity (FODP), to explore the trade-offs between the number of failure domains and rebuild performance. This gives us a fine-grained control to accommodate different reliability requirements and system sizes. Furthermore, we introduce FODP-Plus-One to add additional parities on top of FODP to further protect data. Our detailed analysis shows that FODP and FODP-Plus-One yield significant reduction in the probability and granularity of data loss in the presence of various failure regimes.
The third part of this dissertation is to further explore how SODP/FODP can be integrated into tiered parity, which layers two levels of protection schemes on top of one another. With the tiered architecture, few established principles exist to guide system designs to tolerate both temporal and spatial correlated failures. This work systematically explores the design space for balancing fault tolerance and rebuild performance at each tier and evaluates how different data protection techniques impact the system reliability under various failures regimes. Based on the analysis, we identify a set of design principles that storage architects can use to tolerate correlated failures. By applying these principles, we present a novel tiered parity scheme, Tiered FODP (TFODP), where the top tier is deployed with the minimal FODP technique for high fault tolerance and the bottom tier is designed with the maximal FODP to provide high rebuild performance. Our evaluationshows that TFODP can achieve higher system reliability with less storage overhead.