Tuesday, October 31, 2006

Paper: "EXPLODE: A Lightweight, General System for Finding Serious Storage System Errors"

EXPLODE: A Lightweight, General System for Finding Serious Storage System Errors
Junfeng Yang, Can Sar, and Dawson Engler, Stanford University


Storage systems such as file systems, databases, and RAID systems have a simple, basic contract: you give them data, they do not lose or corrupt it. Often they store the only copy, making its irrevocable loss almost arbitrarily bad. Unfortunately, their code is exceptionally hard to get right, since it must correctly recover from any crash at any program point, no matter how their state was smeared across volatile and persistent memory.

This paper describes EXPLODE, a system that makes it easy to systematically check real storage systems for errors. It takes user-written, potentially system-specific checkers and uses them to drive a storage system into tricky corner cases, including crash recovery errors. EXPLODE uses a novel adaptation of ideas from model checking, a comprehensive, heavy-weight formal verification technique, that makes its checking more systematic (and hopefully more effective) than a pure testing approach while being just as lightweight.

EXPLODE is effective. It found serious bugs in a broad range of real storage systems (without requiring source code): three version control systems, Berkeley DB, an NFS implementation, ten file systems, a RAID system, and the popular VMware GSX virtual machine. We found bugs in every system we checked, 36 bugs in total, typically with little effort.


Blogger Geoffrey Lefebvre said...

EXPLODE: A Lightweight, General System for Finding Serious Storage System Errors
Junfeng Yang, Can Sar, and Dawson Engler, Stanford University

Presented by Junfeng Yang

Junfeng Yang began his talk by noting that storage systems errors are the worst kind of error since they can result in corruption of persistent state, possibly leading to permanent loss of data. This problem is compounded with the fact that storage systems are complicated, hard to get right and that it's hard to do comprehensive testing on them.

To address this issue, the authors present EXPLODE a system to find errors in storage system. EXPLODE borrows ideas from model checking by being comprehensive, but instead of running the checked system inside a model checker, EXPLODE runs client defined checkers inside the checked system, thus minimizing the changes required to the checked system. Running on a real system allows to easily check storage stacks. A file system checker can be used to find errors in storage layers either above or below the checked system. Another advantage of this approach is that checkers are easy to write. A checker for a storage system can be written in less than 200 lines of C++ and often consists of a simple wrapping layer around existing utilities such as mkfs and fsck. The authors state that this work completely subsumes their previous work (FiSC).

Because bugs are often triggered by corner cases, EXPLODE's core idea is to explore all choices. EXPLODE provides choose(N), A N-way fork, that allows checkers to fork at every decision point during testing and explore every possible operations. Before exploring a decision point, EXPLODE checkpoints the state of the system. A checkpoint is simply the recorded sequence of return values from choose(). To restore a checkpoint, EXPLODE deterministically replays this sequence from the initial state. These two mechanisms allow EXPLODE to perform an exhaustive state exploration. At any point during testing, a checker has the ability to force crashes. Upon a forced crash, EXPLODE generates crash disks based on all possible orderings of dirty buffers. A checker supplied routine then tests all disks for specific invariants.

A challenge faced by the authors was dealing with non-determinism. Care must be taken when choose() is called inside the kernel. The choose primitive could be called by non-checking code such as interrupt handler. EXPLODE solves this problem by filtering on thread ID. EXPLODE must deterministically schedule all threads involved with the checked system. This includes the checker's thread but also all threads belonging to the checked storage system. By playing with thread's priorities, EXPLODE is able to enforce deterministic scheduling most of the time and can detect when it fails to do so.

Junfeng then presented an evaluation of EXPLODE. The authors tested various file and storage systems such ext2, ext3, JFS, the VFS layer, NFS, Berkeley DB and VMWare with EXPLODE and found bugs in all of them.

Someone from UCSD asked about the option to not replay instrumented kernel functions and if doing so could lead to non-determinism. Junfeng answered that short traces were deterministic since calls to these kernel functions succeed most of the time but that long traces had non-determinism.

Someone asked if it was possible to test sub-components of file systems. Junfeng answered that this wasn't possible and another person asked how EXPLODE handled non-determinism created by disk actually performing operation out of order internally. Junfeng answered that EXPLODE uses a RAM disk which avoids this problem.

9:51 PM  

Post a Comment

<< Home