2fd3fee068ff58ff31aa7dc3b25072e8.ppt
- Количество слайдов: 34
EXPLODE: a Lightweight, General System for Finding Serious Storage System Errors Junfeng Yang, Can Sar, Dawson Engler Stanford University
Why check storage systems? q Storage system errors are among the worst § q Complicated code, hard to get right § q kernel panic, data loss and corruption Simultaneously worry about speed, failures and crashes Hard to comprehensively test for failures, crashes Goal: comprehensively check many storage systems with little work
EXPLODE summary q Comprehensive: uses ideas from model checking q Fast, easy § § q General, real: check live systems. § q Can run, can check, even without source code Effective § § q Check a storage system: 200 lines of C++ code Main requirement: 1 device driver. Run on Linux, Free. BSD checked 10 Linux FS, 3 version control software, Berkeley DB, Linux RAID, NFS, VMware GSX 3. 2/Linux Bugs in all, 36 in total, mostly data loss Subsumes our old work Fi. SC [OSDI 2004]
Checking complicated stacks q q All real Stack of storage systems § subversion: an open-source version control software User-written checker on top Recovery tools run after EXPLODEsimulated crashes subversion checker ok? subversion %svnadm. recover NFS client loopback crash NFS server JFS software RAID 1 checking disk %fsck. jfs %mdadm --assemble --run --force --update=resync %mdadm -a crash disk
Outline q Core idea q Checking interface q Implementation q Results q Related work, conclusion and future work
Core idea: explore all choices q q Bugs are often triggered by corner cases How to find: drive execution down to these tricky corner cases When execution reaches a point in program that can do one of N different actions, fork execution and in first child do first action, in second do second, etc.
External choices q Fork and do every possible operation at re c /root a c b link unlink mkd ir rm di r … Users write code to check. EXPLODE “amplifies” the checks … Explore generated states as well Speed hack: hash states, discard if seen
Internal choices q Fork and explore internal choices at re c kmalloc returns 0 /root a c b Buffer cache misses
How to expose choices q q To explore N-choice point, users instrument code using choose(N): N-way fork, return K in K’th kid void* kmalloc(size s) { if(choose(2) == 0) return NULL; … // normal memory allocation } q We instrumented 7 kernel functions in Linux
Crashes q Dirty blocks can be written in any order, crash at any point cr at e /root a c b buffer cache Users write code to check recovered FS fsck Write all subsets check fsck check
Outline q Core idea: explore all choices q Checking interface § § What EXPLODE provides What users do to check their storage system q Implementation q Results q Related work, conclusion and future work
What EXPLODE provides q q choose(N): conceptual N-way fork, return K in K’th child execution check_crashes_now(): check all crashes that can happen at the current moment § § q Paper has more methods for checking crashes Users embed non-crash checks in their code. EXPLODE amplifies them err(): record trace for deterministic replay
What users do q q Example: ext 3 on RAID checker: drive ext 3 to do something: mutate(), then verify what ext 3 did was correct: check() storage component: set up, repair and tear down ext 3, RAID. Write once per system assemble a checking stack
q FS Checker § q q mutate ext 3 Component Stack choose(4) creat file sync fsync rm file …/0 1 mkdir 2 3 rmdir 4 …/0 1 2 3 4
q FS Checker § q q Check file exists check ext 3 Component Stack Check file contents match Found JFS fsync bug, caused by reusing directory inode as file inode Checkers can be simple (50 lines) or very complex(5, 000 lines) Whatever you can express in C++, you can check
q q FS Checker storage component: initialize, repair, set up, and tear down your system § q ext 3 Component threads(): returns list of kernel thread IDs for deterministic error replay q q Stack Wrappers to existing utilities q Write once per system q Real code on next slide
q q q FS Checker ext 3 Component Stack
q q q FS Checker ext 3 Component q q Stack q assemble a checking stack Let EXPLODE know how subsystems are connected together, so it can initialize, set up, tear down, repair the entire stack Real code on next slide
q q q FS Checker ext 3 Component Stack
Outline q Core idea: explore all choices q Checking interface: 200 lines of C++ to check a system q Implementation § § § Checkpoint and restore states Deterministic replay Checking process Checking crashes Checking “soft” application crashes q Results q Related work, conclusion and future work
Recall: core idea q “Fork” at decision point to explore all choices state: a snapshot of the checked system …
How to checkpoint live system? q Hard to checkpoint live kernel memory § q q checkpoint: record all choose() returns from S 0 restore: umount, restore S 0, re-run code, make K’th choose() return K’th recorded values Key to EXPLODE approach S 0 2 3 S S = S 0 + redo (2, 3) … q VM cloning is heavy-weight
Deterministic replay q Need it to recreate states, diagnose bugs Sources of non-determinism q Kernel choose() can be called by other code § q Kernel threads § § q q Fix: filter by thread IDs. No choose() in interrupt Opportunistic hack: setting priorities. Worked well Can’t use lock: deadlock. A holds lock, then yield to B Other requirements in paper Worst case: non-repeatable error. Automatic detect and ignore
EXPLODE: put it all together EXPLODE User code EKM = EXPLODE device driver
Outline q q Core idea: explore all choices Checking interface: 200 lines of C++ to check a system q Implementation q Results § § q Lines of code Errors found Related work, conclusion and future work
EXPLODE core lines of code Linux 1, 915 (+ 2, 194 generated) Free. BSD 1, 210 Kernel patch User-level code 6, 323 3 kernels: Linux 2. 6. 11, 2. 6. 15, Free. BSD 6. 0. Free. BSD patch doesn’t have all functionality yet
Checkers lines of code, errors found Storage System Checked Component Checker Bugs 10 file systems 744 5, 477 18 27 68 1 CVS 1 “EXPENSIVE” 30 124 3 82 202 6 RAID Transparent subsystems Not ported yet Berkeley DB Storage applications Subversion 144 FS + 137 2 NFS 34 FS 4 VMware GSX/Linux 54 FS 1 1, 115 6, 008 36 Total
Outline q q Core idea: explore all choices Checking interface: 200 lines of C++ to check a system q Implementation q Results § § q Lines of code Errors found Related work, conclusion and future work
FS Sync checking results indicates a failed check App rely on sync operations, yet they are broken
ext 2 fsync bug Events to trigger bug B creat B write B crash! fsck. ext 2 Mem Disk B A … fsync B A … truncate A Indirect block Bug is fundamental due to ext 2 asynchrony
Classic app mistake: “atomic” rename q Atomically update file A to avoid corruption fd = creat(A_tmp, …); write(fd, …); fsync(fd); close(fd); rename(A_tmp, A); q Problem: rename guarantees nothing abt. data
Outline q q q Core idea: explore all choices Checking interface: 200 lines of C++ to check a system Implementation Results: checked many systems, found many bugs Related work, conclusion and future work
Related work q FS testing § q IRON Static analysis § § § Traditional software model checking Theorem proving Other techniques
Conclusion and future work q EXPLODE § § § q Easy: need 1 device driver. simple user interface General: can run, can check, without source Effective: checked many systems, 36 bugs Future work: § § § Work closely with storage system implementers to check more systems and more properties Smart search Automatic diagnosis Automatically inferring “choice points” Approach is general, applicable to distributed systems, secure systems, …