
071797a1e1eaaf672565b2156ce218fd.ppt
- Количество слайдов: 65
Efficient SAT/BDD-based Techniques for Predicate Abstraction Shuvendu K. Lahiri Microsoft Research, Redmond Joint work with Thomas Ball, Randy Bryant, Byron Cook, Robert Nieuwenhuis, Albert Oliveras
Program analysis and abstraction Unbounded state space n Unbounded integers, arrays, heap n State exploration may not terminate Abstraction n Construct an overapproximation of program behavior n Abstract domain/operators ensures that the analysis terminates – 2–
Automatic predicate abstraction n Graf & Saïdi, CAV ’ 97 n Underlying framework l Abstract Interpretation, Cousot & Cousot ‘ 77 Idea n Given set of predicates P = {P 1, …, Pk } l Formulas describing properties of system state n Finite Abstraction l Abstraction (s) = subset of {P 1, …, Pk } holds on s l At most 2 k abstract states – 3–
Predicate abstraction in practice Boolean Program from C programs n SLAM Software Model Checking n BLAST, MAGIC, … Loop invariant synthesis for arrays and lists n ESC-JAVA, . . Distributed Protocol Verification n UCLID, – 4– Murphi, …
Definitions Predicates Literals in some theory T n P = {x = 1, x = y, x < y + 2, f(x) = f(y) + 2, . . } n Formula n Boolean n (x – 5– combination of predicates = 1 x < y + 2)
Fundamental Operation: Predicate Cover F P ( ) n Predicate cover of n Weakest expression over P that implies F P ( ) Partitioning defined by the predicates A minterm over P A conjunction of predicates Pi or their negations – 6– P: Set of predicates : Formula
Example Minterms over P y nx<y nx x 2 x=2 P : {x < y, x = 2} : y>1 F P ( ) – 7–
Traditional approaches F P ( ) n Predicate cover of n Weakest expression over P that implies F P ( ) Check which minterms imply Partitioning defined by the predicates n Use a decision procedure to check the implication – 8– Exponential number of Decision Procedure Calls P: Set of predicates : Formula
Traditional approaches Large number of decision procedure calls n Worst case exponential in P l Exponential behavior often seen in practice n Each decision procedure call can be expensive Limits scalability n FP ( ) invoked a few thousand times during a single software verification run n Tools – 9– have to sacrifice precision for efficiency
Overview of the talk Two approaches to predicate abstraction n Symbolic Decision Procedures n Satisfiability Modulo Theory (SMT) based Symbolic decision procedures (SDP) n [Lahiri, Ball, Cook CAV’ 05] SMT-based predicate abstraction n Eager [Lahiri, Bryant, Cook CAV’ 03] n DPLL(T) based [Lahiri, Oliveras, Nieuwenhuis CAV’ 06] Challenges ahead – 10 –
Predicate Abstraction using Symbolic Decision Procedures – 11 –
Overview of SDP Symbolic Decision Procedures n Predicate abstraction SDP for Equality Logic Combining SDP for two theories – 12 –
Computing FP ( ) FP distributes over conjunction n FP ( 1 2) = FP ( 1) FP ( 2) Suffices to compute FP (e 1 e 2 …. en) n Each ei is a literal convert to an equivalent conjunctive normal form (CNF) n First Rest of the talk, assume n = 1 (simplicity) n Concentrate – 13 – on computing FP (e)
Decision Procedure (DP) Input n. A set G = {g 1, …, gm} of literals n. A literal e Output n Is G e valid? Equivalently g 1 . . gm e UNSAT? n Is G { e} UNSAT? n Is – 14 –
Symbolic Decision Procedure (SDP) Input n. A set G = {g 1, …. , gm} of atomic expressions n An atomic expression e Output n Representation for l {G’ | G’ G, and G’ { e} is UNSAT} “Symbolic” Decision Procedure n One run of SDP(G, e) represents an exponential (2|G|) number of runs of DP(G, e) – 15 –
Predicate Abstraction and SDP PBar = { p | p P } l SDP(P PBar, e) represents FP (e) n FP (e) l all minterms over P PBar that imply e PBar, e) l {G’ | G’ P PBar , and G’ { e} is UNSAT} n SDP(P – 16 –
Overview of SDP Symbolic Decision Procedures n Predicate abstraction SDP for Equality Logic Combining SDP for two theories – 17 –
A Decision Procedure for Equality Logic Atomic expressions n x = y, x y Inference Rules (R) n n Reflexivity, Symmetry, Transitivity Contradiction l x = y, x y Inference rule n – 18 – generates a new expression from existing expressions
A Decision Procedure for Equality Logic Atomic expressions n x = y, x y G = {a=b, b=c}; e : (a = c) Inference Rules (R) n n Reflexivity, Symmetry, Transitivity Contradiction l x = y, x y Inference rule n – 19 – generates a new expression from existing expressions a=b a c b=c a c a=c
A Decision Procedure for Equality Logic G { e} Atomic expressions n x = y, x y R Inference Rules (R) n n Reflexivity, Symmetry, Transitivity Contradiction l x = y, x y lg(|G|) R Inference rule n – 20 – R generates a new expression from existing expressions Contains SAT Yes UNSAT
Symbolic DP for Equality Logic G = {a=b, b=c, a=d, d=c}; e : (a = c) Modifications n n – 21 – Introduce a Boolean variable [g] for each expression g in G l Add “true” for e Construct a “shared” expression for the derivations [a = b] a=b [b = c] [a = d] b=c a=d [d = c] true d=c a c
Symbolic DP for Equality Logic G = {a=b, b=c, a=d, d=c}; e : (a = c) Modifications n n Introduce a Boolean variable [g] for each expression g in G l Add “true” for e Construct a “shared” expression for the derivations [a = b] a=b [b = c] [a = d] b=c a=d a=c – 22 – [d = c] true d=c a c
Symbolic DP for Equality Logic G = {a=b, b=c, a=d, d=c}; e : (a = c) Modifications n n Introduce a Boolean variable [g] for each expression g in G l Add “true” for e Construct a “shared” expression for the derivations SDP(G, e) n – 23 – [a = b] a=b [b = c] [a = d] b=c a=d [d = c] true d=c a c a=c The expression representing “ ” after lg(|G|) steps
Symbolic DP for Equality Logic G = {a=b, b=c, a=d, d=c}; e : (a = c) Output n. A shared Boolean expression with [. ] variables in the leaves [a = b] a=b [b = c] [a = d] b=c a=d [d = c] true d=c a c a=c – 24 –
SDP for Equality Logic Expression representing “ ” after lg(|G|) steps n Shared expression for {G’ | G’ G, and DP(G’, e) is UNSAT} Shared expression can be computed in polynomial time n Derivations repeated for lg(|G|) steps step has at most |V|2 atomic expressions l V: number of vars in G n Each – 25 –
SDP for other theories G { e} Bounded-depth Saturating Theory T procedure for T can be implemented by saturation n Provide a function Depth: G Nat, to denote the max. depth to iterate R n Decision Depth(G) R No – 26 – R SAT Contains Yes UNSAT
SDP for other theories Equality with Uninterpreted Functions (EUF) n Expressions: f(x) = f(g(y)), x = f(z) n Depth(G) < 3 m l m is the number of terms in G n Polynomial Complexity of SDP Difference Logic (DIFF) n Expressions: n Depth(G) n Pseudo x y+c < lg(|G|) Polynomial Complexity of SDP l Depends on the size of constants in G – 27 –
Overview of SDP Predicate Abstraction Symbolic Decision Procedures n Predicate abstraction SDP for Equality Logic Combining SDP for two theories – 28 –
Combining SDPs for two theories Extend Nelson-Oppen method for combining decision procedures for two theories T 1, T 2 n [Nelson, Oppen TOPLAS ’ 79] n The decision procedures communicate via equalities over shared variables Given SDP 1 and SDP 2 for theories T 1, T 2 n Disjoint signatures, convex theories n Each theory generates derivations of all equalities between variables n Complexity of the resultant SDP (for T 1 T 2) only increases linearly in the number of variables – 29 –
Combining SDP for two theories G 1 SDP 1 G 2 {x=y} SDP 2 G 1 SDP 1 – 30 – {x=y} N : number of shared variables
Combining SDP for theories Combined SDP for EUF + DIFF n Pseudo Polynomial complexity n Important fragment of most program verification queries (especially in SLAM) – 31 –
SDP to Predicate Abstraction Output of SDP is an Expression DAG n Represents FP (e) n Can be used directly to construct Boolean programs (with intermediate variables) To compute explicit expression for FP (e) n Construct a Binary Decision Diagram (BDD) from SDP, and enumerate prime-implicants n BDDs crucial for exploiting the shared representation – 32 –
Evaluation SLAM benchmarks n Generated 665 predicate abstraction queries from device driver verification n Decision Procedure (Zapato) based approach: 27904 sec n SDP based approach: 273 s n 100 X speedup – 33 –
Challenges SDP for other interesting theories and combinations n Linear arithmetic, non-convex theories Incremental SDPs n Useful for combining SDPs Output sensitive predicate abstraction? n Complexity is polynomial in the number of minterms in the output – 34 –
Conclusion Predicate abstraction via symbolic decision procedures n Polynomial algorithms for useful theories Modular combination of Symbolic Decision Procedures for theories n Can design SDP for each theory in isolation Simple prototype implementation n Promising – 35 – results on SLAM queries
Overview of the talk Two approaches to predicate abstraction n Symbolic Decision Procedures n Satisfiability Modulo Theory (SMT) based Symbolic decision procedures (SDP) n [Lahiri, Ball, Cook CAV’ 05] SMT-based predicate abstraction n Eager [Lahiri, Bryant, Cook CAV’ 03] n DPLL(T) based [Lahiri, Oliveras, Nieuwenhuis CAV’ 06] Challenges ahead – 36 –
SMT-based predicate abstraction – 37 –
Satifiability Modulo Theories (SMT) SMT n Decide satisfiability of a (ground) first-order formula with respect to a background theory T n Example (EUF) g(a) = c (f(g(a)) f(c) g(a) = d) c d SMT-solvers n Leverages efficient Boolean search of Boolean satifiability (SAT) solvers – 38 –
SMT for predicate abstraction Input n A formula , a set of predicates P over a theory T Output n n GP ( ): External predicate cover of Same as FP ( ) Main Idea [Lahiri et al. CAV’ 03, Clarke et al. FMSD ’ 04] 1. Introduce fresh Boolean variables B = {b 1, . . , bn} 2. Construct the formula ( i (bi Pi)) 3. – 39 – Enumerate all the models over B
Eager SMT techniques Methodology a (ground) formula into equisatisfiable Boolean formula n Use off-the-shelf SAT solvers to check the satisfiability (X , B ) n Translates n Tools: Equisatisfiable Translation UCLID bool (A, B) Variables introduced during translation – 40 –
Predicate abstraction using eager SMT techniques ( i (bi Pi)) Methodology n [Lahiri, Bryant, Cook CAV’ 03] n Translates a (ground) formula into Boolean formula n Use off-the-shelf BDD or SAT solvers to perform All. SAT over B Equisatisfiable Translation + Preserves solutions over Boolean variables n Implemented in UCLID l Uses SATQE (Kroening) Variables introduced during translation – 41 – bool (A, B)
Advantage over explicit approach Single Call to SAT-based Quantification Engine n Removes exponential number of calls to theorem prover Learning in Incremental SAT n Retains conflict clauses across different solutions Leverage future advances in SAT n Without – 42 – any change to the framework
Evaluation n Compared with a black-box decision procedure based approach l Das, Dill and Park, CAV’ 99 SLAM benchmarks n Device driver verification n Eager SMT technique improves 50 -100 X on many benchmarks Distributed protocol verification (UCLID) n Lahiri, Bryant VMCAI’ 04 n Decision procedure (SVC/CVC) based approach unable to finish on most examples l > 10, 000 theorem prover calls – 43 –
Lazy SMT techniques Integrate a theory T-solver with SAT solver n n Lazily rule out T-inconsistent Boolean models using theory solver CVC-Lite, Verifun, Math. SAT, Barcelogic, … Barcelogic Tool n n – 44 – R. Nieuwenhuis and A. Oliveras CAV’ 05 Optimizations (based on DPLL(T)) 1. Check partial Boolean models for Tinconsistency 2. Upon T-inconsistency, use the explanation as a conflicting clause and perform backjump 3. Theory (unit) propagation to generate implied facts
Predicate abstraction using lazy methods n Lahiri, Nieuwenhuis, Oliveras CAV’ 06, using Barcelogic Enumerate all the models over B for [ ( i (bi Pi)) ] while is T-satisfiable do 1. 2. 3. M : = T-model for using SMT-solver M : = project M onto B Consider M as a conflicting clause 1. Perform conflict analysis to generate backjump clause 2. Optionally add backjump clause 4. Backjump and continue return all models over B – 45 –
Experimental results SLAM benchmarks n ~5 seconds on 665 benchmarks n > 100 X improvement on SDP based approach Hardware and protocol benchmarks [UCLID] n n 7 set of benchmarks 22 X – 143 X improvement over Eager-SMT based approach Linked list verification [Lahiri, Qadeer POPL’ 06] n n 4 set of benchmarks 31 X – 40 X improvement over Eager-SMT based approach SDP-based technique not applied on the latter two classes l – 46 – Need support for (sound) quantifier-reasoning
Hardware and protocol benchmarks Benchmarks Preds Eager Lazy (secs) # minterms # cubes (secs) Aodv 21 657 4. 6 2916 458 Bakery 32 245 11 426 294 BRP 22 3. 5 0. 1 30 24 Cache_ibm 16 34 1. 3 326 123 Cache_ibm 2 26 1119 23 2238 1022 Dlx 23 335 13 30808 2704 OOO 25 921 36 10728 242 # cubes: Number of prime-implicants in the BDD for the minterms 1. Theory propagation crucial for benchmarks with arithmetic n 2. – 47 – E. g. 17 X slowdown in OOO without it Reusing lemmas and clauses improves 1. 5 X – 3 X on most examples
Conclusions Relatively easy to turn SMT solver to perform predicate abstraction n Clear benefit from leveraging learned clause and not restarting the search after each model Improvements in SMT translate to predicate abstraction case – 48 –
Overview of the talk Two approaches to predicate abstraction n Symbolic Decision Procedures n Satisfiability Modulo Theory (SMT) based Symbolic decision procedures (SDP) n [Lahiri, Ball, Cook CAV’ 05] SMT-based predicate abstraction n Eager [Lahiri, Bryant, Cook CAV’ 03] n DPLL(T) based [Lahiri, Oliveras, Nieuwenhuis CAV’ 06] Challenges ahead – 49 –
Summary Symbolic decision procedures l l l Can construct DAG representation of output in polynomial time for useful theories Modular combination of SDPs Require more optimizations to make it practical 1. SMT-based procedures 1. 2. ALLSAT using SAT-solvers (Eager) or SMT solvers (Lazy) 3. – 50 – Can leverage SMT solvers without much effort Lazy approaches benefit from tighter SAT+theory reasoning
Challenges for predicate abstraction tools Predicate abstraction with non-ground formulas n Quantifiers were removed with simple instantiation techniques for UCLID/List verification benchmarks Generate partial models during ALLSAT n Should improve the performace when ratio of #minterms : # cubes is large Incremental refinement of approximations n n Construct refined approximation of FP ( ) from coarser approximations, without repeating work Some initial directions in CAV’ 06 paper Refining the abstraction (incrementally) with monotonically increasing set of predicates – 51 –
Questions? – 52 –
– 53 –
Overview Predicate Abstraction Symbolic Decision Procedures (SDP) n Predicate abstraction SDP for Equality Logic Combining SDP for two theories Implementation and Results Related Work – 54 –
Zap Overview n [Ball, Lahiri, Musuvathi] Many automated program analysis tools require symbolic reasoning n e. g. Unit-testing, model checking, static analysis, … Support symbolic operations for such tools n n n Support richer operations, apart from validity checking Support useful theories for program analysis Leverage advances in SAT solving and theorem proving SLAM/SDV MUTT Zing Boogie unit-testing model checking static analysis Zap – 55 – theorem prover
Symbolic Reasoning for Automated Software Analysis l Validity / Satisfiability l Model generation n Useful in test case generation l Quantifier elimination n Image operation in model checking l Abstract interpretation operations n abstract transformers, join, widen l Interpolants n – 56 – For abstraction-refinement
Interesting Theories n Equality with uninterpreted functions (EUF) n Linear Arithmetic n Arrays n Bounded Integers n Lists n Sets Combine the symbolic operations for different theories – 57 –
Symbolic Reasoning for Automated Software Analysis l Validity / Satisfiability l Model generation n Useful in test case generation l Quantifier elimination n Image operation in model checking l Abstract interpretation operations n abstract transformers, join, widen l Interpolants n – 58 – For abstraction-refinement
FP ( ) – 59 –
Evaluation SLAM benchmarks n Generated 665 predicate abstraction queries from device driver verification n Decision Procedure based approach: 27904 sec n SDP based approach: 273 s n 100 X speedup Synthetic benchmark n Comparison n More – 60 – with UCLID than 100 X speedup
Related Work Decision Procedure Based n Calls n a decision procedure to check implication with each minterm [Das & Dill], [Saidi & Shankar], … Boolean Quantifier Elimination Based n [Lahiri, Bryant, Cook, CAV 03, Clarke et al. , FMSD 04] n Performs predicate abstraction by quantifier elimination n Reduces restricted first-order quantifier elimination to Boolean quantifier elimination – 61 –
Experimental Setup Symbolic Method n Incremental SAT-based method l SATQE : Simple extension to Zchaff » Built by Daniel Kroening at CMU Explicit Method n Algorithm of Das, Dill & Park, CAV’ 99 l Avoids exponential worst case in many cases in practice l Uses SVC as a decision procedure Device Driver Benchmarks from SLAM Toolkit n n – 62 – Ball and Rajamani, MSR Queries during C Boolean Program construction
Evaluation on SLAM-benchmarks Example # Preds Explicit #Calls Time (sec) Symbolic #Prop- SAT-based time vars (sec) Dr. 10 19 >7576 >1000 115 9. 9 Dr. 13 20 >7351 >1000 234 44. 7 Dr. 15 23 >7237 >1000 336 68. 2 Dr. 17 Dr. 3 15 13 3041 2023 507 355 105 125 6. 1 7. 0 l BDD based approach worse than SAT on larger – 63 – benchmarks
Symbols – 64 –
Challenges ahead – 65 –
071797a1e1eaaf672565b2156ce218fd.ppt