Reasoning with Propositional Knowledge Frameworks for Boolean Satisfiability

Reasoning with Propositional Knowledge Frameworks for Boolean Satisfiability and Knowledge Compilation Knot Pipatsrisawat UCLA May 12, 2010

Electronic design automation Planning Software verification Protein structure prediction

Sudoku Minesweeper Mastermind

Core computational Logic problem • Lots of constraints • A solution must satisfy all constraints

Propositional Logic (Boolean Logic) true X false true Y false Logical operators: (and/conjoin), (or/disjoin), (negation) Propositional formula: ((A B) (C D)) (B C) Conjunctive normal form (CNF): (A B C) ( A E) (B C F) (E G H) Clause literals

Sudoku CNF (A B C) ( A E) ( B C F) (E G H) Can we fill all squares? How many ways can we fill this board? Does the top-left square always have to be ‘ 7’? Knowledge Compilation Can we set all variables to satisfy all clauses? How many solutions are there? Is “A=true” implied by this formula? or and or or and and A E Boolean satisfiability (SAT) B F or and and C G D H easy

Overview of Contributions • SAT – Model of state-of-the-art SAT algorithm[Ch. 3] – Proof theoretic characterization of the algorithm[Ch. 4] – 2 techniques for improving the algorithm [Ch. 5] • Knowledge Compilation – New compilation language [Ch. 6] (includes OBDD as a special case) – Fundamental properties • Supported operations [Ch. 7] • Size lower and upper bounds [Ch. 8] • Compactness (relative to other languages) [Ch. 7]

Boolean Satisfiability

State-of-the-Art SAT Solver X F F Z T Y Y Z Z Z F conflict (X Y Z) This clause is falsified conflict solution

State-of-the-Art SAT Solver X F Y Z T Y Z = true conflict Z conflict Z solution Z conflict solution Unit resolution (X Y Z), X=false, Y=true Z = true

State-of-the-Art SAT Solver [Clause-Learning SAT Solver] X Not related to the conflict F Y Z Z = true conflict T Z = true conflict Z conflict Clause learning Y and Z cannot both be false (Y Z) Conflict clause Y Z = true solution

Clause Learning Conflict Current decisions X=true, Y=true + Some clauses Unit resolution ( X Y Z) ( Y W) ( Z W U) ( Y U) Conflicting implications Z=true, W=true, U=false,

Clause Learning Clause learning resolution (X Y Z) Some clauses resolvent (Y Z ) (Z W U) ( X Y ) ( Y Z U W) (Z W V) (Z U V) Conflict clause Added to the formula & can be used to derive future conflict clauses

Clause-Learning SAT Solver X Y bad assignment Conflict clauses are kept across restarts W Y Z W W conflict conflict Early decisions may be bad Restarting allows the solver to avoid getting stuck

Clause-Learning SAT Solver X Y conflict Z conflict Conflict clause false

Clause-Learning SAT Solver Resolution proof that the CNF is unsatisfiable (refutation proof) false

UNSAT CNF Clause-learning SAT solver [Refutation generator] Resolution refutation proof

OPEN PROBLEM: Are proofs in CLR as “short” as those in General Resolution? CLR All refutation proofs producible by clause -learning SAT solvers Proof system = a set of proofs General resolution All possible refutation proofs based on resolution

CLR p-simulates general resolution CLR All refutation proofs producible by clause -learning SAT solvers [Ch. 4, CP-09] General resolution All possible refutation proofs based on resolution

Implications • Theoretically: – Provide proof-theoretic characterization of clauselearning SAT algorithm – Show that clause-learning SAT solvers are practical implementations of resolution • Practically: – Key components are highlighted – Limit of resolution-based reasoning reached • Needed better heuristics • Going beyond resolution

Outline • SAT – Model of state-of-the-art SAT algorithm[Ch. 3] – Proof theoretic characterization of the algorithm[Ch. 4] – 2 techniques for improving the algorithm [Ch. 5] • Knowledge Compilation – New compilation language [Ch. 6] (includes OBDD as a special case) – Fundamental properties • Supported operations [Ch. 7] • Size lower and upper bounds [Ch. 8] • Compactness (relative to other languages) [Ch. 7]

Key Concepts 1. Empowerment – A property of conflict clauses that makes them powerful – A clause is empowering if it can be used to produce a new unit implication (every conflict clause is empowering) – (A B) ( B C) (A C D) • • (A C) is not empowering (A D) is empowering 2. Restarting – Gives solvers the freedom to pursue short proofs

Techniques for Faster SAT Solving • A new clause-learning algorithm [Chap. 5, AAAI-08] – Targets shorter clauses without compromising empowerment • A new restart policy [Chap. 5, SAT-09] – Encourages narrow proofs

A New Clause-Learning Algorithm Conflict analysis (X Y Z) (Y Z ) This can be learned too (Z W U) ( X Y ) ( Y Z U W) (Z W V) (Z U V) Conflict clause

New Conflict Clauses Could be Shorter Problem set Avg. clause size ratio DIFP 0. 33 DLX IQ UNSAT 0. 49 FPGA 0. 64 FVP UNSAT 0. 53 IBM 0. 43 PIPE OOO 0. 48 SAT Competition 2007 0. 41 SAT-Race 2006 0. 42 VLIW UNSAT 0. 56 New clauses are on average 33% the length of normal conflict clauses

Performance on UNSAT Instances 1800 Running time profile 1600 normal Rsat Running time (s) 1400 Rsat+Bi-AC new 1200 1000 800 600 Much worse than normal algorithm if we do not insist on empowerment! 400 200 0 600 650 700 750 800 Number of solved problems 850 900 New technique solved 30 more problems

A New Restart Policy • Existing policies: restart when too many conflicts Many conflicts = long proof • Proof width = Size of largest clause in proof • [Ben-Sasson & Wigderson’ 01] Every CNF with a short proof must also have a narrow proof. hard to find • Encourages the solver to find narrow proofs – There are only O(nk) clauses of length k • Restart whenever many long (length>k) clauses have been learned

Performance on Grid-Pebbling Formula Running time (s) Problem size

Experimental Results Industrial problems Time limit (s) Problems solved

Experimental Results Combinatorial problems Time limit (s) Problems solved

Knowledge Compilation

Outline • SAT – Model of state-of-the-art SAT algorithm[Ch. 3] – Proof theoretic characterization of the algorithm[Ch. 4] – 2 techniques for improving the algorithm [Ch. 5] • Knowledge Compilation – New compilation language [Ch. 6] (includes OBDD as a special case) – Fundamental properties • Supported operations [Ch. 7] • Size lower and upper bounds [Ch. 8] • Compactness (relative to other languages) [Ch. 7]

Knowledge Compilation (X Y Z) (Y Z W) (U V X) (Z W Y V) ( V X Z S) ( T X S) Compiler Queries Compiled Structure Evaluator (Polytime)

Knowledge Compilation (X Y Z) (Y Z W) (U V X) (Z W Y V) ( V X Z S) ( T X S) What form to use here? or and Compiler Queries or and and and A B B A or or and and C D D C Evaluator (Polytime)

Compilation Language Evaluation Compilation language Polytime Operations Satisfiability Validity Clausal entailment Sentential entailment Implicant testing Equivalence testing Model Counting Model enumeration Projection (exist. quantification) Conditioning Conjoin, Disjoin, Negate or and and A B B A or or and and C D D C Succinctness

Negation Normal Form (NNF) or and or or or and and A B B A C D D C rooted DAG (Circuit)

Decomposability No two children of and-node share a variable Decomposable negation normal form (DNNF) or and A, B and or and A or or and B D B, D A, C C, D and C or and B D and C A

Conjoining is Hard for DNNF and A, B, C, D and A, X, B, Y X, Y, Z, W ? C, Z, D, W

Structured Decomposability [Chap. 6, AAAI-08] Every AND decomposes vars according to a global scheme or A, B and C, D or A, B or C, D or or and and A B B A C D D C

Structured Decomposability vtree T DNNF respecting T and A, B, C Full binary tree with variables at leaves or or structured DNNF (SDNNF) and D, E A, C or or D, E and E A B B and C C ¬A D E D ¬D

Determinism Every pair of children of ornode are inconsistent (mutually exclusive) Decomposability + determinism = d-DNNF Structured decomposability + determinism = d-SDNNF or and or or or and and A B B A C D D C

Ordered Binary Decision Diagram (OBDD) High child (Q=true) Q R S OBDD is a special type of SDNNF T U V Low child (Q=false) OBDD (traditional form) OBDD (NNF) Linear vtree

Supported Operations [Chap. 7, AAAI-08] Polytime Operations Queries Satisfiability Validity Clausal entailment Implicant testing Sentential entailment Equivalence testing Model Counting Model enumeration Transformations Conditioning Exist. quantification Conjoin (bounded) Disjoin (bounded) Negate DNNF SDNNF d. SDNNF ? ? ? OBDD

Succinctness L 1 L 2 L 1 strictly more succinct than L 2 More succinct Less succinct SDNNF d. SDNNF OBDD DNNF d. DNNF Circular bit-shift function FBDD Indirect storage access function results from Chap. 7 [AAAI-08]

Practical Implications • Structured DNNF – Max. SAT – Symbolic SAT – Model-based diagnosis • Deterministic Structured DNNF – Bottom-up compilation – Model checking – Planning

Towards a Practical SDNNF Compiler • A good vtree is important for building a practical compiler for SDNNF/d-SDNNF – Good vtree small SDNNF – Bad vtree large SDNNF

Variable Ordering & OBDD f = (A X) (B Y) (C Z) (D W) A, X, B, Y, C, Z, D, W A, B, C, D, X, Y, Z, W

Lower and Upper Bounds for OBDD f = (A X) (B Y) (C Z) (D W) A, B, C, D, X, Y, Z, W [Sieling & Wegener’ 93] f|ABC f|AB C f|A B C … f| A B C

Overview of Contributions • SAT – Model of state-of-the-art SAT algorithm[Ch. 3] – Proof theoretic characterization of the algorithm[Ch. 4] – 2 techniques for improving the algorithm [Ch. 5] • Knowledge Compilation – New compilation language [Ch. 6] (includes OBDD as a special case) – Fundamental properties • Supported operations [Ch. 7] • Size lower and upper bounds [Ch. 8] • Compactness (relative to other languages) [Ch. 7]

Decomposition of Boolean Functions • Examples: f = (X 1 Y 1) (X 2 Y 2) (X 2 Y 3) – X={X 1, X 2}, Y={Y 1, Y 2 , Y 3}: f(X, Y) = g(X) h(Y) (X, Y)-decomposition of f f(X, Y) = f 1 f 2 f 3 … fm g 1(X) h 1(Y) g 2(X) h 2(Y) (X 1 Y 1) (X 2 (Y 2 Y 3)) g 3(X) h 3(Y) gm(X) hm(Y) Size=2 (X 1 Y 1) (( X 1 X 2) (Y 2 Y 3)) ((X 1 X 2) ( Y 1 (Y 2 Y 3))) Size=3

Lower Bound for Structured DNNF of f that respects T Vtree T v or Y X Vars(v) = X and or and and A B B A or or and and C D D C DNNFT induces an (X, Y)-decomposition of f LB DNNF size of min. (X, Y)-decomposition of f [Chap. 8, AAAI-

Example Application f = (X 1 Y 1) (X 2 Y 2) (X 3 Y 3) … (Xn Yn) Lemma: For X = {X 1, X 2, …, Xn}, minimal (X, Y)-decomposition of f has size O(2 n) Vtree T v Y [Chap. 8, AAAI 10] X Vars(v) = X Any DNNF of f that respects T must have at least O(2 n) nodes

An Algorithm for Constructing SDNNF Vtree T g (X)h (Y) 1 1 f(X, Y) = g 2(X)h 2(Y) or g 3(X)h 3(Y) v v’ X’ X and Y’ Y g 1(X) or and h 1(Y) g 2(X) and g’ 1(X’) h’ 1(Y’) g’ 2(X’) and h 2(Y) g 3(X) h 3(Y) or and h’ 2(Y’) Does not commit to a specific type of decomposition

Upper Bound for Structured DNNF Vtree T v f(X, Y) v’ v’’ Y’ X’ or [v] Y and[v] g 1(X) or[v’] and[v] h 1(Y) g 2(X) and[v’] g’ 1(X’) h’ 1(Y’) g’ 2(X’) kv # recursive calls @ v mv max size of decompositions at v Size of decompositions at v and [v] h 2(Y) g 3(X) h 3(Y) or[v’] and[v’] h’ 2(Y’) # nodes @ v kvmv [Chap. 8, ECAI-10]

Upper Bounds for Well-Known Functions • Odd/even parity functions – O(n) upper bound for ANY vtree • Threshold functions (true if at least k inputs are true) – O(nk 2) upper bound for ANY vtree • Total symmetric functions – O(n 2) upper bound for ANY vtree

Relationship to OBDD Bounds Lower bound DNNF size of min. X-decomposition of f Upper bound # nodes @ v kvmv Both bounds becomes identical to Sieling & Wegener’s bounds when OBDD is considered

• SAT Conclusion – Model of state-of-the-art SAT algorithm – Proof theoretic characterization of the algorithm – 2 techniques for improving the algorithm • Knowledge Compilation – New compilation language (includes OBDD as a special case) – Fundamental properties • Supported operations • Compactness (relative to other languages) • Size lower and upper bounds

Publications • • • • Knot Pipatsrisawat and Adnan Darwiche: On the Use of Logical Interactions for Establishing Decomposability. To appear in ECAI-10, Lisbon, Portugal. Knot Pipatsrisawat and Adnan Darwiche: Top-Down Algorithms for Constructing Structured DNNF: Theoretical and Practical Implications. To appear in ECAI-10, Lisbon, Portugal. Knot Pipatsrisawat and Adnan Darwiche: A Lower Bound on the Size of Decomposable Negation Normal Form. To appear in AAAI-10, Atlanta, Georgia, USA. Dan He, Arthur Choi, Knot Pipatsrisawat , Adnan Darwiche, Eleazar Eskin: Optimal Algorithms for Haplotype Assembly From Whole. Genome Sequence Data. To appear in ISMB 2010, Boston, MA, USA. Knot Pipatsrisawat and Adnan Darwiche: On Modern Clause-Learning Satisfiability Solvers. Journal of Automated Reasoning , Vol 44, No 3, pages 277 -301, 2010. Knot Pipatsrisawat and Adnan Darwiche: On the Power of Clause-Learning SAT Solvers with Restarts. In Proceedings of CP'09 (best student paper award). Knot Pipatsrisawat and Adnan Darwiche: A New d-DNNF-Based Bound Computation Algorithm for Functional EMAJSAT. In Proceedings of IJCAI'09. Knot Pipatsrisawat and Adnan Darwiche: Width-Based Restart Policies for Clause-Learning Satisfiability Solvers. In Proceedings of SAT'09. Adnan Darwiche and Knot Pipatsrisawat : Complete Algorithms. In Handbook of Satisfiability. Armin Biere, Hans van Maaren, and Toby Walsh (eds. ). IOS Press. Arthur Choi, Noah Zaitlen, Buhm Hahn, Knot Pipatsrisawat , Adnan Darwiche, and Eleazar Eskin: Efficient Genome Wide Tagging by Reduction to SAT. In Proceedings of the 8 th Workshop on Algorithms in Bioinformatics (WABI), Universität Karlsruhe, Germany, September 2008, pages 135 -147. Knot Pipatsrisawat and Adnan Darwiche: A New Clause Learning Scheme for Efficient Unsatisfiability Proofs. In Proceedings of the Twenty -third AAAI Conference on Artificial Intelligence, Chicago, USA, July 2008, pages 1481 -1484. Knot Pipatsrisawat and Adnan Darwiche: New Compilation Languages Based on Structured Decomposability. In Proceedings of the Twenty-third AAAI Conference on Artificial Intelligence, Chicago, USA, July 2008, pages 517 -522. Knot Pipatsrisawat , Akop Palyan, Mark Chavira, Arthur Choi and Adnan Darwiche: Solving Weighted Max-SAT in a Reduced Search Space: A Performance Analysis. Journal on Satisfiability Boolean Modeling and Computation (JSAT), Volume 4 (2008), pages 191 -217. Knot Pipatsrisawat and Adnan Darwiche: Clone: Solving Weighted Max-SAT in a Reduced Search Space. In Proceedings of the Twentieth Australian Joint Conference on Artificial Intelligence (AI 07), Queensland, Australia, December 2007, pages 223 -233. Knot Pipatsrisawat and Adnan Darwiche: A Lightweight Component Caching Scheme for Satisfiability Solvers. In Proceedings of the Tenth International Conference on Theory and Applications of Satisfiability Testing(SAT), Lisbon, Portugal, May 2007, pages 294 -299.

Acknowledgments • Ph. D adviser: Adnan Darwiche • Committee: Richard Korf, Rupak Majumdar, Chih-Kong Ken Yang • • Arthur Choi: general advice Amarin Phaosawasdi: transportation Ertan Dogrultan: setting up refreshment Terry Valai: room & equipment

Thank You