"Report on Formalised Bit Vector Decision Procedures"

Written By Atticus Kuhn
Tags: "blog", "public", "project"
:PROPERTIES: :ID: aff79137-b7f4-4795-8838-9fe332a3fb9b :mtime: 20240902015919 20240816162042 20240816145043 20240816133252 20240816104252 20240815172443 :ctime: 20240815172442 :END: #+title: Report on Formalised Bit Vector Decision Procedures #+filetags: :blog:public:project: #+author: Atticus M. Kuhn #+OPTIONS: broken-links:mark * Introduction In this report, I will survey three [[id:f883fe8c-a2db-4033-8c01-93c9be7c20dc][decision procedure]]s on [[id:ecf57e78-a8a2-4af0-9746-bb84525fd8a5][bit vector]]s. #+include: ~/org/roam/20240815172704-decision_procedure.org :lines "8-" I will explain these [[id:f883fe8c-a2db-4033-8c01-93c9be7c20dc][decision procedure]]s, how they work, and also benchmark their rate of success on real-life benchmarks. * Tactics ** Bitwise Tactic The Bitwise tactic is based on the idea of [[id:5ea0735e-03b7-4183-88df-1c098c8b7fcb][bit blasting.]] The following [[id:9db1a2e1-0330-4905-892c-8ed3b088e6d0][Backus-Naur form]] grammar represents the terms that the bitwise tactic accepts #+BEGIN_SRC bnf bitwise_term ::= expr "=" expr expr :: = literal | expr "&&&" expr | expr "|||" expr | expr "^^^" expr | "~~~" expr #+END_SRC This is key: the bitwise tactic only works over equalities on a fixed/constant bit-width. Here is an example of an equality where the bitwise tactic would work, in Lean: #+BEGIN_SRC lean4 example (x y : BitVec 64) : (x ||| x) &&& y = y &&& x := by bv_decide #+END_SRC The bitwise tactic works by bit-blasting the expression into constraints for each bit, and then passing these constraints to a SAT solver. In this case, we have a verified SAT solver in Lean called Lean-SAT. ** Ring Tactic The Ring tactic works by putting both sides of the equality in ring-normal-form. Here is the [[id:9db1a2e1-0330-4905-892c-8ed3b088e6d0][Backus-Naur form]] of expressions that the ring tactic can handle: #+BEGIN_SRC bnf ring_term ::= expr "=" expr expr :: = literal | expr "+" expr | expr "-" expr | expr "*" expr | "-" expr | expr "<<<" literal -- x <<< c = x * 2 ^ c | "~~~" expr --- ~~~ x = - x - 1 #+END_SRC Here is an example of an equality that can be handled by the ring tactic in Lean: #+BEGIN_SRC lean4 example (w : Nat) (x y : BitVec w) : ~~~ (x + y) = ~~~ x + ~~~ y + 1 := by bv_ring #+END_SRC Note that the ring tactic can handle expressions over an arbitrary width. ** Automata Tactic The automata tactic may be compared by analogy to Presburger Arithmetic over the Integers. The automata tactic works by only considering operations that can be computed using a constant amount of memory scratch space. How, exactly, the automata tactic works is a complicated subject. Below is the [[id:9db1a2e1-0330-4905-892c-8ed3b088e6d0][Backus-Naur form]] of expressions accepted by the automata procedure: #+BEGIN_SRC bnf automata_term ::= expr "=" expr expr :: = literal | expr "+" expr | expr "-" expr | "-" expr | expr "<<<" literal -- x <<< c = x + x + x + .... + x | expr "*" literal -- x * c = x + x + x + .... + x | literal "*" expr -- c * x = x + x + x + .... + x | expr "&&&" expr | expr "|||" expr | expr "^^^" expr | "~~~" expr #+END_SRC For example, negation may be computed with one extra bit of scratch space, so it is handled by the automata. However, multiplication may take up scratch space dependent on the size of its inputs, so it is not handled in the general case by the automata tactic (although multiplication by a constant can just be re-written as repeated addition) * Benchmarks ** Hacker's Delight Benchmark #+include: ~/org/roam/20240816133158-hacker_s_delight.org :lines "8-" Some of the bit-twiddling theorems from [[id:adb55e19-89c4-4c5b-9988-3ad5be323083][Hacker's Delight]] were manually translated into Lean Theorems. These theorems came from chapters two and three of the book. We then collected data using our tactics. | | # | Not Equalities | Unsupported Operations | Succeeded | |-----------------+----+-----------------+------------------------+-----------| | Ring | 86 | 53 | 30 | 3 | | Bitwise | 86 | 53 | 29 | 4 | | Automata | 86 | 53 | 9 | 24 | | Ring + Automata | 86 | 53 | 9 | 24 | ** LLVM InstCombine Benchmark Instruction Combination, or InstCombine, is the largest part of the LLVM optimisations. In InstCombine, instructions are locally re-written in thousands of cheap re-writes. For example, InstCombie may re-write #+BEGIN_SRC c // before InstCombine y = 2 * x; // after InstCombine y = x + x; #+END_SRC Such an optimisation carries the implicit proof that ~2 * x = x + x~. In the InstCombine benchmark, the number of files is: #+BEGIN_SRC bash :exports results ls ~/coding/llvm-project-main/llvm/test/Transforms/InstCombine | wc -l #+END_SRC #+RESULTS: : 1473 Within those files, the number of functions is #+BEGIN_SRC bash :exports results grep "define" -R ~/coding/llvm-project-main/llvm/test/Transforms/InstCombine | wc -l #+END_SRC #+RESULTS: : 33558 We optimise these files using the shell command ~opt -passes=instcombine -S~. Using xDSL, we filter only those functions which contain the operations we are interested in (arithmetic and bitwise operations). From that, the number of Lean-understandable theorems we extract is: #+BEGIN_SRC bash :exports results grep "BEGIN" -R ~/coding/lean-mlir/SSA/Projects/InstCombine/tests/LLVM | wc -l #+END_SRC #+RESULTS: : 951 Of these, the number of theorems automatically solved by unfolding the definition is: #+BEGIN_SRC bash :exports results grep "try extract_goal" -R ~/coding/lean-mlir/SSA/Projects/InstCombine/tests/LLVM | wc -l #+END_SRC #+RESULTS: : 207 This means that the number of non-trivial theorems in this benchmark is: #+BEGIN_SRC bash :exports results grep "apply" -R ~/coding/lean-mlir/SSA/Projects/InstCombine/tests/LLVM | wc -l #+END_SRC #+RESULTS: : 744 Of those, we exclude some theorems because they take too long to typecheck. This gives us the final comparison: | | # | Not Equalities | Unsupported Operations | Succeeded | |-----------------+-----+-----------------+------------------------+-----------| | Ring | 717 | 179 | 478 | 58 | | Bitwise | 717 | 179 | 332 | 206 | | Automata | 717 | 179 | 293 | 245 | | Ring + Automata | 717 | 179 | 246 | 290 | Thus, on the InstCombine Benchmark, we see that ~automata~ solves the most theorems.

See Also

decision procedurebit vectordecision procedurebit blastingBackus-Naur formBackus-Naur formBackus-Naur formHacker's Delight

Leave your Feedback in the Comments Section