View: session overviewtalk overview
11:00 | First-Order Subsumption via SAT Solving ABSTRACT. Automated reasoners, such as SAT/SMT solvers and first-order provers, are becoming the backbones of applications of formal methods, for example in automating deductive verification, program synthesis, and security analysis. Automation in these formal methods domains crucially depends on the efficiency of the underlying reasoners towards finding proofs and/or counterexamples of the task to be enforced. In order to gain efficiency, automated reasoners use dedicated proof rules to keep proof search tractable. To this end, subsumption is one of the most important proof rules used by automated reasoners, ranging from SAT solvers to first-order theorem provers and beyond. It is common that millions of subsumption checks are performed during proof search, necessitating efficient implementations. However, in contrast to propositional subsumption as used by SAT solvers and implemented using sophisticated polynomial algorithms, first-order subsumption in first-order theorem provers involves NP-complete search queries, turning the efficient use of first-order subsumption into a huge practical burden. In this paper we argue that integration of a dedicated SAT solver provides a remedy towards efficient implementation of first-order subsumption and related rules, and thus further increasing scalability of first-order theorem proving towards applications of formal methods. Our experimental results demonstrate that, by using a tailored SAT solver within first-order reasoning, we gain a large speed-up in state-of-the-art benchmarks. |
11:25 | BaxMC: a CEGAR approach to MAX#SAT PRESENTER: Thomas Vigouroux ABSTRACT. The MAX#SAT problem is an important problem with multiple applications in security and program synthesis and it is proven hard to solve. MAX#SAT is defined as: given a parameterized propositional formula, with an optional existential prefix, compute parameters such that the number of models of the formula is maximal. We propose a CEGAR based algorithm and refinements thereof, relying on both exact and approximate model counting and we provide formal guarantees about its correctness in both cases. Our experiments show that this algorithm has much better effective complexity compared to the state of the art. As another contribution we give a simple proof of hardness of the exact problem and generalizations thereof (counting hierarchy). |
11:50 | INC: A Scalable Incremental Weighted Sampler ABSTRACT. The fundamental problem of weighted sampling involves sampling of satisfying assignments of Boolean formulas, which specify sampling sets, and according to distributions defined by pre-specified weight functions to weight functions. The tight integration of sampling routines in various applications has highlighted the need for samplers to be incremental, i.e., samplers are expected to handle updates to weight functions. The primary contribution of this work is an efficient knowledge compilation-based weighted sampler, INC, designed for incremental sampling. INC builds on top of the recently proposed knowledge compilation language, OBDD[AND], and is accompanied by rigorous theoretical guarantees. Our extensive experiments demonstrate that INC is faster than state-of-the-art approach for majority of the evaluation. In particular, we observed a median of 1.69× runtime improvement over the prior state-of-the-art approach. |
GAMBIT: An Interactive Playground for Concurrent Programs Under Relaxed Memory Models PRESENTER: Pankaj Kumar Kalita ABSTRACT. Concurrency bugs are challenging to diagnose. Further, relaxed memory models implemented by ``real" architectures and compilers drive concurrency bugs still more mysterious. GAMBIT was designed as an interactive debugging environment for debugging and repairing concurrency bugs under multiple relaxed memory models. GAMBIT analyzes supervised executions of concurrent programs using a theorem prover to answer queries regarding program behaviors due to alternate thread schedulings or instruction reorderings on other relaxed memory models. Further, it provides repair capabilities to automatically synthesize required fences and atomic blocks. Our ongoing work attempts to retarget the symbolic encodings used in the above for pedagogical applications. Graduate level courses like multiprocessor architecture, parallel programming, and advanced compilers discuss thread interleavings and relaxed memory models. However, such courses tend to be challenging for both students and instructors. We believe that students would appreciate a ``concurrency playground" that allows them to explore various concurrency constructs under a controlled setting, allowing them to examine them in composition and isolation. Instructors face the challenge of setting questions that test assimilation of certain topics and also grade the solutions. Our attempts are in this direction of applying the power of modern theorem provers on the symbolic encodings of concurrent programs on relaxed memory model (already developed for our debug and repair tool) for attacking the above problems. |
Commutativity in Concurrent Program Verification ABSTRACT. Verification algorithms for concurrent programs can take advantage of commutativity between statements: Verification of a representative subset of interleavings often succeeds with simpler proofs and with less time and memory consumption. The precise notion of commutativity as well as the selection of representatives are two crucial aspects of such verification algorithms. We develop verification algorithms that are parametric in these aspects, and theoretically analyse as well as empirically evaluate the impact of the chosen parameters on the verification. |
SEAURCHIN: Bounded Model Checking for Rust ABSTRACT. Rust is a systems programming language with similar applications as C and C++. The Rust compiler ensures that programs written in the “safe” subset of Rust possess certain safety properties, e.g., memory safety. At the same time practical Rust programs contain “unsafe” code for performance or legacy reasons – for example when delegating to existing C libraries. The Rust compiler, rustc, does not type check unsafe code. Thus, Rust programs using unsafe code may contain memory safety issues in addition to usual functional correctness problems. These errors are not restricted to the syntactically unsafe parts of the program but can manifest even in the safe part that calls unsafe code. Bounded Model Checking is an effective way to enable whole program analysis of Rust programs. Since Rust compiles to LLVM IR and SEAHORN has a bounded model checker (SEABMC) for LLVM IR – it becomes feasible to use SEAHORN to verify Rust programs. We build a pre-processing pipeline for SEAHORN called SEAURCHIN and use it to ascertain conditions for two standard library components String and RefCell to not panic at runtime. Interestingly, we find an invariant missing from the RefCell documentation whose absence leads to an overflow bug. Currently SEAURCHIN does not exploit ownership semantics (borrowed, owned, shared) from Rust to scale verification. Some ideas for doing so are discussed. We conclude by comparing our approach to that taken by other state-of-the-art verification tools for Rust. |
Proof Production for Neural Network Verification ABSTRACT. Deep neural networks (DNNs) are increasingly being employed in safety-critical systems, and there is an urgent need to guarantee their correctness. Consequently, the verification community has devised multiple techniques and tools for verifying DNNs. When DNN verifiers discover an input that triggers an error, that is easy to confirm; but when they report that no error exists, there is no way to ensure that the verification tool itself is not flawed. As multiple errors have already been observed in DNN verification tools, this calls the applicability of DNN verification into question. In this work, we present a novel mechanism for enhancing Simplex-based DNN verifiers with proof production capabilities: the generation of an easy-to-check witness of unsatisfiability, which attests to the absence of errors. Our proof production is based on an efficient adaptation of the well-known Farkas’ lemma, combined with mechanisms for handling piecewise-linear functions and numerical precision errors. As a proof of concept, we implemented our technique on top of the Marabou DNN verifier. Our evaluation on a safety-critical system for airborne collision avoidance shows that proof production succeeds in almost all cases, and requires only minimal overhead. |
Trading Accuracy For Smaller Cardinality Constraints ABSTRACT. We propose a new approximate Boolean cardinality constraint encoding trading accuracy for size. With the right trade-off, it is smaller than any existing encoding, maintaining generalized arc consistency through unit propagation. The accuracy and with it the size of the encoding can be adjusted. |
Formal Approach to Identifying Genes and Microbes Significant to Inflammatory Bowel Disease ABSTRACT. Research into inflammatory bowel disease (IBD) has exponentially increased due to the influx of new methods for data analysis. Previous work mainly focused on investigating either the microbiome or transcriptomic data. However, there may be additional benefits to a multi-omics approach that combines microbiome and gene expression data. We present a formal method that uses Boolean implication analysis as a CAD tool to identify microbes and genes important to IBD and the progression to IBD. We previously researched microbiomes and gene expression individually, and describe a new approach to analyze the combined data. This analysis may provide novel microbes and genes that can be used for disease diagnosis and drug development, and increase accuracy and efficiency due to the formal method approach. |
Understanding Trust Assumptions for Attestation in Confidential Computing ABSTRACT. Despite its critical role, remote attestation in Intel Software Guard Extensions (SGX) and Trust Domain Extensions (TDX) is poorly specified by Intel with some obvious flaws. We believe that it is part of Intel's strategic policy to create resistance to revealing trust assumptions of the process. |
On Optimizing Back-Substitution Methods for Neural Network Verification ABSTRACT. With the increasing application of deep learning in mission-critical systems, there is a growing need to obtain formal guarantees about the behaviors of neural networks. Indeed, many approaches for verifying neural networks have been recently proposed, but these generally struggle with limited scalability or insufficient accuracy. A key component in many state-of-the-art verification schemes is computing lower and upper bounds on the values that neurons in the network can obtain for a specific input domain — and the tighter these bounds, the more likely the verification is to succeed. Many common algorithms for computing these bounds are variations of the symbolic-bound propagation method; and among these, approaches that utilize a process called back-substitution are particularly successful. In this work, we present an approach for making back-substitution produce tighter bounds. To achieve this, we formulate and then minimize the imprecision errors incurred during back-substitution. We implement our approach as a proof-of-concept tool, and present favorable results compared to state-of-the-art verifiers that perform back substitution. |
Incremental Weighted Sampling ABSTRACT. The fundamental problem of weighted sampling involves sampling of satisfying assignments of Boolean formulas, which specify sampling sets, and according to distributions defined by pre-specified weight functions to weight functions. The tight integration of sampling routines in various applications has highlighted the need for samplers to be incremental, i.e., samplers are expected to handle updates to weight functions. The primary focus of this line of work is an efficient knowledge compilation-based weighted sampler, INC, designed for incremental sampling. INC builds on top of the recently proposed knowledge compilation language, OBDD[AND], and is accompanied by rigorous theoretical guarantees. Our extensive experiments demonstrate that INC is faster than state-of-the-art approach for majority of the evaluations. We discuss the potential novel extensions of INC. |
16:00 | Reconciling Verified-Circuit Development and Verilog Development ABSTRACT. In software development, verified compilers like the CompCert compiler and the CakeML compiler enable a methodology for software development and verification that allows software developers to establish program-correctness properties on the verified compiler's target level. Inspired by verified compilers for software development, the verified Verilog synthesis tool Lutsig enables the same methodology for Verilog hardware development. In this paper, we address how Verilog features that must be understood as hardware constructs, rather than as software constructs, fit into hardware development methodologies, such as Lutsig's, inspired the development methodology enabled by software compilers. We explore this issue by extending the subset of Verilog supported by Lutsig with one such feature: always_comb blocks. In extending Lutsig's Verilog support with this, seemingly minor, feature, we are, perhaps surprisingly, required to revisit Lutsig's methodology for circuit development and verification; this revisit, it turns out, requires reconciling traditional Verilog development and the traditional program-verification methodology offered by verified software compilers. All development for this paper has been carried out in the HOL4 theorem prover. |
16:25 | Timed Causal Fanin Analysis for Symbolic Circuit Simulation ABSTRACT. Symbolic circuit simulation has been the main vehicle for formal verification of Intel Core processor execution engines for over twenty years. It extends traditional simulation by allowing symbolic variables in the stimulus, covering the circuit behavior for all possible values simultaneously. A distinguishing feature of symbolic simulation is that it gives the human verifier clear visibility into the progress of the computation during the verification of an individual operation, and fine-grained control over the simulation to focus only on the datapath for that operation while abstracting away the rest of the circuit behavior. In this paper we describe an automated simulation complexity reduction method called timed causal fanin analysis that can be used to carve out the minimal circuit logic needed for verification of an operation on a cycle-by-cycle basis. The method has been a key component of Intel's large-scale execution engine verification efforts, enabling closed-box verification of most operations in the interface level. As a specific application, we discuss the formal verification of Intel’s new half-precision floating-point FP16 micro-instruction set. Thanks to the ability of the timed causal fanin analysis to separate the half-precision datapaths from full-width ones, we were able to verify all these instructions closed box, including the most complex ones like fused multiply-add and division. This led to early detection of several deep datapath bugs. |
16:50 | Divider Verification Using Symbolic Computer Algebra and Delayed Don’t Care Optimization PRESENTER: Alexander Konrad ABSTRACT. Recent methods based on Symbolic Computer Algebra (SCA) have shown great success in formal verification of multipliers and -- more recently -- of dividers as well. In this paper we enhance known approaches by the computation of satisfiability don’t cares for so-called Extended Atomic Blocks (EABs) and by Delayed Don’t Care Optimization (DDCO) for optimizing polynomials during backward rewriting. Using those novel methods we are able to extend the applicability of SCA-based methods to further divider architectures which could not be handled by previous approaches. We successfully apply the approach to the fully automatic formal verification of large dividers (with bit widths up to 512). |
Social dinner in a beautiful historical winery of Trentino.