FMCAD 2022: FORMAL METHODS IN COMPUTER-AIDED DESIGN
PROGRAM FOR WEDNESDAY, OCTOBER 19TH
Days:
previous day
next day
all days

View: session overviewtalk overview

09:00-10:00 Session 11: Invited Talk 1

June Andronick. The seL4 verification journey: how have the challenges and opportunities evolved

Abstract. The formal verification journey of the seL4 microkernel is nearing two decades, and still has an busy roadmap for the years ahead. It started as a research project aiming for a highly challenging problem with the potential of significant impact. Today, a whole ecosystem of developers, researchers, adopters and supporters are part of the seL4 community. With increasing uptake and adoption, seL4 is evolving, supporting more platforms, architectures, configurations, and features. This creates both opportunities and challenges: verification is what makes seL4 unique; as the seL4 code evolves, so must its formal proofs. With more than a million lines of formal, machine-checked proofs, seL4 is the most highly assured OS kernel, with proofs of an increasing number of properties (functional correctness, binary correctness, security –integrity and confidentiality– and system initialisation) and for an increasing number of hardware architectures: Arm (32-bit), x86 (64-bit) and RISC-V (64-bit), with proofs now starting for Arm (64-bit). In this talk we will reflect on the evolution of the challenges and opportunities the seL4 verification faced along its long, and continuing, journey.

10:30-12:00 Session 13: ML Verification
10:30
Proving Robustness of KNNs Against Adversarial Data Poisoning

ABSTRACT. We propose a method for verifying data-poisoning robustness of the k-nearest neighbors (KNN) algorithm, which is a widely-used supervised learning technique. Data poisoning aims to corrupt a machine learning model and change its inference result by adding polluted elements into its training set. The inference result is considered n-poisoning robust if it cannot be changed by up-to-n polluted elements. Our method verifies n-poisoning robustness by soundly overapproximating the KNN algorithm to consider all possible scenarios in which polluted elements may affect the inference result. Unlike existing methods which only analyze the inference phase but not the significantly more complex learning phase, our method is capable of verifying the entire KNN algorithm. Our experimental evaluation shows that the proposed method is significantly more accurate than existing methods, and is able to prove robustness of KNNs generated from popular supervised-learning datasets.

10:55
On Optimizing Back-Substitution Methods for Neural Network Verification

ABSTRACT. In recent years, deep neural networks are increasingly being incorporated into mission-critical systems, highlighting the importance of verifying them. Indeed, Many approaches for neural network verification have been recently proposed, but these generally struggle with limited scalability or insufficient accuracy. A key component in many state-of-the-art verification schemes is computing lower and upper bounds on the values that neurons in the network can obtain for a specific input domain --- the tighter these bounds, the smaller the search space that the verifier has to traverse in seeking a counter-example. Many commonly used algorithms for computing these bounds are variations of the symbolic-bound propagation method; and among these, approaches that utilize a process called back-substitution are particularly successful. In this paper, we present an approach for improving back-substitution. The approach is aimed at improving precision, by formulating, and then minimizing, the imprecision errors incurred during the run of back-substitution. We explore two variations of the approach corresponding with different tightness/run-time trade-offs. The technique can be easily integrated into many existing, symbolic-bound propagation techniques, with only minor changes. We implement our approach as a proof-of-concept tool, and present highly favorable results compared to state-of-the-art verifiers.

11:20
Verification-Aided Deep Ensemble Selection

ABSTRACT. Deep neural networks (DNNs) have become the technology of choice for realizing a variety of complex tasks. However, as highlighted by many recent studies, even an imperceptible perturbation to a correctly classified input can lead to misclassification by a DNN. This renders DNNs vulnerable to strategic input manipulations by attackers, and also oversensitive to environmental noise.

To mitigate this phenomenon, practitioners apply joint classification by an *ensemble* of DNNs. By aggregating the classification outputs of different individual DNNs for the same input, ensemble-based classification reduces the risk of misclassifications due to the specific realization of the stochastic training process of any single DNN. However, the effectiveness of a DNN ensemble is highly dependent on its members *not simultaneously erring* on many different inputs.

In this case study, we harness recent advances in DNN verification to devise a methodology for identifying ensemble compositions that are less prone to simultaneous errors, even when the input is adversarially perturbed --- resulting in more robustly-accurate ensemble-based classification.

Our proposed framework uses a DNN verifier as a backend, and includes heuristics that help reduce the high complexity of directly verifying ensembles. More broadly, our work puts forth a novel universal objective for formal verification that can potentially improve the robustness of real-world, deep-learning-based systems across a variety of application domains.

11:35
Neural Network Verification with Proof Production

ABSTRACT. Deep neural networks (DNNs) are increasingly being employed in safety-critical systems, and there is an urgent need to guarantee their correctness. Consequently, the verification community has devised multiple techniques and tools for verifying DNNs. When DNN verifiers discover an input that triggers an error, that is easy to confirm; but when they report that no error exists, there is no way to ensure that the verification tool itself is not flawed. As multiple errors have already been observed in DNN verification tools, this calls the applicability of DNN verification into question. In this work, we present a novel mechanism for enhancing Simplex-based DNN verifiers with proof production capabilities: the generation of an easy-to-check witness of unsatisfiability, which attests to the absence of errors. Our proof production is based on an efficient adaptation of the well-known Farkas’ lemma, combined with mechanisms for handling piecewise-linear functions and numerical precision errors. As a proof of concept, we implemented our technique on top of the Marabou DNN verifier. Our evaluation on a safety-critical system for airborne collision avoidance shows that proof production succeeds in almost all cases, and requires only minimal overhead.

12:00-12:40 Session 14: Student forum talks - Part 1
Partial Order Reduction for Abstraction-Based Verification of Concurrent Software

ABSTRACT. Concurrency introduces a new level of complexity into software verification due to the great number of possible thread interleavings. Partial order reduction is an effective approach to handle concurrency in model checking. In this work, we present ways to integrate a dynamic partial order reduction algorithm into an abstraction-based verification process. The focus of my ongoing research is to exploit information about the applied abstraction when building a dependency relation on operations of a program. If the source of dependency between certain operations is abstracted away, they need not be considered dependent. By decreasing the dependency in the model, we hope to increase the reducing effect of partial order reduction.

Axiomatic Analysis of Distributed Systems

ABSTRACT. Safety critical embedded software require comprehensive proofs of safety. A mathematically precise way of ensuring freedom from bugs is formal verification, but that is often too complex to be applied to entire systems. However, embedded components seldom exist on their own, but rather as part of hardware-software systems. Communication protocols used among such components have often been the cause of insuperable complexity. This paper proposes a novel solution to this verification problem by extending concepts of weak memory modeling to communication protocols.

Lazy Abstraction for Probabilistic Systems

ABSTRACT. We present the idea of applying lazy abstraction-refinement to the verification of Markov Decision Processes. We combine an existing lazy abstraction-refinement framework that uses adaptive simulation graphs originally formulated for non-probabilistic systems with the Bounded Real-Time Dynamic Programming algorithm used for MDP analysis. Our goal with the proposed algorithm is to make the analysis of more complex probabilistic models possible.

A Formalization of Heisenbugs and Their Causes

ABSTRACT. Heisenbugs are bugs that disappear or change their behavior under observation and are hence notoriously difficult to detect and fix. They can be caused by various sources of non-determinism on different system levels, many of which developers and testers might not even be aware of. My ongoing research focuses on the development of a formal framework to identify, categorize, and reason about the causes of Heisenbugs. In the future, I would like to use this as a basis for developing tools for automated causal analysis of Heisenbugs.

Fine-Grained Reconstruction of cvc5 Proofs in Isabelle/HOL

ABSTRACT. The proof assistant Isabelle/HOL can call external solvers to automate proof search, which is crucial for using it more effectively. Some solvers also provide proofs that are reconstructed internally into native Isabelle/HOL proofs. cvc5 is an efficient satisfiability modulo theories (SMT) solver that is currently only indirectly used by Isabelle. Leveraging prior work on reconstruction of SMT proofs in the Alethe proof format, we present a translation from CVC5 ’s internal proof format into Alethe. Moreover, we extend the existing reconstruction to handle proofs in the theory of bit-vectors which is frequently used in program verification.

Tuning the Learning of Circuit-Based Classifiers

ABSTRACT. Modern predictive systems often come as big models with high computational costs. Circuit-based models could therefore be a step toward efficiency. As part of this author's Master's thesis, we present our efforts to construct classifiers that work internally with binary variables.

Verification-Driven Ensemble Selection

ABSTRACT. Deep neural networks (DNNs) have become the technology of choice for realizing a variety of complex tasks. However, as highlighted by many recent studies, even an imperceptible perturbation to a correctly classified input can lead to misclassification by a DNN. This renders DNNs vulnerable to strategic input manipulations by attackers, and also oversensitive to environmental noise.

To mitigate this phenomenon, practitioners apply joint classification by an *ensemble* of DNNs. By aggregating the classification outputs of different individual DNNs for the same input, ensemble-based classification reduces the risk of misclassifications due to the specific realization of the stochastic training process of any single DNN. However, the effectiveness of a DNN ensemble is highly dependent on its members *not simultaneously erring* on many different inputs.

In this case study, we harness recent advances in DNN verification to devise a methodology for identifying ensemble compositions that are less prone to simultaneous errors, even when the input is adversarially perturbed --- resulting in more robustly-accurate ensemble-based classification.

Our proposed framework uses a DNN verifier as a backend, and includes heuristics that help reduce the high complexity of directly verifying ensembles. More broadly, our work puts forth a novel universal objective for formal verification that can potentially improve the robustness of real-world, deep-learning-based systems across a variety of application domains.

Formal Verification of Algebraic Effects

ABSTRACT. Abstract—With the upcoming release of OCaml 5.0, which will see the addition of algebraic effects, we will propose an extension of the Cameleer verification tool that can prove these, using protocols and defunctionalization

Lazy abstraction for time in eager CEGAR

ABSTRACT. Abstraction-based model checking, especially Counterexample-Guided Abstraction Refinement (CEGAR), has a prominent role in verifying software-based systems. However, traditional eager CEGAR has limitations: timed systems need efficient refinement and fine-grained precision to compute good abstractions, and visible variable-based abstractions proved to be too coarse. In this paper, we propose to combine the fine-grained precision of lazy abstraction for time with efficient data abstraction in the eager CEGAR loop.

Symbolic Coloured Model Checking for HCTL

ABSTRACT. Hybrid computation tree logic (HCTL) extends the branching-time temporal logic CTL. This extended expressiveness of HCTL allows specifying properties that play a crucial role in analysing biological systems, such as the existence of oscillations or multiple steady states within the system's dynamics. Usually, not all interactions in such systems are precisely known. This work presents a novel, entirely symbolic algorithm for model checking HCTL on systems with partially defined dynamics, suitable for the needs of systems biology.

Strategies for Parallel SMT Solving

ABSTRACT. SMT solving has not been effectively parallelized such that solving subproblems in parallel reliably outperforms the best single-threaded execution strategies. In order to effectively utilize the vast amount of computing resources available and solve previously intractable problems, we must partition SMT problems. In this extended abstract, we discuss partititoning algorithms and parallel execution strategies for SMT solving.

14:00-15:30 Session 16: Hardware and RTL - Part 1
Chair:
14:00
Formally Verified Isolation of DMA

ABSTRACT. Every computer having a network, USB or disk controller has a Direct Memory Access Controller (DMAC) which is configured by a driver to transfer data between the I/O device and main memory. The DMAC, if wrongly configured, can therefore potentially leak sensitive data and overwrite critical memory to overtake the system. Since DMAC drivers tend to be buggy (due to their complexity), these attacks are a serious threat.

This paper presents a general formal framework for modeling DMACs and verifying under which condition they are isolated. These conditions can be used later as a specification for guaranteeing that a driver configures the DMAC correctly. Our framework establishes general isolation isolation theorems that are common to all DMACs, leaving to the user only the task of verifying proof obligations that are specific of the desired DMAC. This provides a reusable verification infrastructure that allows to reduce the verification effort for different DMACs.

We demonstrate our framework by instantiating it with a USB DMAC.

14:25
Foundations and Tools in HOL4 for Analysis of Microarchitectural Out-of-Order Execution

ABSTRACT. Program analyses based on Instruction Set Architecture (ISA) abstractions can be circumvented using microarchitectural vulnerabilities, permitting unwanted program information flows even when ISA-level properties rule them out. However, the low abstraction levels found below ISAs, e.g., in microarchitectures defined in hardware description languages, may obscure information flow and hinder analysis tool development. We present a machine-checked formalization in the HOL4 theorem prover of a language, MIL, that abstractly describes microarchitectural in-order and out-of-order program execution and enables reasoning about low-level program information flows. In particular, MIL programs can exhibit information flow side channels when executed out-of-order, as compared to a reference in-order execution. We prove memory consistency between MIL's out-of-order and in-order dynamic semantics in HOL4, and define a notion of conditional noninterference for MIL programs which rules out side channels. We then demonstrate how to establish conditional noninterference for programs via a novel semi-automated bisimulation based verification strategy inside HOL4, which we apply to several examples. Based on our results, we believe MIL is suitable as a translation target for ISA code to enable information flow analyses.

14:50
Synthesizing Instruction Selection Rewrite Rules from RTL using SMT

ABSTRACT. Creating a compiler for an instruction set architecture (ISA) requires a set of rewrite rules describing how to translate from the compiler’s intermediate representation (IR) to the ISA. We address this challenge by synthesizing rewrite rules from a register-transfer level (RTL) description of the target architecture (with minimal annotations about its state and the ISA format), together with formal IR semantics, by constructing SMT queries where solutions represent valid rewrite rules. We evaluate our approach on multiple architectures, support- ing both integer and floating-point operations. We synthesize both integer and floating-point rewrite rules from an intermediate representation to various reconfigurable array architectures in under 1.2 seconds per rule. We also synthesize integer rewrite rules from WebAssembly to RISC-V with both standard and custom extensions in under 4 seconds per rule, and we synthesize floating-point rewrite rules in under 8 seconds per rule.

15:15
Error Correction Code Algorithm and Implementation Verification using Symbolic Representations

ABSTRACT. Error-correction codes (ECCs) are becoming a de rigueur feature in modern memory subsystems, as it becomes increasingly important to safeguard data against random bit corruption. ECC architecture constantly evolves towards designs that leverage complex mathematics to minimize check bits and maximize the number of data bits protected, as a result of which subtle bugs may be introduced into the design. These algorithms traverse a vast data space and are subject to corner case bugs which are hard to catch through constraint-based randomized testing. This necessitates formal verification of ECC designs to assure correctness of the algorithm and its hardware implementation. In this paper we present a technique of representing various ECC algorithm outputs as Boolean equations in the form of Boolean Decision Diagrams (BDDs) to facilitate reasoning about the algorithms. We also discuss the counting and generation of examples from the BDD representations and how it aids in tuning ECC algorithms for performance and security. Additionally, we display the use of Symbolic Trajectory Evaluation (STE) to prove the correctness of register transfer level (RTL) implementations of these algorithms. We discuss the scaling up of this verification methodology, using different complexity and convergence techniques. We apply these techniques to a number of complex ECC designs at Intel and showcase their efficacy on several categories of bugs.

16:00-17:20 Session 18: SAT and SMT - Part 1
16:00
Compact Symmetry Breaking for Tournaments

ABSTRACT. Isolators are a useful tool for reducing the computation needed to solve graph existence problems via SAT. We extend techniques for creating isolators for undirected graphs to the tournament (complete, directed) case, noting several parallels in properties of isolators for the two classes. We further present an algorithm for constructing n-vertex tournament isolators with O(n \log n) unit clauses. Finally, we show the utility of our new isolators in computations of tournament Ramsey numbers.

16:25
Enumerative Data Types with Constraints

ABSTRACT. Many verification and validation activities involve reasoning about constraints over complex, hierarchical data types. For example, distributed protocols are often defined using state machines that govern the behavior of processes communicating with messages which are hierarchical data types with state-dependent constraints and dependencies between component fields. Fuzzing, analyzing and evaluating implementations of such protocols requires solving complex queries that pose challenges to current SMT solvers. Generating fields that satisfy type constraints is one of the challenges and this can be tackled using enumerative data types: types that come with an enumerator, an efficiently computable function from natural numbers to elements of the type. Enumerative data types were introduced in ACL2s as a key component of counterexample generation, but they do not handle constraints such as dependencies between types. We extend enumerative data types with constraints and show how this extension enables applications such as hardware-in-the-loop fuzzing of complex distributed protocols.

16:50
Reducing NEXP-complete problems to DQBF

ABSTRACT. We present an alternative proof of the NEXP-hardness of the satisfiability of "Dependency Quantified Boolean Formulas" (DQBF). Besides being simple, our proof also gives us a general method to reduce NEXP-complete problems to DQBF. We demonstrate its utility by presenting explicit reductions from a wide variety of NEXP-complete problems to DQBF such as (succinctly represented) 3-colorability, Hamiltonian cycle, set packing and subset-sum as well as NEXP-complete logics such as the Bernays-Sch\"onfinkel-Ramsey class, the two-variable logic and the monadic class. Our results show the vast applications of DQBF solvers which recently have gathered a lot of attention among researchers.

17:05
Bounded Model Checking for LLVM

ABSTRACT. Bounded Model Checking (BMC) is an effective and precise static analysis technique that reduces program verification to satisfiability (SAT) solving. In this paper, we present the design and implementation of a new BMC engine (SEABMC) in the SEAHORN verification framework for LLVM. SEABMC precisely models arithmetic, pointer, and memory operations of LLVM. Our key design innovation is to structure verification condition generation around a series of transformations, starting with a custom IR (called SEA-IR) that explicitly purifies all memory operations by explicating dependencies between them. This transformation-based approach enables supporting many different styles of verification conditions. To support memory safety checking, we extend our base approach with fat pointers and shadow bits of memory to keep track of metadata, such as the size of a pointed-to object. To evaluate SEABMC, we have used it to verify the AWS-C-COMMON library from AWS. We report on the effect of different encoding options with different SMT solvers, and also compare with CBMC, SMACK, KLEE and SYMBIOTIC. We show that SEABMC is capable of providing an order of magnitude improvement compared with state-of-the-art.