previous day
next day
all days

View: session overviewtalk overview

09:00-10:00 Session 21: Invited Talk (SAT 2015)


Pragmatic Approach to Formal Verification

ABSTRACT. After more than two decades of hard work by researchers in academia and industry, formal methods have been accepted as a viable part of the hardware design and validation process. We now have a better understanding of what is a cost-effective use of formal methods, and companies even set aside some resources for the further development of formal tools. Spreading formal methods into industrial scale software verification broadened the user population and increased motivation for development of such tools.

Centaur Technology is one of the companies that adopted formal verification (FV) as a part of their production flow. Our company designs Intel compatible x86-64 microprocessors. It does it with a relatively small team. To assure the quality of the design, a lot of effort is spent in the process of validation. Since new additions to x86-64 architecture widened the data on which instructions are performed, classic simulation provides even less coverage with respect to all possible inputs to the system than a few years ago.  The FV team at Centaur Technology was created as a reaction to this trend as well as the fact that the capacity of formal tools has reached a level where they can be successful even on industrial scale designs. There are also more publicly available off-shelf formal point tools (SAT, SMT, Model-checkers, etc.) that can be incorporated into more complex validation framework.  A pilot project that discovered a corner case bug in floating-point arithmetic was convincing enough to justify investing in a small FV team.

Our verification framework is built on top of the ACL2 theorem prover. There are many decision procedures built in the logic of ACL2 and proved correct within this logic.  For example, packages exist, defined inside the logic, for Binary Decision Diagrams (BDD) and for And-Inverter Graph (AIG) manipulation . It has a symbolic simulator called GL that can automate the proof of theorems over finite domains.  GL can be combined with word-level symbolic simulation of a hardware model to relate that model to its specification.

While we strive for rigorous analysis, our verification approach is very pragmatic.  We connected some ``trusted'' tools to ACL2, for instance various Satisfiability solvers (e.g., Glucose, Penelope, Lingeling, Riss3G), or the ABC model/equivalence checking tool.  The results from these tools are tagged as ``trusted'' (unverified) by ACL2.  However, for tools that provide a proof trace, in some cases, we can verify the correctness of those results within ACL2.  We prefer this approach to blindly trusting the tools.  However, when we exhaust approaches that have verifiable results, we are willing to use unverified methods as well.

Our team has worked on a variety of design and verification problems including microcode verification and transistor-level validation. Our main focus remains the verification of Register-transfer-level (RTL) designs written in Verilog or System Verilog. These designs are translated to a formal model with well defined semantics in ACL2 which allows for a rigorous analysis. The formal model is represented using our specialized word-level expression language called SVEX. An SVEX model can be symbolically executed and compared to the symbolically evaluated specification using GL. This process includes a word-level rewriting before it is bit-blasted into AIG (and later into Conjunctive Normal Form -- CNF), or BDD representation.  A SAT solver or BDD package is used to check equivalences. It is worthwhile to mention that the translation steps from SVEX to AIGs and from AIGs to CNFs or BDDs are proved correct -- in fact, the whole process creates an ACL2 theorem, where the only unverified portion is typically due to the use of a SAT solver.  Such theorems usually prove properties of microoperations as executed by the RTL design of the Centaur microprocessor. This is similar to the Symbolic Trajectory Evaluation based approach taken by the Intel FV team. However, the main difference is that our approach allows for the richer reasoning provided by ACL2 including proofs about specifications, composition of GL theorems, etc.

The specifications of the microoperations used for the verification of RTL can be used to define operational semantics of the machine that executes microcode.  This allows for a seamless transition between two different verification domains -- hardware and microcode.  We also use these microoperation specifications as the basis of a formal executable model of a subset of the x86 ISA that can be validated by running against Intel, AMD or Centaur's existing microprocessors, running a few hundred thousand tests per second. This increases our confidence in our specifications and clarifies inconsistencies in the ISA manual. Overall, our framework offers a more holistic approach to microprocesor validation.

In this talk we give you a top-down overview of our verification methodology. We point out problems that can be solved using automatic methods, and describe the aspects of our work that cannot be easily automated. We hope for feedback that may help us to be more efficient in use of your tools (SAT solvers in particular).  We also invite anybody who would like to collaborate on improving the AIG to CNF encoding or SVEX to AIG encoding.

10:00-10:15Coffee Break
10:15-12:15 Session 22: Parallel SAT and #SAT Solving (SAT 2015)
Projected Model Counting
SPEAKER: unknown

ABSTRACT. Model counting is the task of computing the number of assignments to variables V that satisfy a given propositional theory F. Model counting is an essential tool in probabilistic reasoning. In this paper, we introduce the problem of model counting projected on a subset P of original variables that we call 'priority' variables. The task is to compute the number of assignments to P such that there exists an extension to 'non-priority' variables V\P that satisfies F. Projected model counting arises when some parts of the model are irrelevant to the counts, in particular when we require additional variables to model the problem we are counting in SAT. We discuss three different approaches to projected model counting (two of which are novel), and compare their performance on different benchmark problems.

Search-Space Partitioning for Parallelizing SMT Solvers
SPEAKER: unknown

ABSTRACT. This paper studies how parallel computing can be used to reduce the time required to solve instances of the Satisfiability Modulo Theories problem (SMT). We address the problem in two orthogonal ways: (i) by distributing the computation using algorithm portfolios, search space partitioning techniques, and their combinations; and (ii) by studying the effect of partitioning heuristics, and in particular the look-ahead heuristic, to the efficiency of the partitioning. We implemented the approaches in the OpenSMT2 solver and experimented with the QF_UF theory on a computing cloud. The results show a consistent speed-up on hard instances with up to an order of magnitude run time reduction and several more instances being solved within the timeout.

HordeSat: A Massively Parallel Portfolio SAT Solver
SPEAKER: Tomáš Balyo

ABSTRACT. A simple yet successful approach to parallel satisfiability (SAT) solving is to run several different (a portfolio of) SAT solvers on the input problem at the same time until one solver finds a solution. The SAT solvers in the portfolio can be instances of a single solver with different configuration settings. Additionally the solvers can exchange information usually in the form of clauses. In this paper we investigate whether this approach is applicable in the case of massively parallel SAT solving. Our solver is intended to run on clusters with hundreds of processors, hence the name HordeSat. HordeSat is a fully distributed portfolio-based SAT solver with a modular design that allows it to use any SAT solver that implements a given interface. HordeSat has a decentralized design and features hierarchical parallelism with interleaved communication and search. We experimentally evaluated it using all the benchmark problems from the application track of the 2014 International SAT Competition. The experiments demonstrate that HordeSat is scalable up to at least 512 processors with superlinear average speedup.

Laissez-Faire Caching for Parallel #SAT Solving
SPEAKER: Jan Burchard

ABSTRACT. The problem of counting the number of satisfying assignments of a propositional formula (#SAT) can be considered to be the big brother of the well known SAT problem. However, the higher computational complexity and a lack of fast solvers currently limit its usability for real world problems.

Similar to SAT, utilizing the parallel computation power of modern CPUs could greatly increase the solving speed in the realm of #SAT. However, in comparison to SAT there is an additional obstacle for the parallelization of #SAT that is caused by the usage of conflict learning together with the #SAT specific techniques of component caching and sub-formula decomposition. The combination can result in an incorrect final result being computed due to incorrect values in the formula cache. This problem is easily resolvable in a sequential solver with a depth-first node order but requires additional care and handling in a parallel one. In this paper we introduce laissez-faire caching which allows for an arbitrary node computation order in both a sequential and parallel solver while ensuring a correct final result. Additionally, we apply this new caching approach to build countAntom, the world's first parallel #SAT-solver.

Our experimental results clearly show that countAntom achieves considerable speedups through the parallel computation while maintaining correct results on a large variety of benchmarks coming from different real-world applications. Moreover, our analysis indicates that laissez-faire caching only adds a small computational overhead.

12:15-13:45Lunch Break
13:45-15:15 Session 23: CDCL Solving (SAT 2015)
Hints Revealed

ABSTRACT. We propose a notion of hints, clauses that are not necessarily consistent with the input formula. The goal of adding hints is to speed up the SAT-solving process. For this purpose, we provide an efficient general mechanism for hint addition and removal, accomplished by partial resolution. When a hint is determined to be inconsistent, the partial resolution-graph of an unsatisfiable-core is used to reduce the search space. The suggested mechanism is used to boost performance by adding generated hints to the input formula. We describe two specific hint-suggestion methods, one of which increases performance by 30% on satisfiable SAT '13 competition instances and solves 9 instances not solved by the baseline solver.

Between SAT and UNSAT: The Fundamental Difference in CDCL SAT
SPEAKER: Chanseok Oh

ABSTRACT. The way CDCL SAT solvers find a satisfying assignment is very different from the way they prove unsatisfiability.  We propose an explanation to the difference by identifying direct connections to the workings of some of the most important elements in CDCL solvers: the effects of restarts and VSIDS, and the roles of learned clauses.  We give a wide range of concrete evidence that highlights the varying effects and roles of these elements.  As a result, this paper also sheds a new light on the internal workings of CDCL.  Based on our reasoning on the difference in solver behaviors, we present several ideas for optimizing SAT solvers for either SAT or UNSAT instances.  We then show that we can achieve improvements on both SAT and UNSAT at the same time by judiciously exploiting the difference.  We have implemented a hybrid idea mixing two different restart strategies on top of our new solver COMiniSatPS and observed substantial performance improvement.

Evaluating CDCL Variable Scoring Schemes
SPEAKER: unknown

ABSTRACT. On application benchmarks, the invention of the variable decision heuristic VSIDS (variable state indepedent decaying sum) in the context of the CDCL (conflict-driven clause learning) SAT solver Chaff is considered a major mile stone in improving the efficiency of modern SAT solvers. In this paper, we propose a new variant ACIDS (average conflict-index decision score) and relate it to the original implementation of VSIDS, its popular modern implementation EVSIDS (exponential VSIDS), the VMTF (variable move-to-front) scheme, and other related decision heuristics. The main goal is to provide an empirical evaluation to serve as a starting point for trying to understand the reason for the efficiency of these decision heuristics. In our experiments, it turns out that VMTF, ACIDS, EVSIDS behave very similarly, if implemented carefully.

15:15-15:30Coffee Break
15:30-16:15 Session 24: Competitions and Evaluations (SAT 2015)
SAT Race 2015
SPEAKER: Tomas Balyo

ABSTRACT. SAT-Race 2015 is a competitive event for solvers of the Boolean Satisfiability (SAT) problem. It is organized as a satellite event to the 18th International Conference on Theory and Applications of Satisfiability Testing, September 24-27, 2015, Austin, Texas, USA and stands in the tradition of the yearly SAT Competitions and SAT-Races / Challenges. In contrast to the SAT Competitions, the focus of SAT-Race is on application benchmarks only.

The area of SAT Solving has seen tremendous progress over the last years. Many problems (e.g. in hardware and software verification) that seemed to be completely out of reach a decade ago can now be handled routinely. Besides new algorithms and better heuristics, refined implementation techniques turned out to be vital for this success.

To keep up the driving force in improving SAT solvers, we want to motivate implementors to present their work to a broader audience and to compare it with that of others.

Pseudo-Boolean Evaluation 2015

ABSTRACT. The pseudo-Boolean evaluation 2015 analyzes the current state of the art of pseudo-Boolean solving technology in order to determine the progress that has been made since the last evaluation. Therefore, we invite solver developers to submit the latest version of their tools. Furthermore, we will include selected solvers of the previous pseudo-Boolean competition 2012.

Additionally, we would like to see whether there is a need for novel solving technology. This should reflect in novel pseudo-Boolean formulas both from academia and industry.

This evaluation tries to show the current state of the art as well as possible. Hence, we will evaluate both open source reasoners, as well as closed-source solutions. We will make all collected data public afterwards.

Max-SAT Evaluation 2015

ABSTRACT. The Tenth Evaluation of Max-SAT Solvers (Max-SAT-2015) is organized as an affiliated event of the 18th International Conference on Theory and Applications of Satisfiability Testing (SAT-2015).

The objective of the evaluation is assessing the state of the art in the field of Max-SAT solvers, as well as creating a collection of publicly available Max-SAT benchmark instances.

The evaluation allows the submission of incomplete solvers in a Special Track, with the same three categories than for complete solvers, but with a reduced number of instances.