Towards Robust Artificial Intelligence via Spec-consistent Machine Learning

ABSTRACT. Deep learning has led to rapid progress being made in the field of machine learning and artificial intelligence, leading to dramatically improved solutions of many challenging problems such as image understanding, speech recognition, and control systems. Despite these remarkable successes, researchers have observed some intriguing and troubling aspects of the behaviour of these models. A case in point is the presence of adversarial examples which make learning based systems fail in unexpected ways. Such behaviour and the difficulty of interpreting the behaviour of neural networks is a serious hindrance in the deployment of these models for safety-critical applications. In this talk, I will review the challenges in developing models that are robust and explainable, introduce the framework of spec-consistent machine learning and discuss opportunities for collaboration between the formal methods and machine learning communities.

Bio:

Pushmeet Kohli is a Principal Scientist and Team Leader at DeepMind. Before joining DeepMind, Pushmeet was the director of research at the Cognition group at Microsoft. During his 10 years at Microsoft, Pushmeet worked in Microsoft labs in Seattle, Cambridge and Bangalore and took a number of roles and duties including being technical advisor to Rick Rashid, the Chief Research Officer of Microsoft. Pushmeet’s research revolves around Intelligent Systems and Computational Sciences, and he publishes in the fields of Machine Learning, Computer Vision, Information Retrieval, and Game Theory. His current research interests include 3D Reconstruction and Rendering, Probabilistic Programming, Interpretable and Verifiable Knowledge Representations from Deep Models. He is also interested in Conversation agents for Task completion, Machine learning systems for Healthcare and 3D rendering and interaction for augmented and virtual reality.

Pushmeet has won a number of awards and prizes for his research. His PhD thesis, titled "Minimizing Dynamic and Higher Order Energy Functions using Graph Cuts", was the winner of the British Machine Vision Association’s “Sullivan Doctoral Thesis Award”, and was a runner-up for the British Computer Society's “Distinguished Dissertation Award”. Pushmeet’s papers have appeared in Computer Vision (ICCV, CVPR, ECCV, PAMI, IJCV, CVIU, BMVC, DAGM), Machine Learning, Robotics and AI (NIPS, ICML, AISTATS, AAAI, AAMAS, UAI, ISMAR), Computer Graphics (SIGGRAPH, Eurographics), and HCI (CHI, UIST) conferences. They have won awards in ICVGIP 2006, 2010, ECCV 2010, ISMAR 2011, TVX 2014, CHI 2014, WWW 2014 and CVPR 2015. His research has also been the subject of a number of articles in popular media outlets such as Forbes, Wired, BBC, New Scientist and MIT Technology Review. Pushmeet is a part of the Association for Computing Machinery's (ACM) Distinguished Speaker Program.

Local Reasoning About the Presence of Bugs: Incorrectness Separation Logic

ABSTRACT. There has been a great deal of work on local reasoning about state for
proving the *absence* of bugs, but none for proving the *presence* of
bugs. In this paper, we present a new formal framework for local
reasoning about the presence of bugs, building on two complementary
foundations: (1) separation logic and (2) the recently introduced
"incorrectness logic". In addition to exploring the theory of this
new *incorrectness separation logic* (ISL), we show how it can be used
to derive a begin-anywhere, intra-procedural forward symbolic
execution analysis that has no false positives *by construction*. In
so doing, we take a step towards transferring modular, scalable
techniques from the world of program verification to the world of bug
catching.

Reasoning over Permissions Regions in Concurrent Separation Logic

ABSTRACT. We propose an extension of separation logic with *fractional permissions*, aimed at reasoning about concurrent programs that share arbitrary regions or data structures in memory. In existing formalisms, such reasoning typically either fails or is subject to stringent side conditions on formulas (notably *precision*) that significantly impair automation. We suggest two formal syntactic additions that collectively remove the need for such side conditions: first, the use of both "weak" and "strong" forms of separating conjunction, and second, the use of nominal labels from hybrid logic. We contend that our suggested alterations bring formal reasoning with fractional permissions in separation logic considerably closer to common pen-and-paper intuition, while imposing only a modest bureaucratic overhead.

Approximate Counting of Minimal Unsatisfiable Subsets

ABSTRACT. Given an unsatisfiable formula F in CNF, i.e. a set of clauses, the problem of Minimal Unsatisfiable Subset (MUS) seeks to identify the minimal subset of clauses N ⊆ F such that N is unsatisfiable. The emerging viewpoint of MUSes as the root causes of unsatisfiability has led MUSes to find applications in a wide variety of diagnostic approaches. Recent advances in finding and enumeration of MUSes have motivated researchers to discover applications that can benefit from rich information about the set of MUSes. One such extension is that of counting the number of MUSes, which has shown to describe the inconsistency metrics for general propositional knowledge bases. The current best approach for MUS counting is to employ a MUS enumeration algorithm, which often does not scale for the cases with a reasonably large number of MUSes.
Motivated by the success of hashing-based techniques in the context of model counting, we design the first approximate counting procedure with
(ε,δ) guarantees, called AMUSIC. Our approach avoids exhaustive MUS enumeration by combining the classical technique of universal hashing with advances in QBF solvers along with a novel usage of union and intersection of MUSes to achieve runtime efficiency. Our prototype implementation of AMUSIC is shown to scale to instances that were clearly beyond the reach of enumeration-based approaches.

ABSTRACT. Craig interpolant generation for non-linear theory and its combination with other theories are still in infancy, although interpolation-based techniques have become popular in the verification of programs and hybrid systems where non-linear expressions are very common.In this paper, we first prove that a polynomial interpolant of the form h(x)>0 exists for two mutually contradictory polynomial formulas \phi(x,y) and \psi(x,z), with the form f1>=0/\.../\ fn >= 0, where fi are polynomials in x,y or x,z, and the quadratic module generated by fi is Archimedean. Then, we show that synthesizing such interpolant can be reduced to solving a semi-definite programming problem (SDP). In addition, we propose a verification approach to assure the validity of the synthesized interpolant and consequently avoid the unsoundness caused by numerical error in SDP solving. Besides, we discuss how to generalize our approach to general semi-algebraic formulas. Finally, as an applicaiton, we demonstrate how to apply our approach to invariant generation in program verification.

ABSTRACT. SPARK is both a deductive verication tool for the Ada language and the subset of Ada on which it operates. In this paper, we present a recent extension of the SPARK language and toolset to support pointers. This extension is based on an ownership policy inspired by Rust to enforce non-aliasing through a move semantics of assignment. In particular, we consider pointer-based recursive data structures, and discuss how they are supported in SPARK. We explain how iteration over these structures can be handled using a restricted form of aliasing called local borrowing. To avoid introducing a memory model and to stay in the first-order logic background of SPARK, the relation between the iterator and the underlying structure is encoded as a predicate which is maintained throughout the program control flow. Special first-order contracts, called pledges, can be used to describe this relation. Finally, we give examples of programs that can be veried using this framework.

Unbounded-Time Safety Verification of Stochastic Differential Dynamics

ABSTRACT. In this paper, we propose a method for bounding the probability that a stochastic differential equation (SDE) system violates a safety specification over the infinite time horizon. SDEs are mathematical models of stochastic processes that capture how states evolve continuously in time. They are widely used in numerous applications such as engineered systems (e.g., modeling how pedestrians move in an intersection), computational finance (e.g., modeling stock option prices), and ecological processes (e.g., population change over time). Previously the safety verification problem has been tackled over finite and infinite time horizons using a diverse set of approaches. The approach in this paper attempts to connect the two views by first identifying a finite time bound, beyond which the probability of a safety violation can be bounded by a negligibly small number. This is achieved by discovering an exponential barrier certificate that proves exponentially converging bounds on the probability of safety violations over time. Once the finite time interval is found, a finite-time verification approach is used to bound the probability of violation over this interval. We demonstrate our approach over a collection of interesting examples from the literature, wherein our approach can be used to find tight bounds on the violation probability of safety properties over the infinite time horizon.

Widest Paths and Global Propagation in Bounded Value Iteration for Stochastic Games

ABSTRACT. Solving \emph{stochastic games} with the reachability objective is a fundamental problem, especially in quantitative verification and synthesis. For this purpose, \emph{bounded value iteration (BVI)} attracts attention as an efficient iterative method. However, BVI's performance is often impeded by costly \emph{end component (EC) computation} that is needed to ensure convergence. Our contribution is a novel BVI algorithm that conducts, in addition to local propagation by the Bellman update that is typical of BVI, \emph{global} propagation of upper bounds that is not hindered by ECs. To conduct global propagation in a computationally tractable manner, we construct a weighed graph and solve the \emph{widest path problem} in it. Our experiments show the algorithm's performance advantage over the previous BVI algorithms that rely on EC computation.

Global PAC Bounds for Learning Discrete Time Markov Chains

ABSTRACT. Learning models from observations of a system is a powerful tool with many applications. In this paper, we consider learning Discrete Time Markov Chains (DTMC), with different methods such as {\em frequency estimation} or {\em Laplace smoothing}. While models learnt with such methods converge asymptotically towards the exact system, a more practical question in the realm of trusted machine learning is how accurate a model learnt with a limited time budget is.
Existing approaches provide bounds on how close the model is to the original system, in terms of bounds on {\em local} (transition) probabilities, which has unclear implication on the {\em global} behavior.

In this work, we provide {\em global bounds on the error} made by such a learning process, in terms of global behaviors formalized using {\em temporal logic}.
More precisely, we propose a learning process ensuring a bound on the error in the probabilities of these properties. While such learning process cannot exist for the full LTL logic, we provide one ensuring a bound that is uniform over all the formulas of CTL. Further, given one time-to-failure property, we provide an improved learning algorithm. Interestingly, frequency estimation is sufficient for the latter, while Laplace smoothing is needed to ensure non-trivial uniform bounds for the full CTL logic.

STMC: Statistical Model Checker with Stratified and Antithetic Sampling

ABSTRACT. STMC is a statistical model checker that uses antithetic and stratified sampling techniques to reduce the number of samples and, hence, the amount of time required before making a decision. The tool is capable of statistically verifying any black-box probabilistic system that PRISM can simulate, against probabilistic bounds on any property that PRISM can evaluate over individual executions of the system. We have evaluated our tool on many examples and compared it with both symbolic and statistical algorithms. When the number of strata is large, our algorithms reduced the number of samples more than 3 times on average. Furthermore, being a statistical model checker makes STMC able to verify models that are well beyond the reach of current symbolic model checkers. On large systems (up to 1014 states) STMC was able to check 100% of benchmark systems, compared to existing methods which only succeeded on 13% of systems. The tool, installation instructions, benchmarks, and scripts for running the benchmarks are all available online as open source.

Certifying Certainty and Uncertainty in Approximate Membership Query Structures

ABSTRACT. Approximate Membership Query structures (AMQs) rely on randomisation for timeand space-efficiency, while introducing a possibility of false positive and false negative answers. Correctness proofs of such structures involve subtle reasoning about bounds on probabilities of getting certain outcomes. Because of these subtleties, a number of unsound arguments in such proofs have been made over the years.

In this work, we address the challenge of building rigorous and reusable computer-assisted proofs about probabilistic specifications of AMQs. We describe the framework for systematic decomposition of AMQs and their properties into a series of interfaces and reusable components. We implement our framework as a library in the Coq proof assistant and showcase it by encoding in it a number of non-trivial AMQs, such as Bloom filters, counting filters, quotient filters and blocked constructions, and mechanising the proofs of their probabilistic specifications.

We demonstrate how AMQs encoded in our framework guarantee the absence of false negatives by construction. We also show how the proofs about probabilities of false positives for complex AMQs can be obtained by means of verified reduction to the implementations of their simpler counterparts. Finally, we provide a library of domain-specific theorems and tactics that allow a high degree of automation in probabilistic proofs.

ABSTRACT. We study the expressiveness and reactive synthesis problem of HyperQPTL, a logic that specifies omega-regular hyperproperties. HyperQPTL is an extension of linear-time temporal logic (LTL) with explicit trace and propositional quantification and therefore truly combines trace relations and omega-regularity. As such, HyperQPTL can express promptness, which states that there is a common bound on the number of steps up to which an event must have happened. We demonstrate how the HyperQPTL formulation of promptness differs from the type of promptness expressible in the logic PROMPT-LTL. Furthermore, we study the realizability problem of HyperQPTL by identifying decidable fragments, where one decidable fragment contains formulas for promptness. We show that, in contrast to the satisfiability problem of HyperQPTL, propositional quantification has an immediate impact on the decidability of the realizability problem. We present a reduction to the realizability problem of HyperLTL, which immediately yields a bounded synthesis procedure. We implemented the synthesis procedure for HyperQPTL in the bounded synthesis tool BoSy. Our experimental results show that arbiter satisfying promptness as well as a protocol for the knowledge-based formulation of the dining cryptographers problem can be synthesized.

AMYTISS: Parallelized Automated Controller Synthesis for Large-Scale Stochastic Systems

ABSTRACT. Large-scale stochastic systems have recently received significant attentions due to their broad applications in various safety-critical systems such as traffic networks and self-driving cars. In this paper, we propose a software tool, called AMYTISS, implemented in C++/OpenCL, for designing correct-by-construction controllers for large-scale discrete-time stochastic systems. This tool is employed to (i) build finite Markov decision processes (MDPs) as finite abstractions of given original systems, and (ii) synthesize controllers for the constructed finite MDPs satisfying bounded-time safety and reach-avoid specifications. In AMYTISS, scalable parallel algorithms are designed such that they support parallel execution within CPUs, GPUs and hardware accelerators (HWAs). Unlike all existing tools for stochastic systems, AMYTISS can utilize high-performance computing (HPC) platforms and cloud-computing services to mitigate the effects of the state-explosion problem, which is always present in analyzing large-scale stochastic systems. We benchmark AMYTISS against the most recent tools in the literature using several physical case studies including robot examples, room temperature and road traffic networks. We also apply our algorithms to a 3-dimensional autonomous vehicle and 7-dimensional nonlinear model of a BMW 320i car by synthesizing an autonomous parking controller.

Code2Inv: A Deep Learning Framework for Program Verification

ABSTRACT. We propose a general end-to-end deep learning framework Code2Inv, which takes a verification task and a proof checker as input, and automatically learns a valid proof for the verification task by interacting with the given checker. Code2Inv is parameterized with an embedding module and a grammar: the former encodes the verification task into numeric vectors while the latter describes the format of solutions Code2Inv should produce. We demonstrate the flexibility of Code2Inv by means of two small-scale yet expressive instances: a loop invariant synthesizer for C programs, and a Constrained Horn Clause (CHC) solver.

Maximum Causal Entropy Specification Inference from Demonstrations

ABSTRACT. In many settings (e.g., robotics) demonstrations provide a natural way to specify tasks; however, most methods for learning from demonstrations either do not provide guarantees that the learned (e.g., rewards) can be safely composed or do not explicitly capture temporal properties. Motivated by this deficit, recent works have proposed learning Boolean task specifications, a class of Boolean non-Markovian rewards which admit well-defined composition and explicitly handle historical dependencies. This work continues this line of research by adapting maximum causal entropy inverse reinforcement learning to estimate the posteriori probability of a specification given a multi-set of demonstrations. The key algorithmic insight is to leverage the extensive literature and tooling on reduced ordered binary decision diagrams to efficiently encode a time unrolled Markov Decision Process. This enables transforming a naive algorithm with running time exponential in the episode length, into a polynomial time algorithm.

ABSTRACT. Computer science class enrollments have rapidly risen in the past decade.
With current class sizes, standard approaches to grading and providing personalized feedback are no longer possible and new techniques become both feasible and necessary. In this paper, we present the third version of Automata Tutor, a tool
for helping teachers and students in large courses on automata and formal languages.
The second version of Automata Tutor supported automatic grading and feedback for finite-automata constructions and has already been used by thousands of users in dozens of countries. This new version of Automata Tutor supports automated grading and feedback generation for a greatly extended variety of new problems, including problems that ask students to create regular expressions, context-free grammars, pushdown automata and Turing machines corresponding to a given description, and problems about converting between equivalent models - e.g., from regular expressions to nondeterministic finite automata. Moreover, for several problems, this new version also enables teachers and students to automatically generate new problem instances. We also present the results of a survey run on a class of 950 students, which shows very positive results about the usability and usefulness of the tool.

Manthan: A Data-Driven Approach for Boolean Function Synthesis

ABSTRACT. Boolean functional synthesis is a fundamental problem in computer science with wide-ranging applications and has witnessed a surge of interest resulting in progressively improved techniques over the past decade. Despite intense algorithmic development, a large number of problems remain beyond the reach of the state of the art techniques. Motivated by the progress in machine learning, we propose Manthan, a novel data-driven approach to Boolean functional synthesis. Manthan views functional synthesis as a classification problem, relying on advances in constrained sampling for data generation, and advances in automated reasoning for a novel proof-guided refinement and provable verification. On an extensive and rigorous evaluation over 609 benchmarks, we demon- strate that Manthan significantly improves upon the current state of the art, solving 356 benchmarks in comparison to 280, which is the most solved by a state of the art technique; thereby, we demonstrate an in- crease of 76 benchmarks over the current state of the art. Furthermore, Manthan solves 60 benchmarks that none of the current state of the art techniques could solve. The significant performance improvements, along with our detailed analysis, highlights several interesting avenues of future work at the intersection of machine learning, constrained sampling, and automated reasoning.