Branch & Learn with Post-hoc Correction for Predict+Optimize with Unknown Parameters in Constraints
ABSTRACT. Combining machine learning and constrained optimization, Predict+Optimize tackles optimization problems containing parameters that are unknown at the time of solving. The idea is to first predict the unknown parameters using historical data, and then solve the optimization problem using the predicted parameters. Prior works focus on cases with unknowns only in the objectives. A new framework was recently proposed to cater for unknowns also in constraints by introducing a loss function, called Post-hoc Regret, that takes into account the cost of correcting an unsatisfiable prediction. Since Post-hoc Regret is non-differentiable, the previous work computes only its approximation. While the notion of Post-hoc Regret is general, its specific implementation is applicable to only packing and covering linear programming problems. In this paper, we first show how to compute Post-hoc Regret exactly for any optimization problem solvable by a recursive algorithm satisfying simple conditions. Experimentation demonstrates substantial improvement in the quality of solutions as compared to the earlier approximation approach. Furthermore, we show experimentally the empirical behavior of different combinations of correction and penalty functions used in the Post-hoc Regret of the same benchmarks. Results provide insights for defining the appropriate Post-hoc Regret in different application scenarios.
NER4OPT: Named Entity Recognition for Optimization Modelling from Natural Language
ABSTRACT. Solving combinatorial optimization problems involves a two-stage process that follows the model-and-run approach. First, a user is responsible for formulating the problem at hand as an optimization model, and then, given the model, a solver is responsible for finding the solution. While optimization technology has enjoyed tremendous theoretical and practical advances, the overall process has remained the same for decades. To date, transforming problem descriptions into optimization models remains the main barrier to entry. To alleviate users from the cognitive task of modeling, we study named entity recognition to capture components of optimization models such as the objective, variables, and constraints from free-form natural language text, coined as NER4OPT. We show how to solve NER4OPT using classical techniques based on morphological and grammatical properties and modern methods leveraging pre-trained large language models and fine-tuning transformers architecture with optimization-specific corpora. For best performance, we present their hybridization combined with feature engineering and data augmentation to exploit the language of optimization problems. We improve over the state-of-the-art for annotated linear programming word problems, identify several next steps and discuss important open problems toward automated modeling.
ABSTRACT. Extensive research has been conducted, over recent years, on various ways of enhancing heuristic search for combinatorial optimization problems with machine learning algorithms. In this study, we investigate the use of predictions from graph neural networks (GNNs) in the form of heatmaps to improve the Hybrid Genetic Search (HGS), a state-of-the-art algorithm for the capacitated vehicle routing problem.
The crossover and local-search components of HGS are instrumental in finding improved solutions, yet these components essentially rely on simple greedy or random choices. It seems intuitive to attempt to incorporate additional knowledge at these levels.
Throughout a vast experimental campaign on more than 10,000 problem instances, we show that exploiting more sophisticated strategies using measures of node relatedness (heatmaps, or simply distance) within these algorithmic components can significantly enhance performance. However, contrary to initial expectations, we also observed that heatmaps did not present significant advantages over simpler distance measures for these purposes. Therefore, we faced a common ---though rarely documented--- situation of overkill: GNNs can indeed improve performance on an important optimization task, but an ablation analysis demonstrated that simpler alternatives perform equally well.
Getting away with more network pruning: From sparsity to geometry and linear regions
ABSTRACT. One surprising trait of neural networks is the extent to which their connections can be pruned with little to no effect on accuracy. But when we cross a critical level of parameter sparsity, pruning any further leads to a sudden drop in accuracy. This drop plausibly reflects a loss in model complexity, which we aim to avoid. In this work, we explore how sparsity also affects the geometry of the linear regions defined by a neural network, and consequently reduces the expected maximum number of linear regions based on the architecture. We observe that pruning affects accuracy similarly to how sparsity affects the number of linear regions and our proposed bound for the maximum number. Conversely, we find out that selecting the sparsity across layers to maximize our bound very often improves accuracy in comparison to pruning as much with the same sparsity in all layers, thereby providing us guidance on where to prune.
Interpretable Clustering via Soft Clustering Trees
ABSTRACT. Clustering is a popular unsupervised learning task that consists of finding a partition of the data points that groups similar points together. Despite its popularity, most state-of-the-art algorithms do not provide any explanation of the obtained partition, making it hard to interpret. In recent years, several works have considered using decision trees to construct clusters that are inherently interpretable. However, these approaches do not scale to large datasets, do not account for uncertainty in results, and do not support advanced clustering objectives such as spectral clustering. In this work, we present soft clustering trees, an interpretable clustering approach that is based on soft decision trees that provide probabilistic cluster membership. We model soft clustering trees as continuous optimization problem that is amenable to efficient optimization techniques. Our approach is designed to output highly sparse decision trees to increase interpretability and is the first approach that supports tree-based spectral clustering. Extensive experiments show that our approach can produce clustering trees of significantly higher quality compared to the state-of-the-art and scale to large datasets.
Scalable and Near-Optimal epsilon-tube Clusterwise Regression
ABSTRACT. Clusterwise Regression (CLR) methods that jointly optimize clustering and regression tasks are useful for partitioning data into disjoint subsets with distinct regression trends. Due to the inherent difficulty in simultaneously optimizing clustering and regression objectives, it is not surprising that existing optimal CLR approaches do not scale beyond 100s of data points. In an effort to provide more scalable and optimal CLR methods, we propose a novel formulation of the problem that takes inspiration from epsilon-tubes in Support Vector Regression (SVR). The advantage of this novel formulation, which aims to assign data points to clusters in order to minimize the largest epsilon-tube that encapsulates the regressed data, is that it admits an optimal MILP formulation. Furthermore, given that each constraint in our formulation corresponds to a single data point, we propose an efficient row generation solution that can optimally converge for the full dataset while only requiring optimization over a subset of the data. Our results on a variety of synthetic and benchmark real datasets show that our Clusterwise Regression MILP formulation provides near-optimal solutions up to 100,000 data points and the smallest data-encapsulating epsilon-tubes among CLR alternatives.
Predicting the Optimal Period for Cyclic Hoist Scheduling Problems
ABSTRACT. Since combinatorial scheduling problems are usually NP-hard, this paper investigates whether machine learning (ML) can accelerate exact solving of a problem instance. We adopt supervised learning on a corpus of problem instances, to acquire a function that predicts the optimal makespan for a given instance. The learned predictor is invariant to the instance size as it uses statistics of instance attributes. We provide this prediction to a solving algorithm in the form of bounds on the objective function. Specifically, this approach is applied to the well-studied Cyclic Hoist Scheduling Problem (CHSP). The goal for a CHSP instance is to find a feasible schedule for a hoist moving objects between tanks in an electroplating facility with minimal makespan. Taking an existing Constraint Programming (CP) model for this problem, and an exact CP-SAT solver, we implement a Deep Neural Network, a Random Forest and a Gradient Boosting Tree in order to predict the optimal period $p$. Experimental results find that, first, ML models (in particular DNNs), can be good predictors of the optimal $p$; and, second, providing tight bounds for $p$ around the predicted value to a CP solver significantly reduces the solving time without compromising the optimality of the solutions.
Robustness, Privacy, Fairness, and Credibility? Pushing the Boundaries of Economic Design with Deep Learning
ABSTRACT. The design of revenue-maximizing auctions with strong incentive guarantees is a core concern of economic theory. Computational auctions enable online advertising, sourcing, spectrum allocation, and myriad financial markets. Analytic progress in this space is notoriously difficult; since Myerson's 1981 work characterizing single-item optimal auctions, there has been limited progress outside of restricted settings. Motivated by this analytic difficulty, we will instead discuss differentiable economic design: using deep learning techniques to approximate optimal auctions. Differentiable economic design introduces a fresh host of opportunities for research at the intersection of operations research, AI/ML, and microeconomics. We discuss recent progress in some of these open areas, including:
* fairness while maintaining high revenue and strong incentive guarantees, including learning fairness from human preferences;
* certified robustness, that is, verification of claimed strategyproofness of deep learned auctions; and
* expressiveness via different demand functions and other constraints.
Throughout, we highlight areas where the CPAIOR community's expertise could accelerate progress, or open up entirely new research directions.
Local Branching Relaxation Heuristics for Integer Linear Programs
ABSTRACT. Large Neighborhood Search (LNS) is a heuristic algorithm for combinatorial optimization problems (COP). It starts with an initial solution to the problem and iteratively improves it by searching a large neighborhood around the current solution. LNS relies on destroy heuristics to select neighborhoods to search in. In this paper, we focus on designing effective and efficient destroy heuristics in LNS for integer linear programs (ILP) since a wide range of COPs can be represented as ILPs. LB is a destroy heuristic that computes the subset of variables to destroy and re-optimize in one iteration of LNS that lead to the largest improvement over the current solution. LB is often slow since it needs to solve an ILP of the same size as input. Our proposed destroy heuristics, LB-RELAX and its variants, use the linear progamming relaxation of LB to select neighborhoods. Empirically, LB-RELAX and its variants compute as effective neighborhoods as LB but run faster. They achieve better anytime performance than several baseline destroy heuristics and the Branch-and-Bound algorithm on ILP benchmarks.
ABSTRACT. In this paper, we present a hybrid optimization architecture, which combines both incomplete search processes that are often used to quickly find good-quality solutions to large-size problems and clause generation techniques that are known to be efficient to boost systematic search. In this architecture, clauses are generated once a locally optimal solution is found. We introduce a generic component to store these clauses generated step-by-step. This component is able to prune neighborhoods by answering queries formulated by the incomplete search process. We define three versions of this clause basis manager and then experiment with an Operations Research problem known as the Orienteering Problem with Time Windows (OPTW) to show the efficiency of the approach.
Optimization Bounds from Decision Diagrams in Haddock
ABSTRACT. We study the automatic generation of primal and dual bounds from decision diagrams in constraint programming. In particular, we expand the functionality of the Haddock system to optimization problems, by extending its specification language to include an objective function. We describe how restricted decision diagrams can be compiled in Haddock similar to the existing relaxed decision diagrams. Together, they provide primal and dual bounds on the objective function, which can be seamlessly integrated into the constraint programming search. We evaluate our method on the sequential ordering problem and compare the performance of Haddock to a dedicated decision diagram approach.
A Neural Network-Based Distributional Constraint Learning Methodology for Mixed-Integer Stochastic Optimization
ABSTRACT. EXTENDED ABSTRACT IN THE PDF FILE.
The use of machine learning methods helps to improve decision-making in different fields. In particular, the idea of bridging predictions (machine learning models) and prescriptions (optimization problems) is gaining attention within the scientific community. One of the main ideas to address this trade-off is the so-called Constraint Learning (CL) methodology, where the structures of the machine learning model can be treated as a set of constraints to be embedded within the optimization problem, establishing the relationship between a direct decision variable $x$ and a response variable $y$. However, most CL approaches have focused on making point predictions for a specific variable, not taking into account the statistical and external uncertainty faced in the modeling process. In this paper, we extend the CL methodology to deal with uncertainty in the response variable $y$. The novel Distributional Constraint Learning (DCL) methodology makes use of a piece-wise linearizable neural network-based model to estimate the parameters of the conditional distribution of $y$ (dependent on decisions $x$ and contextual information), which can be embedded within mixed-integer optimization problems. In particular, we formulate a stochastic optimization problem by sampling random values from the estimated distribution by using a linear set of constraints. In this sense, DCL combines both the high predictive performance of the neural network method and the possibility of generating scenarios to account for uncertainty within a tractable optimization model. The behavior of the proposed methodology is tested in a real-world problem in the context of electricity systems, where a Virtual Power Plant seeks to optimize its operation, subject to different forms of uncertainty, and with price-responsive consumers.
Graphs, Constraints, and Search for the Abstraction and Reasoning Corpus
ABSTRACT. The Abstraction and Reasoning Corpus (ARC) aims at benchmarking the performance of general artificial intelligence algorithms. The ARC's focus on broad generalization and few-shot learning has made it difficult to solve using pure machine learning. A more promising approach has been to perform program synthesis within an appropriately designed Domain Specific Language (DSL). However, these too have seen limited success. We propose Abstract Reasoning with Graph Abstractions (ARGA), a new object-centric framework that first represents images using graphs and then performs a search for a correct program in a DSL that is based on the abstracted graph space. The complexity of this combinatorial search is tamed through the use of constraint acquisition, state hashing, and Tabu search. An extensive set of experiments demonstrates the promise of ARGA in tackling some of the complicated object-centric tasks of the ARC rather efficiently, producing programs that are correct and easy to understand.
Optimal Shapelets Tree for Time Series Interpretable Classification
ABSTRACT. Time series shapelets are a state-of-the-art data mining technique that is applied to time series supervised classification tasks.
Shapelets are defined as subsequences that retain the most discriminating power contained in time series.
The main advantage of shapelet-based methods consists of their great interpretability.
Indeed, shapelets can provide the end-user with very helpful insights about the most interesting subsequences.
In this paper, we propose a novel mixed-integer programming model to optimize shapelet discovery based on optimal binary decision trees.
Our formulation provides a flexible and adaptable classification framework that is interpretable with respect to both the mathematical model and the final output.
Computational results for a large class of datasets show that our approach achieves performance comparable with state-of-the-art shapelet-based classification methods.
To the best of our knowledge, our model is the first approach based on optimal decision tree induction for time series classification.
20:00-23:59Banquet
Banquet at Le galet,located at 3 Promenade des Anglais