IOS2022: INFORMS OPTIMIZATION SOCIETY CONFERENCE 2022
PROGRAM FOR SUNDAY, MARCH 13TH
Days:
next day
all days

View: session overviewtalk overview

08:30-10:00 Session SA1: Tutorial: From Reinforcement Learning to Sequential Decision Analytics
08:30
Tutorial: A Unified Framework for Sequential Decisions Under Uncertainty

ABSTRACT. Sequential decision problems are universal, spanning engineering, the sciences, e-commerce, transportation, finance, health, … These span dynamic resource allocation problems, control problems, stochastic search, simulation-optimization, as well as active learning problems (bandit problems), adversarial games and multiagent problems (any problem which involves decisions, information, decisions, information). These problems have been addressed by over 15 different communities (stochastic programming and dynamic programming are just two). I will present a universal framework that covers all of these problems, and all fields of research that have emerged to address them. The point of departure with most communities is the focus on optimizing over policies. I will then describe four (meta)classes of policies that I claim cover all methods that have ever been addressed in the academic literature, or industry practice. This framework increases the visibility of all fields, since none of the four classes is a panacea. Furthermore, the idea of designing hybrids is especially powerful. I will highlight a particularly powerful hybrid for multistage stochastic programs using a parameterized deterministic lookahead which represents an important emerging area of research for the optimization community.

08:30-10:00 Session SA2: Advances in DFO I
Location: Gold Ballroom
08:30
A Trust Region Method for the Optimization of Noisy Functions
PRESENTER: Shigeng Sun

ABSTRACT. The classical convergence analysis of trust region methods assumes that the function, gradient and Hessian employed at each iteration are exact. In this talk, we consider the case when there are bounded errors in all above computations and propose a minimal modification of the trust region method that is able to cope with noise. We establish a global convergence result for smooth nonconvex functions. It states that the iterates enter a neighborhood of stationary points, and once it does, it cannot stray too far away from it. Numerical results are presented.

08:52
Complexity Analysis of Trust-Region Optimization with Adaptive Sampling for Zeroth-Order Stochastic Oracles
PRESENTER: Yunsoo Ha

ABSTRACT. We present an enhanced stochastic trust-region optimization with adaptive sampling (ASTRO-DF) in which optimizing an iteratively constructed local model on estimates of objective values with stochastic sample size guides the search. The noticeable feature is that the underdetermined quadratic model with a diagonal Hessian requires fewer function evaluations, which is particularly useful at high dimensions. This paper describes the enhanced algorithm in detail. It gives several theoretical results, including iteration complexity, and renders almost sure convergence guarantees. We report in our numerical experience the finite-time superiority of the enhanced ASTRO-DF over state-of-the-art using the SimOpt library.

09:14
Analysis of Line Search and Trust Region Methods with Noise
PRESENTER: Albert Berahas

ABSTRACT. In this talk, we present and analyze general nonlinear optimization methods in the presence of noise. Specifically, we analyze a generic line search method and a generic trust region method in the presence of bounded noise in the objective function and an absence of any derivative information. Our theoretical results match those of the deterministic methods, in terms of the dependence on the desired convergence accuracy. Finally, we present some numerical results showing the practical performance of the methods.

09:36
Efficient Implementation of Second-Order (Newton-like) Stochastic Optimization with Zeroth-Order or First-Order Information

ABSTRACT. The talk will discuss several key extensions to the basic simultaneous perturbation stochastic approximation (SPSA) algorithm for optimization in a stochastic setting. In particular, it is known that a stochastic analogue of the well-known (deterministic) Newton-Raphson algorithm provides an asymptotically optimal or near-optimal form of search for optimization (or root-finding). However, directly determining the required Hessian matrix for optimization has often been difficult or impossible in practice. We review state-of-the-art methods for solving the optimization problem of interest while simultaneously estimating the Hessian matrix. Among all SA algorithms, the second-order SPSA (2SPSA) and the second-order stochastic gradient (2SG) are particularly efficient in handling high-dimensional problems, covering both the gradient-free and gradient-based settings. We discuss enhancements to the basic method based on: (i) feedback to reduce the error in the Hessian estimate, (ii) use of a symmetric indefinite matrix factorization to reduce the FLOPs cost from O(p^3) to O(p^2) (p = problem dimension; FLOPs cost is distinct from the classical SPSA-based per-iteration O(1) cost in terms of the number of noisy function evaluations), and (iii) estimation of only the diagonal entries in the Hessian. For all of the enhancements, we discuss the implications for efficiency and asymptotic error bounds.

08:30-10:00 Session SA3: Convex and Non-Convex Optimization for Machine Learning
Chairs:
Location: Card Room
08:30
Stochastic first-order Algorithms for a class of non-Lipschitz variational inequalities

ABSTRACT. We propose new algorithms for computing the weak solutions of a class of deterministic and stochastic monotone and non-Lipschitz variational inequalities (VI). The main idea of our algorithm is to incorporate a scaling scheme into the gradient projection method. For deterministic VIs, the proposed algorithm computes the weak solutions with the convergence rate of O(\log(T)/T) which breaks the optimal O(1/\sqrt{T}) optimal convergence rate for general monotone and non-Lipschitz VIs. For stochastic VIs, the proposed algorithm computes the weak solutions with the convergence rate of O(\log(T)/T + \sigma/\sqrt{T}). We also provide the efficient implementation that can exploit the sparse structure of VIs and the numerical results demonstrate the effectiveness of our algorithms.

08:52
Adaptivity and Optimism in Acceleration

ABSTRACT. Optimization algorithms used in machine learning need to ensure strong performance even when employed by users who do not have the time or expertise to select and tune an algorithm for their particular task. We can address this need with adaptive optimization algorithms that automatically adjust their dynamics to the statistics and geometry of the objective encountered without user intervention. In this talk, I will describe recent developments in "adaptive " or "universal" algorithms that achieve optimal accelerated convergence rates without prior knowledge of problem parameters such as smoothness or variance parameters across a variety of settings. These algorithms are produced by leveraging techniques for adaptive online linear optimization, yielding simple and modular analysis.

09:14
Distributed stochastic inertial-accelerated methods with delayed derivatives for nonconvex problems
PRESENTER: Yangyang Xu

ABSTRACT. Stochastic gradient methods (SGMs) are predominant approaches for stochastic optimization. On smooth nonconvex problems, acceleration techniques have been applied to improve the convergence of SGMs. However, little exploration has been made on applying acceleration to a stochastic subgradient method (SsGM) for nonsmooth nonconvex problems. Also, few efforts have been made to analyze an (accelerated) SsGM with delayed derivatives. The information delay naturally happens in a distributed system.

In this talk, I will present an inertial proximal SsGM for nonsmooth nonconvex stochastic optimization. Our method has guaranteed convergence even with delayed derivatives in a distributed environment. Convergence rate results are established to three problem classes: weakly-convex nonsmooth problems with a convex regularizer, composite nonconvex problems with a nonsmooth convex regularizer, and smooth nonconvex problems. In a distributed environment, the convergence rate of the proposed method will be slowed down by the information delay. Nevertheless, the slow-down effect will decay with the number of iterations for the latter two problem classes. We test the proposed method on three applications. The numerical results clearly demonstrate the advantages of using the inertial-based acceleration. Furthermore, we observe higher parallelization speed-up in asynchronous updates over the synchronous counterpart, though the former uses delayed derivatives.

08:30-10:00 Session SA4: Novel Solution Techniques in Discrete Optimization
08:30
Solving cut-generating linear programs via machine learning
PRESENTER: Danial Davarnia

ABSTRACT. Cut-generating linear programs (CGLPs) play a key role as a separation oracle to produce valid inequalities for the feasible region of optimization problems. Running CGLPs at nodes of the branch-and-bound tree, however, is computationally cumbersome due to the large number of node candidates and the lack of a priori knowledge on which nodes admit useful cutting planes. In this paper, we propose a novel framework based on machine learning to approximate the optimal value of the CGLP, which is the deciding factor in generating cutting planes. Translating the CGLP as an indicator function of the objective function vector, we show that it can be approximated through conventional data classification techniques. We conduct computational experiments using classification methods such as neural networks. Computational results suggest that the outcome of the approximate CGLP obtained from classification achieves a high accuracy rate in a significantly smaller amount of time compared to modern LP solvers. Our proposed framework can be efficiently applied to a large number of nodes in the branch-and-bound tree to identify the best candidates for applying the CGLP.

08:52
Proximity Results In Mixed-integer Second Order Conic Programming
PRESENTER: Diego Moran

ABSTRACT. We study the distance between the optimal solutions (resp. optimal values) of mixed-integer second order conic programs (SOCPs) and the optimal solutions (resp. optimal values) of their continuous relaxations. In the case of simple SOCPs whose feasible regions are defined by a single Lorentz cone constraint, we show that this distance can be upper bounded in terms of the objective function vector and the matrix defining the SOCP, independently of their r.h.s. In contrast, if the feasible region of the SOCP is defined by two or more Lorentz cone constraints, then we show that, in general, the distance cannot be bounded independently of the r.h.s defining the SOCP, not even when its feasible region is compact or one of the Lorentz cones constraints defines a polyhedral set.

09:14
On The Structure Of DD-representable MIPs With Application To Unit Commitment

ABSTRACT. Over the past decade, a powerful solution framework called Decision Diagrams (DDs) was introduced and successfully employed to solve integer programs. However, the question on possibility of extending DDs to model mixed integer programs (MIPs) has been unanswered. In this talk, we first address this question by providing both necessary and sufficient conditions for a general MIP to be modeled by DDs, and then present a DD-based method to model and solve general MIPs. To show the practicality of our framework, we apply it to a stochastic variant of unit commitment problem. Computational experiments show that the proposed method improves the solution times in comparison to the outcome of modern solvers.

09:36
Second-Order Conic and Polyhedral Approximations of the Exponential Cone: Application to Mixed-Integer Exponential Conic Programs
PRESENTER: Weijun Xie

ABSTRACT. Exponents and logarithms exist in many important applications such as logistic regression, maximum likelihood, relative entropy and so on. Since the exponential cone can be viewed as the epigraph of perspective of the natural exponential function or the hypograph of perspective of the natural logarithm function, many mixed-integer nonlinear convex programs involving exponential or logarithm functions can be recast as mixed-integer exponential conic programs (MIECPs). To harvest the past efforts on MILPs and MISOCPs, this paper presents second-order conic (SOC) and polyhedral approximation schemes for the exponential cone with application to MIECPs. To do so, we first extend and generalize existing SOC approximation approaches in the extended space, propose new scaling and shifting methods, prove approximation accuracies, and derive lower bounds of approximations. We then study the polyhedral outer approximation of the exponential cones in the original space using gradient inequalities, show its approximation accuracy, and derive a lower bound of the approximation. When implementing SOC approximations, we suggest testing smaller cases by learning the approximation pattern and then applying to the large-scale cases; and for the polyhedral approximation, we suggest using the cutting plane method when solving the continuous ECP and branch and cut method for MIECPs.

08:30-10:00 Session SA5: Optimization Models and Methods for Disaster Relief and Emergency Response
Location: Furman Room
08:30
Using OR to Prioritize Transportation Network Resilience Investments

ABSTRACT. Transportation systems provide fundamental infrastructure and services for evacuation, disaster response, and disaster recovery, as well as everyday activities. In evacuation, the focus of transportation is outbound from the hazardous/threatened area. During humanitarian relief operations, the focus is on bringing supplies into the area. While repairing damage and restoring services, transportation systems need to connect supplies, equipment, and people to affected areas and infrastructures. Given the different roles of the transportation systems during different periods, prioritizing investments for improving transportation resilience can be challenging. General categories of investments and resilience improvement strategies for transportation networks include, but are not limited to, retrofitting network elements and designing them to higher standards, designing the network with extra capacity and redundancy, expanding the network through new paths and/or capacity, responding to the disruption with flexible uses of the network, and having repair resources that can be accessed quickly. However, funding resilience improvements is a challenge since there are few dedicated funding streams and many competing needs exist. This presentation provides an overview of different roles operations research (OR) plays in addressing this complex issue.

08:52
Shared-Mobility-Based Evacuation Planning under Demand Uncertainty
PRESENTER: Kati Moug

ABSTRACT. Many emergency-evacuation plans in the United States rely on the use of personal vehicles, which can lead to congestion on main travel routes and exclude carless populations. Ridesharing reduces the number of vehicles leaving an area, yet requires extensive planning and coordination, when information about an emergency is not yet available. In this work, we employ two-stage stochastic integer programming formulations for optimizing shared-mobility- based evacuation under uncertainty in geographic and temporal demand for rides. Our models optimize the planning decisions – in particular, the volume and spatial distribution of volunteer drivers recruited – with respect to either expectation-based or chance-constrained criteria, to make efficient decisions that mitigate the risk of unserved demand. We implement our models using real-time traffic data collected during an evacuation for Hurricane Florence in Charleston County, SC, demonstrating approaches to handle large-scale, practical instances. We discuss the benefits and disadvantages of each criterion and implications for practitioners.

09:14
Drone-Network Queuing Optimization Model for Emergencies
PRESENTER: Miguel Lejeune

ABSTRACT. We propose a queueing optimization model for the design of a drone network delivering automated external defibrillators in response to out-of-hospital cardiac arrests (OHCA). The network is modeled as a collection of queues with uncertain arrival of OHCAs and drone service times. The model takes the form of an integer nonlinear model with fractional and bilinear terms and minimizes the average response time which is conducive to maximizing the chance of survival of OHCA patients. We present a reformulation and a modular solution method. In particular, we propose optimality-based bound tightening methods to derive multiple variable bounds at once. We use real cardiac arrest data for Virginia Beach to ascertain the computational efficiency and scalability of the algorithmic method, to validate the modeling approach, and to analyze the impact of delivery mode, network capacity, and drone technology on response time and probability of survival.

09:36
Determining Optimal Fuel Strategies Under Uncertainty
PRESENTER: Faith Idoko

ABSTRACT. In preparation for a hurricane, individuals affected often are required to evacuate. Evacuations can cause high increases in the demand for gasoline fuel. The purpose of this research is to determine alternate delivery locations and times for fuel vessels scheduled to arrive at a port that is impacted by an approaching hurricane. Motivated by the impact of Hurricane Irma in Florida, we develop a multi-period stochastic scheduling model that incorporates hurricane advisories, scheduled fuel deliveries, maritime port storage capacity and vessel capacity in a maritime port network. Our model determines the best delivery schedule based on multiple objectives. The first minimizes total unmet demand at each port. The second objective captures inequity in unmet demand across the port network. We conduct a numerical study based on fuel delivery data from maritime ports in Florida. The results show we are able to determine alternative delivery schedules for each vessel. One of the key findings is that the port availability is the driving factor in determining the vessel schedules.

08:30-10:00 Session SA6: Network Interdiction
Location: Palmetto Room
08:30
Optimizing Cooperative Interdependent Attack Graphs to compromise cyberinfrastructures
PRESENTER: Joseph Alameda

ABSTRACT. As infrastructures in our society become increasingly interdependent on one another, the possibility of attacks on these infrastructures utilizing these interdependencies also increases. In this paper, we consider attacks from multiple attackers working semi-cooperatively to reach their goals. Moreover, we model the possibility that more than one exploit may have to be used by an attacker to reach their objective. We provide a mixed-integer program for this attacker model, and we provide a bounding algorithm that iteratively improves a series of upper and lower bounds. Furthermore, we provide preliminary analysis showing the computational effectiveness of our algorithm when compared to the MIP.

08:52
Multi-period Max Flow Network Interdiction with Restructuring for Disrupting Domestic Sex Trafficking Networks
PRESENTER: Daniel Kosmas

ABSTRACT. We introduce a new class of max flow network interdiction problems with restructuring, where interdictions and restructuring are implemented over time. We advance modeling sex trafficking networks from a max flow perspective. We also introduce a model with a second interdictor that have limited ability to disrupt the network. We use column and constraint generation with partial information to solve this problem. We test our model on synthetic sex trafficking networks and discuss policy recommendations based on model prescriptions.

09:14
Best Response Intersection: An Optimal Algorithm for Interdiction Defense
PRESENTER: Andrew Mastin

ABSTRACT. We define the interdiction defense problem as a game over a set of targets with three stages: a first stage where the defender protects a subset of targets, a second stage where the attacker observes the defense decision and attacks a subset of targets, and a third stage where the defender optimizes a system using only the surviving targets. We present a novel algorithm for optimally solving such problems that uses repeated calls to an attacker's best response oracle. For cases where the defender can defend at most k targets and the attacker can attack at most z targets, we prove that the algorithm makes at most ${ k + z \choose k }$ calls to the oracle. In application to the direct current optimal power flow problem, we present a new mixed integer programming formulation with bounded big-M values to function as a best response oracle. We use this oracle along with the algorithm to solve a defender-attacker-defender version of the optimal power flow problem. On standard test instances, we find solutions with larger values of k and z than shown in previous studies, and with runtimes that are an order of magnitude faster than column and constraint generation.

09:36
Network Interdiction Games with Heterogeneous Transportation and Interdiction Costs
PRESENTER: Mathieu Dahan

ABSTRACT. In this talk, we consider a generic strategic interdiction game on a capacitated flow network. The game involves a routing entity that sends its flow through the network while facing heterogeneous path transportation costs, and an interdictor who simultaneously interdicts one or more edges while facing heterogeneous edge interdiction costs. Both players can randomize over their sets of actions. To solve this simultaneous game, we prove the existence of a probability distribution over subsets of a finite partially ordered set (poset) that satisfies a set of constraints involving marginal probabilities of the poset's elements and maximal chains. Using this existence result on posets and strict complementary slackness in linear programming, we show that the Nash equilibria of our network interdiction game can be fully described using primal and dual solutions of a minimum-cost circulation problem. Our analysis leads to a polynomial-time approach for equilibrium computation and provides a new characterization of critical components in strategic flow interdiction problems.

10:30-11:30 Session Plenary 1: Machine Learning and Optimization for Engineering
Location: Gold Ballroom
10:30
Machine Learning and Optimization for Engineering

ABSTRACT. This talk reviews some progress at the intersection of machine learning and optimization for engineering applications. It reviews a number of new paradigms where their combination brings some unique benefits, including optimization proxies, end-to-end learning, learning to optimization, and end-to-end optimization. The methodological advances are illustrated on applications in power systems, mobility, manufacturing, and ethical decision making.

13:00-14:30 Session SB1: Applications and Theory of Chance Constrained Optimization
Chairs:
13:00
ALSO-X and ALSO-X+: Better Convex Approximations for Chance Constrained Programs
PRESENTER: Nan Jiang

ABSTRACT. The chance constrained program (CCP) is often nonconvex and is difficult to solve to optimality, much effort has been devoted to developing convex inner approximations for a CCP, among which the conditional value-at-risk (CVaR) has been known to be the best for more than a decade. This paper studies and generalizes the ALSO-X, originally proposed by Ahmed, Luedtke, SOng, and Xie (2017), for solving a CCP. We first show that the ALSO-X resembles a bilevel optimization, where the upper-level problem is to find the best objective function value and enforce the feasibility of a CCP for a given decision from the lower-level problem, and the lower-level problem is to minimize the expectation of constraint violations subject to the upper bound of the objective function value provided by the upper-level problem. This interpretation motivates us to prove that when uncertain constraints are convex in the decision variables, ALSO-X always outperforms CVaR. We further show (i) sufficient conditions under which ALSO-X can recover an optimal solution to a CCP; (ii) an equivalent bilinear programming formulation of a CCP, inspiring us to enhance ALSO-X with a convergent alternating minimization method; (iii) an extension to distributionally robust chance constrained programs under \infty-Wasserstein ambiguity set.

13:22
Optimization under rare chance constraints

ABSTRACT. Chance constraints provide a principled framework to mitigate the risk of high-impact extreme events by modifying the controllable properties of a system. The low probability and rare occurrence of such events, however, impose severe sampling and computational requirements on classical solution methods that render them impractical. This work proposes a novel sampling-free method for solving rare chance constrained optimization problems affected by uncertainties that follow general Gaussian mixture distributions. By integrating modern developments in large deviation theory with tools from convex analysis and bilevel optimization, we propose tractable formulations that can be solved by off-the-shelf solvers. Our formulations enjoy several advantages compared to classical methods: their size and complexity is independent of event rarity, they do not require linearity or convexity assumptions on system constraints, and under easily verifiable conditions, serve as safe conservative approximations or asymptotically exact reformulations of the true problem. Computational experiments on linear, nonlinear and PDE-constrained problems from applications in portfolio management, structural engineering and fluid dynamics illustrate the broad applicability of our method and its advantages over classical sampling-based approaches in terms of both accuracy and efficiency.

13:44
A risk-averse and chance-constrained two-stage stochastic programming model for pharmaceutical supply chain management under demand uncertainty
PRESENTER: Hamed Rahimian

ABSTRACT. We study an inventory management problem for a pharmaceutical supply chain under demand uncertainty. To model the problem, we propose a risk-averse chance-constrained two-stage stochastic program, where a convex combination of the expected value and the conditional value-at-risk (CVaR) of the cost is included in the objective function. Moreover, we enforce the demand satisfaction with a chance constraint. To solve the resulting model, we develop various decomposition algorithms, capable of handling the CVaR and the chance constraint. We conduct extensive numerical experiments to compare the computational efficiency of different decomposition algorithms. Moreover, we provide insights on the practical implications of our model at different risk levels.

14:06
Stochastic lot-sizing with substitution and service level constraints
PRESENTER: Jim Luedtke

ABSTRACT. We consider a multi-stage stochastic lot-sizing problem with service level constraints and product substitution. A firm has multiple products and when meeting demand has the option to meet demand from substitutable products at a cost. Considering the uncertainty in future demands, the firm wishes to make ordering decisions in every period such that the probability that all demands can be met in the next period is at least a given target. We propose a rolling horizon policy in which a two-stage joint chance-constrained stochastic program is solved to make decisions in each time period. We demonstrate how to effectively solve this formulation, and compare the resulting policy to policies based on deterministic approximations, demonstrating that the proposed policy can achieve the service levels more reliably and at lower cost. We also explore the value of product substitution in this model, demonstrating that allowing substitution allows achieving service levels at significantly reduced costs, and that the majority of the benefit can be obtained with limited levels of substitution allowed.

13:00-14:30 Session SB2: Advances in DFO II
Location: Gold Ballroom
13:00
Derivative-Free Optimization via Adaptive Sampling Strategies
PRESENTER: Cem Karamanli

ABSTRACT. In this talk, we analyze the performance of derivative-free methods for solving stochastic optimization problems. We consider different estimators to approximate the first derivatives using only stochastic function evaluations and employ adaptive sampling mechanisms to control the accuracy in stochastic approximations. The methods we consider in this study are finite difference methods, smoothing techniques, randomized coordinate descent, and randomized subspace methods. We provide theoretical convergence results to a neighborhood of the solution and compare the worst-case complexity results of these methods. We also make an empirical comparison of these methods on logistic regression and nonlinear least-squares problems.

13:22
A Derivative-Free Algorithm for Least-Squares that Exploits Prior Simulation Outputs
PRESENTER: Shima Dezfulian

ABSTRACT. We propose a model-based algorithm for solving black-box least-square problems when the aim is to find parameters in an expensive simulation procedure that best match known observations. We assume that we have available to us outputs from prior simulations that we will exploit to reduce the number of simulations required by our algorithm. Using this prior information, we approximate simulation outputs when possible in order to build local models of the objective function. We discuss the convergence properties of our algorithm and provide the results of numerical experiments.

13:44
Stochastic Zeroth-order Riemannian Derivative Estimation and Optimization
PRESENTER: Jiaxiang Li

ABSTRACT. We consider stochastic zeroth-order optimization over Riemannian submanifolds embedded in Euclidean space, where the task is to solve Riemannian optimization problems with only noisy objective function evaluations. Towards this, our main contribution is to propose estimators of the Riemannian gradient and Hessian from noisy objective function evaluations, based on a Riemannian version of the Gaussian smoothing technique. The proposed estimators overcome the difficulty of non-linearity of the manifold constraint and issues that arise in using Euclidean Gaussian smoothing techniques when the function is defined only over the manifold. We use the proposed estimators to solve Riemannian optimization problems in the following settings for the objective function: (i) stochastic and gradient-Lipschitz (in both nonconvex and geodesic convex settings), (ii) sum of gradient-Lipschitz and non-smooth functions, and (iii) Hessian-Lipschitz. For these settings, we analyze the oracle complexity of our algorithms to obtain appropriately defined notions of $\epsilon$-stationary point or $\epsilon$-approximate local minimizer. Notably, our complexities are independent of the dimension of the ambient Euclidean space and depend only on the intrinsic dimension of the manifold under consideration. We demonstrate the applicability of our algorithms by simulation results and real-world applications on black-box stiffness control for robotics and black-box attacks to neural networks.

13:00-14:30 Session SB3: Algorithmic Dynamics in Learning Problems
Chair:
Location: Card Room
13:00
SGD in the Large: Exact Dynamics, Average-case Analysis, and Stepsize Criticality

ABSTRACT. In this talk, I will present a framework, inspired by random matrix theory, for analyzing the dynamics of stochastic algorithms (e.g., stochastic gradient descent (SGD), and momentum) when both the number of samples and dimensions are large. Using this new framework, we show that the dynamics of SGD on a least squares problem with random data become deterministic in the large sample and dimensional limit. Moreover, the limiting dynamics for stochastic algorithms are governed by a Volterra integral equation. This model predicts that SGD undergoes a phase transition at an explicitly given critical stepsize that ultimately affects its convergence rate, which we also verify experimentally. Finally, when input data is isotropic, we provide explicit expressions for the dynamics and average-case convergence rates. These rates show significant improvement over the worst-case complexities.

13:22
Sharp convergence guarantees for iterative algorithms in random optimization problems
PRESENTER: Ashwin Pananjady

ABSTRACT. Iterative algorithms are the workhorses of modern statistical learning, and are widely used to fit high-dimensional models to random data. We develop a framework that produces sharp, iterate-by-iterate characterizations of solution quality for a large family of algorithms run on nonconvex problems with Gaussian data, enabling rigorous comparisons between algorithms in terms of their convergence behavior. I will present the framework and use it to show (in particular) that it can yield sharp global convergence guarantees for higher-order algorithms in some nonconvex model-fitting problems.

13:44
Overparameterized learning beyond the lazy regime

ABSTRACT. Modern learning models are typically trained in an over-parameterized regime where the parameters of the model far exceed the size of the training data. Due to over-parameterization these models in principle have the capacity to (over)fit any set of labels including pure noise. Despite this high fitting capacity, somewhat paradoxically, these models trained via first-order methods continue to predict well on yet unseen test data. In this talk I aim to demystify this phenomena by focusing on overparameterized learning in the context of low-rank reconstruction from a few measurements. For this problem I will show that despite the presence of many global optima gradient descent from small random initialization converges to a generalizable solution and finds the underlying low-rank matrix. Notably this analysis is not in the “lazy” training regime and is based on an intriguing phenomena uncovering the critical role of small random initialization: a few iterations of gradient descent behaves akin to popular spectral methods. We also show that this implicit spectral bias from small random initialization, which is provably more prominent for overparameterized models, puts the gradient descent iterations on a particular trajectory towards solutions that are not only globally optimal but also generalize well.

14:06
Algorithmic Regularization in Model-free Overparametrized Matrix Factorization
PRESENTER: Lijun Ding

ABSTRACT. We study the asymmetric matrix factorization problem under a natural nonconvex formulation with arbitrary overparmatrization. We consider the model-free setting with no further assumption on the rank or singular values of the observed matrix, where the global optima provably overfit. We show that vanilla gradient descent with small random initialization and early stopping produces the best low-rank approximation of the observed matrix, without any additional regularization. We provide a sharp analysis on relationship between the iteration complexity, initialization size, stepsize and final error. In particular, our complexity bound is almost dimension-free and depends logarithmically on the final error, and our results have more lenient requirements on the stepsize and initialization, which improve upon existing work and shows agreement with numerical experiments.

13:00-14:30 Session SB4: Recent Advances in Integer and Matrix Optimization
13:00
A new perspective on low-rank optimization
PRESENTER: Ryan Cory-Wright

ABSTRACT. A key question in many low-rank problems throughout optimization, machine learning, and statistics is to characterize the convex hulls of simple low-rank sets and judiciously apply these convex hulls to obtain strong yet computationally tractable convex relaxations. We invoke the matrix perspective function — the matrix analog of the perspective function — and characterize explicitly the convex hull of epigraphs of convex quadratic, matrix exponential, and matrix power functions under low-rank constraints. Further, we exploit these characterizations to develop strong relaxations for a variety of low-rank problems including reduced rank regression, non-negative matrix factorization, and factor analysis. We establish that these relaxations can be modeled via semidefinite and matrix power cone constraints, and thus optimized over tractably. The proposed approach parallels and generalizes the perspective reformulation technique in mixed-integer optimization, and leads to new relaxations for a broad class of problems.

13:22
A Graph-based Decomposition Method for Convex Quadratic Optimization with Indicators
PRESENTER: Peijing Liu

ABSTRACT. We study convex quadratic optimization problems arising in sparse regression, where additionally the model matrix defining the quadratic term in the objective is sparse. We first use a graphical representation of the support of the matrix, and show that if this graph is a path, then we can solve the associated problem in polynomial time. This enables us to construct a compact extended formulation for the closure of the convex hull of the epigraph of the mixed-integer convex problem. Furthermore, we propose a novel decomposition method using Fenchel duality for general sparse matrices, which leverages the efficient algorithm for the path case and produce high quality solutions in a fraction of the time required to solve the problems using state-of-the-art mixed-integer optimization solvers.

13:44
The backbone method for ultra-high dimensional sparse machine learning

ABSTRACT. We present the backbone method, a general framework that enables sparse and interpretable supervised machine learning methods to scale to ultra-high dimensional problems. We solve sparse regression problems with 10^7 features in minutes and 10^8 features in hours, as well as decision tree problems with 10^5 features in minutes. The proposed method operates in two phases: we first determine the backbone set, consisting of potentially relevant features, by solving a number of tractable subproblems; then, we solve a reduced problem, considering only the backbone features. For the sparse regression problem, our theoretical analysis shows that, under certain assumptions and with high probability, the backbone set consists of the truly relevant features. Numerical experiments on both synthetic and real-world datasets demonstrate that our method outperforms or competes with state-of-the-art methods in ultra-high dimensional problems, and competes with optimal solutions in problems where exact methods scale, both in terms of recovering the truly relevant features and in its out-of-sample predictive performance.

13:00-14:30 Session SB5: Emerging Applications in Urban Transportation
Location: Furman Room
13:00
Distributionally Robust Optimization Approaches for a Stochastic Mobile Facility Routing and Scheduling Problem

ABSTRACT. We present two distributionally robust optimization (DRO) models for a mobile facility (MF) fleet sizing, routing, and scheduling problem (MFRSP) with random demand. These models seek to minimize the fixed cost of establishing the MF fleet and the worst-case expectation/mean-conditional value-at-risk penalty costs of transportation and unmet demand. We develop a decomposition-based algorithm to solve the proposed DRO models and derive valid lower bound inequalities to strengthen the master problem and speed up convergence. In addition, we derive two families of symmetry-breaking inequalities, which break symmetries in the solution space of routing and scheduling decisions and thus improve the solvability of the models. Finally, we present extensive computational experiments comparing the operational and computational performance of the proposed DRO models and a stochastic programming model and drive insights into the MFRSP.

13:22
Learning Model Predictive Controllers for Real-Time Ride-Hailing Vehicle Relocation and Pricing Decisions

ABSTRACT. Large-scale ride-hailing systems often combine real-time routing at the individual request level with a macroscopic Model Predictive Control (MPC) optimization for dynamic pricing and vehicle relocation. The MPC relies on a demand forecast and optimizes over a longer time horizon to compensate for the myopic nature of the routing optimization. How- ever, the longer horizon increases computational complexity and forces the MPC to operate at coarser spatial-temporal granularity, degrading the quality of its decisions. This pa- per addresses these computational challenges by learning the MPC optimization. The resulting machine-learning model then serves as the optimization proxy and predicts its optimal solutions. This makes it possible to use the MPC at higher spatial-temporal fidelity, since the optimizations can be solved and learned offline. Experimental results show that the proposed approach improves quality of service on challenging instances from the New York City dataset.

13:44
Optimal Transportation Mode Selection and Capacity Allocation under Uncertainty
PRESENTER: Avnish Malde

ABSTRACT. We consider an overseas manufacturing supply chain with long lead times and multiple transportation modes, where orders are placed using forecasted demand. The forecast error, the difference between forecasted and actual demand quantity, is considered an uncertain parameter. We also assume that excess inventory in the warehouse at the beginning of each period is uncertain. Order quantities of a part for each transportation mode must be determined. We model this problem using a two-stage stochastic programming approach to minimize the overall expected order procurement, inventory holding, and backorder costs under demand and inventory uncertainty. We use Sample Average Approximation (SAA) method to solve our two-stage stochastic program. We run our experiments using simple random sampling as well as the stratified sampling technique called Latin Hypercube Sampling (LHS) to generate random samples from a continuous distribution in each replication of an SAA. We compare the results obtained using simple random sampling and stratified sampling methods. Further, we use a scenario decomposition-based method, Progressive Hedging Algorithm (PHA), to solve the extensive form of two-stage stochastic programming problem generated in each replication of an SAA. We evaluate the performance of our solution algorithm at different levels of forecast error and inventory uncertainty.

14:06
Scheduling Mobile Parcel Lockers Under Customer Choice
PRESENTER: Yiling Zhang

ABSTRACT. The movable unit equipped with a set of parcel lockers has been recently developed as a new mode to improve the efficiency of the last-mile delivery. Mobile parcel lockers, different from traditional facility location problems placing permanent facilities in a given region, can be relocated at any time, driven by the changing demand rate from a location to another over time. At each parking location, the demand is aggregated by the customers residing nearby. While each customer's choice is based on what delivery options are available to them. In this work, we schedule mobile parcel lockers under customer choice models. The problem is formulated as a mixed-integer non-convex program. Using integer programming techniques, we derive an equivalent mixed-integer second-order cone programming (MISOCP) reformulation. To further improve the computation, we exploit the problem structures and derive two classes of valid inequalities: one based on a submodular underestimation and the other based on a convex underestimation. The cuts are implemented in a branch-and-cut framework to find a globally optimal solution. Preliminary computational studies have suggested the efficiency and effective the proposed model and solution approaches.

13:00-14:30 Session SB6: Optimization in Communication Networks
Location: Palmetto Room
13:00
An Exact Optimization Approach for Relay Node Location in Wireless Sensor Networks
PRESENTER: Ankan Mitra

ABSTRACT. Wireless Sensor Networks (WSNs) consist of a group of sensors located in a space with the purpose of collecting physical or environmental data of interest, which is then transmitted wirelessly from sensor to sensor until reaching a base station (or sink) for processing. We study the problem of locating relay nodes to improve the connectivity of a set of already deployed sensors in a WSN. In this problem, one or more nodes serve as the collection point of all the information captured by the sensors. Sensors have limited transmission range and hence signals are transmitted through multi-hop routing. As a result, the WSN is connected if there exists a path of transmission from each sensor to the data collection points. The goal is to locate a given number of relay nodes anywhere in the continuous space to maximize the number of sensors connected. We propose an integer programming based solution approach as well as some strengthening techniques and a decomposition algorithm to accelerate the problem solution. We also present an extension for the case where obstacles block the sensors’ transmission ranges. The proposed method achieves solutions times many times faster than an alternative (and intuitive) nonlinear formulation.

13:22
Formulations of the max k-cut problem on classical and quantum computers
PRESENTER: Hamidreza Validi

ABSTRACT. Recent claims on "solving" combinatorial optimization problems via quantum computers has attracted many researchers to work on quantum algorithms. The max k-cut problem is a challenging combinatorial optimization problem with multiple notorious mixed integer linear optimization (MILO) formulations. Motivated by recent progress of classical solvers in handling quadratic optimization problems, we revisit the binary quadratic optimization (BQO) formulation of Carlson and Nemhauser (Operations Research, 1966) and provide theoretical and computational comparisons between different mixed integer optimization formulations of the max k-cut problem. While no claim on "quantum advantage" is provided, we introduce quadratic unconstrained binary optimization (QUBO) formulations with tight penalty coefficients. We also design quantum circuits for finding feasible solutions of the problem by a quantum approximate optimization algorithm (QAOA). To accelerate the solve process and to handle large-scale instances of the problem, we propose a preprocessing procedure that can be employed in both classical and quantum contexts. Finally, we test the performance of our classical and quantum-inspired formulations on an extensive set of instances. Codes and data are available on GitHub.

13:44
Maximizing Node Activation Network Flow Problems under Non-simultaneous Flow Assumptions
PRESENTER: Robert Curry

ABSTRACT. In this work, we study a non-simultaneous network flow problem, in which resources are transported through a network of nodes via arcs in order to maximize the value of activated nodes. A node is defined as activated whenever some positive number of resources are allocated or transported to that node during some time period. We assume the network consumes some non-negative units of a finite budget for moving such resources through the network. These problems have applications in disaster relief, wireless sensor network optimization, and defense settings. We present two mixed-integer programs (MIPs) for solving this problem as well as computationally-effective Branch-And-Price algorithm for its solution. Furthermore, we present a series of preliminary results to display the efficacy of our approach compared to the MIPs.

15:00-16:30 Session SC1: Stochastic and Robust Optimization
15:00
On Data-Driven Prescriptive Analytics with Side Information: A Regularized Nadaraya-Watson Approach

ABSTRACT. We consider generic stochastic optimization problems in the presence of side information which enables a more insightful decision. The side information constitutes observable exogenous covariates that alter the conditional probability distribution of the random problem parameters. A decision-maker who adapts her decisions according to the observed side information solves a stochastic optimization problem where the objective function is specified by the conditional expectation of the random cost. If the joint probability distribution is unknown, then the conditional expectation can be approximated in a data-driven manner using the Nadaraya-Watson (NW) kernel regression. While the emerging approximation scheme has found successful applications in diverse decision problems under uncertainty, it is largely unknown whether the scheme can provide any reasonable out-of-sample performance guarantees. In this talk, we establish guarantees for the generic problems by leveraging techniques from moderate deviations theory. Our analysis motivates the use of a variance-based regularization scheme which, in general, leads to a non-convex optimization problem. We adopt ideas from distributionally robust optimization to obtain tractable formulations. We present numerical experiments for inventory management and wind energy commitment problems to highlight the effectiveness of our regularization scheme.

15:22
Convex Chance-Constrained Programs with Wasserstein Ambiguity
PRESENTER: Haoming Shen

ABSTRACT. Chance constraints yield non-convex feasible regions in general. In particular, when the uncertain parameters are modeled by a Wasserstein ball, arXiv:1806.07418 and arXiv:1809.00210 showed that the distributionally robust (pessimistic) chance constraint admits a mixed-integer conic representation. This paper identifies sufficient conditions that lead to convex feasible regions of chance constraints with Wasserstein ambiguity. First, when uncertainty arises from the left-hand side of a pessimistic individual chance constraint, we derive a convex and conic representation if the Wasserstein ball is centered around a Gaussian distribution. Second, when uncertainty arises from the right-hand side of a pessimistic joint chance constraint, we show that the ensuing feasible region is convex if the Wasserstein ball is centered around a log-concave distribution (or, more generally, an $\alpha$-concave distribution with $\alpha \geq -1$). In addition, we propose a block coordinate ascent algorithm for this class of chance constraints and prove its convergence to global optimum. Furthermore, we extend the convexity results and conic representation to optimistic chance constraints.

15:44
Markov chain-based policies for multi-stage stochastic integer linear programming with an application to disaster relief logistics
PRESENTER: Yongjia Song

ABSTRACT. We present a multi-stage stochastic programming methodology to the contingency modality activation problem. We consider two contingency modality policies, namely, here-and-now and wait-and-activate, and apply two methods to solve the associated optimization problems: a two-stage linear decision rule approximation and a stochastic dual dynamic programming algorithm embedded in a branch-and-cut framework. Our empirical evaluation on a disaster relief logistics application compares the effectiveness of these methodologies and analyzes the trade-offs between policy flexibility, solution quality, and computational effort.

16:06
Reward-Risk Ratio Optimization under Copula-Based Wasserstein Ambiguity
PRESENTER: Ran Ji

ABSTRACT. We investigate a class of distributionally robust fractional optimization problem under copula-based Wasserstein ambiguity with applications in reward-risk ratio portfolio optimization problems. In particular, the copula-based Wasserstein ambiguity set is constructed with a metric-based ambiguity set on uncertain variables, with an additional constraint on dependence structure between the marginal distributions of uncertain variables using copulas. Leveraging conic duality, we provide a finite convex program reformulation for generic problem with copula-based Wasserstein ambiguity. We derive reformulation for STARR and Omega ratio in portfolio optimization problems and solve them efficiently via an iterative bisection algorithm. We conduct various numerical experiments to evaluate the efficiency of the proposed solution framework and showcase the robustness and out-of-sample performance of portfolios with the copula-based ambiguity.

15:00-16:30 Session SC2: Stochastic Bilevel Optimization Methods
Location: Gold Ballroom
15:00
On a bilevel optimization approach to fair classification
PRESENTER: Kangwook Lee

ABSTRACT. Bilevel optimization has been widely used in various machine learning areas such as meta-learning, hyperparameter optimization, and adversarial machine learning. In our recent work, we have proposed a bilevel optimization-based approach to fair machine learning. Our proposed framework, which we named FairBatch, keeps the standard training algorithm intact, viewing it as an inner optimizer. It then iteratively updates the weights of the training samples, where the update rules are dictated by a simple iterative solution to the underlying bilevel optimization problem. FairBatch's design is in stark contrast to the existing machine learning applications of bilevel optimization. While bilevel formulations are almost inherent in the existing applications, FairBatch's bilevel formulation looks rather artificial. Surprisingly, we show that this bilevel structure, introduced by design, allows us to tackle challenging problems, thanks to its compositional structure. We will share three recent applications of FairBatch: (1) fair training with corrupted data, (2) fair training with decentralized data, and (3) black-box fair training.

15:22
Bilevel stochastic methods for optimization and machine learning: Bilevel stochastic descent and DARTS

ABSTRACT. Two-level stochastic optimization formulations have become instrumental in a number of machine learning contexts such as neural architecture search, continual learning, adversarial learning, and hyperparameter tuning. Practical stochastic bilevel optimization problems become challenging in optimization or learning scenarios where the number of variables is high or there are constraints.

The goal of this paper is twofold. First, we aim at promoting the use of bilevel optimization in large-scale learning and we introduce a practical bilevel stochastic gradient method (BSG-1) that requires neither lower level second-order derivatives nor system solves (and dismisses any matrix-vector products). Our BSG-1 method is close to first-order principles, which allows it to achieve a performance better than those that are not, such as DARTS. Second, we develop bilevel stochastic gradient descent for bilevel problems with lower level constraints, and we introduce a convergence theory that covers the unconstrained and constrained cases and abstracts as much as possible from the specifics of the bilevel gradient calculation.

15:44
Bi-level Optimization for Robust and Efficient Learning
PRESENTER: Rishabh Iyer

ABSTRACT. In the first part of this talk, we will first formulate the robust learning problem (robust to label noise, imbalance, and out-of-distribution data) as a bi-level optimization problem, and study this in the supervised learning, semi-supervised learning and meta learning scenarios. We will propose an efficient one-step gradient based approach and also study theoretical properties like convergence bounds. Finally, we will empirically demonstrate the utility of the approach for robust semi-supervised and robust meta learning problem. In the second part of this talk, we will look at a bi-level based subset selection problem called GLISTER, which is a mixed discrete-continuous optimization problem. We will show the utility of this approach for robust and efficient learning, and propose a combinatorial optimization approach for this bi-level problem. Specifically, we will show close connections to submodularity, and also study theoretical properties of the algorithm. We will show that GLISTER can achieve significant speedups (of a magnitude of 5x to 7x) in training times while nopt significantly loosing on accuracy.

15:00-16:30 Session SC3: Nonconvex Approaches to Statistical Estimation
Location: Card Room
15:00
Bridging Convex and Nonconvex Optimization in Robust PCA: Noise, Outliers, and Missing Data
PRESENTER: Yuling Yan

ABSTRACT. This paper delivers improved theoretical guarantees for the convex programming approach in low-rank matrix estimation, in the presence of (1) random noise, (2) gross sparse outliers, and (3) missing data. This problem, often dubbed as robust principal component analysis (robust PCA), finds applications in various domains. Despite the wide applicability of convex relaxation, the available statistical support (particularly the stability analysis vis-à-vis random noise) remains highly suboptimal, which we strengthen in this paper. When the unknown matrix is well-conditioned, incoherent, and of constant rank, we demonstrate that a principled convex program achieves near-optimal statistical accuracy, in terms of both the Euclidean loss and the entrywise loss. All of this happens even when nearly a constant fraction of observations are corrupted by outliers with arbitrary magnitudes. The key analysis idea lies in bridging the convex program in use and an auxiliary nonconvex optimization algorithm, and hence the title of this paper.

15:22
Structured Local Solutions in Some Nonconvex Problems

ABSTRACT. This talk focuses on a family of nonconvex problems --- the decomposition of matrix $Y$ into the product of matrix factors $A$ and $X$: $Y=AX$. In different applications, different priors are being imposed on the matrix factors $A$ and $X$. With increasing number and intricacy of priors, the nonconvex optimization landscapes become more complicated. We will discuss a few specific examples, including blind convolution, sparse PCA, and clustering, where local solutions provably exist. We demonstrate the geometric structure of local solutions in these problems as well as the algorithm implications towards guaranteed recovery of the global solution, or the ground truth.

15:44
Clustering a mixture of Gaussians with unknown covariance
PRESENTER: Mateo Diaz

ABSTRACT. Clustering is a fundamental data scientific task with broad application. This talk investigates a simple clustering problem with data from a mixture of Gaussians that share a common but unknown, and potentially ill-conditioned, covariance matrix. We start by considering Gaussian mixtures with two equally-sized components and derive a Max-Cut integer program based on maximum likelihood estimation. We show its solutions achieve the optimal misclassification rate when the number of samples grows linearly in the dimension, up to a logarithmic factor. However, solving the Max-cut problem appears to be computationally intractable. To overcome this, we develop an efficient spectral algorithm that attains the optimal rate but requires a quadratic sample size. Although this sample complexity is worse than that of the Max-cut problem, we conjecture that no polynomial-time method can perform better. Furthermore, we present numerical and theoretical evidence that supports the existence of a statistical-computational gap. Finally, we generalize the Max-Cut program to a k-means program that handles multi-component mixtures with possibly unequal weights and has similar guarantees.

16:06
A superlinearly convergent subgradient method for sharp semismooth problems

ABSTRACT. Subgradient methods comprise a fundamental class of nonsmooth optimization algorithms. Classical results show that certain subgradient methods converge sublinearly for general Lipschitz convex functions and converge linearly for convex functions that grow sharply away from solutions. Recent work has moreover extended these results to certain nonconvex problems. In this talk, we ask: is it possible to design a superlinearly convergent subgradient method? We provide a positive answer to this question for a broad class of sharp semismooth functions.

15:00-16:30 Session SC4: Algorithms for Discrete Optimization Problems
15:00
Comparing Solution Paths of Sparse Quadratic Minimization with a Stieltjes Matrix
PRESENTER: Ziyu He

ABSTRACT. This paper studies several solution paths of sparse quadratic minimization problems as a function of the weighing parameter of the bi-objective of estimation loss versus solution sparsity. Three such paths are considered: the l0-path where the discontinuous l0-function provides the exact sparsity count; the l1-path where the l1-function provides a convex surrogate of sparsity count; and the capped l1-path where the nonconvex nondifferentiable capped l1-function aims to enhance the l1-approximation. Our results deepen the understanding of (old and new) properties of the associated paths, highlight the pros, cons, and tradeoffs of these sparse optimization models, and provide numerical evidence to support the practical superiority of the capped l1-path. Our study of the capped l1-path is the first of its kind as the path pertains to computable directionally stationary (= strongly locally minimizing in this context, as opposed to globally optimal) solutions of a parametric nonconvex nondifferentiable optimization problem. Motivated by classical parametric quadratic programming theory and reinforced by modern statistical learning studies, both casting an exponential perspective in fully describing such solution paths, we also aim to address the question of whether some of them can be fully traced in strongly polynomial time in the problem dimensions.

15:22
An Abstract Model for Branch-and-Cut

ABSTRACT. Branch-and-cut is the dominant paradigm for solving a wide range of mathematical programming problems -- linear or nonlinear -- combining intelligent search (via branch-and-bound) and relaxation-tightening procedures (via cutting planes, or cuts). While there is a wealth of computational experience behind existing cutting strategies, there is simultaneously a relative lack of theoretical explanations for these choices, and for the tradeoffs involved therein. Recent papers have explored abstract models for branching and for comparing cuts with branch-and-bound. However, to model practice, it is crucial to understand the impact of jointly considering branching and cutting decisions. In this paper, we provide a framework for analyzing how cuts affect the size of branch-and-cut trees, as well as their impact on solution time. Our abstract model captures some of the key characteristics of real-world phenomena in branch-and-cut experiments, regarding whether to generate cuts only at the root or throughout the tree, how many rounds of cuts to add before starting to branch, and why cuts seem to exhibit nonmonotonic effects on the solution process.

15:44
Computational Comparison of Exact Solution Methods for 0-1 Quadratic Programs: Recommendations for Practitioners

ABSTRACT. Binary quadratic programs (BQPs) are among the most well-studied classes of nonlinear integer optimization problems because of their wide variety of applications. While a number of different solution approaches have been proposed for tackling BQPs, practitioners need techniques that are both efficient and easy to implement. We revisit two of the most widely used linearization strategies for BQPs and examine the effectiveness of enhancements to these formulations that have been suggested in the literature. We perform a detailed large-scale computational study over five different classes of BQPs to compare these two linearizations with a more recent linear reformulation and direct submission of the nonlinear integer program to an optimization solver. The goal is to provide practitioners with guidance on how to best approach solving BQPs in an effective and easily implemented manner.

16:06
Multi-Module Capacitated Lot-Sizing Problem, and its Generalizations with Two-Echelons and Piecewise Concave Production Costs
PRESENTER: Kartik Kulkarni

ABSTRACT. We study new generalizations of the classical capacitated lot-sizing problem with concave cost functions in which the total production (or transportation) capacity in each time period is the summation of capacities of a subset of n available machines (or vehicles) of different capacities. We refer to this problem as Multi-module Capacitated Lot-Sizing Problem without or with Subcontracting, and denote it by MCLS or MCLS-S, respectively. We present exact fixed-parameter tractable (polynomial) algorithms that solve MCLS and MCLS-S in O(T^{2n+3}) time for a given n≥2. We also present exact algorithms for two generalizations of the MCLS and MCLS-S: (a) a lot-sizing problem with piecewise concave production cost functions that takes O(T^{2m+3}) time, where m is the number of breakpoints in these functions, and (b) two-echelon MCLS that takes O(T^{4n+4}) time. Our computational results show that our algorithms are efficient and stable. Furthermore, our algorithm for MCLS-S also addresses another open question related to the existence of a polynomial time algorithm for optimizing a linear function over n-mixing set (a generalization of the well-known 1-mixing set).

15:00-16:30 Session SC5: Learning under Uncertainty
Chair:
Location: Furman Room
15:00
Online Learning and Pricing for Service Systems with Reusable Resources
PRESENTER: Huiwen Jia

ABSTRACT. We consider a price-based revenue management problem with finite reusable resources over a finite time horizon. The stochastic arrival and service rates of customers all depend on the price but unknown to the firm. We propose two new multi-armed bandit based learning algorithms for finding near-optimal pricing policies. The salient difficulties of this problem lie in (i) the unknown rate-and-price mapping information, (ii) the dynamic nature of reusable resources being committed over time, (iii) the transient behavior of the service system when the price changes, and (iv) unbounded and heavy-tailed distributions of observed random variables. Our algorithms separate the learning horizon into successive operational batches and select a price from a prescribed set in each batch using past sales collected in previous batches. We prove that the cumulative regret is O(√(PTlog⁡T)), where T is the total number of time periods and P is the cardinality of the feasible price set, and the result matches the lower bound up to a logarithmic factor. As an intermediate step, we also develop a coupling analysis for analyzing the time for a queue to reach the steady state from an empty state or from a steady state under another set of system parameters.

15:22
Online Stochastic Optimization with Wasserstein Based Non-stationarity
PRESENTER: Jiashuo Jiang

ABSTRACT. We consider a general online stochastic optimization problem with multiple budget constraints over a horizon of finite time periods. In each time period, a reward function and multiple cost functions are revealed, and the decision-maker needs to specify an action to collect the reward and consume the budget. In each period, the reward and cost functions are drawn from an unknown distribution, which is non-stationary across time. The objective of the decision-maker is to maximize the cumulative reward subject to budget constraints. We consider two settings: (i) a data-driven setting where the true distribution is unknown but an estimate is available; (ii) an uninformative setting where the true distribution is completely unknown. We propose a unified Wasserstein-distance based measure to quantify the inaccuracy of the estimates in setting (i) and the non-stationarity of the system in setting (ii). For setting (i), we propose a new primal-dual algorithm, which integrates the prior information of the underlying distributions into an online gradient descent procedure. The algorithm also naturally extends to the uninformative setting (ii). Under both settings, we show the corresponding algorithm achieves a regret of optimal order.

15:44
Optimal Robust Classification Trees
PRESENTER: Nathan Justin

ABSTRACT. In many high-stakes domains, the data used to drive machine learning algorithms is noisy (due to e.g., the sensitive nature of the data being collected, limited resources available to validate the data, etc). This may cause a distribution shift to occur, where the distribution of the training data does not match the distribution of the testing data. In the presence of distribution shifts, any trained model can perform poorly in the testing phase. In this paper, motivated by the need for interpretability and robustness, we propose a mixed-integer optimization formulation and a tailored solution algorithm for learning optimal classification trees that are robust to adversarial perturbations in the data features. We evaluate the performance of our approach on numerous publicly available datasets, and compare the performance to a regularized, non-robust optimal tree. We show an increase of up to 14.16 percent in worst-case accuracy and increase of up to 4.72 percent in average-case accuracy across several data sets and distribution shifts from using our robust solution in comparison to the non-robust solution.

15:00-16:30 Session SC6: Advances in Linear and Conic Optimization
Chair:
Location: Palmetto Room
15:00
Convex Hull Results on Quadratic Programs with Non-Intersecting Constraints
PRESENTER: Alexander Joyce

ABSTRACT. Let F be a nonempty closed set. Understanding the structure of the lifted closed convex hull of C(F) in the lifted space is crucial to solve quadratically constrained quadratic programs related to F. In this talk, we discuss the relationship between C(F) and C(G), where G results by adding non-intersecting quadratic constraints to F. We prove that C(G) can be represented as the intersection of C(F) and some half spaces defined by the added constraints. The proof relies on a complete description of the asymptotic cones of sets defined by a single quadratic equality and a partial characterization of the recession cone of C(F). Our proof generalizes an existing result for bounded F with non-intersecting quadratic constraints and several results of C(G) for G defined by non-intersecting constraints.

15:22
Micro-grid Planning With Aggregator's Role In The Renewable Inclusive Prosumer Market
PRESENTER: Saurav Sharma

ABSTRACT. The utility market is continuously evolving, and the electrical power landscape is transforming at a faster rate than ever. The traditional electrical power grid management is characterized by one-directional power flows from producers to consumers. The current electric power landscape is moving towards greater decentralization and multidirectional flows, characterized by greater volatility. Nowadays, the flow of power has become multi-directional and decentralization is taking place for production units at the grid level. Microgrid architecture is playing a significant role in meeting the growing changes. Prosumers are getting more freedom and rights in terms of energy use. This paper throws light and discusses a model for a microgrid architecture incorporating the role of aggregators and renewable sources on the prosumer side, working together to optimize configurations and operations. The final model takes the form of a mixed-integer linear programming model. This model is solved using the CPLEX solver via GAMS by having a consistent data set.

15:44
Faster First-Order Primal-Dual Methods for Linear Programming using Restarts and Sharpness
PRESENTER: Oliver Hinder

ABSTRACT. First-order primal-dual methods are often slow at finding high accuracy solutions, which creates a barrier to their use for linear programming (LP). Our work exploits the sharpness of primal-dual formulations of LP instances to achieve linear convergence using restarts in a general setting that applies to ADMM (alternating direction method of multipliers), PDHG (primal-dual hybrid gradient method) and EGM (extragradient method).

In the special case of PDHG, without restarts we show a lower bound of $\Omega(\kappa^2 \log(1/\epsilon))$, while with restarts we show an upper bound of $O(\kappa \log(1/\epsilon))$, where $\kappa$ is a condition number and $\epsilon$ is the desired accuracy. Moreover, the upper bound is optimal for a wide class of primal-dual methods, and applies to the strictly more general class of sharp primal-dual problems. We develop an adaptive restart scheme and verify that restarts significantly improve the ability of PDHG to find high accuracy solutions to LP problems.

16:06
Copositive Duality for Discrete Markets and Games
PRESENTER: Cheng Guo

ABSTRACT. Optimization problems with discrete decisions are nonconvex and thus lack strong duality, which limits the usefulness of tools such as shadow prices and the KKT conditions. It was shown in Burer(2009) that mixed-binary quadratic programs can be written as completely positive programs, which are convex. Completely positive reformulations of discrete optimization problems therefore have strong duality if a constraint qualification is satisfied. We apply this perspective in two ways. First, we write unit commitment in power systems as a completely positive program, and use the dual copositive program to design a new pricing mechanism. Second, we reformulate integer programming games in terms of completely positive programming, and use the KKT conditions to solve for pure strategy Nash equilibria. To facilitate implementation, we also design a cutting plane algorithm for solving copositive programs exactly.

16:45-18:15 Session SD1: Recent Advances in Stochastic Optimization under Distributional Ambiguity
16:45
Distributionally risk-receptive and risk-averse network interdiction problems with general ambiguity set
PRESENTER: Sumin Kang

ABSTRACT. We introduce generalizations of stochastic network interdiction problem with distributional ambiguity. In particular, we consider a distributionally risk-averse (or robust) network interdiction problem (DRA-NIP) and a distributionally risk-receptive network interdiction problem (DRR-NIP) where a leader maximizes a follower's minimal expected objective value for either the worst-case or the best-case, respectively, probability distribution belonging to a given set of distributions (referred to as ambiguity set). The DRA-NIP arises in applications where a risk-averse leader is the main protagonist who interdicts a follower (opponent or evader) to cause delays in their supply convoy. In contrast, the DRR-NIP is applicable for network vulnerability analysis where a network user (or follower) also seeks to identify vulnerabilities in the network against potential disruptions by an adversary (or leader) who is receptive to risk for improving the expected objective values. We present exact and approximation algorithms for solving DRA-NIP and DRR-NIP with a general ambiguity set. We also provide conditions and family of ambiguity sets for which these approaches are finitely convergent. To evaluate the effectiveness and efficiency of the approaches for solving DRA-NIP and DRR-NIP, we provide results of our extensive computational experiments performed on instances known in the literature for (risk-neutral) stochastic NIP.

17:07
Distributionally Robust Two-Stage Convex Quadratic Programming
PRESENTER: Nazlican Arslan

ABSTRACT. Distributionally robust optimization is a framework to model stochastic optimization problems where the distribution of the random parameters is uncertain. The uncertain distribution is assumed to belong to a set of plausible distributions, called an ambiguity set. The goal is to compute decisions that hedge against the worst possible distribution in a given ambiguity set. We study a two-stage stochastic program with convex quadratic recourse under distributional ambiguity with uncertainty only in the right-hand side vector. We consider data-driven distributional ambiguity sets based on the Wasserstein distance and based on an optimal quadratic transport distance. For the latter, we derive a cutting-plane algorithm, and show computational results.

17:29
Effective scenarios in Two-stage DRO: properties and acceleration of decomposition algorithms
PRESENTER: Chennan Zhou

ABSTRACT. Effective scenarios are critical to Distributionally Robust Optimization (DRO) in the sense that, if removed, the optimal value will be changed. Earlier, effective scenarios were examined for convex DRO with Total Variation distance. In this work, we study them for a class of DRO with Wasserstein distance and phi-divergences. We present several properties and discuss how they can be used to accelerate Benders’ decomposition algorithms to solve this class of problems.

16:45-18:15 Session SD2: Advances in Optimization and Learning from Data
Chair:
Location: Gold Ballroom
16:45
Two-stage stochastic programs with nonconvex recourse

ABSTRACT. We study the decomposition methods for solving a class of nonconvex and nonsmooth two-stage stochastic programs, where both the objective and constraints of the second-stage problem are nonlinearly parameterized by the first-stage variable. Due to the failure of the Clarke-regularity of the resulting nonconvex recourse function, classical decomposition approaches such as Benders decomposition and (augmented) Lagrangian-based algorithms cannot be directly generalized to solve such models. By exploring an implicitly convex-concave structure of the recourse function, we introduce a novel surrogate decomposition framework based on the so-called partial Moreau envelope. Convergence for both fixed scenarios and interior sampling strategy is established. Numerical experiments are conducted to demonstrate the effectiveness of the proposed algorithm.

17:07
Algorithms for DC Programs based on Difference-of-Moreau-Envelopes Smoothing
PRESENTER: Kaizhao Sun

ABSTRACT. We consider minimization of a difference-of-convex (DC) function with and without linear constraints. We first study a smooth approximation of a DC function termed difference of Moreau envelopes (DME), which captures geometric properties of the original function and enjoys some growth conditions. Then we show that the gradient descent method and an inexact variant converge on the DME and deliver a stationary solution of the original DC function. Furthermore, when the minimization is constrained in an affine subspace, we proposed two variants of the augmented Lagrangian method based on the DME, which allow a nonsmooth concave component in the objective compared to the literature.

17:29
Generalized ADMM Framework for Non-overlapping Group Selection
PRESENTER: Chengyu Ke

ABSTRACT. The paper introduces an ADMM-based algorithm to solve a statistical learning optimization problem with a pre-determined group structure. To promote sparsity across groups, we propose a generalized formulation suitable for convex loss functions, such as mean squared error, logistic loss and Poisson loss, and folded concave regularization terms, including L1 penalty, SCAD, MCP, transformed L1 and L ½ penalty. For d(irectional)-stationary solutions to the problem, we illustrate the global optimality conditions and the bound of the distance between the solution and the theoretical ground truth under the restricted strongly convex assumption. We show the algorithm converges to a d-stationary point of our original problem. In the end, we conduct in-depth numerical experiments to test our algorithm under several scenarios with different penalty functions, providing suggestions for choosing penalty functions in specific tasks.

16:45-18:15 Session SD3: Frontiers in Stochastic and Nonlinear Optimization
Location: Card Room
16:45
Asymptotics of Stochastic Gradient Descent for Nonconvex Objectives with General Noise Models

ABSTRACT. In this talk, we will review recent advances in the asymptotic convergence theory of stochastic gradient descent (SGD) for nonconvex objective functions with general noise models.

17:07
On Arbitrary Compression for Decentralized Consensus and Stochastic Optimization over Directed Networks
PRESENTER: M. Taha Toghani

ABSTRACT. We study the decentralized consensus and stochastic optimization problems under compressed message sharing over a fixed directed graph. Our goal is to minimize the sum of n functions, each accessible to a single node, where local communications are subject to a directed network. In this work, we propose an iterative push-sum algorithm with compressed message sharing to reduce the communication overhead on the network. Contrary to existing literature, we allow for arbitrary compression ratios in the communicated messages. We provide explicit convergence rates for the stochastic optimization problem on smooth functions that are either (i) strongly convex, (ii) convex, or (iii) non-convex. Finally, we provide numerical experiments to illustrate the arbitrary compression and communication efficiency of our algorithm.

17:29
Fast and Exact Matrix Factorization Updates for Nonlinear Programming

ABSTRACT. LU and Cholesky factorization algorithms are core subroutines used to solve systems of linear equations (SLEs) encountered while solving an optimization problem. Standard factorization algorithms are highly efficient but remain susceptible to the accumulation roundoff errors, which can lead solvers to return feasibility and optimality certificates that are actually invalid. This talk introduces a novel approach for solving sequences of closely related SLEs encountered in nonlinear programming efficiently and without roundoff errors. Specifically, it introduces rank-one update algorithms for the roundoff-error-free (REF) factorization framework, a toolset built on integer-preserving arithmetic that has led to the development and implementation of fail-proof SLE solution subroutines for linear programming. The formal guarantees of the proposed algorithms are formally established through the derivation of theoretical insights. Their computational advantages are supported with computational experiments, which demonstrate upwards of 75x-improvements over exact factorization run-times on fully dense matrices with over one million entries. A significant advantage of the proposed methodology is that the length of any coefficient calculated via the associated algorithms is bounded polynomially in the size of the inputs without having to resort to greatest common divisor operations, which are required by and thereby hinder an efficient implementation of exact rational arithmetic approaches.

16:45-18:15 Session SD4: Nonconvexity in DC & AC Power Flow Models
Chair:
16:45
A Linear Outer Approximation of Line Losses for DC-based Optimal Power Flow Problems

ABSTRACT. This paper proposes a novel and simple linear model to capture line losses for use in linearized DC models, such as optimal power flow (DC-OPF) and security-constrained economic dispatch (SCED). The Line Loss Outer Approximation (LLOA) model implements an outer approximation of the line losses lazily and typically terminates in a small number of iterations. Experiments on large-scale power systems demonstrate the accuracy and computational efficiency of LLOA and contrast it with classical line loss approaches. The results seem to indicate that LLOA is a practical and useful model for real-world applications, providing a good tradeoff between accuracy, computational efficiency, and implementation simplicity. In particular, the LLOA method may have significant advantages compared to the traditional loss factor formulation for multi-period, stochastic optimization problems where good reference points may not be available. The paper also provides a comprehensive overview and evaluation of approximation methods for line losses.

17:07
Joint Chance-Constrained Predictive Generation and Line Maintenance Planning with Operations Scheduling under Sensor-Driven Failure Uncertainty

ABSTRACT. This paper studies a short-term condition-based integrated maintenance planning with operations scheduling problem while considering the unexpected failure possibilities of generators as well as transmission lines. We formulate this problem as a two-stage stochastic mixed-integer program with failure scenarios sampled from the sensor-driven remaining lifetime distributions of the system elements whereas a joint chance-constraint consisting of Poisson Binomial random variables is introduced to account for failure risks. We develop a cutting-plane method to obtain an exact reformulation of the joint chance-constraint by proposing a separation subroutine and deriving stronger cuts as part of this procedure. Alternatively, we derive a second-order cone programming based safe approximation of this constraint. Furthermore, we propose a decomposition-based algorithm for solving the resulting stochastic program, which exploits the features of the integer L-shaped method and the special structure of the maintenance and operations scheduling problem to derive stronger optimality cuts. We further present preprocessing steps over transmission line flow constraints to identify redundancies. Our computational results on a weekly plan with daily maintenance and hourly operational decisions involving detailed unit commitment subproblems on various IEEE instances demonstrate the significant computational efficiency of the proposed approach with reliable and cost effective maintenance and operational schedules.

17:29
Tightening Quadratic Convex Relaxations for the AC Optimal Transmission Switching Problem
PRESENTER: Cheng Guo

ABSTRACT. The Alternating Current Optimal Transmission Switching (ACOTS) problem incorporates line switching decisions into the fundamental AC optimal power flow (ACOPF) problem. The advantages of the ACOTS problem are well-known in terms of reducing the operational cost and improving system reliability. ACOTS optimization models contain discrete variables and nonlinear structures, which make it difficult to solve. We derive strengthened quadratic convex (QC) relaxations for ACOTS by combining several methodologies recently developed in the ACOPF literature. First, we relax the ACOTS model with the QC relaxation, which has been empirically observed to be both tight and computationally efficient in approximating the ACOPF problem. Further, we tighten the QC relaxation by using strong linearization with extreme-point representation, and by adding several types of valid inequalities, with some of those inequalities being novel. In particular, we derive a novel kind of on/off cycle-based polynomial constraints, by taking advantage of the network structure. Those constraints are linearized with convex-hull representations. Our extensive numerical experiments on PGLib instances show that, compared with the standard method, our strengthening techniques are able to improve the bounds for ACOTS relaxations on a subset of the PGLib instances, with some of those improvements substantial.

17:51
Improving Microgrid Energy Management with storage via Stochastic Optimization and Machine Learning

ABSTRACT. Storage is ascertained as a critical element to increase the adoption of variable and intermittent energy sources. Managing stored energy adds a new layer of complexity to energy management, resulting in important gains in efficiency. This work introduces two multi-stage stochastic programming models for energy management on a microgrid, with battery storage and energy generation via solar panels, under uncertain demand and solar radiation. One model is oriented for residential use with varying purchase prices throughout the day, whereas the other is oriented to an industrial setting with a fixed price-setting but an annual charge over peak usage. Using historical demand and climatological data, we trained several recurrent neural network models, particularly Long Short-Term Memory (LSTM) networks with MC dropout, to generate demand and generation predictions in a stage-wise independent scenario setting. We use Stochastic Dual Dynamic Programming (SDDP) to compute solutions to the energy management problem using these predictions. The resulting policies provide optimal energy flows between network, battery, panel, and demand. Numerical experiments on Chilean data showed that these policies could improve up to 15% profit over baseline policies.

16:45-18:15 Session SD5: Two-Stage and Multi-Stage Stochastic Programming
Chair:
Location: Furman Room
16:45
On the Value of Multistage Stochastic Facility Location with Risk Aversion
PRESENTER: Xian Yu

ABSTRACT. We consider a multi-period capacitated facility location problem over a finite time horizon under uncertain demand in each period. We formulate a multistage stochastic integer program using a scenario tree representation of the uncertainty and compare it with a two-stage approach that decides which facilities to open for all periods upfront. In the multistage model, in each stage, after realizing the demand, a decision-maker optimizes facility locations and recourse flows from open facilities to demand sites, with the goal of minimizing certain measures of the future random cost associated with facility location and demand satisfaction. Using expected conditional risk measures, we provide tight lower bounds for the gaps between optimal objective values of risk-averse multistage models and their two-stage counterparts. Moreover, two approximation algorithms are proposed to efficiently solve risk-averse two-stage and multistage stochastic programs, whose approximation ratios are independent of the underlying stochastic process, structure of the scenario tree, number of stages, and risk-preference-related parameters. We conduct numerical studies using randomly generated and real-world network instances with diverse sizes to demonstrate the tightness of the analytical bounds and computational efficacy of the proposed approximation algorithms. We also provide managerial insights based on different parameter and uncertainty settings.

17:07
An Active-set Method For Two-Stage Stochastic Quadratic Programming
PRESENTER: Niloofar Fadavi

ABSTRACT. In this talk, we examine two-stage stochastic quadratic programming problems, where the objective function of the first and second stages are quadratic functions and the constraints are linear. The uncertainty is associated with the second-stage right-hand sides. One of the main challenges we face is solving the second-stage problem, when the random variable associated with the right-hand side has a continuous support or a large number of scenarios in the discrete case. To deal with this issue, we develop the dual active-set strategy to obtain an approximate solution for the second-stage problem. This approximation is used to design a computationally efficient solution algorithm to solve stochastic quadratic programs. We will present the convergence and numerical analysis of the algorithm.

17:29
On Finite Adaptability in Two-stage Distributionally Robust Optimization
PRESENTER: Eojin Han

ABSTRACT. In many applications, practitioners prefer policies that are interpretable and easy to implement. This tendency is magnified in sequential decision making settings. In this paper, we leverage the concept of finite adaptability to construct policies for two-stage optimization problems. More specifically, we focus on the general setting of distributional uncertainties affecting the right-hand sides of constraints, since in a broad range of applications uncertainties do not affect the objective function and recourse matrix. The aim is to construct policies that have provable performance bounds. This is done by partitioning the uncertainty realization and assigning a contingent decision to each piece. We first show that the optimal partitioning can be characterized by translated orthants, which only require the problem structure and, hence, are free of modeling assumptions. We then prove that finding the optimal partitioning is hard and propose a specific partitioning scheme with orthants, allowing the efficient computation of orthant-based policies via solving a mixed-integer optimization problem of a moderate size. By leveraging the geometry of this partitioning, we provide performance bounds of the orthant-based policies, which also generalize the existing bounds in the literature. We also assess suboptimality in more general settings and provide techniques to obtain lower bounds.

17:51
LP-based Approximations for Disjoint Bilinear and Two-Stage Adjustable Robust Optimization
PRESENTER: Ayoub Foussoul

ABSTRACT. We consider the class of disjoint bilinear programs max{x^Ty | x \in X, y \in Y} where X and Y are packing polytopes. We present an O(\frac{\log \log m_1}{\log m_1} \frac{\log \log m_2}{\log m_2})-approximation algorithm for this problem where m_1 and m_2 are the number of packing constraints in X and Y respectively. In particular, we show that there exists a near-optimal solution (x,y) such that x and y are ``near-integral". We give an LP relaxation of this problem from which we obtain the near-optimal near-integral solution via randomized rounding. As an application of our techniques, we present a tight approximation for the two-stage adjustable robust problem with covering constraints and right-hand side uncertainty where the separation problem is a bilinear optimization problem. In particular, based on the ideas above, we give an LP restriction of the two-stage problem that is an O(\frac{\log n}{\log \log n} \frac{\log L}{\log \log L})-approximation where L is the number of constraints in the uncertainty set. This significantly improves over state-of-the-art approximation bounds known for this problem.

16:45-18:15 Session SD6: Mixed Integer Nonlinear Programming
Location: Palmetto Room
16:45
On computing with some convex relaxations for the maximum-entropy sampling problem

ABSTRACT. Based on a factorization of an input covariance matrix, we define a mild generalization of a factorization bound of Nikolov (2015) and Li and Xie (2020) for the NP-Hard maximum-entropy sampling problem (MESP). We demonstrate that the bound is invariant under scaling and also independent of the particular factorization chosen. We give a variable-fixing methodology that can be used in a branch-and-bound scheme based on the factorization bound for exact solution of MESP. We report on successful experiments with the commercial software Knitro. We demonstrate that the known “mixing technique” can be used to combine the factorization bound with the factorization bound of the complementary MESP, and also with Anstreicher's "linx bound", for improved bounds. This is joint work with M. Fampa and J. Lee.

17:07
Tridiagonal Maximum-Entropy Sampling and Tridiagonal Masks

ABSTRACT. The Maximum-Entropy Sampling problem (MESP) is an NP-Hard problem that aims to find the sample $S \subset N$ that will maximize the log-determinant of a given positive semidefinite matrix $C$. We provide a dynamic-programming algorithm that will solve the Maximum-Entropy Sampling problem when $C$ is tridiagonal, we also extend this algorithm to solve for when the support graph of $C$ is a spider graph. A mask $M$ is a correlation matrix that is used to impose a tridiagonal structure on $C$ using the Hadamard product $C \circ M$, which upper bounds $C$ in the MESP. We take advantage of the fact that we can solve the MESP for the tridiagonal case quickly and we analyze and develop a local search algorithm on the mask M to minimize the upper bound for the input $C$.

17:29
Gaining or Losing Perspective

ABSTRACT. We consider Mixed Integer Nonlinear formulations of a situation in which a binary variable, z, determines whether or not a second variable, x, is 0 or within a nonnegative operating range, [a,b]. A convex cost, f(x), is captured by a third variable y, via the inequality f(x) <= y. We use volume as a metric to compare various continuous relaxations of this disjunction to its convex hull, which is achieved via the "perspective reformulation".

17:51
Gaining or Losing Perspective for Convex Multivariate Functions

ABSTRACT. MINLO (mixed-integer nonlinear optimization) formulations of the disjunction between the origin and a polytope via a binary indicator variable are broadly used in nonlinear combinatorial optimization, such as in portfolio optimization. The perspective relaxation, which is the convex hull, is often used to solve such models to optimality in a branch-and-bound context. But such a relaxation typically requires conic solvers like MOSEK, and may be incompatible with general-purpose NLP software which can accommodate other classes of constraints. We study the various weaker relaxations to investigate when a simpler non-conic-programming relaxation may be adequate. Comparing the volume of the relaxations, we lift some of the results related to univariate functions to the multivariate case.