IOS2020: INFORMS OPTIMIZATION SOCIETY CONFERENCE 2020
PROGRAM FOR SUNDAY, MARCH 15TH
Days:
next day
all days

View: session overviewtalk overview

08:30-10:00 Session SA1: Algorithmic Advances in Stochastic Programming I
Chair:
08:30
Advances in Scalable Open Source Software for Parallel Stochastic Programming
PRESENTER: Jean-Paul Watson

ABSTRACT. We discuss advances in parallel implementations of scenario-based decomposition strategies for stochastic programming, including Progressive Hedging and Asynchronous Projective Hedging. The talk will focus on strategies for the elimination of key performance bottlenecks in the PySP open source library for stochastic programming, and features and architecture of a soon-to-be-released successor library. Advanced features, including FW-PH lower bounding and cross-scenario cut strategies, will be detailed.

08:52
Biased Stochastic Approximation for Conditional Stochastic Optimization
PRESENTER: Yifan Hu

ABSTRACT. In this paper, we propose a biased stochastic gradient descent (BSGD) algorithm for Conditional Stochastic Optimization (CSO). CSO finds many applications in areas like robust supervised learning, meta learning, and causal inference. More specifically, we demonstrate that Model-Agnostic Meta-Learning (MAML) and some other problems can be formulated as a CSO problem and further solved by the proposed algorithm. BSGD, to the best of our knowledge, is the first stochastic approximation algorithm for solving CSO. We analyze the sample complexity of BSGD for convex and non-convex objective functions to converge. For convex CSO problems, to find an $\eps$-optimal solution, we show that the sample complexity of BSGD improves from $\cO(\eps^{-4})$ to $\cO(\eps^{-2})$ when assuming smoothness and strong convexity. For non-convex CSO, BSGD achieves $\cO(\eps^{-8})$ sample complexity to attain a stationarity convergence, which can be reduced to $\cO(\eps^{-6})$ when assuming smoothness. We also provide a lower bound analysis of the first order oracle stochastic approximation for the general convex CSO problems, which matches with the upper bound results mentioned above. Extensive numerical experiments are also provided to further support our theoretical results and illustrate the performance of BSGD.

09:14
Benders Cut Classification via Support Vector Machines for Solving Two-stage Stochastic Programs
PRESENTER: Siqian Shen

ABSTRACT. We consider Benders decomposition for solving two-stage stochastic programs with complete recourse and finite samples of uncertain parameters. We define the Benders cuts binding at the final optimal solution or the ones significantly improving bounds over iterations as valuable cuts. We propose a learning-enhanced Benders decomposition (LearnBD) algorithm, which adds a cut classification step in each iteration to learn cuts that are more likely to be valuable cuts. The LearnBD algorithm includes three phases: (i) cut sampling, (ii) cut classifier construction using support vector machines (SVM), and (iii) cut classification. We run the LearnBD algorithm on instances of capacitated facility location problems under uncertain demand. Our results show that SVM cut classifier works effectively for identifying valuable cuts and the LearnBD algorithm improves the overall computational efficiency for randomly generated instances with various sizes and complexities.

08:30-10:00 Session SA2: Clinical Decision-Making Under Uncertainty
Location: Gold Ballroom
08:30
Dynamic Tuberculosis Screening for Healthcare Employees
PRESENTER: Mahsa Kiani

ABSTRACT. Regular Tuberculosis (TB) screening is required for healthcare employees since they may be in contact with infected patients. TB is a potentially serious contagious disease. Thus, early detection of the disease even when it is latent, prevents the spread of the disease and helps the treatment. There are two types of TB diagnosis tests: skin test and blood test. The cost of the blood test is much higher than the skin test. But, the possibility of getting a false positive or false negative result in skin test is higher especially for persons with specific characteristics. This may cause extra costs. In this study, we categorize employees into multiple groups based on the place they work, the job they do, birth country. We create a Markov decision process (MDP) model to decide which TB test should be taken by each employee group to minimize the total cost and infection rate. The model also specifies the optimum time between each two tests. Due to the curse of dimensionality, we use approximate dynamic programming (ADP) to approximate the solution. Considering the results obtained by solving the MDP, we propose a benchmark policy for the healthcare facility that is easy to implement.

08:52
Equity in Genetic Newborn Screening
PRESENTER: Hussein El Hajj

ABSTRACT. State-level newborn screening allows for early treatment of genetic disorders, which can substantially improve health outcomes for newborns. As the cost of genetic testing decreases, it is becoming an essential part of newborn screening. A genetic disorder can be caused by many mutation variants; therefore, an important decision is to determine which variants to search for (i.e., the panel design), under a testing budget. The frequency of variants that cause a disorder and the incidence of the disorder vary by racial/ethnic group. Consequently, it is important to consider equity issues in panel design, so as to reduce disparities among different groups. We study the panel design problem using cystic fibrosis (CF) as a model disorder, considering the trade-offs between equity and accuracy, under a limited budget. Most states use a genetic test in their CF screening protocol, but panel designs vary, and, due to cost, no state's panel includes all CF-causing variants. We develop models that design equitable genetic testing panels, and compare them with panels that maximize sensitivity in the general population. Our case study, based on realistic CF data, highlights the value of equitable panels and provides important insight for newborn screening practices.

09:14
Mean-Risk Two-Stage Vaccine Allocation Under Uncertainty
PRESENTER: Amira Hijazi

ABSTRACT. CDC estimated that 25 million people became infected in the U.S. during the 2015-2016 flu season. While most of these cases were mild, there were approximately 12,000 flu-related deaths and 310,000 related hospitalizations. The economic burden of influenza on the U.S. economy is estimated to be $10.4 billion annually in direct medical costs, and $16.3 billion in indirect costs such as lost earnings. Vaccination is the most effective way of preventing influenza. In this study, we consider a risk-averse two-stage stochastic vaccine allocation model, where we specify the conditional-value-at-risk (CVaR) as the risk measure. We apply the sample average approximation (SAA) method to distribute flu vaccines to the counties of the State of North Carolina over two phases.

08:30-10:00 Session SA3: Optimization for Data Analytics and Machine Learning
Location: Card Room
08:30
Complexity of stochastic dual dynamic programming

ABSTRACT. Stochastic dual dynamic programming is a cutting plane type algorithm for multi-stage stochastic optimization originated about 30 years ago. In spite of its popularity in practice, there does not exist any analysis on the convergence rates of this method. In this paper, we first establish the number of iterations, i.e., iteration complexity, required by a basic dynamic cutting plane method for solving relatively simple multi-stage optimization problems, by introducing novel mathematical tools including the saturation of search points. We then refine these basic tools and establish the iteration complexity for both deterministic and stochastic dual dynamic programming methods for solving more general multi-stage stochastic optimization problems under the standard stage-wise independence assumption. Our results indicate that the complexity of these methods mildly increases with the number of stages T, in fact linearly dependent on T for discounted problems. Therefore, they are efficient for strategic decision making which involves a large number of stages, but with a relatively small number of decision variables in each stage. Without explicitly discretizing the state and action spaces, these methods might also be pertinent to the related reinforcement learning and stochastic control areas.

08:52
Learning to optimize with hidden constraints
PRESENTER: Rafid Mahmood

ABSTRACT. We consider a data-driven framework for learning to solve instances of continuous optimization problems where the feasible set varies with an instance-specific auxiliary input in a way that cannot be formally described. We use a data set of inputs and feasible solutions, as well as an oracle of feasibility to iteratively train two machine learning models. The first model is a binary classifier of feasibility, which then serves as a barrier function to train the second model via an interior point method. We develop a theory and optimality guarantees for interior point methods when given a barrier for a relaxation of the feasible set, and extend these results to obtain in-sample and out-of-sample guarantees for our classification and generative models. Finally, we implement our method on a radiation therapy treatment planning problem to predict personalized doses for patients with head-and-neck cancer. Our automated planning system outperforms current machine learning models that predict dose and in addition, demonstrates an ability for transfer learning to constraints in new clinics where data may not be readily available.

09:14
Approximation Algorithms for the Maximum Entropy Sampling Problem
PRESENTER: Yongchun Li

ABSTRACT. This paper studies the classic maximum entropy sampling problem (MESP), which aims to select the most informative principal submatrix of a given size out of a covariance matrix from an observable system. MESP has a broad application in many domains, including experimental design, sensor placement, data compression, information theory, etc. Albeit important, MESP has been proved to be NP-hard. By concavifying the objective function, this paper derives a novel concave integer programming formulation for MESP and shows that its continuous relaxation is close to optimality. This result motivates us to propose an efficient sampling algorithm and develop its approximation bound for MESP, which improves the best-known bound in the literature. We show that the proposed sampling algorithm has an efficient deterministic implementation with the same approximation bound. By analyzing the Lagrangian dual of the proposed concave integer program, we investigate the widely-used local search algorithm and prove its first-known approximation bound for MESP. The proof techniques further inspire us to design an efficient implementation of the local search algorithm. Our numerical experiments demonstrate the proposed algorithms can efficiently solve large scale instances to near-optimality. Finally, we extend the analyses to the A-optimal MESP, where the objective is to find the smallest trace of the inverse of the selected principal submatrix.

08:30-10:00 Session SA4: Semidefinite Programming: Theory and Algorithms
Chair:
08:30
On Sufficient and Necessary Conditions for Rank-One Generated Cones

ABSTRACT. A convex cone K that is a subset of the positive semidefinite (PSD) cone is rank-one generated (ROG) if all of its extreme rays are generated by rank one matrices. ROG property is closely related to characterizations of exactness of SDP relaxations, e.g., of nonconvex quadratic programs. We consider the case when K is obtained as the intersection of the PSD cone with (potentially infinitely-many) linear (or conic) matrix inequalities, and identify sufficient conditions that guarantee that K is ROG. In the case of two linear matrix inequalities, we also establish the necessity of our sufficient condition.

08:52
Weak infeasibility in semidefinite programming: a complete characterization and generating all instances
PRESENTER: Aleksandr Touzov

ABSTRACT. In this work, we analyze weakly infeasible semidefinite programs (SDPs). These SDPs are infeasible, but an arbitrarily small perturbation can make them feasible.

Weakly infeasible SDPs appear in many guises, some classical and some more recent: 1) as asymptotes of the semidefinite cone; as 2) difficult SDPs, which are often mistaken for feasible ones by even the best solvers; and as 3) infeasible SDPs within zero distance to ill-posedness.

We first describe a very simple combinatorial characterization of weak infeasibility in SDPs. The characterization uses elementary operations (inherited from Gaussian elimination) and reformulates the original SDP into a form that makes the weak infeasibility trivial to see. We then introduce two simple combinatorial algorithms to generate any weakly infeasible SDP: with a suitable starting data, any such SDP is obtained as an output of our algorithms. We conclude with a computational study.

09:14
A Decomposition Interior-Point Algorithm for Stochastic Linear Optimization in a Hilbert Space

ABSTRACT. In this talk, we present a decomposition-based interior-point algorithm for two-stage stochastic linear optimization problems with recourse in a Hilbert space. We give the fundamental properties of the logarithmic barrier associated with the recourse function of our problem setting. We then see that by exploiting such properties to our problem setting, the iteration complexity results obtained for the short- and long-step algorithms coincide with the best-known estimates in the finite-dimensional case.

08:30-10:00 Session SA5: Optimization in Modern Data Science Applications
Location: Furman Room
08:30
Approximating L1-Norm Best-Fit Lines and Extensions
PRESENTER: Paul Brooks

ABSTRACT. Fitting lines to data can be the foundation for many data analysis tasks including regression, ranking, recommendation, background subtraction in video analysis, text mining, and principal component analysis. Measuring error using the L1-norm rather than the traditional squared L2-norm error criterion can provide an insensitivity to outlier observations. However, the L1-norm version of the optimization problem is NP-hard.

We provide sufficient conditions for a deterministic algorithm for finding a best-fit line. In doing so, we derive fundamental properties of the L1-norm projection on a line and establish an equivalence between the algorithm and independently-derived algorithms based on linear programming. The proposed algorithm is based on finding a series of sortings. The equivalence implies that it is a 2-factor approximation algorithm which is the best-known factor among deterministic algorithms. Among known 2-factor approximation algorithms, the proposed algorithm has the smallest worst-case computational requirements.

We explain how the method can be extended to provide estimates of L1-norm best-fit subspaces of any dimension. We then propose a framework for distributed estimation of L1-norm best-fit lines and subspaces.

08:52
Engineering fast multilevel support vector machines

ABSTRACT. The computational complexity of solving nonlinear support vector machine (SVM) is prohibitive on large-scale data. In particular, this issue becomes very sensitive when the data represents additional difficulties such as highly imbalanced class sizes. Typically, nonlinear kernels produce significantly higher classification quality to linear kernels but introduce extra kernel and model parameters, which requires computationally expensive parameter fitting. This increases quality but also reduces performance dramatically. We introduce a generalized fast multilevel framework for regular and weighted SVM and discuss several versions of its algorithmic components that lead to a good trade-off between quality and time. Our framework is implemented using PETSc, which allows easy integration with scientific computing tasks. The experimental results demonstrate significant speedup compared to the state-of-the-art nonlinear SVM libraries. Reproducibility: our source code, documentation, and parameters are available at https://github.com/esadr/mlsvm.

09:14
Partition Hypergraphs with Embeddings
PRESENTER: Justin Sybrandt

ABSTRACT. The problem of placing circuits on a chip or distributing sparse matrix operations can be modeled as the hypergraph partitioning problem. A hypergraph is a generalization of the traditional graph wherein each "hyperedge" may connect any number of nodes. Hypergraph partitioning, therefore, is the NP-Hard problem of dividing nodes into k similarly sized disjoint sets while minimizing the number of hyperedges that span multiple partitions. Due to this problem's complexity, many partitioners leverage the multilevel heuristic of iteratively "coarsening" their input to a smaller approximation until an inefficient algorithm becomes feasible. The initial solution is then propagated back to the original hypergraph, which produces a reasonably accurate result provided the coarse representation preserves structural properties of the original. The multilevel hypergraph partitioners are considered today as state-of-the-art solvers that achieve an excellent quality/running time trade-off on practical large-scale instances of different types. In order to improve the quality of multilevel hypergraph partitioners, we propose leveraging graph embeddings to better capture structural properties during the coarsening process. Our approach prioritizes dense subspaces found at the embedding, and contracts nodes according to both traditional and embedding-based similarity measures. Reproducibility: All source code, plots and experimental data are available at https://sybrandt.com/publications/partition_with_embeddings/.

Full text: https://arxiv.org/abs/1909.04016

08:30-10:00 Session SA6: Modeling and Optimization
Location: Palmetto Room
08:30
Optimizing Revenue while Showing Relevant Recommendations
PRESENTER: Deeksha Sinha

ABSTRACT. We revisit the problem of large-scale assortment optimization under the multinomial logit choice model without any assumptions on the structure of the feasible assortments. Scalable real-time assortment optimization has become essential in e-commerce operations due to the need for personalization and the availability of a large variety of items. While this can be done when there are simplistic assortment choices to be made, not imposing any constraints on the collection of feasible assortments gives more flexibility to incorporate insights of store-managers and historically well-performing assortments. We design fast and flexible algorithms based on variations of binary search that find the revenue of the (approximately) optimal assortment. We speed up the comparisons steps using novel vector space embeddings, based on advances in the information retrieval literature. For an arbitrary collection of assortments, our algorithms can find a solution in time that is sub-linear in the number of assortments and for the simpler case of cardinality constraints - linear in the number of items (existing methods are quadratic or worse). Empirical validations using the Billion Prices dataset and several retail transaction datasets show that our algorithms are competitive even when the number of items is ∼105 (100x larger instances than previously studied).

08:52
Maximizing the total pleasure from a personalized day tour

ABSTRACT. This paper studies a personalized day tour planning problem to maximize the total pleasure gained from the tour. The problem focuses on finding the tourist spots to be visited, designing the routing sequence, and measuring the time of travel and time to be spent in each spot. We propose to utilize the concept of utility theory for achieving the research endeavor. Considering the uniform and marginal pleasure intensities, we propose two zero-one mixed integer nonlinear programming models to obtain corresponding tour plans. The uniform pleasure intensity assumes that the pleasure remains constant, whereas the marginal pleasure intensity assumes that the pleasure decreases after spending a certain duration in a tourist spot. The proposed models accommodate the visit to specific tourist spots, preferred time slots to visit some spots, and diversity of spots. We solve the models by a Branch-and-Bound algorithm. A case example demonstrates the usefulness of the proposed models. The tour plan obtained using the diminishing marginal pleasure intensity offers improved total pleasure.

09:14
Solving Multi-Objective Combinatorial Cell Formation Problem: A Two-Step Approach Using A Distance Function and A Genetic Algorithm

ABSTRACT. A primary concern in cellular manufacturing is solving the cell formation problem where a similar part family needs to be processed in a machine group with the objective to minimize cell load variation and exceptional elements. This research undertakes an approach for solving multi-objective combinatorial cell formation problems in a dynamic environment. It combines uniquely the distance function approach and a genetic algorithm in two stages. In the first stage, the designed genetic algorithm finds the ideal and negative ideal solutions using the individual objectives as fitness functions. These solutions work as the inputs for the second stage. In the second stage, the genetic algorithm finds the optimum machine-part cell formation so that the both cell load variation and exceptional elements are minimized. The fitness function of this stage is constructed based on solutions obtained from the first stage. The proposed approach has been numerically illustrated using an example problem and then implemented in a GUI based software toolkit to evaluate the performance of a set of 30 benchmarked test dataset and demonstrated to provide greater or equal group efficacy performance measure over the latest available approaches for majority of the problems.

08:30-10:00 Session SA7: Optimization Approaches for Complex Power System Network Problems
Location: Alexander Room
08:30
Distributed Algorithms for Multi-Scale Capacity Expansion Problems in Power Systems
PRESENTER: Run Chen

ABSTRACT. With more variable-output renewable resources being integrated into power systems, the systems’ reliable operations need not only adequate generation capacity, but also sufficient ramping capability. Such needs are not considered in traditional capacity expansion models as they do not have the detailed temporal resolution to account for minute-level ramping and dispatch decisions over multi decades. Capacity expansion models of such fine temporal scales, however, will result in optimization problems of extreme sizes that may not be solved by off-the-shelf solvers directly. To address both the modeling and computational issues, in this talk, we first present a capacity expansion model with hourly and sub-hourly economic dispatch decisions and ramping constraints; we then study numerical performances of distributed optimization algorithms (including ADMM and PCPM) on the resulting extreme-scale models, with focuses on the impacts of different decomposition schemes, parallelization, and speed-up schemes. We will also compare the results between traditional expansion models and the mutli-scale model to emphasize the differences in solutions and their implications for future capacity needs.

08:52
Integrated Learning and Optimization for AC-OPF

ABSTRACT. The Optimal Power Flow (OPF) problem is a fundamental building block for the optimization of electrical power systems. It is nonlinear and nonconvex and computes the generator setpoints for power and voltage, given a set of load demands. It is often solved repeatedly under various conditions, either in real-time or in large-scale studies. This need is further exacerbated by the increasing stochasticity of power systems due to renewable energy sources in front and behind the meter. To address these challenges, this paper presents a deep learning approach to the OPF. The learning model exploits a dual Lagrangian method to satisfy the physical and engineering constraints present in the OPF. The proposed model is evaluated on a large collection of realistic medium-sized power systems. The experimental results show that its predictions are highly accurate with average errors as low as 0.2%.

09:14
Path-based Valid Inequalities for DC Power Systems with Discrete Transmission Considerations
PRESENTER: Kyle Skolfield

ABSTRACT. The stable and efficient operation of the transmission network is fundamental to the power system’s ability to deliver electricity reliably and cheaply. One factor core to the network’s functionality is its flexibility. Recent advances in transmission technology and control systems (e.g., transmission switching, line hardening) have begun to enable not only planning for expanding the network topology, but also the ability to control certain network characteristics. However, to be able to model these control choices exactly requires many discrete variables, rendering exact models intractable. Recent research has begun exploring how to exploit certain structural insights on these network characteristics to improve solution times for these intractable problems. This work proposes advanced optimization techniques which incorporate these insights in particular into valid inequalities (VIs) or cutting planes to improve solution times, as demonstrated in several exact models considering various forms of flexible transmission technologies. The validity of these VIs is theoretically proved and their effectiveness is demonstrated computationally herein.

10:30-12:00 Session SB1: Algorithmic Advances in Stochastic Programming II
10:30
Subtree Decomposition Method For Multistage Stochastic Programs

ABSTRACT. We develop a bounding approach for multi-stage stochastic programs based on scenario tree decomposition. The proposed methodology divides the scenario tree into subtrees which can be solved in parallel. This framework generalizes scenario decomposition by allowing any subtree of the original scenario tree. We present numerical experiments with large-scale instances of Multistage Capacity Acquisition Problem.

10:52
An Algorithm for Parametric Quadratically Constrained Quadratic Programs
PRESENTER: Andrew Pangia

ABSTRACT. Quadratic programs with quadratic constraints appear in nonlinear network flow problems such as pooling. Optimization models of pooling problems frequently involve data that is uncertain or unknown and therefore is represented by parameters converting the models into parametric programs. The advantage of parametric optimization is that the optimal solutions and optimal objective value come in the form of functions of the parameters over the parameter space. Using parametric programming, the programs can be solved before the solutions for a specific data scenario are actually needed, and, in time-sensitive situations, the only computations required are function evaluations for the specific parameter values stemming from the scenario. The disadvantage is that the computation of the solution and optimal value functions results in even slower optimization processes.

A recently developed multiparametric subgradient algorithm is implemented for solving Multiparametric Quadratically Constrained Quadratic Programs (mp-QCQPs). The subgradient algorithm is implemented in MATLAB and makes use of Lagrangian relaxation and a multiparametric simplex approximation algorithm. The algorithm is tested on examples constructed deterministically and randomly. To our knowledge, mpQCQPs have not been solved yet, so there are no benchmarks to which the obtained results could be compared.

11:14
Sample Complexity of Sample Average Approximation for Conditional Stochastic Optimization
PRESENTER: Yifan Hu

ABSTRACT. In this paper, we study a class of stochastic optimization problems, referred to as the \emph{Conditional Stochastic Optimization} (CSO), in the form of $\min_{x \in \mathcal{X}} \EE_{\xi}f_\xi\Big({\EE_{\eta|\xi}[\mathbf{g}_\eta(x,\xi)]}\Big)$. CSO finds a wide spectrum of applications including portfolio selection, reinforcement learning, and robust learning. We establish the sample complexity of the sample average approximation (SAA) for CSO, under a variety of structural assumptions, such as Lipschitz continuity, smoothness, and error bound conditions. We show that the total sample complexity improves from $\cO(d/\eps^4)$ to $\cO(d/\eps^3)$ when assuming smoothness of the outer function, and further to $\cO(1/\eps^2)$ when the empirical function satisfies the quadratic growth condition. We also establish the sample complexity of a modified SAA, when $\xi$ and $\eta$ are independent. Several numerical experiments further support our theoretical findings.

11:36
Comparing model-free and model-based methods for multistage stochastic programming
PRESENTER: Simon Risanger

ABSTRACT. We compare model-based and model-free techniques for a prominent multistage stochastic problem, namely hydropower scheduling. Stochastic dual dynamic programming (SDDP) and deep deterministic policy gradients (DDPG) represent state-of-the-art techniques of both paradigms. Both methods solve the scheduling problem under equal conditions using industry data. In a convex setting, SDDP outperforms DDPG because it explicitly takes into consideration time-stages and has an exact representation of the piecewise linear efficiency curve to exploit optimal operating points. The neural network of DDPG is not able to uncover the efficiency curve. However, SDDP cannot solve non-convex settings, which are no problem for DDPG. We show how model-free approaches are useful in non-convex cases, which occur in hydropower applications through binary dispatch decisions and efficiency curves that depend on the reservoir level. Consequently, we can uncover some notion of the cost of assuming a convex system. DDPG is a flexible algorithm and we discuss how to improve its out-of-the-box performance.

10:30-12:00 Session SB2: Discrete Stochastic Programming
Location: Gold Ballroom
10:30
Variance Reduction in Stochastic Programming and Applications in Logistics
PRESENTER: Suvrajeet Sen

ABSTRACT. Stochastic Programming (SP) is a fundamental approach to modeling uncertainty in optimization models. However, SP poses several challenges. These include issues of striking a balance among competing dimensions such as model-size, solution quality, solution time, and last but not the least, developing data resources which lead to timely and effective decision models. Unlike most other optimization paradigms, developing SP algorithms requires optimization methodology to draw upon other disciplines such as simulation (including variance reduction and simulation optimization), computer science, statistical learning (e.g. model selection) and others. In this talk, we will illustrate variance reduction ideas in Stochastic MIPs, and illustrate them with applications in logistics.

10:52
Robust Optimization with Decision-Dependent Information Discovery
PRESENTER: Phebe Vayanos

ABSTRACT. Robust optimization is a popular paradigm for modeling and solving two- and multi-stage decision-making problems affected by uncertainty. Most approaches assume that the uncertain parameters can be observed for free and that the sequence in which they are revealed is independent of the decision-maker's actions. Yet, these assumptions fail to hold in many real-world applications where the time of information discovery is decision-dependent and the uncertain parameters only become observable after an often costly investment. To fill this gap, we consider two- and multi-stage robust optimization problems in which part of the decision variables control the time of information discovery. Thus, information available at any given time is decision-dependent and can be discovered (at least in part) by making strategic exploratory investments in previous stages. We propose a novel dynamic formulation of the problem and prove its correctness. We leverage our model to provide a solution method inspired from the K-adaptability approximation, whereby K candidate strategies for each decision stage are chosen here-and-now and, at the beginning of each period, the best of these strategies is selected after the uncertain parameters that were chosen to be observed are revealed. We reformulate the problem as a mixed-binary linear program solvable with off-the-shelf solvers. We generalize our approach to the minimization of piecewise linear convex functions. We demonstrate effectiveness of our approach on synthetic and real data instances of the active preference elicitation problem used to recommend policies that meet the needs of policy-makers at the Los Angeles Homeless Services Authority.

11:14
Joint chance-constrained programs and the intersection of mixing sets through a submodularity lens
PRESENTER: Dabeen Lee

ABSTRACT. The intersection of mixing sets with common binary variables arise when modeling joint linear chance-constrained programs with random right-hand sides and finite sample space. In this talk, we first establish a strong and previously unrecognized connection of mixing sets to submodularity. This viewpoint enables us to unify and extend existing results on polyhedral structures of mixing sets. Then we study the intersection of mixing sets with common binary variables and also linking constraint lower bounding a linear function of the continuous variables. We propose a new class of valid inequalities and characterize when this new class along with the mixing inequalities are sufficient to describe the convex hull.

11:36
Sample Average Approximation for Stochastic Nonconvex Mixed Integer Nonlinear Programming via Outer Approximation
PRESENTER: Can Li

ABSTRACT. Stochastic mixed-integer nonlinear programming (MINLP) is a very challenging class of problems. Although there have been recent advances in developing decomposition algorithms to solve stochastic MINLPs, none of the existing algorithms can address stochastic MINLPs with continuous distributions. We propose a sample average approximation-based outer approximation algorithm (SAAOA) that can address nonconvex two-stage stochastic programs (SP) with any continuous or discrete probability distributions. The SAAOA algorithm does internal sampling within a nonconvex outer approximation algorithm where we iterate between a mixed-integer linear programming (MILP) master problem and a nonconvex nonlinear programming (NLP) subproblem. We prove that the optimal solutions and optimal value obtained by the SAAOA algorithm converge to the optimal solutions and the optimal value of the true SP problem as the sample size goes to infinity. The convergence rate is also given to estimate the sample size. However, the theoretical sample size estimate is too conservative in practice. Therefore, we propose an SAAOA algorithm with confidence intervals for the upper bound and the lower bound at each iteration of the SAAOA algorithm. Two policies are proposed to update the sample sizes dynamically within the SAAOA algorithm with confidence intervals. The proposed algorithm works well for the special case of pure binary first stage variables and continuous stage two variables since in this case the nonconvex NLPs can be solved for each scenario independently. The proposed algorithm is tested with a stochastic pooling problem and is shown to outperform the external sampling approach where large scale MINLPs need to be solved.

10:30-12:00 Session SB3: Integer Programming and Data-Driven Optimization
Location: Card Room
10:30
Robust Assortment Optimization under Product Unavailability
PRESENTER: Saharnaz Mehrani

ABSTRACT. The assortment planning problem is a central piece in the revenue management strategy of any company in the retail industry. In this paper, we formulate an assortment planning problem for substitutable products under a partial raking-based choice model, where customers make decisions based on their ordered preference list of products. After realizing that their highly preferred products are unavailable, customers either choose to make no purchase or select a product from the available assortment. Our modeling approach incorporates the cumulative effect of finding multiple unavailable products on the customers' purchase decisions. We present a bi-level optimization approach to maximize the expected revenue, which is also robust with respect to the worst-case order of products in the preference list of the customers. We provide a polynomial-time algorithm that solves a special case of the unconstrained assortment planning problem under our choice model optimally. For the general constrained version of the problem, we devise a solution procedure that includes a single-level reformulation and a cutting-plane approach to iteratively tighten the solution space. We also provide a greedy algorithm that can quickly solve large instances with small optimality gaps.

10:52
Precision Heparin Dosing using Mixed Integer Optimization
PRESENTER: Yonatan Mintz

ABSTRACT. Finding optimal dose levels for time sensitive drugs such as heparin is crucial for avoiding adverse health effects. Computing these heparin doses for patients is particularly challenging since the concentration of heparin in the blood cannot be measured directly and the rates at which patients metabolize it vary greatly between individuals. In this talk we design a personalized framework for optimizing heparin doses. Using a pharmacokinetic model, we infer the current concentration of heparin and predict future therapeutic effects. We validate the predictive capabilities of our model against existing machine learning techniques and show how it can be used to optimize patient dosing. SESSION CHAIR: Andres Gomez

11:14
Strong formulations for sparse regression
PRESENTER: Andres Gomez

ABSTRACT. In this talk we focus sparse regression problems, this is, regression problems with an L0-norm component. The talk discusses the derivation of strong formulations for such problems, that substantially improve upon existing formulations based on big-M constraints or the perspective reformulation. Although the continuous relaxations for the formulations are conic, they may not be directly amenable to use with exiting branch-and-bound solvers. Thus, in this talk, we also outline some approaches to efficiently solve or exploit the resulting conic relaxations, paving the way for an integration with branch-and-bound methods.

11:36
A mixed-integer optimization approach for exhaustive cross-validated model selection

ABSTRACT. We consider a linear regression model for which we assume that many of the observed regressors are irrelevant for the prediction. In order to avoid overfitting, we want to conduct a variable selection and only include the true predictors for the least square fitting. The best subset selection gained a lot of interest in recent years for addressing this objective. For this method, a mixed-integer optimization problem is solved which finds the optimal subset not larger than a given natural number k with respect to the in-sample error. In practice, a best subset selection is computed for each k and the ideal k is then chosen via a validation. We argue that the notion of the best subset selection might be misaligned with the statistical intention. Only the sparsity is selected via a validation whereas the best cardinality-constrained subset is selected in accordance to the training error. We address this issue by proposing a discrete optimization formulation that conducts an in-model cross-validation. The proposed program is only allowed to fit coefficients to training data but it can choose to switch variables on and off in order to minimize the validation error of the cross-validation. Moreover, novel coefficient bounds are presented, which are also valid for the best subset selection problem. It is shown that the bounds improve the computational efficiency significantly in many settings. At last, we conduct a simulation study and provide evidence that the novel mixed-integer formulation provides more accurate predictions than the best subset selection and the lasso method.

10:30-12:00 Session SB4: Conic Optimization and Nonnegative Polynomials
10:30
On the tightness of SDP relaxations of QCQPs
PRESENTER: Alex Wang

ABSTRACT. Quadratically constrained quadratic programs (QCQPs) are a fundamental class of optimization problems well-known to be NP-hard in general. In this paper we study conditions under which the standard semidefinite program (SDP) relaxation of a QCQP is tight. We begin by outlining a general framework for proving such sufficient conditions. Then using this framework, we show that the SDP relaxation is tight whenever the quadratic eigenvalue multiplicity, a parameter capturing the amount of symmetry present in a given problem, is large enough. We present similar sufficient conditions under which the projected epigraph of the SDP gives the convex hull of the epigraph in the original QCQP. Our results also imply new sufficient conditions for the tightness (as well as convex hull exactness) of a second order cone program relaxation of simultaneously diagonalizable QCQPs.

10:52
Sparse PCA under a conic optimization lens: MISDP reformulations and scalable SOCP bounds
PRESENTER: Ryan Cory-Wright

ABSTRACT. We study the problem of selecting the leading sparse principal component of a positive, semidefinite matrix. Existing state-of-the-art approaches do not supply certifiably optimal principal components with more than n=100 covariates. By reformulating the sparse PCA problem as a convex mixed-integer semidefinite optimization problem, we propose a cutting-plane method which solves the problem to certifiable optimality at a larger scale. We also propose a scalable technique for obtaining high-quality dual bounds, namely taking an appropriate second order cone relaxation and adding a small number of cuts. We illustrate our approach's effectiveness by engineering interpretable principal components for real-world datasets which arise in medicine and finance.

11:14
On Local Minima of Cubic Polynomials
PRESENTER: Jeffrey Zhang

ABSTRACT. We study local minima of cubic polynomials. We first give a characterization of local minima, and show that this characterization can be checked in polynomial time. We then give an SDP approach for finding local minima of cubic polynomials.

10:30-12:00 Session SB5: Interplay between Optimization and Statistics I
Chair:
Location: Furman Room
10:30
Bridging Convex and Nonconvex Optimization in Robust PCA: Noise, Outliers, and Missing Data
PRESENTER: Yuxin Chen

ABSTRACT. This paper delivers improved stability guarantees for the convex programming approach in low-rank matrix estimation, in the presence of (1) random noise, (2) gross sparse outliers, and (3) missing data. This problem, often dubbed as robust principal component analysis (robust PCA), finds applications in various domains. Despite the wide applicability of convex relaxation, the available theoretical support (particularly the stability analysis vis-à-vis random noise) remains highly suboptimal, which we strengthen in this paper. When the unknown matrix is well-conditioned, incoherent, and of constant rank, we demonstrate that a principled convex program achieves near-optimal stability guarantees, in terms of both the Euclidean loss and the entrywise loss. All of this happens even when nearly a constant fraction of observations are corrupted by outliers with arbitrary magnitudes. The key analysis idea lies in bridging the convex program in use and an auxiliary nonconvex optimization algorithm, and hence the title of this paper.

10:52
Learning Dynamical Systems with Side Information
PRESENTER: Bachir El Khadir

ABSTRACT. We present a mathematical formalism and a computational framework for the problem of learning a dynamical system from noisy observations of a few trajectories and subject to side information (e.g., physical laws or contextual knowledge). This is motivated by safety-critical applications where an unknown dynamical system needs to be controlled after a very short learning phase where a few of its trajectories are observed. (Imagine, e.g., the task of autonomously landing a passenger airplane that has gone through sudden wing damage.) We identify six classes of side information which can be imposed by semidefinite programming and that arise naturally in many applications. We demonstrate their value on two examples from epidemiology and physics. Some density results on polynomial dynamical systems that either exactly or approximately satisfy side information are also presented.

11:14
Composite optimization for blind deconvolution

ABSTRACT. The blind deconvolution problem seeks to recover a pair of vectors from a set of rank one bilinear measurements. This problem appears in an array of different fields, including image processing, coding theory, and wireless communications. In this talk I will present a natural nonsmooth formulation of the problem, study its nonsmooth landscape, and show that, under standard statistical assumptions, it satisfies nice regularity properties (even when a constant fraction of the measurements are arbitrarily corrupt). Which cause standard algorithms, such as the subgradient and prox-linear methods, to converge at a rapid dimension-independent rate when initialized within constant relative error of the solution. Furthermore, I'll introduce an initialization strategy, complementing the local search algorithms. Such procedure is both provably efficient and robust to outlying measurements. This is joint work with V. Charisopoulos, D. Davis, and D. Drusvyatskiy.

11:36
Structured local minima in some nonconvex optimization problems

ABSTRACT. This talk focuses on the investigation of structure of the local minima of some nonconvex problems. We consider 1) short and sparse blind deconvolution and 2) K-means clustering problem. The short and sparse blind deconvolution problem aims to recover a short signature pattern from its repeated occurrence, which arises from several important applications, including microscopy image analysis, neural spike sorting and image deblurring. The K-means problem aims to partition observations into k clusters such that the within-cluster variation is minimized. Both problems are nonconvex, and our research shows that all the local solutions are structured under conditions. By investigating the structure of these local solutions gives reliable algorithms to infer the global solution from the a local solution.

10:30-12:00 Session SB6: Renewable Energy
Location: Palmetto Room
10:30
Data driven stochastic optimization applied to renewable energy penetration in power systems

ABSTRACT. Data driven stochastic optimization play an important role in nowadays energy planning. The challenge of incorporating renewable sources in the existing systems has completely changed the industry paradigms from analysis of two or three scenarios (pessimistic, intermediate and optimistic) to dozens, hundreds or even thousands of scenarios. Although systems with relevant penetration have been studied for a few decades with the multi scenario paradigm in mind, the new trend of Variable Renewable Energy sources, such as wind and solar, have been pushing thermal dominated systems to this new model. Systems details that were precisely know by planners and operators became more uncertain every day. Tools from statistics, stochastic and integer optimization are the key to understand and plan these new systems, leveraging historical and sometimes limited data. We will show how state-of-the-art algorithms and methodologies are applied to today’s problems concerning renewable energy integration in different regions of the globe. We highlight how these tools can be used to improve planning, expansion and operation of power systems with real world sized case studies. Moreover, we highlight how computational tools ranging from computing languages to parallel computing environments are crucial to solve the large-scale problems of real life.

10:52
Exploiting almost identical generators in unit commitment
PRESENTER: Jonathan Schrock

ABSTRACT. It is common to find similar, but not identical power generators in real power generator scheduling problems. We propose a strategy to exploit this "near symmetry" that improves computational performance.

11:14
Spatio-temporal forecasting for wind energy and its impact on wind farm operations

ABSTRACT. Several operational decisions in wind farms are contingent on short-term wind speed and power predictions. In this talk, I will present a probabilistic spatio-temporal forecasting model in which the specification of certain parameters is made in light of wind propagation patterns on a farm. Tested on one year of turbine-specific wind farm data, double-digit percentage improvements in terms of both wind speed and power prediction accuracy were achieved relative to a set of widely used forecasting methods. Connections to how these probabilistic predictions can be optimally used in wind farm operations will be demonstrated.

10:30-12:00 Session SB7: Advances in Computational Optimization
Location: Alexander Room
10:30
Understanding Limitation of Two Symmetrized Orders by Worst-case Complexity
PRESENTER: Peijun Xiao

ABSTRACT. It was recently found that the standard version of multi-block cyclic ADMM diverges. Interestingly, Gaussian Back Substitution ADMM (GBS-ADMM) and symmetric Gauss-Seidel ADMM (sGS-ADMM) do not have the divergence issue. Therefore, it seems that symmetrization can improve the performance of the classical cyclic order. In another recent work, cyclic CD (Coordinate Descent) was shown to be $\mathcal{O}(n^2)$ times slower than randomized versions in the worst-case. A natural question arises: can the symmetrized orders achieve a faster convergence rate than the cyclic order, or even getting close to randomized versions? In this paper, we give a negative answer to this question. We show that both Gaussian Back Substitution and symmetric Gauss-Seidel order suffer from the same slow convergence issue as the cyclic order in the worst case. In particular, we prove that for unconstrained problems, they can be $\mathcal{O}(n^2)$ times slower than R-CD. For linearly constrained problems with quadratic objective, we empirically show the convergence speed of GBS-ADMM and sGS-ADMM can be roughly $\mathcal{O}(n^2)$ times slower than randomly permuted ADMM.

10:52
Surrogate Optimization of Deep Neural Networks for Groundwater Predictions
PRESENTER: Juliane Mueller

ABSTRACT. Sustainable management of groundwater resources under changing climatic conditions requires the application of reliable and accurate predictions of groundwater levels. Mechanistic multi-scale, multi-physics simulation models are often too hard to use for this purpose, especially for groundwater managers who do not have access to the complex compute resources and data. Therefore, we analyze the applicability and performance of four modern deep learning computational models for predicting groundwater levels. We compare three methods for optimizing the models' hyperparameters, including two surrogate model-based algorithms and a random sampling method. We test the models by comparing their predictions with groundwater level data in Butte County, California, USA, taking into account the temporal variability of streamflow, precipitation, and ambient temperature. Our numerical study shows that the optimization of the hyperparameters can lead to reasonably accurate performance of all models, but the ''simplest'' network, namely a multilayer perceptron (MLP) performs overall better for learning and predicting groundwater data than the more advanced long short-term memory or convolutional neural networks in terms of prediction accuracy and time-to-solution. Thus, the MLP is a suitable candidate for groundwater prediction and for integration in decision support tools for groundwater management.

11:14
New Bregman proximal type algorithms for solving DC optimization problems
PRESENTER: Shota Takahashi

ABSTRACT. Difference of Convex (DC) optimization problems have objective functions that are differences of two convex functions. The proximal DC algorithm is a representative algorithm for solving large-scale DC problems. However, it requires that the objective functions have globally Lipschitz continuous gradients. Using Bregman distances, which measures the proximity between two points, we propose an algorithm that does not require Lipschitz continuity of gradients, called the Bregman Proximal DC Algorithm (BPDCA). In addition to this, we accelerate it by the extrapolation technique, proposing the Bregman Proximal DC Algorithm with extrapolation (BPDCAe). Assuming the Kurdyka-Łojasiewicz property of the objective function, we establish the global convergence of iterates of both algorithms to a stationary point of the problem. We apply our algorithms for a phase retrieval, which is one of nonconvex problems which can be described as DC problems. Then, we can take larger step size than formulating it as a nonconvex problem. In numerical experiments, we show that the BPDCAe is faster for this application.

11:36
New Multi-Step Conjugate Gradient Method for Optimization

ABSTRACT. Multi-step methods are secant-like techniques of the quasi-Newton type that, unlike the classical methods, construct nonlinear alternatives to the quantities in the so-called Secant equation. Multi-step methods instead utilize data available from the m most recent iterations and thus create an alternative to the Secant equation with the intention of creating better Hessian approximation that induce faster convergence to the minimizer of the objective function f. The methods, based on reported numerical results published in several research papers related to the subject, have introduced substantial savings in both iteration and function evaluation counts. Encouraged by the successful performance of the methods, we explore in this paper using them in developing a new Conjugate Gradient (CG) algorithm. CG methods gain popularity on big problems and in situations when memory resources are scarce. The numerical experimentation on the new methods are encouraging and open venue for further investigation of such techniques to explore their merits in a multitude of applications.

13:45-14:45 Session Plenary I: Learning to Optimize and Optimizing Learning (Speaker: Pascal van Hentenryck)

Abstract:

The boundaries between machine learning and optimization are continuously blurring as machine learning can be used to approximate optimization problems and optimization can expand the scope and applicability of learning systems. Moreover, the tight integration of machine learning and optimization opens new possibilities for both fields. This talk reviews some interesting developments in this space through applications in power systems and mobility.

Bio: 

Pascal Van Hentenryck is the A. Russell Chandler III Chair and Professor in the H. Milton Stewart School of Industrial and Systems Engineering at the Georgia Institute of Technology, the Associate Chair for Innovation and Entrepreneurship, and the director of the Socially Aware Mobility lab. Van Hentenryck is an INFORMS Fellow and a Fellow of the Association for the Advancement of Artificial Intelligence (AAAI), and the recipient of two honorary doctoral degrees. Several of his optimization systems, including the CHIP and OPL systems, have been in commercial use for more than 20 years. His current research focuses on machine learning, optimization, and privacy with applications in mobility, energy, and resilience.

Location: Gold Ballroom
15:15-16:45 Session SC1: Advances in Approximate Dynamic Programming and Reinforcement Learning
15:15
Self-guided Approximate Linear Programs
PRESENTER: Parshan Pakiman

ABSTRACT. Approximate linear programs (ALPs) are well-known models based on value function approximations (VFAs) to obtain heuristic policies and lower bounds on the optimal policy cost of Markov decision processes (MDPs). The ALP VFA is a linear combination of predefined basis functions that are chosen using domain knowledge and updated heuristically if the ALP optimality gap is large. We side-step the need for such basis function engineering in ALP – an implementation bottleneck – by proposing a sequence of ALPs that embed increasing numbers of random basis functions obtained via inexpensive sampling. We provide a sampling guarantee and show that the VFAs from this sequence of models converge to the exact value function. Nevertheless, the performance of the ALP policy can fluctuate significantly as more basis functions are sampled. To mitigate these fluctuations, we "self-guide" our convergent sequence of ALPs using past VFA information such that a worst-case measure of policy performance is improved. We perform numerical experiments on perishable inventory control and generalized joint replenishment applications, which, respectively, give rise to challenging discounted-cost MDPs and average-cost semi-MDPs. We find that self-guided ALPs (i) significantly reduce policy fluctuations and improve the optimality gaps from an ALP approach that employs basis functions tailored to the former application, and (ii) deliver optimality gaps that are comparable to a known adaptive basis function generation approach targeting the latter application. More broadly, our methodology provides application-agnostic policies and lower bounds to benchmark approaches that exploit application structure.

15:37
A Data Efficient and Feasible Level Set Method for Stochastic Convex Optimization with Expectation Constraints

ABSTRACT. Stochastic convex optimization problems with expectation constraints (SOECs) are encountered in several application domains. In data-rich environments, the SOEC objective and constraints contain expectations defined with respect to large datasets. Therefore, efficient algorithms for solving such SOECs need to limit the fraction of data points that they use, which we refer to as algorithmic data complexity. Recent stochastic first order methods exhibit low data complexity when handling SOECs but guarantee near-feasibility and near-optimality only at convergence. These methods may thus return highly infeasible solutions when heuristically terminated, as is often the case, due to theoretical convergence criteria being highly conservative. This issue limits the use of first order methods in several applications where the SOEC constraints encode implementation requirements. We design a stochastic feasible level set method (SFLS) for SOECs that has low data complexity and emphasizes feasibility before convergence. We establish that SFLS maintains a high-probability feasible solution at each level-set iteration and exhibits favorable iteration complexity compared to state-of-the-art deterministic feasible level set and stochastic subgradient methods. Numerical experiments on three applications validate SFLS’s low data complexity relative to the former approach and highlight how SFLS finds feasible solutions with small optimality gaps faster than the latter method.

15:59
Adaptive Approximate Linear Programs based on Index Networks for Solving Weakly-coupled Dynamic Programs

ABSTRACT. We introduce a new class of approximate linear programs (ALPs) for weakly-coupled dynamic programs by using networks to model the combinatorial structure embedded in the indices of the standard ALP formulation. An interesting feature of the resulting model is that it adapts to the level of coupling introduced by the constraints linking the component dynamic programs. Intuitively, the size of the resulting formulation is dependent on the degree by which the dynamic programs are linked; small-sized models result when the linking constraints are either very mild or overly restrictive. We establish that our adaptive ALP formulation is of polynomial size for known, as well as new, linking constraints connecting the dynamic programs. In addition, even when our adaptive formulation is not polynomially sized, it can still be exponentially smaller than the standard ALP formulation. Numerical results and policy performance are presented for difficult combinatorial versions of multi-armed bandit problems, demonstrating orders of magnitude speed-ups in solution time to obtain the ALP VFA in comparison to classical formulations.

16:21
Lookahead-bounded Q-learning
PRESENTER: Ibrahim El Shar

ABSTRACT. We introduce the lookahead-bounded Q-learning (LBQL) algorithm, a new, provably convergent variant of Q-learning that seeks to address the poor performance of Q-learning algorithm in stochastic environments through the use of “lookahead” upper and lower bounds. The poor performance is caused by the well-known overestimation problem and by the fact that Q-learning requires a lot of experience to achieve convergence. This is especially appealing in real-world problems which require expensive simulations/real-world interactions. To do this, LBQL employs previously collected experience and each iteration's state-action values as dual feasible penalties to construct a sequence of sampled information relaxation problems. The solutions to these problems provide estimated upper and lower bounds on the optimal value, which we track via stochastic approximation. These quantities are then used to enhance the performance of standard Q-learning by constraining the iterates to stay within the bounds at every iteration. We show that the three interrelated stochastic processes, representing the Q-iterates, upper bounds, and lower bounds, each converge to the optimal state-action values with probability one. Numerical experiments confirm the fast convergence of LBQL as compared to the standard Q-learning algorithm and several related techniques, such as double Q-learning, speedy Q-learning, and bias-corrected Q-learning.

15:15-16:45 Session SC2: Global Optimization I
Location: Gold Ballroom
15:15
Dualization and Bilevel optimization in JuMP
PRESENTER: Guilherme Bodin

ABSTRACT. In this talk, we present two extensions to the JuMP environment of mathematical programming in Julia. The first extension is Dualization.jl, a package that helps users to exploit different formulations of their optimization problems rapidly. The functionality can dualize any conic convex optimization problem. The dual formulation can be used to better suit the description of the optimization problem to the format expected by conic solver. Moreover, automatic dualization can be used to model bilevel problems by automatically building KKT conditions. Which leads us to the second extension: BilevelJuMP.jl, a package that makes it straightforward for users to write bilevel problem just like JuMP made it easy to write optimization problems. BilevelJuMP.jl uses Dualization.jl to generate the dual constraints of KKT conditions and has multiple formulations for the complementarity constraints such as SOS1, Fortuny-Amat, quadratic programming and actual complementarity constraints. The linear problems can be solved with mixed-integer or non-linear solvers, while conic problems require similar solvers that are aware of conic constraints. In this talk, we show how the packages were developed relying on Julia the JuMP infra-structure, and we present some examples of automatic dualization and bilevel conic modeling.

15:37
Disjunctive cuts for Mixed-Integer Conic Optimization
PRESENTER: Mathieu Tanneau

ABSTRACT. Motivated by the recent success of conic formulations for Mixed-Integer Convex Optimization, we investigate the impact of disjunctive cutting planes for Mixed-Integer Conic Optimization. We show that conic strong duality, guaranteed by a careful selection of a novel normalization in the conic separation problem, as well as the numerical maturity of interior-point methods for conic optimization allow to solve the theoretical and numerical issues encountered by many authors since the late 90s. As a result, the proposed approach allows algorithmic flexibility in the way the conic separation problem is solved and the resulting cuts are shown to be computationally effective to close a significant amount of gap for a large collection of instances.

15:59
Hypatia: Generic Nonsymmetric Conic Optimization in Julia
PRESENTER: Chris Coey

ABSTRACT. Hypatia is an open-source optimization solver in Julia and is accessible through a native interface and through JuMP/MathOptInterface. Hypatia makes it easy to model and solve primal-dual conic problems involving general convex cones for which appropriate primal or dual barrier functions are known. We introduce Hypatia's interfaces and algorithms and offer a taste of "natural" nonsymmetric conic formulations and applications.

16:21
Hypatia: formulations, linear algebra, and preliminary computational results
PRESENTER: Lea Kapelevich

ABSTRACT. We present Hypatia.jl, an open-source and highly-configurable conic interior point solver written in Julia. We demonstrate the generality of Hypatia's novel primal-dual algorithms for nonsymmetric cone optimization and discuss algorithmic features such as various linear system solvers. We present computational results using a set of example problems involving nonsymmetric cones.

15:15-16:45 Session SC3: In Memory of Dave Shanno I
Location: Card Room
15:15
In memory of Dave Shanno: Revisiting Quasi-Newton Methods

ABSTRACT. Over the last decade, Dave and I had been working on using regularization in nonlinear programming algorithms to solve difficult problems and to remedy classic, known shortcomings of interior-point methods and quasi-Newton methods. This talk will focus on our most recent efforts: the application of cubic regularization in the context of the symmetric rank-one and conjugate gradient methods for nonlinear programming.

15:37
In Memory of Dave Shanno: Objective Bounds from ADMM Algorithms

ABSTRACT. Dave Shanno recruited me to Rutgers and was my mentor for my first ten years there. I will start with some reminiscences about Dave. I will also describe a simple result about obtaining optimal solution value bounds from ADMM algorithms, with some LASSO problem examples.

15:59
Solving the Problem of Portfolio VAR Minimization as a Nonlinear Program

ABSTRACT. Minimizing the Value at Risk (VAR) of a financial portfolio is challenging as it is a global optimization problem that is non-convex. In previous work the problem was formulated as an MPEC (mathematical program with equilibrium constraints), that was solved using branch-and-bound techniques. We show that the same MPEC can be solved effectively as a nonlinear program, specifically by use of interior-point methods, and that this a flexible approach that is easily able to incorporate additional constraints on the optimal portfolio. Promising numerical results are presented that demonstrate that this approach robustly solves the problem of finding a local minimizer, and can also be adapted to the global problem.

16:21
In memory of Dave Shanno: from optimization to machine learning, it’s all the same data

ABSTRACT. In market basket data there is a row for each basket, and the row contains the items that were placed in the basket. There is a natural graph that represents a set of baskets. This graph has a node for each item and arcs connecting items that have appeared together in at least one basket. This graph can be used as the basis for optimization or for machine learning. If the items belong to categories, then they can be organized into columns. The columns are features and the items are options. An optimization problem is then which variants of a configurable product should be built in advance to capture as much of the expected demand as possible. Regression starts with a table of rows and columns. The usual practice is to convert categorical variables into pseudo-numeric ones. But numeric variables can instead be converted into categorical ones by banding. If all the variables are categorical, then we can do prediction by moving around a graph, rather than moving around in a metric space. Predicting the value of one special variable by using the values of other variables is now known as machine learning.

15:15-16:45 Session SC4: The Interface of Optimization and Machine Learning
15:15
First-order methods for large-scale linear programming
PRESENTER: Miles Lubin

ABSTRACT. With the goal of solving linear programming (LP) problems 100x-1000x larger than the limits of state-of-the-art solvers based on the simplex and interior-point methods, we report on our experiments developing first-order methods for LP that require at most matrix-vector products rather than matrix factorizations. We describe our refinements to the primal-dual hybrid gradient (Zhu and Chan (2008); Chambolle and Pock (2011)) and mirror prox (Nemirovski (2004)) algorithms that have improved both theoretical and empirical performance. Our approach compares favorably with SCS, an open-source ADMM-based solver and has solved LPs with over 1B variables and 6B nonzero coefficients.

15:37
Strong mixed-integer programming formulations for trained neural networks
PRESENTER: Joey Huchette

ABSTRACT. We present mixed-integer programming (MIP) formulations for high-dimensional piecewise linear functions that correspond to trained neural networks. These formulations can be used for a number of important tasks, such as: 1) verifying that an image classification network is robust to adversarial inputs, 2) designing DNA sequences that exhibit desirable therapeutic properties, 3) producing good candidate policies as a subroutine in deep reinforcement learning algorithms, and 4) solving decision problems with machine learning models embedded inside (i.e. the "predict, then optimize" paradigm). We provide formulations for networks with many of the most popular nonlinear operations (e.g. ReLU and max pooling) that are strictly stronger than other approaches from the literature. We corroborate this computationally on image classification verification tasks, where we show that our formulations are able to solve to optimality in orders of magnitude less time than existing methods.

15:59
Risk Guarantees For End-to-end Prediction and Optimization Processes
PRESENTER: Nam Ho-Nguyen

ABSTRACT. In real-world decision-making, prediction models are often used to estimate parameters of optimization models. While it is usually believed that good prediction performance should result in good subsequent optimization performance, formal theoretical guarantees on this are lacking entirely. We provide two sufficient conditions on the prediction loss function which guarantee that good prediction results in good optimization, and derive explicit relations between the two. In a preliminary computational study on fractional knapsack problems, we compare the optimization performance of using several Fisher consistent prediction loss functions with a provably inconsistent loss function.

15:15-16:45 Session SC5: Social Good I
Location: Furman Room
15:15
Exploring Algorithmic Fairness in Robust Graph Covering Problems

ABSTRACT. Algorithms are increasingly being deployed in settings with unanticipated challenges and complex social effects. Motivated by real-world deployment of social-network based suicide prevention and landslide risk management interventions, this paper focuses on robust graph covering problems subject to group fairness constraints. We show that, in the absence of fairness constraints, state-of-the-art algorithms for the robust graph covering problem result in biased node coverage: they tend to discriminate individuals (nodes) based on membership in traditionally marginalized groups. To mitigate this issue, we propose a novel formulation of the robust graph covering problem with group fairness constraints and a tractable approximation scheme applicable to real-world instances. We provide a formal analysis of the price of group fairness (PoF). Finally, we demonstrate the effectiveness of our approach on several real-world social networks. Our method yields competitive node coverage while significantly improving group fairness relative to state-of-the-art methods.

15:37
Optimization Under Uncertainty in Kidney Exchange
PRESENTER: Duncan Mcelfresh

ABSTRACT. Kidney exchange is a real-world barter market, where patients with end-stage renal disease swap their paired willing living donors. Exchanges account for roughly 10% of living donations in the US, and is growing rapidly worldwide. Computationally, kidney exchange is a set packing problem on a directed graph, consisting of both cyclic swaps and one-directional "chains" initiated by an unpaired donor. In general this problem is NP-hard, however many efficient mixed-integer linear formulations exist and are used in fielded exchanges. As with many real-world problems, kidney exchange is plagued with uncertainty; we focus on two types of uncertainty--in the quality and existence of a particular kidney transplant. We cover recent work by the authors addressing these forms of uncertainty, using approaches from robust optimization, stochastic optimization, and conditional value at risk. We briefly relate this work to two areas of ongoing research in kidney exchange: online algorithms for kidney exchange (accounting for future arrival and departures of patients and donors), and incorporating various human value judgements (which may conflict with each other, or with medical priorities).

15:59
Robust two echelon vehicle and drone routing for humanitarian relief operations

ABSTRACT. In this talk, we investigate the problem of using uncrewed autonomous vehicles for detecting demand and routing first aid items after a disaster. We present new formulations for a two-echelon capacitated vehicle routing problem as well as an efficient decomposition approach to decompose the problem into manageable subproblems. Our work proposes two approaches to handle demand uncertainty. In the first, we employ UAVs to provide communications capabilities to trapped populations prior to sketching any routes. To solve this problem, we provide a decomposition and route generation approach. In the second approach, we devise a two-stage robust two echelon capacitated vehicle routing problem which is then solved through a combination of an outer level column-and constraint-generation with an inner level column-generation method. Finally, we present computational results from a real-life setting: we use instances generated from the publicly available data from Hurricane Maria in Puerto Rico in 2017.

15:15-16:45 Session SC6: Decision Diagrams for Optimization
Location: Palmetto Room
15:15
Optimization Algorithms for the Bin Packing Problem with Minimum Color Fragmentation
PRESENTER: Saharnaz Mehrani

ABSTRACT. In the bin packing problem with minimum color fragmentation (BPPMCF), we are given a collection of items, each associated with a size and a color, and a bounded number of bins of fixed capacity, and our goal is to pack items with a common color in as few bins as possible. This problem has several real-world applications that we also explore in areas as diverse as group event seating and scheduling of operating rooms. We investigate theoretical aspects of the BPPMCF and present several exact optimization algorithms to solve the problem. Our key contributions include a new direct integer programming formulation, a branch-and-price algorithm, and alternative integer programming formulations based on two recursive decompositions that utilize decision diagrams. Using the results from an extensive computational evaluation on synthetic instances, we trained a decision tree algorithm that predicts which algorithm should be chosen to solve a given instance of the problem based on a collection of derived features. The structure of this decision tree allows us to extract data-driven insights about structural properties of the problem related to the weaknesses and strengths of the proposed algorithms. Our insights are also validated through experiments on the aforementioned applications on real-world data.

15:37
Data-Driven Optimization for Team Formation
PRESENTER: David Bergman

ABSTRACT. Allocating individuals to teams is a widely studied problem in management, both in the organizational- behavior and the operations-research literature. The former focuses on understanding team performance with little focus on forming teams, whereas the latter typically assumes team formation is based on pairwise utility that individuals receive for being matched with one another. This paper seeks to remedy the lack of cohesion in the literature by studying an end-to-end integrated decision-making framework for composing teams. Using data from a sample of teams competing in a longitudinal business simulation, we apply linear regression and neural network techniques to derive predictive models of team performance based on members' characteristics. We then study how to partition a collection of individuals into teams with best-expected performance based on the predictive models we train. We design various computational approaches that are specifically tailored to the predictive models we developed in this paper. The optimization algorithms fall into the broader context of optimization with objective functions learned through predictive modeling, which can be applied in myriad of contexts. We exhibit the efficiency of the proposed algorithms and the capacity to employ the framework to compose higher functioning teams than those naturally occurring in practice.

15:59
On parametric decision diagrams
PRESENTER: Leonardo Lozano

ABSTRACT. We examine decision diagrams for which the arc cost and/or the space of solutions represented are a function of an external parameter. We explore exact and relaxed constructions for such decision diagrams and show how to optimize over the parametric diagrams via a labeling algorithm for constrained shortest path problems. We present three application problems: a scheduling problem with negotiable due dates, a robust traveling salesman problem with time windows, and a robust knapsack problem for which the uncertainty set depends on the knapsack decision variables. Computational experiments show that the use of parametric decision diagrams for these problems results in a considerable reduction in computational time when compared to a standard decomposition approach based on integer programs.

16:21
Last mile scheduling under uncertainty

ABSTRACT. Shared mobility is revolutionizing urban transportation and sparked interest on optimizing the joint schedule of passengers in public transit and last-mile services. Scheduling algorithms must anticipate future requests and provision flexibility in order to be adopted in practice. In this work, we consider a two-stage stochastic programming formulation for scheduling a set of known passengers and sets of uncertain passengers that are realized from a finite set of scenarios. We present an optimization approach based on decision diagram. We obtain in minutes schedules for a thousand passengers that are robust and optimized with respect to scenarios involving up to a hundred of additional passengers.

15:15-16:45 Session SC7: MIP: valid inequalities, acceleration and application
Chair:
Location: Alexander Room
15:15
Scenario grouping and decomposition algorithms for chance-constrained programs
PRESENTER: Siqian Shen

ABSTRACT. A lower bound for a finite-scenario-based chance-constrained program is the quantile value corresponding to the sorted optimal objective values of scenario subproblems. This quantile bound can be improved by grouping subsets of scenarios at the expense of solving larger subproblems. The quality of the bound depends on how the scenarios are grouped. In this paper, we formulate a mixed-integer bilevel program that optimally groups scenarios to tighten the quantile bounds. For general chance-constrained programs, we propose a branch-and-cut algorithm to optimize the bilevel program, and for chance-constrained linear programs, a mixed-integer linear programming reformulation is derived. We also propose several heuristics for grouping similar or dissimilar scenarios. Our computational results show that optimal-grouping bounds are much tighter than heuristic bounds, resulting in smaller root node gaps and better performance of scenario decomposition for solving chance-constrained 0-1 programs. Moreover, the optimal grouping bounds can be greatly strengthened using larger group size.

15:37
Exploiting Multiple Optimal Solutions in Independent Set
PRESENTER: Chris Muir

ABSTRACT. The presence of multiple (near) optimal solutions can slow solution times for integer programming problems. We will discuss this phenomenon with respect to independent set problems. We will show how this problem can be avoided by the addition of dominance-type constraints that help remove many of these (near) optimal solutions.

15:59
Learning-Enhanced Integer Programming: A Hybrid Approach for Routinely Solved Discrete Optimization

ABSTRACT. In many practical settings, discrete optimization problems are solved repeatedly, sometimes daily or even more frequently, with only relatively minor changes in input data. Examples include the Unit Commitment Problem, solved multiple times per day to schedule energy production, and the Vehicle Routing Problem, solved daily to construct optimal routes. In this talk, we explore the usage of Machine Learning (ML) to accelerate the performance of Mixed-Integer Programming (MIP) solvers under these conditions. Based on statistical data collected during the solution of a representative set of training instances, we predict good initial feasible solutions, redundant constraints in the formulation and other information that helps the MIP solver to process similar (but yet unseen) instances faster. Computational experiments are presented for discrete optimization problems in different domains.

16:21
Cardinality Constrained Multilinear Sets
PRESENTER: Rui Chen

ABSTRACT. Motivated by a machine learning application, we study the problem of minimizing a multilinear function of binary variables with lower and upper bounds on the number of nonzero variables. We call the set of solutions of the standard linearization of this problem a cardinality constrained multilinear set. We give a complete polyhedral description of its convex hull when the multilinear terms in the problem have a nested structure. Some valid inequalities are generalized to the non-nested case.

17:00-18:30 Session SD1: Recent Advances in Multi-stage Stochastic Programming
17:00
Stochastic Dynamic Linear Programming and its Application

ABSTRACT. In this talk we present {\it stochastic dynamic linear programming} (SDLP) a sequential sampling algorithm for multistage stochastic linear programs with stagewise independent uncertainty. Algorithms designed to address multistage stochastic linear programming (MSLP) problems often rely upon scenario trees to represent the underlying stochastic process. When this process exhibits stagewise independence, sampling-based techniques, particularly the stochastic dual dynamic programming (SDDP) algorithm, have received wide acceptance. However, these sampling-based methods still operate with a deterministic representation of the problem which uses the so-called sample average approximation. On the other hand, the sequential sampling nature of SDLP allows for optimization of the decision process while concurrently assimilating new uncertainty information through sampling. A principal component embedded within SDLP is a piecewise-affine policy for identifying incumbent solutions used in the quadratic regularization terms at non-terminal stages. We will present the main results that establish the asymptotic convergence of value function approximations generated by the algorithm and solution mapping to an optimal policy, with probability one. We will also present a computational experiments that compare SDLP and SDDP algorithms on instances of a distributed storage control problem in a power system with renewable generation resources.

17:22
Adaptive Partition-based Stochastic Dual Dynamic Programming Algorithms For Multistage Stochastic Linear Programs
PRESENTER: Murwan Siddig

ABSTRACT. We extend the adaptive partition-based approach for solving two-stage stochastic programs with fixed recourse to the multistage setting. The proposed algorithm integrates the adaptive partition-based strategy with a popular approach for solving multistage stochastic programs, Stochastic Dual Dynamic Programming (SDDP), via different tree-traversal strategies in order to enhance its computational efficiency. Our numerical experiments on a hydro-thermal power generation planning problem show the effectiveness of the proposed approach.

17:44
A Data-Driven Approach for a Class of Stochastic Dynamic Optimization Problems

ABSTRACT. Dynamic stochastic optimization models provide a powerful tool to represent sequential decision-making processes. Typically, these models use statistical predictive methods to capture the structure of the underlying stochastic process without taking into consideration estimation errors and model misspecification. In this context, we propose a data-driven prescriptive analytics framework aiming to integrate the machine learning and dynamic optimization machinery in a consistent and efficient way to build a bridge from data to decisions. The proposed framework tackle a relevant class of dynamic decision problems comprising many important practical applications. The basic building blocks of our proposed framework are: (i) a Hidden Markov Model as a predictive (machine learning) method to represent uncertainty; and (ii) a distributionally robust dynamic optimization model as a prescriptive method that takes into account estimation errors associated with the predictive model and allows for control of the risk associated with decisions. Moreover, we present an evaluation framework to assess out-of-sample performance in rolling horizon schemes. A complete case study on dynamic asset allocation illustrates the proposed framework showing superior out-of-sample performance against selected benchmarks. The numerical results shows the practical importance and applicability of the proposed framework since it extracts valuable information from data to obtain robustified decisions with an empirical certificate of out-of-sample performance evaluation.

18:06
Distributionally Robust Stochastic Dual Dynamic Programming
PRESENTER: Daniel Duque

ABSTRACT. We consider a multi-stage stochastic linear program that lends itself to solution by stochastic dual dynamic programming (SDDP). In this context, we consider a distributionally robust variant of the model with a finite number of realizations at each stage. Distributional robustness is with respect to the probability mass function governing these realizations. We describe a computationally tractable variant of SDDP to handle this model using the Wasserstein distance to characterize distributional uncertainty. The full paper is available at http://www.optimization-online.org/DB_FILE/2019/12/7539.pdf

17:00-18:30 Session SD2: Global Optimization II
Chair:
Location: Gold Ballroom
17:00
Spectral relaxations and branching for MIQPs

ABSTRACT. Spectral relaxations are derived by convexifying the quadratic term in the objective. In this work, we propose two new relaxations in which the parameter is derived using the objective and equality constraints in the problem. Both relaxations are convex quadratic programs and is in the original space of the problem. Hence, the relaxation retains the sparsity properties of the original formulation and is computationally efficient to be incorporated in a branch and bound solver. We also show that spectral relaxations are semidefinite programming (SDP) relaxations. Finally, we also present a spectral branching strategy for nonconvex binary quadratic programs that is in the same spirit as strong branching. However, the proposed branching strategy is computationally cheap. The proposed algorithm has been implemented in BARON and outperforms state-of-the-art solvers.

17:22
Aspects of maximum-entropy sampling

ABSTRACT. Maximum-entropy sampling is an NP-hard nonlinear combinatorial-optimization problem arising in spatial statistics. Exact solution methods are based on branch-and-bound, using sophisticated methods for calculating upper bounds. I will survey some of these methods, and discuss two new contributions to the area. One is a 'mixing' method for combining upper bounds that are based on convex relaxation. The other is a new R package for instantiating instances of the problem from raw environmental-monitoring data.

17:44
Volumetric Analysis of Perspective Reformulations
PRESENTER: Daphne Skipper

ABSTRACT. We study MINLO (mixed-integer nonlinear optimization) formulations of the disjunction $x\in\{0\}\cup[l,u]$, where $z$ is a binary indicator of $x\in[l,u]$, and $y$ ``captures'' $f(x)$, which is convex on $[l,u]$. This model is useful when activities have operating ranges, we pay a fixed cost for carrying out each activity, and costs on the levels of activities are convex.

Using volume as a measure to compare convex bodies, we investigate a variety of continuous relaxations of this model, one of which is the convex-hull, achieved via the ``perspective reformulation'' inequality $y \geq zf(x/z)$. We compare this to various weaker relaxations, studying when they may be considered as viable alternatives. In the important special case when $f(x) := x^p$, for $p>1$, relaxations utilizing the inequality $yz^q \geq x^p$, for $q \in [0,p-1]$, are higher-dimensional power-cone representable, and hence tractable in theory. One well-known concrete application (with $f(x) := x^2$) is mean-variance optimization (in the style of Markowitz), and we carry out some experiments to illustrate our theory on this application.

18:06
Decomposing Optimization-Based Bounds Tightening Problems Via Graph Partitioning
PRESENTER: Bernard Knueven

ABSTRACT. Bounds tightening (a.k.a., domain reduction) is a critical refinement technique used in global optimization algorithms for mixed-integer nonlinear programming problems. Bounds tightening can strengthen convex relaxations and reduce the size of branch and bounds trees. An effective but expensive domain reduction technique is optimization-based bounds tightening (OBBT) where, typically, each variable is minimized and maximized subject to a convex relaxation of the original problem. We present a scalable bounds tightening algorithm that decomposes the majority of the bounds tightening problems into much smaller problems by graph partitioning. Our numerical results on a set of optimal power flow test problems and MINLPLib demonstrate that the proposed algorithm can be nearly as effective as traditional OBBT in terms of domain reduction. Furthermore, the algorithm is significantly more computationally efficient and scales much better with problem size.

17:00-18:30 Session SD3: In Memory of Dave Shanno II
Location: Card Room
17:00
Remembering Dave Shanno: Twenty Years of Collaboration

ABSTRACT. Dave Shanno and I became friends when we first met in the late 1980's. In the late 1990's, he and I and my student Hande Benson started meeting on a regular basis to discuss ongoing advances in generalizing interior point methods to solve nonlinear optimization problems. In the talk I will describe some of those collaborative engagements both from a mathematical and a social perspective.

17:22
In Memory of Dave Shanno: the History of OB1

ABSTRACT. In 1989, I began collaborating with Dave Shanno on primal-dual infeasible interior point methods for linear programming, which led to Dave, Roy Marsten and I being awarded the Beale-Orchard Hayes prize from the Mathematical Programming Society in 1991. In this talk, I will review the development of OB1 and primal-dual infeasible interior point methods, which have become foundational in all commercial linear programming packages.

17:44
In Memory of Dave Shanno: making an interior-point method for NLP converge

ABSTRACT. In this talk we describe a few challenges of a theoretical convergence analysis of an interior point method for nonlinear programming and discuss some ways of dealing with the challenges. The talk presents the results of a joint work with Hande Benson, David Shanno, and Robert Vanderbei performed in early 2000s.

18:06
Reflections on Dave Shanno
PRESENTER: Ariela Sofer

ABSTRACT. Throughout his career, Dave Shanno made significant contributions to both linear and nonlinear optimization. This presentation is a tribute to Dave and his life's work. The session organizers will also read remarks from Don Goldfarb.

17:00-18:30 Session SD4: Recent Advances in First-Order Methods
17:00
Radial Duality

ABSTRACT. Recently Renegar introduced a radial transformation which turns generic convex optimization problems into unconstrained, uniformly Lipschitz continuous ones. We generalize this transformation to avoid any reliance on convexity and thus apply to a variety of nonconvex functions. Our alternative construction shows this transformation is dual (i.e., self-inverse), which gives structural insights relating primal and radially dual problems. This radial duality provides a strong foundation analyzing radial variations of the subgradient and accelerated methods for both convex and nonconvex problems that do not require Lipschitz continuity or orthogonal projection.

17:22
Coordinate Descent Without Coordinates: Tangent Subspace Descent on Riemannian Manifolds
PRESENTER: David Gutman

ABSTRACT. Coordinate descent is a well-studied variant of the gradient descent algorithm which offers significant per-iteration computational savings for large-scale problems defined on Euclidean domains. On the other hand, optimization on a general manifold domain can capture a variety of constraints encountered in many problems, but are difficult to model with the usual Euclidean geometry. In this talk, we present an extension of coordinate descent to general manifold domains, and provide a convergence rate analysis for geodesically convex and non-convex smooth objective functions. Our key insight is to draw an analogy between coordinates in Euclidean space and tangent subspaces of a manifold, hence our method is called tangent subspace descent. We study the effect that non-trivial curvature of manifolds have on the convergence rate and its computational properties. We find that on a product manifold, we can achieve per-iteration computational savings analogous to the Euclidean setting. On such product manifolds, we further extend our method to handle non-smooth composite objectives, and provide the corresponding convergence rates.

17:44
Complexity of proximal augmented Lagrangian for nonconvex optimization with nonlinear equality constraints
PRESENTER: Yue Xie

ABSTRACT. We analyze worst-case complexity of a proximal augmented Lagrangian (proximal AL) framework for nonconvex optimization with nonlinear equality constraints. When a first-order (second-order) optimal point is obtained in the subproblem, an $\epsilon$ first-order (second-order) optimal point for the original problem can be guaranteed within $O(1 / \epsilon^{2−\eta})$ outer iterations (where $\eta$ is a user-defined parameter with $\eta \in [0, 2]$ for the first-order result and $\eta \in [1, 2]$ for the second-order result) when the proximal term coefficient $\eta$ and penalty parameter $\rho$ satisfy $\beta = O(\epsilon^\eta)$ and $\rho = O(1/ \epsilon^\eta )$, respectively. Further, when the subproblems are solved inexactly, the same order of complexity can be recovered by imposing certain verifiable conditions on the error sequence. We also investigate the total iteration complexity and operation complexity when a Newton-conjugate-gradient algorithm is used to solve the subproblems.

18:06
A Dual Approach for Optimal Algorithms in Distributed Optimization over Networks
PRESENTER: Cesar A. Uribe

ABSTRACT. We study the optimal convergence rates for distributed convex optimization problems over networks, where the objective is to minimize a sum of local functions of the nodes in the network. We provide optimal complexity bounds for four different cases, namely: the case when each function is strongly convex and smooth, the cases when it is either strongly convex or smooth and the case when it is convex but neither strongly convex nor smooth. Our approach is based on the dual of an appropriately formulated primal problem, which includes the underlying static graph that models the communication restrictions. Our results show distributed algorithms that achieve the same optimal rates as their centralized counterparts (up to constant and logarithmic factors), with an additional cost related to the spectral gap of the interaction matrix that captures the local communications of the nodes in the network. Application examples are shown for the problem of distributed computation of Wasserstein barycenters. Extensions to time-varying graphs will be discussed as well.

17:00-18:30 Session SD5: Social Good II
Location: Furman Room
17:00
Integer Programming Methods to Alleviate Human Trafficking in Maritime Settings
PRESENTER: Robert Curry

ABSTRACT. Human trafficking (HT) involves the use of coercion, fraud, or force to obtain some form of labor or commercial sex act. One type of HT involves forceful transportation of human beings across international borders. In this work, we study the illegal trafficking of people across borders in maritime settings. Identifying trafficking networks in this setting is complicated, because differentiating between individuals who are willfully migrating and those being trafficked is difficult. Trafficking victims are hard to identify due to language barriers and coercion from traffickers. Therefore, we focus on minimizing the illegal transportation of individuals into the United States across international waters by helping to make decisions in how to patrol highly-trafficked areas. To do so, we present an integer linear programming model to determine the area and schedule for patrol ships used to identify and interdict trafficking vessels. We present a small case study to display the efficacy of our methods. Due to the size of the ocean, this model can be quite large and difficult to solve for even some small networks. Thus, we finally discuss techniques to improve the solvability of our model in order to extend it to larger networks.

17:22
Impact of Bias on Hiring Decisions
PRESENTER: Jad Salem

ABSTRACT. Optimization and machine learning algorithms often work with real-world data that has been generated through complex socio-economic and behavioral processes. This data however is noisy, at times unbalanced and naturally encodes a variety of unquantifiable systemic biases. In this work, we focus on automated hiring decisions, where evaluations of candidates often incorporate implicit bias dependent on their demographic groups or the socio-economic opportunities available to them in the past. Motivated by this, we present the first mathematical analysis of the impact of bias in evaluations on the performance of online secretary algorithms. We assume that the candidates belong to disjoint groups, for instance defined by race, gender, nationality, age, etc. Each candidate has a “true” nonnegative utility Z \in R and an “observed” utility Z/\beta where \beta is unknown and group-dependent. We show that in the biased setting, group-agnostic algorithms for online secretary problem are suboptimal, often causing starvation of jobs for groups with \beta>1. We propose multiple group-aware algorithms that are able to achieve demographic parity (i.e., equal proportion of people hired from each group) by design, while obtaining a competitive ratio for maximizing true utility of hired candidates in a variety of settings.

17:44
Optimal Active Preference Elicitation via Adjustable Robust Optimization
PRESENTER: Phebe Vayanos

ABSTRACT. We consider the problem faced by a recommender system which seeks to offer a user with unknown preferences their favorite item among a potentially uncountable or countably infinite collection. Before making a recommendation, the system has the opportunity to elicit the user's preferences by making a moderate number of queries. Each query corresponds to a pairwise comparison, in the spirit of choice-based conjoint analysis. We take the point of view of risk- and regret-averse recommender systems which only possesse limited, set-based information on the user utility function. We investigate two settings: a) an offline elicitation setting, where all queries are made at once, and b) an online elicitation setting, where queries are selected sequentially over time in an online fashion. We propose exact robust optimization formulations of these problems which integrate the preference elicitation and recommendation phases and study the complexity of these problems. For the offline case, where the active preference elicitation problem takes the form of a two-stage robust optimization problem with decision-dependent information discovery, we provide an enumeration-based algorithm and also an equivalent reformulation in the form of a mixed-binary linear program which we solve via column-and-constraint generation. For the online setting, where the active preference learning problem takes the form of a multi-stage robust optimization problem with decision-dependent information discovery, we propose a provably optimal rolling horizon algorithm. We evaluate the performance of our approach on both synthetic and real data from the Los Angeles Homeless Services Authority. We elicit the preferences of policy-makers in terms of characteristics of housing allocation policies (measures of fairness, efficiency, and interpretability) to better match individuals experiencing homelessness to scarce housing resources. Our framework is shown to outperform the state-of-the-art techniques from the literature.

17:00-18:30 Session SD6: Algorithms for Graph Segmentation and Applications
Location: Palmetto Room
17:00
On Polyhedra of Two Influence Maximization Problems in Social Networks
PRESENTER: Cheng-Lung Chen

ABSTRACT. Influence maximization problem in social networks has attracted more attention recently because of the rise of online social media platforms. Various studies have been focused on the improvement of approximation algorithm to find the optimal seed location so as to maximize total number of activated nodes. In this study, we investigate the polyhedral structure of the maximum active set formulation and its extension. We derive valid inequalities for both single and dual bound threshold on nodes and demonstrate the validity and strength of the proposed inequalities.

17:22
A Branch-and-Cut Approach For Simple Graph Partitioning on Sparse Graphs
PRESENTER: Jose L. Walteros

ABSTRACT. The problem of simple graph partitioning is to partition the vertices of an edge weighted graph into mutually node-disjoint subgraphs, such that each subgraph contains no more than r nodes and the total weight of the edges connecting these subgraphs is minimized. We will present two dynamic programs for solving this problem on trees and series-parallel graphs. Furthermore, we will show how to use these algorithms as separation subroutines within a branch-and-cut framework to tackle the problem on general graphs.

17:44
Centralities for Networks with Consumable Resources

ABSTRACT. Identification of influential nodes is an important step in understanding and controlling the dynamics of information, traffic and spreading processes in networks. As a result, a number of centrality measures have been proposed and used across different application domains. At the heart of many of these measures, lies an assumption describing the manner in which traffic (of information, social actors, particles, etc.) flows through the network. For example, some measures only count the shortest paths while others consider random walks. This paper considers a spreading process in which a resource necessary for transit is partially consumed along the way while being refilled at special nodes on the network. Examples include fuel consumption of vehicles together with refueling stations, information loss during dissemination with error-correcting nodes, and consumption of ammunition of military troops while moving. We propose generalizations of the well-known measures of betweenness, random walk betweenness, and Katz centralities to take such a spreading process with consumable resources into account. In order to validate the results, experiments on real-world networks are carried out by developing simulations based on well-known models such as Susceptible-Infected-Recovered and congestion with respect to particle hopping from vehicular flow theory. The simulation-based models are shown to be highly correlated to the proposed centrality measures.

17:00-18:30 Session SD7: Applications of discrete optimization in network, constraint, and stochastic programs
Location: Alexander Room
17:00
Achieving Consistency with Cutting Planes
PRESENTER: Danial Davarnia

ABSTRACT. Cutting planes accelerate branch-and-bound search primarily by cutting off fractional solutions of the linear programming (LP) relaxation, resulting in tighter bounds for pruning the search tree. Yet cutting planes can also reduce backtracking by excluding inconsistent partial assignments that occur in the course of branching. We define concepts of consistency that are useful in a branch-and-bound context from the lens of integer programming. We present a theoretical framework for studying these concepts, their connection with the convex hull and cutting planes, and their power to exclude infeasible partial assignments.

17:22
Network Models for Multiobjective Discrete Optimization
PRESENTER: Carlos Cardonha

ABSTRACT. This work provides a novel framework for solving multiobjective discrete optimization problems with an arbitrary number of objectives. Our framework formulates these problems as network models, in that enumerating the Pareto frontier amounts to solving a multicriteria shortest path problem in an auxiliary network. We design techniques for exploiting the network model in order to accelerate the identification of the Pareto frontier, most notably a number of operations to simplify the network by removing nodes and arcs while preserving the set of nondominated solutions. We show that the proposed framework yields orders-of-magnitude performance improvements over existing state-of-the-art algorithms on five problem classes containing both linear and nonlinear objective functions.

17:44
A stochastic mixed integer linear program for agricultural land use optimization
PRESENTER: Sarah Ryan

ABSTRACT. Optimizing land use decisions over a watershed for agricultural profitability, while limiting the nutrient pollution in waterways, involves a long study horizon and nonlinear relationships among decisions, precipitation levels, yields and nutrient losses. We employ binary decision variables to linearize the relationships and exploit the cyclical nature of solutions under Markovian precipitation levels to substitute a short time horizon. Solutions to the resulting approximating stochastic mixed integer linear program reveal appropriate compensation to landowners for taking socially optimal land use decisions.