previous day
all days

View: session overviewtalk overview

08:30-09:00Welcome coffee
09:10-10:00 Session 9: Plenary 3: Laura Palagi
Location: Auditorium
Ease-controlled Random Reshuffling Gradient Algorithms for nonconvex finite sum optimization

ABSTRACT. We consider minimizing the sum of a large number of smooth and non-convex functions, which is a typical problem encountered when training deep neural networks on huge datasets. We propose ease-controlled modifications of the traditional online gradient schemes, either incremental (IG) or random reshuffling (RR) gradient methods, which converges to stationary points under weak and basic assumptions. Indeed, besides the compactness of level sets, we require the lonely assumption of Lipschitz continuity of the gradients of the component functions. The algorithmic schemes control the IG/RR iteration by using a watchdog rule and a derivative-free line search that activates only sporadically to guarantee convergence. The schemes also allow controlling the updating of the learning rate used in the main IG/RR iteration, avoiding the use of preset rules, thus overcoming another tricky aspect in implementing online methods. We also propose a variant to further reduce the need for the objective function computation. We performed an extensive computational test using different Deep Networks Architectures and a benchmark of large datasets of varying sizes and we compare performance to state-of-the-art optimization methods for ML. The tests show that the schemes are efficient, in particular when dealing with ultra-deep architectures.

Joint paper with Corrado Coppola, Giampaolo Liuzzi, Ruggiero Seccia, ruggiero.seccia

10:00-10:30Coffee Break
10:30-12:00 Session 10A: Invited session: Mean field games and control, part 1, organized by Guilherme Mazanti and Laurent Pfeiffer
Location: Auditorium
Mean-field games among teams
PRESENTER: Aditya Mahajan

ABSTRACT. In this paper, we present a model of a game among teams. Each team consists of a homogeneous population of agents. Agents within a team are cooperative while the teams compete with other teams. The dynamics and the costs are coupled through the empirical distribution (or the mean field) of the state of agents in each team. This mean-field is assumed to be observed by all agents. Agents have asymmetric information (also called a non-classical information structure). We propose a mean-field based refinement of the Team-Nash equilibrium of the game, which we call mean-field Markov perfect equilibrium (MF-MPE). We identify a dynamic programming decomposition to characterize MF-MPE. We then consider the case where each team has a large number of players and present a mean-field approximation which approximates the game among large-population teams as a game among infinite-population teams. We show that MF-MPE of the game among teams of infinite population is easier to compute and is an $\varepsilon$-approximate MF-MPE of the game among teams of finite population.

Existence and Lipschitz regularity of the trajectories minimizing the total variation in a congested setting
PRESENTER: Annette Dumas

ABSTRACT. The problem I will present is motivated by the study of a Mean Field Game model whose theory was simultaneously introduced by Lasry and Lions and by Caines, Huang and Malhamé in 2006. The model consists in studying a population in a city where each agent jumps to move from one place to another. Each inhabitant minimizes a cost composed of the number of jumps and an increasing function of the density of the population. The solution to this problem is a probability measure on the trajectories which is a Nash equilibrium.

The probability measure Q on the trajectories can be seen as a trajectory of the density of the population which leads us to the minimization of a variational problem which depends on the $L^1$-norm of the speed of the density which is linked to the number of jumps and the additional cost which is associated with the increasing function of the density. Density constraints are also added to the problem. We will see that the solution exists, is unique and is Lipschitz in time, despite the discontinuous trajectories taken by each agent. With additional hypothesis on the data, boundedness or continuity in space can be obtained with Dirichlet conditions in time.

The aspect of the solutions are given by the Euler-Lagrange equations which show that in space, either the solution is constant, or it follows the critical points of the cost. Numerical simulations are carried out on a simple example by using the fast dual proximal gradient method from Beck which validates the theoretical framework.

A Mean Field Game Model for Renewable Investment under Long-Term Uncertainty and Risk Aversion
PRESENTER: Celia Escribe

ABSTRACT. We consider a stylized model for investment into renewable power plants under long-term uncertainty. We model risk-averse agents facing heterogeneous weather conditions and a common noise including uncertainty on demand trends, future fuel prices and the average national weather conditions. The objective of each agent is to maximize multistage profit by controlling investment in discrete time steps. We analyze this model in a noncooperative game setting with N players, where the interaction among agents occurs through the spot price mechanism. Our model extends to a mean field game with common noise when the number of agents is infinite. We prove that the N-player game admits a Nash equilibrium. Moreover, we prove that under proper assumptions, any sequence of Nash equilibria to the N-player game converges to the unique solution of the MFG game. Finally, our numerical experiments highlight the impact of the risk aversion parameter and emphasize the difference between our model which captures heterogeneity and representative agent models.

10:30-12:30 Session 10B: Stochastic optimization
Location: A1.140
An entropy penalized approach for stochastic optimal control problems
PRESENTER: Thibaut Bourdais

ABSTRACT. We propose an alternative technique to dynamic programming for solving stochastic control problems. We consider a weak formulation that is written as an optimization (minimization) problem on the space of probabilities. We then propose a regularized version of this problem obtained by splitting the minimization variables and penalizing the entropy between the two probabilities to be optimized. We show that the regularized problem provides a good approximation of the original problem when the weight of the entropy regularization term is large enough. Moreover, the regularized problem has the advantage of giving rise to optimization problems that are easy to solve in each of the two optimization variables when the other is fixed. We take advantage of this property to propose an alternating optimization algorithm whose convergence to the infimum of the regularized problem is shown. The relevance of this approach is illustrated by solving a high-dimensional stochastic control problem aimed at controlling consumption in electrical systems.

A control problem with random state constraints in probabilistic and almost-sure form
PRESENTER: Rene Henrion

ABSTRACT. We consider a simple PDE-constrained optimization problem under uncertainty governed by a Poisson's equation with Dirichlet boundary conditions. The right-hand side is defined by a control plus a random source term. The resulting random states are supposed to stay within certain limits (uniformly over the domain) with given high probability. We provide an exact formula for the Clarke subdifferential of the associated probability function assigning to each control the probability of satisfying the state constraints. This formula, in particular its reduction in the differentiable case, can be used both for algorithmic purposes and for the derivation of necessary and sufficient optimality conditions. The latter require some additional results on convexity. Numerical results in 1 and 2 D are illustrated. Special emphasis is laid on the consideration of almost-sure constraints (the extreme situation for probability equal to one).

Improving stochastic control by ignoring randomness
PRESENTER: Andy Philpott

ABSTRACT. It is widely regarded as a cardinal sin in optimization under uncertainty to replace random quantities by their expectations before applying some solution technique. In some settings taking expectations is justified through a certainty equivalence principle, but it is easy to show that this never makes solutions strictly better than those obtained by solving the stochastic optimization problem. However, in nearly every practical setting, a probability distribution must be estimated from data, assumed to be a sample drawn from the distribution. Here, optimizing with an approximate distribution based on the sample can lead to strictly worse outcomes than using a model based only on the average of the sample values. We describe this phenomenon and illustrate it using a simple inventory control example.


Duality of upper bounds in stochastic dynamic programming
PRESENTER: Vincent Leclere

ABSTRACT. For multistage stochastic programming problems with stagewise inde- pendent uncertainty, dynamic programming algorithms computes approximations of the value functions at each stage. The Stochastic Dual Dynamic Programming (SDDP) algorithm provides polyhedral lower bounds, in the spirit of the Nested Benders algorithm. On the other hand, non- statistical upper bounds took a longer time to appear. One approach uses the primal dynamic programming recursion to build inner approximations of the value functions, while another builds lower approximations for the conjugate of the value functions. Unfortunately, the dynamic program- ming recursion for the conjugate value functions does not decompose over uncertainties, which suggests a Lagrangian relaxation. We prove that this Lagrangian relaxation corresponds exactly to the primal upper bounds for a natural choice of multipliers

10:30-12:30 Session 10C: Machine learning-based methods in combinatorial optimization
Location: A1.116
Decision-Aware Learning with REINFORCE enables new Architectures and Applications

ABSTRACT. Using Combinatorial Optimization (CO) layers in Machine Learning (ML) pipelines is a recent area of interest (Kotary et al. 2021). These new hybrid methods provide fast differentiable heuristics for difficult CO problems. For instance, it can be used to tackle hard stochastic transportation problems, such as the stochastic vehicle scheduling or the dynamic vehicle routing problems.

However, current state-of-the-art methods face several learning limitations: most of them only support CO layers with linear objective, or need (near) optimal solutions to imitate in a supervised learning way. This limits a lot the possible applications and pipeline architectures for these methods. We introduce what we call perturbed oracles, which can transform any CO oracle f into a distribution on the feasible solution space. This is done by perturbing the input θ of f using a distribution p(η|θ). It then turns the learning problem into a moment problem with non-zero gradients. Meaningful gradients can be computed by using the log-derivative/ REINFORCE trick, a well-known technique in ML and Reinforcement Learning.

This flexible approach unlocks new applications and architectures, such as learning constraints for the two-stage knapsack, and learning to sparsify graphs to scale the stochastic vehicle scheduling problem on large instances. Our numerical experiments show the efficiency of the approach.

Hyperbolic deep reinforcement learning for repeated exact combinatorial optimization
PRESENTER: Paul Strang

ABSTRACT. Mixed-integer linear programs, or MILPs, are NP-hard combinatorial optimization problems. Mixed-integer programming solvers developed over the last decades have relied on the Branch and Bound (B&B) algorithm to efficiently explore the space of solutions while guaranteeing the optimality of the returned solution. These solvers are based on complex heuristics fine-tuned by experts on large MIP datasets to obtain the best average performance. In the context of real-world applications, in which similar instances with slightly varying inputs are solved on a regular basis, there is a huge incentive to reduce the solving time by learning effi- cient tailor-made heuristics. Directly extending the work of Marc Etheve, this thesis proposes to discover new B&B heuristics outperforming existing solvers using reinforcement learning (RL) algorithms.

A Machine Learning based Branch-Cut-and-Benders for Dock Assignment and Truck Scheduling Problem in Cross-Docks

ABSTRACT. In this work, we study the dock-door assignment and truck scheduling problem in crossdocks with an intraday planning horizon. This timeframe allows us to view the problem as a repeating operation with a frequency of 24h. In practice, this repeating process follows a certain distribution, which is largely sustained even if we extend the horizon. While several modeling approaches and efficient solution algorithms have been proposed for various problem variations, the utilization of decomposition techniques in exact mathematical programming methods has been the most effective. Surprisingly, none of these techniques have taken advantage of the repeating patterns inherent in the problem. We start with a recently proposed compact model that is well-designed and can be exploited in a primal (Benders) decomposition technique, although it cannot be directly used to solve a practical-sized problem. We show that its modeling deficiencies can be fixed and propose a Benders decomposition framework together with several Alternative Objective Functions (AOFs) to generate customized Benders cuts, along with other valid inequalities that can be identified and separated. A classifier is trained to identify the most efficient AOF to use at different stages of the Benders iterations, to help avoid saturation of the master problem with dominated Benders cuts. Our extensive computational experiments confirm the significant performance improvement in the comparable decomposition framework.

Neural Benders Decomposition for Mixed Integer Programming
PRESENTER: Shahin Gelareh

ABSTRACT. In this study, we propose an imitation learning framework designed to enhance the Benders decomposition method. Our primary focus is addressing degeneracy in subproblems with multiple dual optima, among which Magnanti-Wong technique identifies the non-dominant solution. We develop two policies. In the first policy, we replicate the Magnanti-Wong method and learn from each iteration. In the second policy, our objective is to determine a trajectory that expedites the attainment of the final subproblem dual solution. We train and assess these two policies through extensive computational experiments on two real cases, confirming that the presence of such learned policies significantly enhances the efficiency of the decomposition process.

10:30-12:30 Session 10D: Invited session: Nonlinear Algebra and its Application, organized by Matías Bender and Elias Tsigaridas
Location: A1.133
Topology of spatiotemporal trajectories

ABSTRACT. In recent years, multiparameter persistent homology (MPH) has been developed to address the limitations of traditional single-parameter persistence. While much progress has been made on the theoretical aspects of MPH, practical applications have mostly been limited to the special case of two parameters. Building on recent developments in the computation of MPH in more than two parameters, we propose and implement a new Gröbner basis algorithm for the computation of multiparameter persistence landscapes in this setting. We apply this work to compute landscapes of three-parameter persistence modules built on spatiotemporal data arising in biology.

Semidefinite Games

ABSTRACT. We introduce and study the class of semidefinite games, which generalizes bimatrix games and finite N-person games, by replacing the simplex of the mixed strategies for each player by a slice of the positive semidefinite cone in the space of real symmetric matrices.

For semidefinite two-player zero-sum games, we show that the optimal strategies can be computed by semidefinite programming. Furthermore, we show that two-player semidefinite zero-sum games are almost equivalent to semidefinite programming, generalizing Dantzig's result on the almost equivalence of bimatrix games and linear programming.

For general two-player semidefinite games, we prove a spectrahedral characterization of the Nash equilibria. Moreover, we give constructions of semidefinite games with many Nash equilibria. In particular, we give a construction of semidefinite games whose number of connected components of Nash equilibria exceeds the long standing best known construction for many Nash equilibria in bimatrix games, which was presented by von Stengel in 1999.

Minimal Sparsity for Scalable Moment-SOS Relaxations of the AC-OPF Problem
PRESENTER: Adrien Le Franc

ABSTRACT. AC-OPF (Alternative Current - Optimal Power Flow) aims at minimizing the operating costs of an AC power grid. It is well-known to be a difficult optimization problem in general, as it reformulates as a nonconvex QCQP (Quadratically Constrained Quadratic Program). The moment-SOS (Sums-Of-Squares) hierarchy has proved relevant to solve AC-OPF instances to global optimality. However, obtaining the convergence of the hierarchy may requires to go beyond the first step of the involved sequence of SDP (Semidefinite Programming) relaxations (e.g. up to the third step in some reference test cases), and thus to solve semidefinite programs whose size grows drastically at each step of the hierarchy. Thus, the application of such relaxations to large scale AC-OPF problems (with thousands of variables) remains a challenging computing task.

Large polynomial optimization problems can be tackled efficiently if they are sufficiently sparse. In this talk, we present a new sparsity pattern, that we call minimal sparsity, inspired by the specific structure of the AC-OPF problem. We show that minimal sparsity enables the computation of second order moment-SOS relaxations on large scale AC-OPF instances with far less computing resources — i.e. RAM and time — than the standard correlative sparsity pattern. Experimentally, we observe that it also provides tight lower bounds to certify the global optimality of AC-OPF solutions.

10:30-12:00 Session 10E: Invited session: Robust Optimization and Applications, organized by Julien Grand-Clément
Location: Amphi II
Differential Privacy via Distributionally Robust Optimization

ABSTRACT. In recent years, differential privacy has emerged as the de facto standard for sharing statistics of datasets while limiting the disclosure of private information about the involved individuals. This is achieved by randomly perturbing the statistics to be published, which in turn leads to a privacy-accuracy trade-off: larger perturbations provide stronger privacy guarantees, but they result in less accurate statistics that offer lower utility to the recipients. Of particular interest are therefore optimal mechanisms that provide the highest accuracy for a pre-selected level of privacy. To date, work in this area has focused on specifying families of perturbations a priori and subsequently proving their asymptotic and/or best-in-class optimality.

In this work, we develop a class of mechanisms that enjoy non-asymptotic and unconditional optimality guarantees. To this end, we formulate the mechanism design problem as an infinite-dimensional distributionally robust optimization problem. We show that the problem affords a strong dual, and we exploit this duality to develop converging hierarchies of finite-dimensional upper and lower bounding problems. Our upper (primal) bounds correspond to implementable perturbations whose suboptimality can be bounded by our lower (dual) bounds. Both bounding problems can be solved within seconds via cutting plane techniques that exploit the inherent problem structure. Our numerical experiments demonstrate that our perturbations can outperform the previously best results from the literature on artificial as well as standard benchmark problems.

Randomized Assortment Optimization
PRESENTER: Zhengchao Wang

ABSTRACT. When a firm selects an assortment of products to offer to customers, it uses a choice model to anticipate their probability of purchasing each product. In practice, the estimation of these models is subject to statistical errors, which may lead to significantly suboptimal assortment decisions. Recent work has addressed this issue using robust optimization, where the true parameter values are assumed unknown and the firm chooses an assortment that maximizes its worst-case expected revenues over an uncertainty set of likely parameter values, thus mitigating estimation errors. In this paper, we introduce the concept of \emph{randomization} into the robust assortment optimization literature. We show that the standard approach of deterministically selecting a single assortment to offer is not always optimal in the robust assortment optimization problem. Instead, the firm can improve its worst-case expected revenues by selecting an assortment randomly according to a prudently designed probability distribution. We demonstrate this potential benefit of randomization both theoretically in an abstract problem formulation as well as empirically across three popular choice models: the multinomial logit model, the Markov chain model, and the preference ranking model. We show how an optimal randomization strategy can be determined exactly and heuristically. Besides the superior in-sample performance of randomized assortments, we demonstrate improved out-of-sample performance in a data-driven setting that combines estimation with optimization. Our results suggest that more general versions of the assortment optimization problem---incorporating business constraints, more flexible choice models and/or more general uncertainty sets---tend to be more receptive to the benefits of randomization.

Decarbonizing OCP

ABSTRACT. I will present our collaboration with OCP, one of the world's largest fertilizer producers, to decarbonize its electricity supply through the installation of solar panels and batteries. We develop an optimization framework that minimizes OCP's time-discounted investment cost plus the cost of satisfying its remaining demand via the national grid. We immunize our model against deviations between forecast and realized solar generation output via a combination of robust and distributionally robust optimization, to account, respectively, for variability in daily solar generation and in seasonal weather patterns induced by climate change. OCP is currently designing its renewable investment strategy, using insights gleaned from our optimization model, and has pledged to invest over 12.3 billion USD in its green initiative by 2027, a subset of which involves decarbonization. Under a 2 billion USD investment, the proposed methodology reduces the carbon emissions which arise from OCP's electricity needs by 95% while generating a net present value of 200 million USD over 20 years. The methodology could be applied to decarbonize other industrial consumers in a financially sustainable fashion, which is imperative in the developing world.

10:30-12:00 Session 10F: Invited session: Optimization in telecom networks, part 1, organized by Eric Gourdin, David Coudert, Sébastien Martin and Dritan Nace
Location: Amphi I
Integer linear formulations for the maximum flow blocker problem
PRESENTER: Isma Bentoumi

ABSTRACT. Given a directed graph with capacities and blocker costs associated with its arcs, we study the maximum flow blocker problem (MFBP) which asks to find a minimum-cost subset of arcs to be removed from the graph in such a way that the remaining maximum-flow value does not exceed a given threshold. The MFBP has applications in telecommunication networks and in the monitoring of civil infrastructures, among others. To the best of our knowledge, the MFBP has not been addressed in the literature but a closely-related problem, called maximum flow interdiction problem (MFIP), has been largely studied. This problem consists in finding a subset of arcs to be removed from the graph such that the remaining maximum flow is minimized. The set of interdicted arcs is limited by a budget. We propose two integer linear programming (ILP) formulations for the MFBP which are used to derive the first exact algorithms to optimally solve the problem. The first one relies on the bilevel structure of the MFBP. It has an exponential number of constraints and accordingly it is solved via a Branch-and-Cut algorithm. The second one has a polynomial number of variables and constraints and it is solved via an ILP solver. This formulation is obtained thanks to a property that we have demonstrated, which establishes a relationship between the MFBP and the MFIP. The two formulations are compared through an extensive computational study. Our tests aim at evaluating the performance of the exact algorithms and at determining the features of the instances which can be solved to proven optimality.

The controller placement problem

ABSTRACT. This study addresses the critical issue of controller placement in networks to fortify against cyber-attacks. Node-to-node connections, essential conduits for inter-node communication, form the backbone of network functionality. Controllers, strategically positioned at select nodes play a pivotal role in facilitating the establishment of these connections. Node attacks can take various forms, one of the most frequent being the Denial-of-Service (DoS) Attacks The research centers on the deployment of controllers to safeguard the entire network. The objective is to minimize communication loss in the event of an attack. This problem aligns with the Maximal Coverage Location Problem (MCLP) category, acknowledged for its NP-hard nature. The study extends this framework to the Equitable Controller Location Problem aiming to ensure equitable protection for all nodes under limited controller availability. A novel mathematical model is introduced for controller placement, leveraging lexicographic optimization principles tailored to this context. The study introduces and compares both lexicographic and proportional fairness approaches for controller deployment.

Latency-Sensitive Service Chaining with Security Isolation Constraints
PRESENTER: Yannick Carlinet

ABSTRACT. Multi-access edge computing (MEC) is a distributed computing architecture that brings computing and storage resources closer to the network edge. It enables applications and services to be hosted and processed at the network edge, rather than in centralized data centers or cloud platforms. By bringing computing closer to the end-users, MEC can reduce network latency, improve application performance, and support emerging use cases that require low latency and high bandwidth, such as augmented reality, virtual reality, or autonomous vehicles. MEC can also enable new business models and revenue streams for network operators and service providers, by providing value-added services, such as content caching, video optimization, or network slicing.

However, there are also some challenges that need to be addressed, including:

  • Resource Constraints: Edge nodes typically have limited computing and storage resources compared to centralized data centers, which can limit the types and scale of applications that can be deployed on edge nodes.
  • Security: MEC introduces new security challenges, such as securing distributed computing resources and protecting sensitive data at the edge of the network. More specifically, an exploit can enable a rogue (or compromised) application to have access to non-authorized parts of RAM, thus allowing to steal or poison the data of the other applications running on the same physical host. In a cloud environment, applications with greater security requirements would run on dedicated hardware. However, this is no longer possible in MEC because of the scarcity of resources.
  • Orchestration: MEC requires the management and orchestration of resources across multiple edge nodes and clusters, while meeting the requirements of service chains, including security and end-to-end latency constraints.

We focus on the orchestration challenges in MEC. In particular, this work brings the following contributions:

  • Description and modelling of an optimization problem, called SLSCP (Security and Latency-aware Service Chaining Problem) ;
  • Algorithms that provide solutions for SLSCP ;
  • Set-up of a platform that emulates a realistic Edge Computing environment and integration of the designed algorithms
10:30-12:30 Session 10G: Numerical methods for optimal control and games
Location: A1.134
Neural networks for deterministic two-player games

ABSTRACT. We consider the value of some deterministic optimal control problem for two-player games. We propose time-marching schemes and deep neural network approximations, using a Lagrangian approach of \cite{bok-pro-war-22} which was developed in the context of one-player optimal control problems. Our analysis aims to give error estimates in an average norm for the considered value. Some academic numerical examples related to reachability will be considered, where level set functions are used in order to represent the reachable sets, and SGDA-type algorithms are utilized for min/max optimization (see e.g. \cite{raz-22}).

Semiconcave Dual Dynamic Programming and Its Application to N-body Problems
PRESENTER: Shanqing Liu

ABSTRACT. We introduce a novel algorithm for numerically finding the value function, along with the optimal trajectory, for a class of finite horizon deterministic optimal control problems of the form \begin{equation}\label{problem} \begin{aligned} &\max \ \int_0^T \ell(x(s),u(s))ds +\phi(x(T)) \\ &\text{s t. } \left\{ \begin{aligned} & \dot{x}(s) = f(x(s),u(s)) \ , \\ & x(s) \in X, \ u(s) \in U , \forall \ s\in[0,T] \ , \\ \end{aligned} \right. \end{aligned} \end{equation} where $x(0)=x_0 $ is given, and the reward function $\ell$ is only required to be semiconcave w.r.t. $x$.

We look for a tight approximation of the value function along the optimal trajectories starting from a given initial point $x_0$. Our approach relies on approximating the value function from above, in a given time horizon, by a minimum of quadratic ``basis'' functions. However, the evolutionary semigroup associated with the maximization problem is max-plus linear instead of min-plus linear. We show that we can propagate the quadratic ``basis functions'' by solving a dual problem, which gives a new upper approximation of the value function. The trajectory is then updated to an optimal trajectory derived from the current approximate value function.

This method is inspired by, and can be thought of as, a generalization of the Dual Dynamic Programming algorithm~\cite{ref1,ref2}, adapted to {\em semiconcave} problems. We show that our method converges to the global maximum under certain regularity assumptions. We present numerical benchmarks, on $N$-body problems.

Convergence of a Two-Player Version of Macqueen's k-means Algorithm

ABSTRACT. We study pure perfect Bayesian equilibria (PBE) in sender-receiver games with finitely many types for the sender. PBE in sender-receiver games are characterized by partitions of the set of sender's types, in which no type prefers to be pooled with types it is not currently pooled with, given the decisions of the receiver, optimally chosen at each cell of the partition. We propose a family of algorithms that converge to a PBE, in a standard class of sender-receiver games, reminiscent of Macqueen's version of the k-means algorithm.

In Macqueen's algorithm, at every step: (i) a single data is moved from its cell to a cell with a closer mean, and (ii) cell-contingent means are readjusted accordingly. In our sender-receiver adaptation, the sender is in charge of moving a type from its cell to a cell it prefers (i), while the receiver readjusts the cell-contingent decisions accordingly (ii). Thus, a stopping partition amounts to a PBE.

Since players may have different objective functions, the simple convergence argument invoked in the k-means context (monotonicity of stages (i) and (ii) with regard to a single objective function) no more holds. Instead, we employ a combinatorial argument to achieve convergence, for every initial partition and every path taken by the algorithm.

We do this for a standard family of utility functions for the players. The family covers, but goes much beyond, the usual univariate k-means Euclidean case. However, our argument crucially relies on the order on types and actions.

Maximum of Reachable Values of Discrete-Time Systems

ABSTRACT. In this presentation, we are interested in finding a reachable value of a discrete-time dynamical system that maximizes a real valued function. This type of maximization problems arises in the analyze of the robustness or the performance of the system; for inverse problems or even in formal verification of systems. The optimal value in this context can be viewed as the value which penalizes the most the system with respect to some criteria. The difficulty of the problem lies in the fact that optimal solutions are composed of a vector and and integer. Standard mathematical assumptions simplify the study of the optimal vector solution. The integer part is more difficult to handle but an upper-bound can be computed from a Lyapunov function of the system when there exists. The presentation includes a illustration of the result on affine systems.

10:30-12:30 Session 10H: Optimization and statistics
Location: A1.122
Statistical Empirical Risk Minimization: The Pursuit of Generalization

ABSTRACT. In this talk, the problem of empirical risk minimization (EMR) is studied using tools from statistics, measure theory, and information theory to establish generalization guarantees for machine learning algorithms. More specifically, the ERM with regularization by f-divergences, often referred to as statistical ERM, is studied assuming that priors might be in the form of a σ-finite measure (the reference measure), and not necessarily a probability measure. Under this assumption, which leads to a generalization of the statistical ERM problem allowing a larger degree of flexibility for incorporating prior knowledge, numerous relevant properties are stated. Among these properties, the solution to the statistical ERM, if it exists, is shown to be a unique probability measure for several choices of the function f. In particular, it is shown that the solution is often mutually absolutely continuous with the reference measure, which introduces a strong inductive bias that dominates the evidence provided by the training data [1, 2]. Interestingly, such a solution exhibits a probably-approximately-correct guarantee for the ERM (without regularization) independently of whether such a problem possesses a solution. In the case in which f(x) = xlog(x), for a fixed training dataset, the empirical risk is shown to be a sub-Gaussian random variable when the models are sampled from the solution to the statistical ERM problem. The generalization capabilities of the solution to this problem (the Gibbs algorithm) are studied via the sensitivity of the expected empirical risk to deviations from such a solution towards alternative probability measures. Finally, an interesting connection between sensitivity, generalization error, and lautum information is stablished.The talk ends by highlighting the impact of the asymmetry of f-divergences in statistical ERM. In particular, regularization by relative entropy (KL divergence) is studied in both directions. These regularizations are coined Type-I and Type-II, respectively. Finally, it is proved that Type-II regularization is equivalent to Type-I regularization with an appropriate transformation of the empirical risk function [3].

Outlier Detection in Regression: Conic Quadratic Formulations

ABSTRACT. In many applications, when building regression models, it is important to account for the presence of outliers, i.e., corrupted input data points resulting, e.g., from recording errors. Problems consisting in simultaneously performing linear regression and discarding outliers may be formulated as mixed-integer optimization problems involving cubic terms, each given by the product of a binary variable and a quadratic term of the continuous variables. Existing approaches in the literature typically rely on the linearization of the cubic terms using big-M constraints. However the latter lead to weak relaxations and poor performance in practice. In this work, we derive stronger second-order conic relaxations not involving big-M constraints. Our computational experiments indicate that the proposed formulations are several orders of magnitude faster than existing big-M formulations in the literature for this problem.

Variational inference with λ-exponential families
PRESENTER: Thomas Guilmeau

ABSTRACT. Variational inference methods aim at approximating difficult probability distributions by minimizing a statistical divergence over a family of parametric distributions. Exponential families, which include Gaussian distributions, are often chosen as the approximating family in such problems. This is because when the statistical divergence is the Kullback-Leibler divergence, it is possible to describe the optimal solutions using convex analysis techniques. However, it is often necessary to go beyond exponential families. For instance, heavy-tailed distributions cannot be represented as an exponential family.

Recently, λ-exponential families have been proposed to extend exponential families [1]. They include in particular Student and Cauchy distributions. The authors also introduced the theory of λ-duality to study λ-exponential families. This theory is a generalization of the convex duality theory using a suitable non-linear coupling instead of the standard scalar product. This is related to recent works in optimization [2]. Still, not much is known about these families, and there is a lack of results indicating how to use them to solve statistical tasks.

We study variational inference problems over λ-exponential families. To this end, we propose a theoretical framework based on λ-duality and extend the approach of [1]. This enables us to recover known variational inference results for standard exponential families and generalize them. In particular, we calculate generalized subgradients with our non-linear couplings for some functions of interest. This allows us to give novel sufficient optimality conditions for variational inference problems on λ-exponential families. We also propose a new proximal-like algorithm to address these problems. We illustrate numerically the usefulness of these results on problems involving heavy-tailed distributions.

[1] T.-K. L. Wong and J. Zhang, Tsallis and Rényi Deformations Linked via a New λ-Duality, IEEE Transactions on Information Theory, 68(8):5353-5373, 2022.

[2] A. Le Franc and J.-P. Chancelier and M. De Lara, The Capra-subdifferential of the ℓ_0 pseudonorm, Optimization, 1-23, 2022.

Using regression splines to approximate mixed integer nonlinear optimization models
PRESENTER: Vanesa Guerrero

ABSTRACT. Complex phenomena can be accurately described by means of data-driven mathematical models. However, being able to integrate these models within a mathematical optimization framework can be, in general, very challenging. In fact, many of these data-driven models are `black-box', in the sense that they do not have an explicit mathematical formula which describes them. In other cases, even if an explicit expression exists, including it into a mathematical optimization model may make solving the problem computationally intractable. We propose to use a special kind of surrogate models, regression splines, to deal with functions of this kind which appear in Mixed Integer Nonlinear Programming (MINLP) problems. The choice of spline functions is not arbitrary. On one hand, they offer a good compromise between accuracy and complexity. On the other hand, their functional form allows us to exploit separability and approximate general non-convex MINLPs by a more tractable subclass of problems.

10:30-12:30 Session 10I: Energy transition and decarbonization
Location: A1.128
A Planning Game for Global Decarbonization Strategies

ABSTRACT. The ongoing concerns surrounding climate change and its irreversible impacts necessitate a deeper understanding of the commitment levels and behavioral nuances of governments toward decarbonization. In our previous work, we established the dynamics of government players’ actions regarding carbon emission levels, emphasizing the interplay between GDP benefits and the repercussions of climate change.

In this year’s presentation, we further enrich our study by introducing two decision-making methodologies: a planning method and a receding horizon method. This allows us to delve deeper into the strategic decisions made by governments and ascertain the critical trade-off between global temperature rise and GDP. Such a rigorous examination provides valuable insights into certain behavioral tendencies that appear invariant to the choice of the damage function.

Furthermore, we augment our study with the introduction of new damage functions, offering a broader and more comprehensive perspective on the potential consequences of emissions. By examining the utilities, we aim to understand better the optimal agreeing time for both the planning and receding horizon methods for in favor of government and climate.

In the culmination of our analysis, we establish conditions on the benefits and damage functions that guarantee the uniqueness of a pure Nash equilibrium. This is instrumental in not only characterizing the strategic interactions among governments but also in identifying scenarios where cooperative outcomes are more likely.

A Two-Timescale Decision-Hazard-Decision Formulation for Storage Usage Values Calculation in Energy Systems Under Uncertainty

ABSTRACT. The penetration of renewable energies requires additional storages to deal with the system’s intermittency. Accordingly, there is growing interest in evaluating the opportunity cost associated with stored energy (usage value), which can be obtained by solving a multistage stochastic optimization problem. Today, to compute usage values under uncertainties, a resource adequacy problem is solved using stochastic dynamic programming assuming a hazard-decision information structure. This modelling assumes completely knowledge of the coming week uncertainties which is not adapted to the system operation, as the intermittency occurs at smaller timescale. Moreover, we also assume full flexibility of thermal units. But in case of unpredictable outages, it takes time to switch other thermal units on. What happens if we take this temporal rigidity into account? We introduce a two-timescale problem [1] with a new information structure considering planning and recourse decisions: decision-hazard-decision. This structure is used to decompose the multistage decision-making process into a non-anticipative planning step in which switch-on decisions for thermal units are made, and a recourse step in which power modulation decisions are made once the uncertainties have been disclosed. We address the usage value computation problem for a uninodal system composed of a cluster of thermal units, a single storage unit and a load. The uncertainties are the residual demand of the system and the availability of thermal units. We observe that the choice of the information structure for usage values computation can change the merit order in the system, i.e. the order of the storage usage value with respect to the thermal units prices. The policy induced by the weekly fully anticipative hazard-decision information modelling leads to an overestimation of the thermal flexibility and as a result, a lower use of the storage. When using the policy induced by the decision-hazard-decision Bellman functions, we take into account some rigidity in the thermal units (specially switch-on and switch-off decisions) and the storage is more used.

Coordinated cross-border capacity calculation through the FARAO open-source toolbox
PRESENTER: Alexandre Marié

ABSTRACT. Cross-borders exchanges have taken a major role in European strategy to achieve climate goals. The European Commission set a target of 15\% interconnections in 2030, meaning that each country should have the physical capability to export at least 15\% of their production. Increasing exchanges makes short term planning more complex.

In this context, the French TSO (RTE) released an open-source toolbox FARAO to perform Coordinated Capacity Calculation (CCC) and ensure the security of supply.This involves running simulations to detect network violations. This involves running simulations (load flow calculation) to detect violations on the network and then determining the preventive or curative actions to be implemented to resolve contingencies at the lowest possible cost. This second step is called Remedial Actions Optimisation (RAO) and consists in solving an Optimal Power Flow (OPF) problem.

Artelys is a consultancy expert in power systems optimization and carries out various projects around TSO operational coordination in Europe. FARAO performs the optimization of both preventive and curative remedial actions, including HVDC lines, phase-shifter transformers and counter-trading but also topological actions. It is operationally used for the exchanges between Italy and its northern neighbors as well as between France, Spain and Portugal.

Artelys will present the algorithms of the FARAO toolbox and how they are actually used to enable greater operational coordination amongst the countries.

A Mean-Field Game Model for Energy Transition
PRESENTER: Quentin Petit

ABSTRACT. We introduce a mean-field game model in which agents can invest in two types of capital: green capital and polluting capital. The productivity of each type of capital depends on a pollution variable subject to noise and common to all agents. We establish the existence of equilibria by solving a classical FBSDE. Numerical experiments are also carried out: we observe the energy transition.

10:30-12:30 Session 10J: Optimization under sparsity constraints
Location: A1.139
The l0 Bregman Relaxation
PRESENTER: M'Hamed Essafri

ABSTRACT. In this talk, we will present the l0 Bregman relaxation for sparse regression. It leads to exact relaxations of l0-penalized criteria in the sense that it preserves global minimizers while removing some local (not global) minimizers. As such, the l0 Bregmen relaxation extends and generalizes previous relaxations proposed only for the least-squares (quadratic) case. Then, we deploy a Bregman proximal gradient algorithm to minimize the relaxed problem. Finally, numerical results for problems involving widely-used non-quadratic data terms (Kullback-Leibler divergence, logistic regression) will be presented.

Safe Peeling for L0-Penalized Problems
PRESENTER: Théo Guyard

ABSTRACT. In many application fields such as machine learning, signal processing or inverse problems, it is of tantamount interest to solve problems of the form \min f(x) + \lambda \|x\|_0 where f(•) is a loss to be minimized, \|•\|_0 is the L0-norm that promotes sparsity in the optimizers by counting the number of non-zeros in its argument and \lambda>0 is an hyperparameter. This problem can be solved effectively via Branch-and-Bound (BnB) algorithms by adding a bound constraint l <= x <= u to reformulate the L0-norm using a binary variable through Big-M relations. In practice, the choice of the values l and u must respect two conflicting imperatives. On the one hand, the bound constraint must not modify the solutions of the initial problem. Hence, the interval [l,u] should be chosen wide enough to contain all the optimizers of the problem. On the other hand, the quality of the relaxations constructed across the BnB tree degrades with the spread of [l,u]. Hence, setting a too-wide interval is likely to increase the number of nodes explored during the BnB algorithm and penalize the solution time.

We propose a new strategy, dubbed peeling, to address this issue. By leveraging a duality property linking consecutive nodes, we locally tighten the interval [l,u] which results in a strengthening of the relaxations constructed. This procedure is carried out in a safe manner, i.e., we guarantee the validity of the BnB procedure and the optimality of the solutions obtained. Peeling can be seen as a generalization of the screening methodology recently introduced for L0-penalized problems. Through different numerical experiments, we show that peeling allows significant gains in the number of nodes explored and in the solution time.

Screen & Relax for Sparse Support Identification
PRESENTER: Théo Guyard

ABSTRACT. Sparse problems arise in many application fields such as machine learning, signal processing and inverse problems, among others. They usually take the form of a convex optimization problem x^{\star} \in \arg\min f (x) + \lambda g(x), where f (·) is the loss to be minimized, g(·) is a sparsity-inducing regularization and \lambda > 0 is a tradeoff parameter. Several first-order optimization methods can address this kind of problems, such as Proximal Gradient (PG), Coordinate Descent (CD) or Alternating Direction Method of Multipliers (ADMM). A popular strategy to exploit the problem structure is to enhance them with screening tests to detect the position of zeros in x^{\star}. This allows shrinking the problem dimensionality and saving computations during the solution process. We propose a complementary strategy, dubbed relaxing tests, to detect the position of non-zeros in x^{\star}. We use this knowledge to perform second-order iterations instead or first-order ones. By implementing screening and relaxing tests in a solution method, we expect both a diminution of its computational burden and an acceleration of its convergence speed.

Improving Sparse Penalties with Nonconvexity and Coefficient Clustering
PRESENTER: Anne Gagneux

ABSTRACT. In this talk, we address the following question: how can we enforce parsimony and structure in a regularized regression model ? Specifically, we investigate strategies that promote not only sparsity but also clustering of correlated features. In fact, we aim to recover these groups without prior knowledge about the clusters. For such problems, the Sorted L1 norm (SLOPE, OWL) is already widely used. Yet, SLOPE relies on the L1 penalty. It therefore shares a major drawback with the Lasso: it is biased and tends to shrink non-null coefficients.

In the same way nonconvex penalties have been introduced as unbiased L1 alternatives, we deal here with sorted nonconvex penalties as unbiased SLOPE's counterparts. Feng and Zhang (2019) have already proposed an approach to address a family of sorted nonconvex penalties: they use a "majorize-minimization" method that generates a sequence of easier-to-solve convex surrogate problems.

Instead, we show that under suitable conditions, we can exactly compute -with linear complexity- the prox of these penalties and apply proximal gradient algorithms to solve the penalized regression. We introduce a new writing for the prox of sorted penalties which only involves the scalar prox of the penalty (for which we usually have an explicit expression) and averaged values of parts of the input vector and the regularization sequence.

We experimentally show that using sorted nonconvex penalties do improve both unbiasedness of the solution and automatic features grouping, which validates our initial hypothesis. We also compare our method with the "majorize-minimization" approach and demonstrate a significant gain in both minimization and computation time.

12:00-12:30 Session 11A: Mean field games and control
Location: Auditorium
On the Optimal Rate for the Convergence Problem in Mean-Field Control
PRESENTER: Samuel Daudin

ABSTRACT. The goal of this work is to obtain optimal rates for the convergence problem in mean field control. Our analysis covers cases where the solutions to the limiting problem may not be unique nor stable. Equivalently the value function of the limiting problem might not be differentiable on the entire space. Our main result is then to derive sharp rates of convergence in two distinct regimes. When the data is sufficiently regular, we obtain rates proportional to $N^{-1/2}$, with $N$ being the number of particles. When the data is merely Lipschitz and semi-concave with respect to the first Wasserstein distance, we obtain rates proportional to $N^{-2/(3d+6)}$. Noticeably, the exponent $2/(3d+6)$ is close to $1/d$, which is the optimal rate of convergence for uncontrolled particle systems driven by data with a similar regularity. The key argument in our approach consists in mollifying the value function of the limiting problem in order to produce functions that are almost classical sub-solutions to the limiting Hamilton-Jacobi equation (which is a PDE set on the space of probability measures). These sub-solutions can be projected onto finite dimensional spaces and then compared with the value functions associated with the particle systems. In the end, this comparison is used to prove the most demanding bound in the estimates. The key challenge therein is thus to exhibit an appropriate form of mollification. We do so by employing sup-convolution within a convenient functional Hilbert space. To make the whole easier, we limit ourselves to the periodic setting. We also provide some examples to show that our results are sharp up to some extent.

12:00-12:30 Session 11B: Robust optimization and applications
Location: Amphi I
An Extension of the Bertsimas & Sim Result for Discrete, Linear, and Gamma-Robust Min-Max Problems
PRESENTER: Ivana Ljubic

ABSTRACT. Due to their nested structure, bilevel problems are intrinsically hard to solve - even if all variables are continuous and all parameters of the problem are exactly known. In this paper, we study mixed-integer linear bilevel problems with lower-level objective uncertainty, which we address using the notion of Gamma-robustness. We provide an extension of the famous result by Bertsimas and Sim for Gamma-robust single-level problems stating that the Gamma-robust counterpart of a min-max bilevel problem can be solved by solving a polynomial number of min-max problems of the nominal type. Moreover, we discuss the situation in which the problems of the nominal type are not solved exactly but in which either an alpha-approximation algorithm or a primal heuristic together with a lower bounding scheme is used. For general Gamma-robust bilevel problems, however, we illustrate that the previous ideas cannot be carried over completely. Nevertheless, we present a primal heuristic that is based on the solution of a polynomial number of problems of the nominal type. To assess the performance of the presented methods, we perform a computational study on 560 instances for both settings. For the primal heuristic, we observe that the optimality gap is closed for a significant portion of the considered instances. If our result for the min-max setting is exploited, we report speed-up factors exceeding 400 and 32 in the mean and the median, respectively, when compared to recently published problem-tailored solution approaches.

12:30-14:00Lunch Break
15:00-15:30Coffee Break
15:30-17:00 Session 13A: Invited session: Mean field games and control, part 2, organized by Guilherme Mazanti and Laurent Pfeiffer
Location: Auditorium
Approximation of deterministic mean field games

ABSTRACT. In this talk we study the numerical approximation of deterministic Mean Field Games where the dynamics of a typical agent is non-linear with respect to the state variable and affine with respect to the control variable. Particular instances of the problem considered here are mean field games where the typical player controls its acceleration. Our main result is the convergence of solutions of this approximation towards MFG equilibria. If time permits, we will also discuss the case of deterministic mean field games with state constraints.

Based on joint works with J. Gianatti (CIFASIS-CONICET, Universidad Nacional de Rosario).

Hybrid games in route planning for sailing vessels and their mean field limit

ABSTRACT. We discuss the extension of the classic dynamic programming approach to hybrid control systems, where a continuous controller can switch between a finite collection of states, paying a cost of switching. This approach requires the resolution of a system of quasi-variational Hamilton-Jacobi inequalities that we propose to approximate via a semi-Lagrangian scheme obtained by the direct discretization of the dynamical programming principle. We discuss the application of such a framework to model a sailing boat navigation problem for the optimization of the strategic choices on a race course.

The model can be extended to the non-zero sum games, naturally including the presence of 3 or more players. For the number of players going to infinity, it is possible to introduce a new model as its mean-field limit. This is the case of very crowded races, where the interaction between many players would be impossible to model correctly with other techniques.

Kolmogorov equations on spaces of measures associated to nonlinear filtering processes

ABSTRACT. Backward Kolmogorov equations are partial differential equations (PDEs) of parabolic type with a given final condition. The relation between them and certain stochastic processes has been intensively investigated in both finite and infinite dimensional cases. Moreover, their comprehension represents a preliminary step in the study of the Hamilton-Jacobi-Bellman equations, which are nonlinear partial differential equations strongly connected to stochastic optimal control problems.

This talk aims to present a class of backward Kolmogorov equations on spaces of probability and positive measures, associated to measure-valued stochastic processes arising in the context of nonlinear filtering. Indeed, in the filtering framework, one can formulate two stochastic differential equations, called Zakai and Kushner-Stratonovich equations, that are satisfied by a positive measure and a probability measure-valued process respectively. Thus, one can study the associated backward Kolmogorov equations, which are partial differential equations of parabolic type on the space of measures.

In the literature, a common approach is to assume that the measure-valued processes admit a density, to exploit stochastic calculus techniques in Hilbert spaces. Here the existence of a density is not assumed and everything is done directly in the context of measures. Due to this, the obtained equations will resemble the ones usually introduced in the context of mean field games and mean field control.

The study of these PDEs is a preparatory step to address the control problems with partial observation. Indeed, these control problems can be rephrased as optimal control problems with full observation of a measure-valued state which solves a filtering equation. It follows that it is natural to formulate the associated Hamilton-Jacobi-Bellman equations on the spaces of measures, as a nonlinear version of the aforementioned backward equations.

In the talk, we will introduce the Kolmogorov equations and then present a well-posedness result for classical solutions. If it remains time, we will discuss possible future applications and developments.

15:30-17:00 Session 13B: Invited session: Optimal transport and applications, organized by Luca Nenna
Location: Amphi II
An ODE characterisation of Entropic Multi-Marginal Optimal Transport

ABSTRACT. The purpose of this talk is to introduce a new numerical method to solve multi-marginal optimal transport problems with pairwise interaction costs. The complexity of multi-marginal optimal transport generally scales exponentially in the number of marginals $m$. We introduce a one parameter family of cost functions that interpolates between the original and a special cost function for which the problem's complexity scales linearly in $m$. We then show that the solution to the original problem can be recovered by solving an ordinary differential equation in the parameter $\eta$, whose initial condition corresponds to the solution for the special cost function mentioned above; we then present some simulations, using both explicit Euler and explicit higher order Runge-Kutta schemes to compute solutions to the ODE, and, as a result, the multi-marginal optimal transport problem.

Convergence rate of entropy-regularized multi-marginal optimal transport costs

ABSTRACT. We investigate the convergence rate of multi-marginal optimal transport costs that are regularized with the Boltzmann-Shannon entropy, as the noise parameter ε tends to 0. We establish lower and upper bounds on the difference with the unregularized cost of the form Cεlog(1/ε)+O(ε) for some explicit dimensional constants C depending on the marginals and on the ground cost, but not on the optimal transport plans themselves. Upper bounds are obtained for Lipschitz costs or semi-concave costs (for a finer estimate), and lower bounds for C² costs satisfying some signature condition on the mixed second derivatives that may include degenerate costs, thus generalizing results previously obtained with Carlier and Tamanini, and by Eckstein and Nutz. We obtain in particular matching bounds in some typical situations where the optimal plan is deterministic, like in the case of Wasserstein barycenters. This is a joint work with Luca Nenna.

Displacement smoothness of entropic optimal transport and applications

ABSTRACT. Entropic optimal transport (EOT) has received a lot of attention in recent years due to its numerical tractability (Sinkhorn algorithm) as well as its connections to the Schrödinger bridge problem. In this talk, I will first give some Lipschitz dependence results for the potentials of EOT (in Ck norm) with respect to the marginals (for the Wasserstein metric). As a first application, well posedness for Wasserstein gradient flows (with possibly several species) of energies involving EOT terms will follow. Finally, I will describe another application which concerns the so-called semi geostrophic equation, which has an Hamiltonian structure, well-posedness and convergence of its entropic approximation will be explained (work in progress with Jean-David Benamou and Hugo Malamut).

15:30-17:00 Session 13C: Genetic optimization and discrepancy
Location: A1.116
Optimal sets for the L-infinity star discrepancy

ABSTRACT. Discrepancy measures are metrics designed to quantify how well-spread a point set is in a given space. Among the different measures, the L-infinity star discrepancy is the most important, measuring the largest deviation between the volume of any box [0, q(, q<1, and the proportion of points that are inside this box. Point sets of low discrepancy are used in many contexts, including Quasi-Monte Carlo integration, financial mathematics and one-shot optimization. Despite their wide-spread use, few people have directly tried to find optimal sets for this measure. In dimension 1, the optimal sets are known for all n [1], but already in dimension 2 results are limited to n<7 by White [2]. More recent works have focused either on very few points in higher dimensions ([3],[4]) or other measures [5].

In this talk, we introduce two non-linear programming formulations to find optimal L-infinity star discrepancy sets. With these, we are able to compute optimal sets for n<21 in dimension 2 and n<9 in dimension 3. These sets present a very different structure to that of known low-discrepancy sets and sequences such as Sobol' or Kronecker , possibly suggesting a new construction method. Furthermore, our formulations can be easily adapted to other related problems such as adding one or many points to an existing set, finding optimal lattices or optimal sets for the 4-corner star discrepancy which considers all boxes anchored in one of the corners of [0,1]^d.

[1]: H. Niederreiter, Discrepancy and convex programming, Ann. Mat. Pura Appl., 93:89-97, 1972. [2]: B.E. White, On optimal extreme-discrepancy point sets in the square, Numer. Math., 27:157-164, 1976/1977. [3]: Tim Pillards, Bart Vandewoestyne, and Ronald Cools, Minimizing the L_2 and L-infinity star discrepancies of a single point in the unit hypercube,Journal of Computational and Applied Mathematics, 197:282-285, 12, 2006. [4]: Gerhard Larcher and Friedrich Pillichshammer, A note on optimal point distributions in [0,1)^s,Journal of Computational and Applied Mathematics, 206(2):977-985, 2007 [5]Aicke Hinrichs and Jens Oettershagen, Optimal point sets for quasi-monte carlo integration of bivariate periodic functions with bounded mixed derivatives,Monte Carlo and Quasi-Monte Carlo Methods}, 2014.

An Improved Tie-Breaking Rule for the Non-Dominated Sorting Genetic Algorithm II

ABSTRACT. The \emph{non-dominated sorting genetic algorithm II} (NSGA-II) is the most intensively used multi-objective evolutionary algorithm (MOEA) in real-world applications. While theoretical research on the NSGA-II has only started very recently, this first work was quickly followed-up by around a dozen research papers. The lower bounds proven by Doerr and Qu show that the expected run time of the NSGA-II often scales linearly with the population size~$N$ of the algorithm. Consequently, different from many other evolutionary algorithms, the NSGA-II does not profit from larger population sizes (as long as the population size is large enough for the NSGA-II to be effective). This shows that the choice of the right parameter value is particularly critical for the NSGA-II.

Analyzing the lower bound proofs, we find that this undesirable scaling is caused by the way the NSGA-II breaks ties between solutions of the same quality (in the sense of non-dominated sorting and crowding distance). We propose a simple modification of the tie-breaking rule of the NSGA-II that takes into account the number of individuals with same objective value. We investigate the performance of the NSGA-II with this \emph{balanced} tie breaking on the two benchmark functions OneMinMax and OneJumpZeroJump. We prove upper bounds that remove the dependence of the expected run time on the population size~$N$ for decent ranges of this parameter. Hence in these regimes, the choice of the population size is uncritical.

This work is part of the PGMO-funded project \emph{Mathematical Analysis of Complex Randomized Search Heuristics} (PI: Benjamin Doerr).

Fast Biased Random Key Genetic Algorithm in Target Set Selection
PRESENTER: Hoang Nguyen Vu

ABSTRACT. The biased random key genetic algorithm (BRKGA) is a genetic algorithm used to find optimal or suboptimal solutions to hard combinatorial optimization problems. It has recently proven highly effective in addressing the J-MIN target set selection problem, a combinatorial optimization problem based on graph diffusion models. While the BRKGA performs well on the target set selection problem, it demands meticulous tuning of several parameters which is troublesome and complexifies the problem. To simplify this, we adopt an approach to randomly select parameters from a power-law distribution. This approach has been recently proven mathematically to be favourable in certain problems. In this talk, we apply this approach to propose a fast biased random key genetic algorithm (fastBRKGA) and also a greedy heuristic reverse Maximum-Degree Heuristic for solving the TSS problem in large-scale social networks. Our experimental investigation finds that this new algorithm yields comparable results to traditional parameter tuning, and the new greedy heuristic improves the result significantly. This work is part of the PGMO-funded project Mathematical Analysis of Complex Randomized Search Heuristics (PI: Benjamin Doerr).

15:30-17:30 Session 13D: Invited session: New roads to optimality, organized by Sorin-Mihai Grad
Location: A1.139
Nonlinear Sharp Minimum and the Stability of a Local Minimum on Metric Spaces
PRESENTER: Michel Thera

ABSTRACT. In this talk we put the emphasis on nonlinear (weak) sharp minimizers with respect to a gauge function. This concept plays an important role in both theoretical and numerical aspects of optimization. In the last part of the contribution we study the stability (in some appropriate sense) of local/global minimizers of an objective function $f$ perturbed to $ f + g$ by a function $g$ belonging to a suitable class of Lipschitz functions defined on metric spaces.

Minimizing a Separable Sum Coupled by a Difference of Functions and Linear Constraints

ABSTRACT. In this work, we develop a splitting algorithm for solving a broad class of linearly constrained composite optimization problems, whose objective function is the separable sum of possibly nonconvex nonsmooth functions and a smooth function, coupled by a difference of functions. This structure encapsulates numerous significant nonconvex and nonsmooth optimization problems in the current literature including the linearly constrained difference-of-convex problems. Relying on the successive linearization and alternating direction method of multipliers, the proposed algorithm exhibits the global subsequential convergence to a stationary point of the underlying problem. We also establish the convergence of the full sequence generated by our algorithm under the Kurdyka--{\L}ojasiewicz property and some mild assumptions. The efficiency of the proposed algorithm is tested on a robust principal component analysis problem and a nonconvex optimal power flow problem.

First order inertial optimization algorithms with threshold effects associated with dry friction

ABSTRACT. In a Hilbert space setting, we consider new first order optimization algorithms which are obtained by temporal discretization of a damped inertial autonomous dynamic involving dry friction. The function to be minimized is assumed to be differentiable (not necessarily convex). The dry friction potential function, which has a sharp minimum at the origin, enters the algorithm via its proximal mapping, which acts as a soft thresholding operator on the sum of the velocity and the gradient terms. After a finite number of steps, the structure of the algorithm changes, losing its inertial character to become the steepest descent method. The geometric damping driven by the Hessian of the objective function makes it possible to control and attenuate the oscillations. The algorithm generates convergent sequences in the convex case, and in the nonconvex case provided that the Kurdyka-Lojasiewicz property is satisfied. The convergence results are robust with respect to numerical errors, and perturbations. The study is then extended to the case of a nonsmooth convex function, in which case the algorithm involves the proximal operators of the objective function and the dry friction potential separately. Applications are given to the Lasso problem and nonsmooth d.c. programming.

Beyond the Fermat Optimality Rules
PRESENTER: Sorin-Mihai Grad

ABSTRACT. We propose a general framework for analyzing the behavior at its extrema of an extended real-valued function that can lack convexity or differentiability, and for which the classical Fermat rules of optimality do not apply. To this end, we employ the notions of sup-subdifferential, recently introduced by two of the authors together with A. Kruger in [1], and partial sup-subdifferentials. The sup-subdifferential is a nonempty enlargement of the Moreau-Rockafellar subdifferential, satisfying most of its fundamental properties and enjoying certain calculus rules. The partial sup-subdifferentials are obtained by breaking down the sup-subdifferential into one-dimensional components through the elements of a Hamel basis and play the same role as the partial derivatives in the Fermat optimality rules.

[1] M. Abbasi, A.K. Kruger, M. Théra, Enlargements of the Moreau-Rockafellar subdifferential, Set-Valued Var Anal 29(3):701--719, 2021

15:30-16:30 Session 13E: Invited session: Optimization in telecom networks, part 2, organized by Eric Gourdin, David Coudert, Sébastien Martin and Dritan Nace
Location: Amphi I
The Virtual Network Function Reconfiguration Problem
PRESENTER: Kristina Kumbria

ABSTRACT. One of the challenges in the administration of 5G networks revolves around the need to reconfigure network slices. This entails the process of reconfiguring and relocating Virtual Network Functions (VNFs) to align with the specific demands of the network slices and the resource capacities within data centers. We consider a problem of allocating Virtual Network Functions (VNFs) in 5G networks, treating it as a multiple multidimensional knapsacks reconfiguration problem where items represent VNFs and knapsacks represent servers on which VNFs are able to run. When provided with both the starting and ending allocations of items to knapsacks, the challenge involves defining a sequence of steps for transferring items between source and destination knapsacks. The ultimate goal is to reduce the overall count of steps during which items reside outside the knapsacks, thereby minimizing the total downtime for Virtual Network Functions (VNFs). We define two exact, compact Integer Linear Programming (ILP) formulations of the VNFs reconfiguration problem, and derive one approximate solution algorithm based on the Column Generation (CG) method. Numerical tests and comparisons illustrate the performance of the algorithms and the quality of the provided solutions.

Learning Sparsified Networks in Column Generation: Applications to Multi-Commodity Network Flow Problems
PRESENTER: Alexandre Schulz

ABSTRACT. Column Generation (CG) is a well known method suited to solving Linear Optimisation problems that have an exponential number of variables such as those stemming from Danzig-Wolfe decomposition. CG consists in solving the Restricted Master problem (RMP); the original problem considering only a subset of variables and iteratively adding promising variables by solving a pricing problem. The algorithm stops as soon as no such variables exist.

CG is notably used for solving the Multi-Commodity Network Flow problem (MCNF)~\cite{CGMCNF}. The latter consists, given a capacitated network and a set of demands, in routing the demands from their source to their target at minimum cost while fulfilling the network capacity requirements. In the MCNF path-flow formulation, a variable represents a path that can be used to route a demand, and new variables are found by solving a shortest-path problem for each demand, which can be computationally expensive for large instances. In telecommunication networks MCNFs are repeatedly solved over the same network with demands varying from instance to instance. We propose to leverage this information to devise an effective heuristic. Our approach relies on training a graph neural network \cite{GNN} to identify regions of interest of the graph for each demand, reducing drastically its size and leading to an improved performance of the pricing algorithm. Improvements are illustrated by applying the aforementioned method on a set of realistic MCNFs, showing significant reduction in computation time with only slight degradation of the objective.

15:30-16:30 Session 13F: Optimal control and epidemiology
Location: A1.122
Optimal Vaccination Strategies for Metropolitan Areas
PRESENTER: Lucas Moschen

ABSTRACT. This study presents a mathematical model to determine optimal vaccination strategies in metropolitan areas, considering commuting patterns. The epidemiological model employed is a compartmental SEIR model, incorporating a vaccination rate as a control function for each city. The commuting patterns are incorporated using a weighted adjacency matrix and a parameter that weighs day and night periods. The optimal control (OC) problem is formulated to minimize a balance between the number of hospitalizations and vaccines, including restrictions of a weekly availability cap and an application limit of vaccines per time unit, leading to mixed control-state and pure-state constraints, which are challenging to manage. Additionally, our study addresses a control-affine problem, which, while practical, poses mathematical difficulties. The key findings are the bounds for the basic reproduction number, particularly in the case of a metropolitan area, and the OC problem analysis. Theoretical analysis and numerical simulations provide insights into disease dynamics and the effectiveness of control measures. The research highlights the importance of prioritizing vaccination in the capital to reduce disease spread faster, as we depicted in our numerical simulations, especially when it has higher contact rates. Future research will refine this strategy and explore theoretical aspects of constrained control-affine problems.

Optimal Control for General Impulsive VS-EIAR Epidemics Models with Application to COVID-19

ABSTRACT. In this work, we are interested in a VS-EIAR epidemiological model considering vaccinated individuals ${V_i:\hspace{0.1cm} i=1,\cdots,n}$, where $n\in \mathbb{N}^{*}$. The dynamics of the VS-EIAR model involve several ordinary differential equations that describe the changes in the vaccinated, susceptible, infected, exposed, asymptomatic, and deceased population groups. Our aim is to reduce the number of susceptible, exposed, infected, and asymptomatic individuals by administering vaccination doses to susceptible individuals and treatment to infected individuals. To achieve this, we utilize optimal control theory to regulate the dynamics of our considered epidemic model within a terminal optimal time $\tau^{*}$. Pontryagin's maximum principle (PMP) will be employed to establish the existence of an optimal control time $(v^{*}(t), u^{*}(t))$. We also incorporate an impulsive VS-EIAR epidemic model, with special attention given to immigration or the travel of certain population groups. Finally, we provide a numerical simulation to demonstrate the practical implementation of the theoretical findings.

15:30-17:30 Session 13G: Conic Optimization
Location: A1.133
Semidefinite programming by projective-cutting-planes

ABSTRACT. I try to extract some essentials from the PDF

Around ten years ago I started to publish work on the idea of upgrading the separation sub-problem to a projection (or intersection) one: given a non-feasible solution d in a polytope P, what is the maximum step-length t so that t*d stays in P? This amounts to projecting from 0 towards d. Around five years ago I introduced Projective Cutting Planes because I realized it may be possible to project from any feasible point instead of zero in some problems. I now think (SDP) is such a problem. To make Projective Cutting Planes work, we have to be able to rapidly solve t=max {t:S+tD is SDP} for a given SDP matrix S and an arbitrary matrix D. This projection sub-problem is quite simple if S is strictly SDP.

QPLayer: efficient differentiation of convex quadratic optimization
PRESENTER: Antoine Bambade

ABSTRACT. Optimization layers within neural network architectures have become increasingly popular for their ability to solve a wide range of machine learning tasks and to model domain-specific knowledge. However, designing optimization layers requires careful consideration as the underlying optimization problems might be infeasible during training. Motivated by applications in learning, and control, this work focuses on convex quadratic programming (QP) layers. The specific structure of this type of optimization layer can be efficiently exploited for faster computations while still allowing rich modeling capabilities. We leverage primal-dual augmented Lagrangian techniques for computing derivatives of both feasible and infeasible QP solutions. More precisely, we propose a unified approach which tackles the differentiability of the closest feasible QP solutions. The obtained Jacobian covers for feasible QPs the traditional implicit differentiation when it is valid and a weaker notion (i.e., conservative Jacobian) when it is infeasible. We then harness this approach to enrich the expressive capabilities of existing QP layers. More precisely, we show how differentiating through infeasible QPs during training enables to drive towards feasibility at test time a new range of QP layers. These layers notably demonstrate superior predictive performance in some conventional learning tasks. Additionally, we present alternative formulations that enhance numerical robustness, speed, and accuracy for training such layers. The effectiveness of our approach is demonstrated in a few standard learning experiments, obtaining three to ten times faster computations than alternative state-of-the-art methods while being more accurate and numerically robust. Along with these contributions, we provide an open-source C++ software package called QPLayer for efficiently differentiating convex QPs and which can be interfaced with modern learning frameworks.

A new interior point solver for HiGHS
PRESENTER: Julian Hall

ABSTRACT. HiGHS is open-source software for large scale linear optimization, and has established itself as offering the best benchmark performance due to its innovative interior point (IPM) solver for LP, and powerful MIP solver. In particular, the IPM solver has been hailed as a game-changer for the field of open-source energy system modelling. However, for some instances, the IPM solver is still prohibitively expensive. This talk will discuss the development of a new IPM solver for HiGHS.

Interior point methods in optimal control

ABSTRACT. This presentation deals with Interior Point Methods (IPMs) for Optimal Control Problems (OCPs) with pure state and mixed constraints. This presentation establishes a complete proof of convergence of IPMs for a general class of OCPs. Convergence results are proved for primal variables, namely state and control variables, and for dual variables, namely, the adjoint state, and the constraints multipliers. In addition, the presented convergence result does not rely on a strong convexity assumption. Interestingly, interior point methods in optimal control allow pure state constraints and mixed state-control constraints to be treated in the same way, i.e. using logarithmic penalties in each case. However, the difference in nature between these two types of constraints lies in the convergence topology of the dual variables associated with them. Finally, this presentation extends previous results presented last year at PGMO where convergence was proved using a strong convexity assumption and for control constraints of the form a <= u(t) <= b.

15:30-17:30 Session 13H: Real world impact of optimization
Location: A1.128
Fisher Market Model based Resource Allocation for 5G Network Slicing
PRESENTER: Mandar Datar

ABSTRACT. Network slicing (NS) is a potential technology to support a higher degree of heterogeneity and flexibility required by next-generation services in 5G networks. One of the essential elements in network slicing is resource allocation, which needs to ensure slice tenants the protection of their service level agreements (SLAs) while optimizing network efficiency. We propose a resource-sharing scheme based on the Fisher market model and the Trading post mechanism that can achieve efficient resource utilization while catering to multi-level fairness, dynamic load conditions and SLA protection. In the proposed scheme, each service provider (SP) is given a budget representing its infrastructure share or purchasing power in the market. The SPs acquire different resources by spending their budgets to offer the service to different classes of users, which are classified according to their service needs and priorities. We assume that SPs employ the $\alpha$ fairness criteria while delivering the service to their subscribers. The proposed allocation scheme aims to find a market equilibrium(ME) that provides allocation and resource pricing whereby each SP's needs are met, and resources are fully utilized. We show that the ME solution problem can be formulated as a convex optimization problem whose primal and dual optimal solution provides equilibrium allocation and pricing. We build a privacy-preserving learning algorithm enabling SPs to reach ME in a decentralized fashion. We theoretically evaluate the proposed allocation scheme's performance by comparing it with the Social Optimal and Static Proportional sharing schemes. Finally, we run extensive numerical simulations to assess the fairness and efficiency properties of the proposed scheme.

Exact Algorithm for Decoupled Vehicle-to-Grid Problem

ABSTRACT. We study a Vehicle-to-Grid (V2G) smart charging optimization problem. The problem examines the perspective of an aggregator wishing to maximize profits by using the battery of a single electric vehicles to trade energy and ancillary services on a day-ahead market. The problem can be obtained as a relaxation of a more complex and operationally realistic multiple vehicle problem.

Our work focused on the development of an original and exact algorithm for optimally solving the considred problem in polynomial time. We present both mathematically rigorous proofs of optimality, and an in-depth time complexity analysis. The solution approach leverages a variant of forward dynamic programming.

The work was performed as part of an M2 internship in collaboration between EDF and École des Mines Saint Étienne, and was funded by PGMO. It is part of the PGMO Vehicle-to-Grid project.

Multidimensional analysis for the techno-economic study of the CSP plant
PRESENTER: Daniel Lasluisa

ABSTRACT. The profitability of a concentrated solar plant (CSP) depends on the type of storage system and its daily operation. In addition, this profitability can be evaluated using different economical criteria such as the minimum time to recover the initial investment, the total profit at the end of the project and others. Given a solar field and a power block, our aim in this work is to determine an optimal design of the storage system for the CSP. We consider two types of storage technology, namely thermochemical reactor and a Two-Tank molten salt system, two different production strategies, four different economical criteria and three price scenarios for electricity market. And for each such combination (called profile) our model determine the optimal design of the CSP. But, contrary to classical economical evaluation in literature, in our approach an optimal design consists of the optimal values of the physical design of the storage system but also the optimal operations of the plant (production, storage, discharge). Thanks to this model and the obtained results we are able to exhibit conditions under which thermochemical storage is more economically pertinent than Two-Tanks storage and also situations ensuring an economical equilibrium of the CSP without subsidy.

Moving Horizon Estimation for Anesthesia dynamics
PRESENTER: Kaouther Moussa

ABSTRACT. During surgery, the anesthesiologists use different types of anesthetic agents in order to induce and maintain the depth of unconsciousness, the absence of movement and the absence of pain [1]. The level of hypnosis is usually measured via processed variables such as the Bispectral Index (BIS) which takes values in the interval [0, 100]. The range [40, 60] characterizes a moderate hypnosis state and the value 50 is generally considered as suitable for surgery. In order to monitor the BIS signal, different models are used, involving pharmacokinetic (PK) and pharmacodynamic (PD) parameters. Contrary to PK parameters, the PD parameters do not have standard expressions in terms of patient information. According to [2], the uncertainty introduced by the PD parameters shows a more significant influence on the measurable outputs than the one introduced by the PK parameters.

Therefore, this presentation will focus on an extended moving horizon observer [4] to estimate both the states and the pharmacodynamic (PD) parameters of an anesthesia model, based on real data. The inputs of this model are the injection rates of Propofol and Remifentanil. The states represent the concentration of the anesthetic agents in dierent compartments of the human body (muscles, fat, blood) and in the effect site. The considered output is the Bispectral index (BIS) which is derived from the electroencephalogram (EEG). The observer is designed such that the parameters are estimated during the anesthesia induction phase, and then almost frozen for the rest of the surgery. The estimator is validated on real data that were extracted from the VitalDB database [3].

[1] Ionescu, C. M., Neckebroek, M., Ghita, M., and Copot, D. An open source patient simulator for design and evaluation of computer based multiple drug dosing control for anesthetic and hemodynamic variables. IEEE Access 9 (2021), 8680-8694.

[2] Krieger, A., Panoskaltsis, N., Mantalaris, A., Georgiadis, M. C., and Pistikopoulos, E. N. Modeling and analysis of individualized pharmacokinetics and pharmacodynamics for volatile anesthesia. IEEE Transactions on Biomedical Engineering 61, 1 (2014), 25-34.

[3] Lee, H. C., Park, Y., Yoon, S. B., Yang, S. M., Park, D., and Jung, C. W. VitalDB, a high-fidelity multi-parameter vital signs database in surgical patients. Sci Data 9, 1 (Jun 2022), 279. [4] Moussa, K., Aubouin-Pairault, B., Alamir, M., and Dang, T. Data-based extended moving horizon estimation for miso anesthesia dynamics. IEEE Control Systems Letters 7 (2023), 3054-3059.

[4] Moussa, K., Aubouin-Pairault, B., Alamir, M., and Dang, T. Data-based extended movinghorizon estimation for miso anesthesia dynamics. IEEE Control Systems Letters 7 (2023), 30543059.