PGMODAYS 2024: PGMODAYS 2024
PROGRAM FOR TUESDAY, NOVEMBER 19TH
Days:
next day
all days

View: session overviewtalk overview

08:30-09:00Coffee Break
09:20-10:10 Session 2: Plenary I: Edouard Grave
09:20
Moshi: a real-time spoken dialogue model

ABSTRACT. In this talk, I will present Moshi, a joint speech-text foundation model and full-duplex spoken dialogue system.

Current systems for spoken dialogue rely on pipelines of independent components, namely voice activity detection, speech recognition, textual dialogue and text-to-speech. Such frameworks cannot emulate the experience of real conversations. First, their complexity induces a latency of several seconds between interactions. Second, text being the intermediate modality for dialogue, non-linguistic information that modifies meaning -- such as emotion or non-speech sounds -- is lost in the interaction. Finally, they rely on a segmentation into speaker turns, which does not take into account overlapping speech, interruptions and interjections.

Moshi solves these independent issues altogether by casting spoken dialogue as speech-to-speech generation. Starting from a text language model, Moshi generates speech as tokens from the quantizer of a neural audio codec, and separately models its own speech and that of the user into parallel streams. This allows for the removal of explicit speaker turns, and the modeling of arbitrary conversational dynamics. We extend the hierarchical semantic-to-acoustic token generation of previous work, by predicting time-aligned text tokens as a prefix to audio tokens. Our resulting model is the first real-time full-duplex spoken large language model, with a latency of around 200 ms in practice.

10:20-10:50 Session 4: PGMO Phd Prize, part I: Antoine Oustry
10:20
Global optimization of nonlinear semi-infinite programming problems: Applications in power systems and control

ABSTRACT. Computing global optima of nonlinear semi-infinite programming problems is difficult due to the infinite number of constraints and their (potential) nonconvexity. This thesis makes the- oretical and practical contributions to address this challenge, focusing on applications to the optimization of electrical grids and the control of dynamical systems. The first part is devoted to convex semi-infinite programming. First, we exhibit a convergence rate for the cutting-plane algorithm when the semi-infinite constraints are linear and the objective function is strongly convex. Second, we address a limitation of the cutting-plane algorithm: feasibility is achieved only asymptotically. Focusing on semi-infinite programs with a quadratic separation problem, we propose an iterative inner-outer approximation algorithm that generates a globally convergent sequence of feasible points. The third chapter presents an application of convex semi-infinite programming to minimal-time control. The second part deals with the global optimization of finite and semi-infinite nonconvex optimization problems related to the dispatch of electricity. We tackle the standard AC optimal power flow problem, a quadratically constrained quadratic programming problem with a finite number of constraints. Our global optimization algorithm, based on a strengthened semidefinite programming relaxation and piecewise linear approximations, achieves state-of-the-art performance on a reference benchmark. We address the numerical issues raised by this semidefinite programming relaxation for instances at a larger scale, by solving an unconstrained dual formu- lation with a bundle method to obtain certified lower bounds. Our numerical experiments on large-scale instances show significant improvement of the dual bounds. Finally, we consider some uncertainties in the AC power flow model and solve the resulting semi-infinite programming problem with an adaptive discretization algorithm with global convergence guarantees.

10:50-11:20Coffee Break
11:20-12:50 Session 5A: Invited session: Quantum Computing for Combinatorial Optimization
11:20
Analog Quantum Computing for Combinatorial Optimization Problems

ABSTRACT. Coherent and programmable arrays of neutral atoms provide a natural platform for solving combinatorial optimization problems, as they can be precisely manipulated using lasers and external fields, allowing for fine control over their quantum states. In the context of graph-structured problems, the atoms can represent the nodes, interactions between atoms encode the cost functions, and the system natural evolution can drive it toward a quantum state that represents the (near-)optimal solution.

Different pairs of electronic levels can effectively encode a two-level system (or qubit). Here, we focus on the case where these two states are the electronic ground state $\ket{g}\equiv\ket{0}$ and a $s$-Rydberg level $\ket{r}\equiv\ket{1}$. In this setup, atoms can be arbitrarily positioned in space, such that the effective Hamiltonian of the system at time $t$ is expressed as: $$ H(t) = \Omega(t)\sum_{u=1}^{|\vertices|}\hat{\sigma}^x_u - \Delta(t)\sum_{u=1}^{|\vertices|}\hat{n}_u +\sum_{u<v=1}^{|\vertices|} U_{uv} \hat{n}_u \hat{n}_v, $$ where the amplitude (giving the Rabi frequency) $\Omega(t)$ and detuning $\Delta(t)$ of the laser can be controlled over time, and the interaction strength $U_{uv}\propto\left|{\bf r}_u-{\bf r}_v\right|^{-6}$ is a function of the distance between atom~$u$ and atom~$v$. This choice of energy levels has already been employed in encoding Quadratic Unconstrained Binary Optimization (QUBO) problems, proving successful in several cases.

In recent years, the neutral atom technology has demonstrated significant theoretical and experimental advancements, positioning itself as a front-runner platform for running quantum algorithms. One unique advantage of this technology is the ability to arbitrarily reconfigure the geometry of the qubit register from shot to shot, enabling native embedding of graph-structured problems at the hardware level. This capability has profound implications for solving complex optimization and machine learning tasks. By driving qubits, quantum states are generated that retain the complex properties of graphs, providing either direct solutions or serving as resources in hybrid quantum-classical schemes.

Here, we review recent advancements in quantum algorithms for graph problems running on neutral atom Quantum Processing Units (QPUs). We also discuss newly introduced embedding and problem-solving techniques and outline ongoing hardware improvements, with a focus on increasing the scalability, controllability, and computation repetition rate of neutral atom QPUs.

11:50
Quantum Computing for Integer Quadratic Programming

ABSTRACT. Quantum Computing for Integer Quadratic Programming

12:20
Resolution of a 9-nodes graph coloring instance using a QAOA based approach color instances of graph coloring problem
PRESENTER: Philippe Lacomme

ABSTRACT. The graph coloring problem involves assigning a color to each vertex in a given graph G(V,E) so that no two adjacent vertices share the same color, while minimizing the total number of colors used. Graph coloring is one of the oldest topics in graph theory. Such approach has been successfully applied to TSP [1] and Flow-Shop [2] using the subexceedant function [3] that gives a decomposition in the factorial base. Here, we introduce a rank representation that can be converted into a color assignment for the nodes, resulting in a valid graph coloring and qubit strings are used to model the rank.

11:20-12:50 Session 5B: Invited session: New trends in bilevel optimization
11:20
BOBILib: Bilevel Optimization (Benchmark) Instance Library
PRESENTER: Martin Schmidt

ABSTRACT. In this talk, we present the BOBILib, a collection of more than 2500 instances of mixed-integer linear bilevel optimization problems. The goal of this library is to make a large and well-curated set of test instances freely available for the research community so that new and existing algorithms in bilevel optimization can be tested and compared in a standardized way. The library is sub-divided into instances of different types and also contains a benchmark instance set. Moreover, we present a new data format for mixed-integer linear bilevel problems that is less error-prone compared to an older format that will now be deprecated. We provide numerical results for all instances of the library using available bilevel solvers. Based on these numerical results, we select the benchmark instance set, which provides a meaningful basis for experimental comparisons of solution methods in a moderate time. Each instance, together with a solution file if a feasible point or an optimal solution is known, can be downloaded at https://bobilib.org. This is joint work with Johannes Thürauf, Thomas Kleinert, Ivana Ljubic, and Ted Ralphs.

11:50
Random Cell Exploration and its applications to linear bilevel programming
PRESENTER: David Salas

ABSTRACT. In this talk, we present a new heuristic to deal with piecewise linear problems, that we call the Random Cell Exploration method (RCEM). The method is based over the principle that the piecewise structure in general is hard to compute, but for (almost) any fixed point, the region containing the point where the problem is linear might be tractable to determine. This work is motivated by the previous contribution (Muñoz, Salas & Svensson, 2024), where piecewise linearity has been applied to stochastic linear bilevel programming. A preliminary form of RCEM was developed, by regarding the stochastic linear bilevel problem as a particular case of the Bayesian approach of (Salas & Svensson, 2023), first introduced by Mallozzi and Morgan in (Mallozzi, Morgan 1996). This work have been developed together with A. Svensson (Universidad de O'Higgins), G. Muñoz (Universidad de Chile), and Gonzalo Contador ()Universidad Técnica Federico Santa María).

12:20
SDP Relaxations for polynomial Single-Leader-Multi-Follower Games
PRESENTER: Didier Aussel

ABSTRACT. A single-leader-multi-follower game (SLMFG) is an extension of a bilevel optimization problem, where the interaction between only one leader and one follower is considered. In this work we show that the powerful reformulation technique of describing the SLMFGs as equivalent so-called Mathematical Programs with complementarity con- straints (MPCC) extends more generally to SLMFGs, where the constraint functions are neither convex nor even quasiconvex, and present an equivalent first-order reformulation in terms of concatenated Karush-Kuhn-Tucker systems as well an MPCC reformulation. We provide simple examples illustrating the significance of our assumptions and reformulations. In the case where the functions involved in the games are polynomials, we establish that the global optimal value of SLMGs is the limiting optimal value of a sequence of semi-definite program relaxations under suitable conditions. This is obtained by equivalently transforming the follower games into systems of polynomials constraints and employing the celebrated Putinar’s Positivstellensatz to derive semi-definite program relaxations. We also show by examples how polynomial SLMFGs can be solved via semi-definite programming methods.

11:20-12:50 Session 5C: Invited session: New trends in Optimal Transport and Applications, Part I
11:20
Particle Methods for High-dimensional Multi-marginal Optimal transport

ABSTRACT. Optimal transport offers a convenient way to model some risk estimation problems, the goal of which is to find a joint probability measure which maximizes a certain cost that belongs to the class of spectral risk measure. These costs are (highly) nonlinear, since the quantity of interest is not the mean risk, but rather a weighted average giving greater weight to the highest dangers. We focus on a specific spectral risk measure called expected shortfall, or conditional value at risk, namely the average danger in the worst q% of cases, with 0<q<1. For this cost, the problem reads as a partial multi-marginal optimal transport problem, which we study in the case of unidimensional, compactly supported, absolutely continuous marginals. A key result in transport theory is that that under certain conditions, solutions are supported on a low-dimensional set (typically, the graph of a function). This greatly motivates the idea to numerically solve high-dimensional transport problems using particle methods. Roughly speaking, instead of looking for an approximate solution supported on a fixed grid, where the weights of the various points are to be optimized, we restrict ourselves to measures that are linear combinations of a fixed number of Dirac masses, the positions of which we optimize. In the former method, most Dirac masses will get zero weight due to the low-dimensional support of the solution, hence considerably reducing the quality of the discretization for a given grid size. On the other hand, the Lagrangian approach of the latter method ensures that the vast majority of Dirac masses actively contribute to the approximate solution. The particle-method discretized problem can be shown to Gamma-converge to the original problem as the number of Dirac masses tends to infinity. We also address the issue of dealing with the constraints on the marginals within the framework of this particle method. As we have a partial transport problem, these are inequality constraints, and we take them into account by adding a penalization term to the objective function. The latter term consists in the sum of the squared 1-dimensional partial Wasserstein transport costs from each marginal to the (fixed) probability density by which it should be bounded in the original problem. We briefly explain why a 2-dimensional regularization seems to be necessary to numerically compute these 1-dimensional partial transport costs.

11:50
Asymptotic log-Sobolev constants and the Polyak-Łojasiewicz gradient domination condition
PRESENTER: Austin Stromme

ABSTRACT. The Polyak-Łojasiewicz (PL) constant for a given function exactly characterizes the rate of convergence of gradient flow uniformly over initializations, and has been of intense recent interest in optimization and machine learning because it is strictly weaker than strong convexity yet implies many of the same results. In the world of sampling, the log-Sobolev inequality plays a similar role, governing the convergence of Langevin dynamics from arbitrary initialization in Kullback-Leibler divergence. In this talk, we present ongoing work which establishes a new connection between optimization and sampling by showing that the PL constant is exactly the small temperature limit of the log-Sobolev constant, under mild assumptions. In the course of our proof, we also prove some non-asymptotic estimates for the log-Sobolev and Poincaré constants of PL functions which may be of independent interest.

12:20
Slicing Unbalanced Optimal Transport

ABSTRACT. Optimal transport (OT) is a powerful framework to compare probability measures, a fundamental task in many statistical and machine learning problems. Substantial advances have been made in designing OT variants which are either computationally and statistically more efficient or robust. Among them, sliced OT distances have been extensively used to mitigate optimal transport's cubic algorithmic complexity and curse of dimensionality. In parallel, unbalanced OT was designed to allow comparisons of more general positive measures, while being more robust to outliers.

In this talk, we bridge the gap between those two concepts and develop a general framework for efficiently comparing positive measures. We notably formulate two different versions of sliced unbalanced OT, and study the associated topology and statistical properties. We then develop a GPU-friendly Frank-Wolfe like algorithm to compute the corresponding loss functions, and show that the resulting methodology is modular as it encompasses and extends prior related work. We finally present an empirical analysis of our loss functions and methodology on both synthetic and real datasets, to illustrate their computational efficiency, relevance and applicability to real-world scenarios including geophysical data.

11:20-12:50 Session 5D: Optimal control
Chair:
11:20
Continuous-Time Optimal Control Problem under Signal Temporal Logic Constraints
PRESENTER: En Lai

ABSTRACT. Signal temporal logic (STL) is a formal language specifying time-dependent properties of continuous signals in dynamic systems, such as sequentially triggered events, safety and stability guarantees. It allows expressing complex mission requirements, such as temporal constraints, in optimal control problems for robotics, self-driving cars, and cyber-physical systems. The current methods discretize and approximate these problems and solve them using mixed integer linear programming or successive convexification algorithm. None of these frameworks, however, are able to provide rigorous justifications on the convergence of the discretized and approximate problem towards the original one, specified by the non-local and non-smooth STL constraints. In this talk, we introduce a novel approach for solving continuous-time optimal control problems under STL constraints that guarantee robustness. Specifically, our approach introduces auxiliary variables according to the STL constraints, which are used to augment the system’s state and extend its dynamics. Based on these auxiliary variables, the auxiliary functions are defined to construct state constraints, so that the original system's behavior, as specified by the STL, is accurately represented through an augmented system with state constraints. Therefore, we solve directly the optimal control problem of augmented system with state constraints, instead of the original problem with STL constraints. In addition, we theoretically and numerically demonstrate the feasibility of this strategy, as well as the solutions' convergence to that of the original problem.

11:50
Extensions of KL class and Lyapunov Functions for the Maximization of Discrete-time Dynamical System States

ABSTRACT. We present new approaches to solve a maximization problem for which the constraints set is the reachable value set of a discrete-time dynamical system. New approaches are adaptations of tools from dynamical systems stability theory called KL class of functions (are part of comparison functions) and Lyapunov functions.

12:20
Continuous-time Optimal Control for Trajectory Planning of Autonomous Vehicles under Uncertainty
PRESENTER: Ange Valli

ABSTRACT. In this paper, we present a continuous-time optimal control framework for generating reference trajectories for autonomous vehicles. The method developed involves chance constraints for modelling uncertainty. A previous work presented such a model in discrete time and designed for urban driving scenarios only. We extend those results in continuous time. It generates reference trajectories on urban driving scenarios with faster computation and better capacity to capture uncertainty. Our model is less likely to violate the problem's constraints in risky scenarios, and is also robust for optimal control on long term horizons in national roads and highways.

11:20-12:50 Session 5E: Aspects in multi-objective and bi-level optimization
11:20
Block coordination of nonlinear networks and discrete optimization

ABSTRACT. We propose to exploit the strong duality property of nonlinear network flows to manage the synchronization constraints in discrete bi-level optimization problems like the pump scheduling in water distribution or the network design in traffic planning.

11:50
On the Kalai-Smorodinsky solutions for Bi-objective Spanning Tree Problem

ABSTRACT. This work explores the Bi-objective Spanning Tree Problem (BSTP), an extension of the classic Minimum Spanning Tree problem with a variety of practical applications. In the BSTP, each edge of a given graph has two distinct weights, and the goal is to find a spanning tree that minimizes both total weights simultaneously. Since BSTP is a specific case of bi-objective optimization, one common interest is enumerating all Pareto-optimal solutions. However, as BSTP is known to be NP-hard, no existing methods can enumerate these solutions within polynomial time.

In this study, we introduce a novel approach for identifying preferred Pareto-optimal solutions. Our method seeks to minimize the worst ratio of the two objective values relative to their respective maximum possible values. This approach draws on the well-established Kalai-Smorodinsky (KS) solutions concept from cooperative game theory. Specifically, we extend this concept to the discrete case, where solutions can be obtained by optimizing convex combinations of the two objectives.

We begin by introducing the concept of KS solutions for the BSTP and characterizing its properties. We then propose a weakly polynomial time algorithm to compute these KS solutions for the BSTP. Finally, we present computational results on several instances and discuss the findings. Beyond its application to the BSTP, this work provides an efficient and interpretable method for solving bi-objective combinatorial optimization problems.

12:20
Sampling and Estimating the Set of Pareto Optimal Solutions in Stochastic Multi-Objective Optimization
PRESENTER: Zachary Jones

ABSTRACT. The framework of stochastic multi-objective programming allows for the inclusion of uncertainties in multi-objective optimization problems at the cost of transforming the set of objectives into a set of expectations of random quantities. Of the family of Robins-Monro algorithms, the stochastic multi-gradient algorithm (SMGDA) gives a solution to these types of problems without ever having to calculate the expected values of the objectives or their gradients \cite{ref1}. However, a bias in the algorithm causes it to converge to only a subset of the whole Pareto front, limiting its use. To sample along the whole Pareto front, we reduce the bias of the stochastic multi-gradient calculation using an exponential smoothing technique, and promote the exploration of the Pareto front by adding non-vanishing noise tangential to the front.

Then, given a set of points which are potentially located on the Pareto front of a stochastic multi-objective optimization program how can we assess dominance? Working only with previously collected minima, the primary challenges to overcome are to perform probabilistic comparison between points with no knowledge of the true value of the expected quantity of interest and to infer the design points that correspond to likely undominated points in objective space. We solve this problem through a dual modeling approach, building a nearest neighbor model to approximate the objectives and a probabilistic model in design space to assess their likely dominance.

We prove that our algorithm, Stochastic Tangential Pareto Dynamics (STPD), generates samples in a concentrated set containing the whole Pareto front and We analyze the the performance of our postprocessing method as the sample size, noise level, and dimension of the underlying stochastic multi-objective program change in toy problems.

11:20-12:50 Session 5F: Scenarios methods in stochastic control and applications
11:20
Scenario Tree Reduction via Wasserstein Barycenters
PRESENTER: Daniel Mimouni

ABSTRACT. Scenario tree reduction techniques are essential for achieving a balance between an accurate representation of uncertainties and computational complexity for solving multistage stochastic programming problems. In the realm of available techniques, the Kovacevic and Pichler algorithm (Ann. Oper. Res., 2015) stands out for employing the nested distance, a metric suited for comparing multistage scenario trees. However, dealing with large-scale scenario trees can lead to a prohibitive computational burden due to the algorithm's requirement of solving a large-scale linear problem per stage and iteration. This study concentrates on efficient approaches to solving such linear problems, recognizing that their solutions are Wasserstein barycenters of the tree nodes' probabilities on a given stage. We leverage advanced optimal transport techniques to compute Wasserstein barycenters and significantly improve the computational performance of the Kovacevic and Pichler algorithm. Our boosted variants of this algorithm are benchmarked on several multistage scenario trees. Our experiments show that compared to the original scenario tree reduction algorithm, our variants can be eight times faster for reducing scenario trees with 4 stages, 78125 scenarios, and 97656 nodes.

11:50
On problem-based scenario reductions
PRESENTER: Benoît Tran

ABSTRACT. In stochastic optimization, a random variable is often modeled as a discrete probability distribution with n atoms. The number of atoms is a common bottleneck for solving an associated stochastic optimization problem.

Scenario reduction aims to construct an approximate discrete probability distribution with m << n atoms. With the downstream optimization problem in mind, scenario reduction differs from usual clustering methods, like k-means, as the value of the stochastic problem with the reduced distribution remains close to the original optimal value.

Recent works (Bertsimas and Mundru 2023, Keutchayan and al. 2023) build problem-based metrics to evaluate distances between scenarios. In these works, costly problem-based metrics are built by solving every deterministic problem when one fixes the realization of the random variable. Then scenario reduction is done using these metrics. They have shown that problem-based metrics allow scenario reduction methods to provide better reduced discrete distribution.

We propose to iteratively compute and refine problem-based metrics while solving more and more relevant stochastic problems with reduced distributions. Thus, we alleviate the cost of computing such metrics while retaining the quality of the reduced distributions after scenario reduction.

Presentation of these old and new methods, their stability properties along with extensive numerical comparisons will be presented.

12:20
Scenario-Based Decomposition for Optimal Power Curtailment with Priority-Level Producers
PRESENTER: Ksenia Syrtseva

ABSTRACT. The use of renewable energy sources (RES) is becoming increasingly important as the world transitions to a more sustainable and environmentally friendly energy system. However, the variability and intermittency of RES present significant challenges to the efficient and reliable operation of power systems. One mechanism for addressing these challenges is power curtailment, which is initiated by Distribution System Operators (DSOs) based on the priority levels assigned to producers through their contract type. In this work, we propose a novel formulation of the power curtailment problem incorporating producers’ curtailment priority levels and associated rules, modeled with the use of binary variables. Uncertainties related to RES production and consumption are addressed through a probability maximization problem, leading to stochastic alternating-current power flow equations. The resulting optimization problem features two types of nonconvexity: one related to binary variables and the other arising from alternating-current power flow equations. By employing a scenario decomposition method we can separate these issues and break down the stochastic component to a deterministic alternating-current optimal power flow (AC-OPF) problems per scenario. This approach results in a mixed-integer linear programming (MILP) problem that captures prioritization rules and parallelizable AC-OPFs that ensure technical feasibility. The proposed methodology is tested in a 33-bus distribution network for several dual update rules (ADMM, proximal Lagrangian) aimed at ensuring consensus for different scenarios.

11:20-12:50 Session 5G: Invited Session : Dynamical systems and algorithms with applications in optimization, sampling and variational analysis - part I
11:20
Global Optimization Algorithm through High-Resolution Sampling
PRESENTER: Daniel Cortild

ABSTRACT. We present an optimization algorithm that can identify a global minimum of a potentially nonconvex function with high probability, assuming the Gibbs measure of the function satisfies a logarithmic Sobolev inequality. Our global optimization method is built on a sampling algorithm, drawing inspiration from both overdamped and underdamped Langevin dynamics, as well as from the high-resolution differential equation known for its acceleration in deterministic settings. While the focus of the work is primarily theoretical, we demonstrate the effectiveness of our algorithm on the Rastrigin function, where it outperforms recent approaches.

11:50
A Speed Restart Scheme for an Inertial System with Hessian-Driven Damping and Constant Coefficients
PRESENTER: Huiyuan Guo

ABSTRACT. We study a speed restart scheme in order to improve the convergence rate of accelerated gradient schemes. We propose a second order differential equation with constant coefficients \[\ddot{x}(t)+\alpha\dot{x}(t)+\beta\nabla^{2} f(x(t))\dot{x}(t)+\gamma\nabla f(x(t))=0,\] where $f:\mathbb{R}^{n}\rightarrow\mathbb{R}$ is a convex function of class $\mathcal{C}^{2}$, which attains its minimum value, $\alpha>0, \beta\geq 0$ and $\gamma>0$ are three parameters. The gradient of function $f$ is Lipschitz continuous with constant $L>0$.

It is shown that suitable choices for $\alpha, \beta$ and $\gamma$ provides exponential decay of the value function when $f$ is $\mathcal{C}^{2}$ strongly-convex with Lipschitz gradient. In our case, we establish a linear convergence rate for the function values along the restarted trajectories without assuming that the strong convexity of the objective function and it holds for any value of $\alpha>0, \beta\geq 0$ and $\gamma>0$. We also report numerical experiments which show that dynamical system with speed restarting scheme together improve the performance of the dynamics.

12:20
Tikhonov Regularized Exterior Penalty Methods For Hierarchical Variational Inequalities
PRESENTER: Meggie Marschner

ABSTRACT. We consider nested variational inequalities consisting in a (upper-level) variational inequality whose feasible set is given by the solution set of another (lower-level) variational inequality. This family of hierarchical optimization problems is notoriously difficult to solve but at the same time very rich in terms of concrete applications in economics, machine learning, engineering and signal processing. In particular, purely hierarchical convex bilevel optimization problems and certain multi-follower games are particular instances of nested variational inequalities of the above form. Working within a real Hilbert space setting, we develop a prox-penalization algorithm with strong convergence guarantees towards a solution of the nested VI problem. In the special case of a bounded domain in the lower level problem, we also prove rates of convergence towards a solution of the lower level problem. Our proof method builds on general Krasnoselskii-Mann iterations involved in a double-loop architecture, and is thus very flexible. We present various application that fit into our framework and present also some preliminary numerical results.

11:20-12:50 Session 5H: Shape Optimization and Optimal Control
11:20
Stochastic Approximation with the Two-Norm Discrepancy

ABSTRACT. In this talk, we discuss the convergence of the projected stochastic gradient method in function spaces. We highlight applications and challenges in this setting and summarize existing results. A particular focus will be on a well-known phenomenon in optimal control called the two-norm discrepancy whereby coercivity of an objective can only be shown in a weaker norm than that in which it is differentiable. This was the topic of our recent paper, where the convergence of the projected stochastic gradient method with respect to the appropriate norm was proven. As an application, we consider Bernoulli's exterior free boundary problem with a random interior boundary. We recast this problem into a shape optimization problem by means of the minimization of the expected Dirichlet energy. By restricting ourselves to the class of convex, sufficiently smooth domains of bounded curvature, the shape optimization problem becomes strongly convex with respect to the weaker norm. The theoretical findings are supported and validated by numerical experiments.

11:50
Stabilization and Optimal Control of an Interconnected SDE – Scalar PDE System
PRESENTER: Gabriel Velho

ABSTRACT. Interconnected systems are frequently advocated to model complex systems such as energy transportation networks. In this context, Stochastic Differential Equations (SDE) are key for capturing hazardous uncertain fluctuations that impact nominal dynamics. At the same time, Partial Differential Equations (PDE) play a crucial role in modeling potentially degrading second-order physical effects. Recently, there has been a notable surge of interest in integrating these two approaches to derive very accurate dynamics models, yielding sophisticated interconnected SDE--PDE systems. In this talk, we will introduce novel, so far unexplored methodologies we have recently developed to control such complex systems. Specifically, we will focus on solving optimal control problems subject to interconnected SDE--PDE dynamics. The objective consists of steering the SDE to a desired target area in expectation, while minimizing the variance of the SDE state by controlling the PDE at the boundary. For this, we will start by introducing the problem of boundary control of PDE via backstepping-type methods. These methods are particularly effective for designing stabilizing boundary controllers. Then, we will show that, for a certain, yet fairly large class of interconnected SDE--PDE systems, backstepping enables transforming the problem into a more manageable input-delayed SDE. Finally, we will present strategies to control input-delayed SDE with additive noise, detailing methods that guarantee variance reduction.

12:20
Parametric Shape Optimization of Flagellated Micro-Swimmers Using Bayesian Techniques
PRESENTER: Lucas Palazzolo

ABSTRACT. Micro-swimming is emerging as a significant research area due to its potential applications, particularly in the medical field for tasks like cargo transport and drug delivery. A key factor in the performance of micro-robots is the optimization of their shape, which directly influences their motion.

This talk investigates shape optimization of micro-swimmers with one or two helical flagella. The Boundary Element Method (BEM) is used to simulate the dynamic. Due to the complexity of our model, we focus on parametric shape optimization by using Bayesian Optimization (BO), which overcomes the challenges of computing gradient, constraints treatment and reduce the number of costly function evaluations inherent in swimmer dynamics. Additionally, we employ the Free-Form Deformation (FFD) technique which provides a sufficiently complex admissible shape space. This is integrated with the Scalable Constrained Bayesian Optimization (SCBO) method, ideal for high-dimensional constrained optimization problems.

The optimized designs are compared to biological swimmers, revealing a wide variety of efficient swimming strategies, including both pushers and pullers.

11:20-12:50 Session 5I: Invited Session: Theory and Algorithms in Sparse Optimization
11:20
H_0 Couplings and the l_0 Pseudonorm
PRESENTER: Michel De Lara

ABSTRACT. In recent work, we introduced the class of Capra-couplings, where Capra\stands for Constant Along Primal RAys. Any Capra-coupling is defined by means of a norm. When this norm is orthant-strictly monotonic, a notion introduced in our work, we were able to prove  that the l0 pseudonorm\ was Capra-convex, thus extending our original result established for the Euclidean norm. Little by little, we realized that our approach, initially designed for the l0 pseudonorm, could possibly be extended to 0-homogeneous functions, studied by couplings that were themselves 0-homogeneous in the first variable, and linear in the second one.

In this talk, we introduce a new class of (additive) H0-couplings, and study the induced conjugacy acting on 0-homogeneous functions. We do the same for multiplicative H0-couplings and the induced polarity. With this, we are able to explore the best lower approximation of 0-homogeneous functions by {closed} convex and by positively 1-homogeneous {closed} convex. Finally, we establish new results of hidden convexity for the l0 pseudonorm.

11:50
The Geometry of Sparsity-Inducing Balls

ABSTRACT. One formulation in sparse optimization consists in seeking an optimal solution among vectors with at most k nonzero coordinates. This constraint is hard to handle, and a strategy to overcome that difficulty amounts to adding a norm penalty term to the objective function. The most widely used penalty is based on the l1-norm which is recognized as the archetype of sparsity-inducing norms.

In this talk, we present k-support norms, generated from a given source norm, and show how they contribute to induce sparsity via support identification. In case the source norms are the l1- and the l2-norms, we analyze the faces and normal cones of the unit balls for the associated k-support norms and their dual top-k norms.

12:20
Capra-Cutting Plane Method Applied to Sparse Optimization

ABSTRACT. In 1960, Kelley proposed a cutting plane method to minimize a continuous convex objective function over a compact set. This iterative method consists at each step in minimizing a polyhedral approximation of the objective function before improving the approximation by a valid affine cut. In generalized convexity, the affine functions giving the cuts are replaced by some family of base functions. This base functions are chosen so that the objective function is their supremum, making it abstract convex. In this framework, the Kelley's algorithm has been generalized to continuous abstract convex functions. We continue the generalization of the cutting plane method by providing a convergence result that can be applied to lower semicontinuous objective functions. This convergence result is motivated by the Capra-convexity results on the l0 pseudonorm, which is lower semicontinuous. As explicit formulas for the Capra-subdifferential of the l0 pseudonorm have been calculated, we can now implement Capra-cutting plane methods for the sparse problem of minimizing l0 over a compact set. We present numerical results of this method on small instances of the problem of minimizing l0 over an affine space without the origin.

11:20-12:50 Session 5J: Optimal charging
11:20
Some Preliminary Insights on Project GAALACTIC: (Provably) Good Approximation ALgorithms for Administrating Curtailments in the Telecommunications Industry Context
PRESENTER: Cédric Bentz

ABSTRACT. Our purpose here is to present a newly funded project, called GAALACTIC. The main topic of this project is to study, from a theoretical point of view, approximation algorithms for the problem of optimizing the use of a battery installed in a telecommunications network, provided that this battery can be used to perform so-called load curtailments (or simply curtailments).

Such a battery is typically installed in a telecommunications network as backup, in order to prevent customers from being impacted by potential power outages. However, other uses of batteries have also emerged in the past few years, in order to make them economically more appealing. As an example, a network operator may also wish to use the batteries of its own network to perform curtailments: in this case, in exchange for a given reward, this network operator agrees to avoid buying (and hence consuming) a given amount of electricity from the grid for a given amount of time. In order to do so, the network operator then has to use its own batteries, and at the same time to ensure some safety usage rules, as well as some market rules.

Recently, a series of papers have addressed this problem (or similar ones) with one or more batteries. In particular, a MILP formulation, as well as a graph-based heuristic, were proposed in [Isaias F. Silva, Cédric Bentz, Mustapha Bouhtou, Matthieu Chardy, Safia Kedad-Sidhoum. Optimization of battery management in telecommunications networks under energy market incentives. Annals of Operations Research 332 (2024) 949-988] for the case with only one battery, and compared experimentally. However, while this heuristic can indeed be parameterized experimentally in order to provide solutions with improved quality, but also with a bigger running time, no theoretical study of its performance has been done yet. In this project, our main goal is thus to determine whether such a heuristic could be turned into a proper approximation algorithm that computes solutions with guaranteed quality.

This may involve determining optimal values for some of its parameters and/or slightly modifying some of its steps, but also writing down a formal analysis. We will review several preliminary results, namely some useful structural properties of optimal solutions, a first (but for now incomplete) rounding scheme having some nice characteristics, and some insights on which features the final algorithm should have, before concluding on what remains to be done.

11:50
Optimal Operation of a Battery Energy Storage System Considering Cyclic Degradation and Revamping Strategies: a comparison of two formulations
PRESENTER: Asja Alic

ABSTRACT. Battery Energy Storage Systems (BESS) are subject to inevitable capacity degradation phenomena due to frequent charging/discharging actions, whose dynamics are typically non-linear and difficult to be integrated into optimization problems. Nevertheless, neglecting this aspect may lead to an overestimation of the net revenues and to high-cost penalties for the inability of the BESS to provide the contracted services. Within this context, we propose two deterministic scheduling models that co-optimize the short-term arbitrage operation (hourly) in the Wholesale Energy Market (WEM) and the long-term revamping actions (months) to restore the degraded capacity and comply with the requirements from the contracted services. Both the models consider the objective function in (1) where the first term accounts for the revenues from selling and buying power [MW] (P_t^d,P_t^c respectively) at the λ_t market price [€/MWh] while the second term accounts for the revamping cost for buying new energy capacity R_t [MWh] at the market price θ_t [€/MWh]. max┬(P_t^c,P_t^d,S_t )⁡[f(λ_t,P_t^c,P_t^d )-g(θ_t,R_t)] (1) The dynamic evolution over time of the energy capacity E_t [MWh] is expressed by (2) E_(t+1)=E_t-γ_t+R_t (2) where the energy capacity degradation γ_t [MWh] pursuant to cyclic actions is defined as in (3). γ_t=E ̂/2∙|1/(¯N(d_t))-1/(¯N(d_(t+1)))| (3) Here ¯N (z) indicates the maximum number of full cycles which a BESS can perform at a certain Depth of Discharge (DoD) z before the capacity drops below 20% of the designed energy capacity E ̂ [MWh]. Even though both the models consider the same objective function (1), they rely on different mathematical frameworks and consider different re-formulation techniques to account for the above-mentioned constraints. The first model is the Greedy Optimal Revamping Algorithm (GORA) which is based on a Dynamic Programming (DP) algorithm. This methodology allows to directly integrate the non-convex behaviour of cyclic degradation (3) and therefore consider different cycle-life curves. The feasibility of the contracted service is ensured by the instantaneous replacement of the degraded cells with brand new ones. The second model is the Optimal Capacity Revamping Algorithm (OCRA) which is formulated as a Mixed Integer Linear Programming (MILP) algorithm. Here, the cyclic degradation (3) is re-formulated with ad-hoc reformulation techniques, allowing for the integration of only specific families of cycle-life curves. Nevertheless, differently from the GORA, additional constraints have been added to allow for more flexible revamping policy where the system may periodically (e.g. every six months) purchase some extra (racks of) cells to increase its capacity. Both the models have been developed using the Julia 1.7.3 programming environment and the simulations have been performed on an Intel® CoreTM i5-52000 processor with 8.0 GB RAM. Concerning the OCRA, the optimization problem has been solved by means of the Gurobi 10.0.2 solver with a MipGap of 1%. In both the cases, results have evidenced that considering the cyclic degradation while optimizing the system’s operation lead to an increase in the net revenues (around 15%) at the end of the planning horizon. As a matter of fact, the models acknowledge the capacity degradation and therefore limit the cyclic action of the BESS during arbitrage operation in the WEM. This strategy, although decreasing the revenues from arbitrage operation, allows to abate the overall costs for the revamping actions.

12:20
Online Moment Constrained Optimal Transport applied to Electric Vehicle Charging
PRESENTER: Thomas Le Corre

ABSTRACT. The widespread adoption of electric vehicles (EVs) necessitates their efficient charging management to minimize impact on the power grid and optimize electricity consumption. We propose a mean-field optimization approach to handle a large population of EVs and address challenges like stochastic arrival and departure patterns. A Mean Field Control problem is formulated as an optimal transport problem, including both hard constraints and grid objectives. We introduce an online version of this moment-constrained optimal transport problem to handle real-time EV charging station management scenarios. An algorithm is proposed to solve this problem sequentially and we discuss how the optimal solution at the previous time step offers a good initialization for the current one. The ElaadNL dataset containing 10,000 EV charging transactions in the Netherlands is used for empirical validation, demonstrating the effectiveness of the proposed model.

12:50-14:10Lunch Break
12:50-13:20 Session 6: Semidefinite Programming
12:50
Restoring definiteness by advancing towards an inner SDP solution (at the edge of the SDP cone)

ABSTRACT. What means restoring definiteness? Given non-SDP matrix S, you need an estimated SDP matrix that is (very) related to S; if S is almost SDP, this means you essentially want to remove some error from a matrix S that should have been SDP in theory. The context for this problem was provided by [1], where we find that “in practice the matrices obtained may lack definiteness due to missing or asynchronous observations or by the nature of their construction, such as in stress testing in finance or when correlations are determined within groups and then the groups are joined together. To ensure the validity of the subsequent analysis the indefinite approximation needs to be replaced by a valid covariance or correlation (SDP) matrix, which we call a replacement matrix. This is needed in a very wide variety of applications including mod- eling public health [8] and dietary intakes [36], determination of insurance premiums for crops [12], simulation of wireless links in vehicular networks [37], reservoir modeling [26], oceanography [32], and horse breeding [35].” I needed this problem inside an optimization algorithm Projective Cutting Planes presented at PGMO-DAYS 2023 (or preprint CoRR abs/2311.09365).

14:10-15:00 Session 7: Plenary II: Huseyin Topaloglu
14:10
Incorporating Discrete Choice Models into Revenue Management Decisions

ABSTRACT. Over the last couple of decades, there has been enormous progress in using discrete choice models to understand how customers choose and substitute among products and incorporating this understanding into operational models to decide which assortment of products to offer to customers or what prices to charge. We owe some of this progress to increase in the computational power so that we can build and solve more detailed operational models, but perhaps, most of this progress is due to the fact that online sales channels started providing fine-grained data on how customers browse the products. In this talk, we will go over fundamental discrete choice models that have been used in building operational assortment optimization and pricing models, overview the main algorithmic approaches that have been developed to solve the operational models, and identify research prospects. The focus will be on both static models that make one-shot assortment optimization or pricing decisions, as well as dynamic models that explicitly capture the evolution of demand and inventories over time.

15:00-15:30 Session 8: PGMO Phd Prize, part II: Tam Le
15:00
Nonsmooth calculus and optimization for machine learning: first-order sampling and implicit differentiation

ABSTRACT. Machine learning problems are often formulated as the minimization of a nonsmooth and nonconvex expectation. Gradient-based methods are favored in this context due to their flexi- bility and scalability. To implement these methods, we observe a common practice of replacing classical derivatives with automatic differentiation, but without rigorous theoretical justification in the nonsmooth setting. A notion of gradient called “conservative gradient” provides a justification for such a practice by extending simple calculus rules to nonsmooth functions, such as the chain rule. In this thesis, we propose two extensions of the conservative calculus. A first result is a rule to differentiate nonsmooth functions under the integral operation which justifies gradient sampling on nonsmooth functions. A second result is an implicit differentiation formula motivated by bi-level problems such as hyperparameter selection or implicit layers. Thanks to these new calculus rules, we propose a theoretical framework for analyzing algo- rithms that is furthermore compatible with practical implementations. A descent along curves property allows us to obtain the convergence of two popular optimization algorithms, the stochas- tic subgradient method and its heavy ball variant. Our approach involves a notion of stationarity that may depend on the use of calculus rules in practical implementations and gen- erate absurd limit points. For both algorithms we finally show that these artifacts are generically avoided.

15:30-16:00Coffee Break
16:00-17:30 Session 9A: Invited session: Auctions: algorithms and application in power and telecommunication markets
16:00
Optimal Strategy against Straightforward Bidding in Clock Auctions
PRESENTER: Jad Zeroual

ABSTRACT. We study a model of auction representative of the 5G auction in France. We determine the optimal strategy of a bidder, assuming that the valuations of competitors are unknown to this bidder and that competitors adopt the straightforward bidding strategy. Our model is based on a Partially Observable Markov Decision Process (POMDP). We show in particular that this special POMDP admits a concise statistics, avoiding the solution of a dynamic programming equation in the space of beliefs. We illustrate our results by numerical experiments, comparing the value of the bidder with the value of a perfectly informed one.

16:30
Learning in multi-unit uniform price auctions
PRESENTER: Marius Potfer

ABSTRACT. Motivated by the strategic participation of electricity producers in electricity day-ahead market, we study the problem of online learning in repeated multi-unit uniform price auctions. The main contribution of this paper is the introduction of a new modelling of the action space. We prove that a learning algorithm leveraging the structure of this problem achieves improved regret rates under bandit feedback, compared to previously known techniques. This improved regret rate is tight up to logarithmic terms. Inspired by electricity reserve markets, we further introduce a different feedback model under which all winning bids are revealed. This feedback interpolate between the full-information and bandit scenario depending on the auctions' result. .

17:00
Optimizing the coalition gain in Online Auctions with Greedy Structured Bandits

ABSTRACT. Motivated by online display advertising, this work considers repeated second-price auctions, where agents sample their value from an unknown distribution with cumulative distribution function F . In each auction t, a decision-maker bound by limited observations selects nt agents from a coalition of N to compete for a prize with p other agents, aiming to maximize the cumulative reward of the coalition across all auctions. The problem is framed as an N -armed structured bandit, each number of player sent being an arm n, with expected reward r(n) fully characterized by F and p + n. We present two algorithms, Local-Greedy (LG) and Greedy-Grid (GG), both achieving constant problem-dependent regret. This relies on three key ingredients: 1. an estimator of r(n) from feedback collected from any arm k, 2. concentration bounds of these estimates for k within an estimation neighborhood of n and 3. the unimodality property of r under standard assumptions on F . Additionally, GG exhibits problem-independent guarantees on top of best problem-dependent guarantees. However, by avoiding to rely on confidence intervals, LG practically outperforms GG, as well as standard unimodal bandit algorithms such as OSUB or multi-armed bandit algorithms.

16:00-17:30 Session 9B: Invited Session: Differential equations in machine learning
16:00
Stable Parameters of Mean-Field Neural ODEs
PRESENTER: Samuel Daudin

ABSTRACT. Performing regression tasks with deep neural networks can be modeled as an optimal control problem for an ordinary differential equation. We investigate a relaxation of this problem where controls are taken to be probability measures over the parameter space and the cost involves an additional entropy penalization. We are particularly interested in the stability of the optimal parameters -- where stability is understood in terms of unique solvability of a certain linearized system of pdes. We show that, for a lot of initial data (in terms of the initial distribution of the features), there is actually a unique stable global minimizer in the control problem. Moreover we prove that stable minimizers satisfy some local Polyak-Lojasiewicz condition and that the (continuous analog of the) gradient descent with backpropagation converges exponentially fast when initialized nearby a stable minimizer.

16:30
Universal Approximation of Dynamical Systems by Semi-Autonomous Neural ODEs
PRESENTER: Kang Liu

ABSTRACT. In this presentation, we introduce semi-autonomous neural ordinary differential equations (SA-NODEs), a variation of the classical NODEs, employing fewer parameters. We investigate the universal approximation properties of SA-NODEs for dynamical systems from both a theoretical and a numerical perspective. Within the assumption of a finite-time horizon, under general hypotheses we establish an asymptotic approximation result, demonstrating that the error vanishes as the number of parameters goes to infinity. Under additional regularity assumptions, we further specify this convergence rate in relation to the number of parameters, utilizing quantitative approximation results in the Barron space. Based on the previous result, we prove an approximation rate for transport equations by their neural counterparts. Our numerical experiments validate the effectiveness of SA-NODEs in capturing the dynamics of various ODE systems and transport equations. Additionally, we compare SA-NODEs with classical NODEs, highlighting the superior performance and reduced complexity of our approach.

17:00
Incremental Learning in Diagonal Linear Networks

ABSTRACT. Diagonal linear networks (DLNs) are a toy simplification of artificial neural networks; they consist in a quadratic reparametrization of linear regression inducing a sparse implicit regularization. In this talk, I will describe the trajectory of the gradient flow of DLNs in the limit of small initialization. I will show that incremental learning is effectively performed in the limit: coordinates are successively activated, while the iterate is the minimizer of the loss constrained to have support on the active coordinates only. This shows that the sparse implicit regularization of DLNs decreases with time.

16:00-17:30 Session 9C: Invited session: New trends in Optimal Transport and Applications, Part II
16:00
The dual charge method for the multimarginal optimal transport with Coulomb cost

ABSTRACT. The multimarginal optimal transport occurs in statistical physics as well as in quantum chemistry, where it is used to describe strongly correlated systems of electrons. In the recent years, efforts have been put into fashioning efficient numerical methods to solve this problem, which is notoriously hard to solve from a computational point of view. In this talk, I will introduce a specific discretization to tackle this problem numerically in the case of Coulomb interaction, and which can be interpreted as the dual version of the so-called Moment-Constrained Optimal Transport (MCOT) introduced by Alfonsi, Coyaud, Ehrlacher and Lombardi.

16:30
Regularized Optimal Transport: disentangling suboptimality and entropy
PRESENTER: Hugo Malamut

ABSTRACT. This presentation explores the convergence behavior of regularized optimal transport solutions as the regularization parameter vanishes. Focusing on entropy-regularized optimal transport with continuous marginals, we analyze the convergence of transport plans and costs. Under mild assumptions, we establish that the Wasserstein distance between the regularized and unregularized solutions scales as \( \sqrt{\epsilon} \), while suboptimality decreases linearly with \( \epsilon \). Additionally,further refinements are provided when marginals have finite Fisher information. These results demonstrate how subobtimality and entropy contributions can be disentangled to yield sharp convergence rates.

17:00
Demographic parity in regression and classification within the unawareness framework
PRESENTER: Vincent Divol

ABSTRACT. Statistical fairness seeks to ensure an equitable distribution of predictions or algorithmic decisions across different sensitive groups. Among the fairness criteria under consideration, demographic parity is arguably the most conceptually straightforward: it simply requires that the distribution of outcomes is identical across all sensitive groups.

We focus on the unawareness framework, where the prediction function cannot make direct use of the sensitive attribute (thus, individuals are not treated differently based on discriminating factors). In a regression setting, we solve the problem of finding the optimal unaware prediction function satisfying the demographic parity constraint. Interestingly enough, this problem boils down to solving a two-dimensional optimal transport problem for some degenerate cost. We study this optimal transport problem and exhibit relevant properties of the associated prediction function.

16:00-17:30 Session 9D: Stochastic optimization
16:00
A Derivative-Free Method for Chance-Constrained Problems with Right-Hand Side Uncertainty

ABSTRACT. This work addresses (mixed-integer) joint chance-constrained optimization problems in which the only uncertain parameter corresponds to the right-hand side coefficients in an inequality system. It exploits one-dimensional marginals to construct an upper model for the underlying joint probability function, which need not be differentiable nor satisfy generalized concavity properties. Based on this model, a global optimization derivative-free method is proposed. The approach iteratively approximates the probability function within an outer approximation algorithm that is shown to compute, under mild assumptions, an approximate global solution to the underlying nonconvex chance-constrained problem. When the problem's data is linear, the outer approximation algorithm requires solving (approximately) a mixed-integer linear programming problem per iteration. Numerical experiments on academic chance-constrained problems highlight the approach's potential and limitations.

16:30
Comparison between Robust Optimization, Stochastic Programming and Distributionally Robust Optimization for Unit Commitment under uncertainty
PRESENTER: Mathis Azéma

ABSTRACT. The Unit Commitment (UC) problem, a key challenge for energy producers and network operators, involves determining the optimal production schedule for electricity generation units, such as thermal and hydro units, while accounting for both technical and economic constraints. A significant source of complexity in UC arises from uncertainty, particularly in forecasting energy demand and wind power generation. In this study, we focus on demand uncertainty and adopt a two-stage approach, with commitment decisions made in the first stage. Traditional optimization techniques, including robust optimization and stochastic programming, have been widely applied to address uncertainty in UC. However, a newer approach, Distributionally Robust Optimization (DRO), has gained attention, particularly DRO methods based on divergences and the Wasserstein distance. Among these, the Wasserstein distance norm 2 is unexplored in the UC context.

In this talk, we first provide new algorithms to address this new DRO case. We then examine and compare stochastic, risk-averse, robust and DRO approaches for UC under uncertainty, evaluating their potential advantages in term of cost, robustness and reliability of solutions.

17:00
Distributionally Robust Standard Quadratic Optimization with Wasserstein Ambiguity

ABSTRACT. The standard quadratic optimization problem (StQP) consists of minimizing a quadratic form over the standard simplex. If the quadratic form is neither convex nor concave, the StQP is NP-hard. This problem has many interesting applications ranging from portfolio optimization to machine learning.

Sometimes, the data matrix is uncertain but some information about its distribution can be inferred, e.g. the first two moments or else a reference distribution (typically, the empirical distribution after sampling). In distributionally robust optimization, the goal is to minimize over all possible distributions in an ambiguity set defined based upon above mentioned characteristics. We explore the StQP under distributionally robust chance constraints and an ambiguity set based upon maximal Wasserstein distance to the sampled empirical distribution.

16:00-17:30 Session 9E: Black-box and Global Optimization
16:00
Linear convergence of CMA-ES
PRESENTER: Armand Gissler

ABSTRACT. Evolution strategies (ES) are a class of derivative-free optimization algorithms. Among them, the covariance matrix adaptation evolution strategy (CMA-ES) is the state-of-the-art method of this class that demonstrated good empirical performances for nonconvex, multi-modal, ill-conditioned or noisy problems. However, establishing a mathematical proof of its convergence is a challenging problem. I will present recent works in which we prove the linear convergence of CMA-ES on ellipsoidal problems (minimizing functions with ellipsoidal level sets). We show also that the inverse Hessian of convex-quadratic functions is approximated by the algorithm. This proof relies on the analysis of an underlying Markov chain, that we prove to be geometrically ergodic.

16:30
Runtime Analysis of the SMS-EMOA for Many-Objective Optimization
PRESENTER: Benjamin Doerr

ABSTRACT. The widely used multiobjective optimizer NSGA-II was recently proven to have considerable difficulties in many-objective optimization. In contrast, experimental results in the literature show a good performance of the SMS-EMOA, which can be seen as a steady-state version of the NSGA-II that uses the hypervolume contribution instead of the crowding distance as the second selection criterion.

This work conducts the first rigorous runtime analysis of the SMS-EMOA for many-objective optimization. To this aim, we first propose a many-objective counterpart of the popular bi-objective OneJumpZeroJump benchmark. We prove that SMS-EMOA computes the full Pareto front of this benchmark in an expected number of $O(\mu M n^k)$ iterations, where $n$ denotes the problem size (length of the bit-string representation), $k$ the gap size (a difficulty parameter of the problem), $M=(2n/m-2k+3)^{m/2}$ the size of the Pareto front, and $\mu$ the population size (at least the same size as the largest incomparable set). This result together with the existing negative result for the original NSGA-II shows that, in principle, the general approach of the NSGA-II is suitable for many-objective optimization, but the crowding distance as tie-breaker has deficiencies.

Our main technical insight, a general condition ensuring that the SMS-EMOA does not lose Pareto-optimal objective values, promises to be useful also in other runtime analyses of this algorithm.

This work is part of the PGMO-funded project "Mathematical Analysis of State-of-the-Art Multi-Objective Evolutionary Algorithms" (PI: Benjamin Doerr).

17:00
Evolutionary Algorithms Are Robust to Noise out of the Box
PRESENTER: Denis Antipov

ABSTRACT. Randomized search heuristics (RHSs) are generally believed to be robust to noise. However, almost all mathematical analyses on how RSHs cope with a noisy access to the objective function assume that each solution is re-evaluated whenever it is compared to others. This is unfortunate, both because it wastes computational resources and because it requires the user to foresee that noise is present (as in a noise-free setting, one would never re-evaluate solutions).

In this work, we show the need for re-evaluations could be overestimated, and in fact, detrimental. For the classic benchmark problem of how the $(1+1)$ evolutionary algorithm optimizes the LeadingOnes benchmark, we show that without re-evaluations up to constant noise rates can be tolerated, much more than the $O(n^{-2} \log n)$ noise rates that can be tolerated when re-evaluating solutions.

This first runtime analysis of an evolutionary algorithm solving a single-objective noisy problem without re-evaluations could indicate that such algorithms cope with noise much better than previously thought, and without the need to foresee the presence of noise.

16:00-17:30 Session 9F: Decomposition and related optimization methods
16:00
On the Duality between Frank-Wolfe and Cutting-Plane Algorithms
PRESENTER: Thibault Moquet

ABSTRACT. The Frank-Wolfe Algorithm, or Conditional Gradient Algorithm, is a well-known convex optimization algorithm. It is used to find an approximate minimizer of a convex and continuously differentiable function f on a closed convex bounded subset K of a Hilbert space H, where we only assume being able to find a minimizer of an inner products of the form <µ,x> for x in K, and this for any µ in H. The Cutting-Plane Algorithm allows for the minimization of a convex differentiable function g on a closed convex bounded subset D of H, where we assume being able to minimize a convex piecewise-affine lower approximation of g. I am going to present my results obtained during the first two years of my PhD, which show the duality, in the sense of Fenchel-Rockafellar, between the Level-Set Method, which is a more robust variant of the Cutting-Plane Method, and an algorithm which can be seen as a extension of the Frank-Wolfe Algorithm for functions of the form f+h, where f is a convex and continuously differentiable function and h is a nonsmooth convex function. I also give theoretical guarantees obtained for these methods, as well as a result on their convergence speeds.

16:30
Computing Usage Values for Prospective Studies in Energy Systems Using Spatial Decomposition

ABSTRACT. The growing penetration of renewable energy requires additional storage to manage system intermittency. As a result, there is increasing interest in evaluating the opportunity cost of stored energy, or usage values, that can be derived by solving a multistage stochastic optimization problem. We aim to compute these usage values for each country in the interconnected European electrical system, mathematically modeled as an oriented graph, where nodes represent countries and arcs represent interconnection links. In large energy systems, spatial complexity adds to the temporal complexity, making classic stochastic dynamic programming techniques inadequate. To address this, we apply Dual Approximate Dynamic Programming (DADP), which decomposes the global multistage stochastic optimization problem into smaller nodal problems and a transport problem.

The global multistage stochastic optimization problem is stated as the aggregation of nodal problems linked with a transport problem. The nodal problems allocate local production to meet local demand and handle energy imports/exports, while the transport problem manages energy flows across interconnections. The coupling constraint (Kirchhoff's law) links the two. DADP approximates a lower bound of the global problem by dualizing this constraint, allowing the nodal problems to be solved individually (locally) using stochastic dynamic programming techniques for a given dual value of the coupling constraint (the decomposition price). The transport problem, on the other hand, is solved analytically since it lacks temporal coupling.

This approach produces nodal usage values that depend solely on the nodal state, independent of other nodes. To validate this, we present a case study that models three countries - France, Germany, and Switzerland - each with a representative production mix and the interconnection capacities between them. By solving the global problem, we study the dependency of usage values across nodes. Additionally, we demonstrate that the decomposition price corresponds to the nodal marginal price, which is the dual variable of the nodal energy balance. This insight helps develop an initialization of the algorithm specifically suited to the system.

17:00
A column generation approach for the routing of electricity technicians
PRESENTER: Elise Bangerter

ABSTRACT. The maintenance of an electricity distribution network involves numerous daily technical interventions. In this problem, we are given a set of interventions each with associated time windows, location, necessary skills and duration, as well as a set of teams of technicians with associated set of skills. We need to find feasible routes on the interventions for each team, considering the time windows and skills, and ensure that each team returns to its departure depot before the end of the day. The primary objective is to maximize the total duration of completed interventions and as a secondary objective, we aim to minimize the overall routing cost. This problem can be formulated as a capacitated vehicle routing problem with time windows. Due to the large number of teams and interventions, this results in a large scale optimization problem, and its operational nature limits the time available for exact solving. Here, we propose a column generation approach where one subproblem per vehicle has to be solved and each potential route of a vehicle is considered as a new column in the master problem. To generate these routes, we rely on dynamic programming. Real world instances from EDF (Electricité de France) of historical technicians' interventions will be used to evaluate the effectiveness of the proposed methods.

16:00-17:30 Session 9G: Invited Session: Optimization and heuristics in battery management
16:00
Management of a Battery on Short-Term Electricity Markets
PRESENTER: Lionel Rolland

ABSTRACT. Our work aims to quantify the benefit of storage flexibilities such as a battery on several short-term electricity markets. We especially focus on two different markets, the intraday market (ID) and the activation market of the automatic Frequency Restoration Reserve (aFRR), also known as the secondary reserve. We propose algorithms to optimize the management of a small battery on these markets. In this talk, we first present the modeling of the problem, then we show some theoretical results and numerical simulations.

16:30
Operating a battery at minimum cost under reserve commitment constraints
PRESENTER: Rose Sossou Edou

ABSTRACT. Consider an electric battery whose owner has committed with a transmission system operator (e.g., RTE in France) to making at its disposal a "reserve'" of electricity along a given day: every hour, a random quantity will be discharged from or charged to the battery; the commitment is on the range of this random variable. To respond at best to this commitment, the owner of the battery can buy or sell electricity every hour on the "intraday" market, at prices following a random process. Moreover, there is a maximal energy level, a bound on the maximal variation of this level, and an extra final cost anticipating the future; the exact value of both the reserve activation and the price is only revealed at the last moment. The problem consists in finding a policy minimizing the expected cost of ensuring satisfaction of the commitment. To the authors' knowledge, this problem has not been addressed yet.

We propose an algorithm that computes good solutions when the problem is feasible. This algorithm, based on dynamic programming and discretization of the state space, relies on a preliminary lemma characterizing beforehand the feasible values of the energy level for every hour. The quality of the solutions is evaluated thanks to another theoretical contribution that provides closed-form expressions for the optimal policies when the energy variation is not bounded. This result is also interesting for its own sake since there are real-world situations where this condition is met. Our work also shows, via toy examples and simulations, that a good knowledge of the distribution of reserve activation allows to decrease dramatically the cost of operating the battery.

17:00
Valuation of aFRR Activation in Germany for a Standalone Battery

ABSTRACT. Our work aims to decompose the sources of revenue for a battery operating in various markets in Germany and to identify the most profitable ones. This involves backtesting an optimization strategy using historical data. Our approach simulates consecutive optimizations in the capacity market, the spot market, the intraday market, and the aFRR activation market within a deterministic framework. We demonstrate that, although the intraday and aFRR activation markets are significant sources of revenue during the period considered (mid-2022 to mid-2023), the additional cycling caused by aFRR activation should be managed with an appropriate bidding strategy.

16:00-17:30 Session 9H: Invited Session : Accelerated algorithms
16:00
A doubly nonlinear evolution system with threshold effects associated with dry friction
PRESENTER: Manh Hung Le

ABSTRACT. We investigate the asymptotic behavior of inertial dynamics with dry friction in convex optimization. We first focus on a doubly nonlinear first-order evolution inclusion involving two potentials: the differentiable objective function and the nonsmooth dry friction potential. The dry friction term acts on a combination of the velocity vector and the gradient of the objective function, ensuring any stationary point corresponds to a critical point. To facilitate the convergence of the objective's gradient, we explore the dual formulation, which possesses a Riemannian gradient structure. Using the acceleration technique by Attouch, Bot, and Nguyen, we derive from the first-order dynamics second-order systems involving dry friction, vanishing viscous damping, and implicit Hessian-driven damping, which exhibit fast convergence properties.

16:30
Tikhonov Regularization in Continuous and Discrete Time Optimization

ABSTRACT. In a Hilbert setting we study a second order in time differential equation, combining viscous and Hessian-driven damping and a Tikhonov regularization term. The dynamical system is related to the problem of minimization of a nonsmooth convex function. In the formulation of the problem as well as in our analysis we use the Moreau envelope of the objective function and its gradient and heavily rely on their properties. We show that such systems preserve the well-known fast convergence properties of the function and Moreau envelope values along the trajectories and, moreover, the strong convergence of the trajectories to the element of the minimal norm from the set of all minimizers of the objective could be obtained. In order to address the discrete counterpart, numerical algorithms featuring Tikhonov regularization are also considered and the same set of the results is obtained. The talk will be a brief overview of the well-known results in this area concluding with the presentation of our contribution to this topic, including several numerical examples.

17:00
A Search-Free O(1/k^{3/2}) Homotopy Inexact Proximal-Newton Extragradient Algorithm for Monotone Variational Inequalities
PRESENTER: M. Marques Alves

ABSTRACT. We present and study the iteration-complexity of a relative-error inexact proximal-Newton extragradient algorithm for solving smooth monotone variational inequality problems in real Hilbert spaces. We removed a search procedure from Monteiro and Svaiter (2012) by introducing a novel approach based on homotopy, which requires the resolution (at each iteration) of a single strongly monotone linear variational inequality. For a given tolerance $\rho>0$, our main algorithm exhibits pointwise $O\left(\frac{1}{\rho}\right)$ and ergodic $O\left(\frac{1}{\rho^{2/3}}\right)$ iteration-complexities.

16:00-17:30 Session 9I: Invited Session : Dynamical systems and algorithms with applications in optimization, sampling and variational analysis - part II
16:00
Making Wasserstein gradient flows noisy with Stochastic Moment Dynamic

ABSTRACT. In finite dimension, the Langevin diffusion is a noisy version of gradient descent which can be used in order to minimize a non-convex function with many local minima. Following this idea for Wasserstein gradient flows (i.e. gradient descent in the space of probability measures) is not straightforward as there is no clear analogous of the Brownian motion in this setting. We propose a simple process, the Stochastic Moment Dynamics, which is designed so that some desired moments (e.g. mean, variance, etc.) follow some given diffusion (e.g. Brownian motion). This provides controllability properties, enabling to escape from local minima. The resulting process is a random flow over the space of probability measures, which can be described as a conditioned McKean-Vlasov diffusion or a stochastic PDE. In its basic form, it may explode in finite time (for instance, when trying to force the variance, which is non-negative, to be a Brownian motion). In practice it is approximated by interacting particles, influenced by a common noise. We show the convergence of the particle system toward the non-linear process up to explosion time, explain how to remove the well-posedness issue while preserving the controllability properties, and illustrate the process on simple examples.

16:30
Stochastic Localization via Iterative Posterior Sampling

ABSTRACT. Building upon score-based learning, new interest in stochastic localization techniques has recently emerged. In these models, one seeks to noise a sample from the data distribution through a stochastic process, called observation process, and progressively learns a denoiser associated to this dynamics. Apart from specific applications, the use of stochastic localization for the problem of sampling from an unnormalized target density has not been explored extensively. This work contributes to fill this gap. We consider a general stochastic localization framework and introduce an explicit class of observation processes, associated with flexible denoising schedules. We provide a complete methodology, Stochastic Localization via Iterative Posterior Sampling (SLIPS), to obtain approximate samples of this dynamics, and as a by-product, samples from the target distribution. Our scheme is based on a Markov chain Monte Carlo estimation of the denoiser and comes with detailed practical guidelines. We illustrate the benefits and applicability of SLIPS on several benchmarks of multi-modal distributions, including Gaussian mixtures in increasing dimensions, Bayesian logistic regression and a high-dimensional field system from statistical-mechanics.

17:00
Inertial Dynamics and Accelerated Gradient Methods for Strongly Convex Functions
PRESENTER: Zepeng Wang

ABSTRACT. We introduce an inertial system with Hessian driven damping, which plays an important role in fast convex optimization. Two energy functions are provided for the analysis of accelerated gradient methods for strongly convex functions. By doing so, we recover existing convergence rate results. Through the analyses, one can gain an intuitive understanding of the acceleration mechanism behind the accelerated gradient methods. These analyses can also be extended to convex functions satisfying quadratic functional growth condition, where new convergence rate results are derived.

16:00-17:30 Session 9J: Games
16:00
Chance-constrained stochastic zero-sum games
PRESENTER: Lucas Osmani

ABSTRACT. We consider two-person zero-sum stochastic dynamic games with random rewards. We model such games with a pair of chance-constrained optimization problems. The aim of each player is to get the maximum payoff he can guarantee with a given probability p in [0, 1], against the worst possible move from his opponent. We propose equivalent minimax reformulations when the rewards follow a multivariate Gaussian distribution. A player has risk-seeking attitude if his confidence level p is less than 0.5 , otherwise he is risk-averse. Both cases are discussed separately. We prove that the risk-seeking problem is equivalent to the constrained optimization of parameterized zero-sum stochastic games, which can be achieved by a fixed point iteration algorithm. We rewrite the risk-aversion problem as a discrete minimax, that can be solved using a linearization method. We prove some convergence results for our algorithms, and illustrate them with numerical experiments.

16:30
Representation results for differential games and related neural-network approximations

ABSTRACT. We give new representation formula for the value of differential games in a finite horizon context. This leads to improved error estimates for some semi-discrete approximations. These representations can then be used to look for the non-anticipative strategies in feedback form and to develop schemes on mesh-free approximations such as neural network schemes.

17:00
Escape Rate Games and Competitive Spectral Radii
PRESENTER: Loïc Marchesini

ABSTRACT. We introduce a new class of perfect information repeated zero-sum games in which the payoff of one player is the escape rate of a dynamical system which evolves according to nonexpansive nonlinear operators depending on the actions of both players. Considering a closed convex cone in Rn equipped with the Funk metric (the one-sided version of Hilbert's projective metric), we obtain subclasses of Matrix Multiplication Games, which provide a 2-player extension of the joint spectral radius. We prove the existence of a uniform value for our game. We characterize the value by a two-player generalization of Mañe’s lemma, involving a non-linear Ruelle-Perron-Frobenius operator S acting on the set of distance-like functions. Additionnaly, making S act on the set of distance-like functions, we provide a dual minimax characterization of the value, when we consider the Funk metric, or when the dynamics are almost isometries.