2024 ALLERTON: 60TH ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING
PROGRAM FOR FRIDAY, SEPTEMBER 27TH
Days:
previous day
all days

View: session overviewtalk overview

08:30-10:10 Session 7A: Electric Power Systems I

Invited Session

Location: 1030
08:30
Braess’ Paradoxes in Electrified Transportation

ABSTRACT. Transportation electrification introduces strong coupling between the power and transportation systems. In this work, we generalize the classical notion of Braess’ paradox to coupled power and transportation systems, and examine how the cross-system coupling induces new types of Braess’ paradoxes. To this end, we model the power and transportation networks as graphs, coupled with charging points connecting to nodes in both graphs. The power system operation is characterized by the economic dispatch optimization, while the transportation system user equilibrium models travelers’ route and charging choices. By analyzing simple coupled systems, we demonstrate that capacity expansion in either transportation or power system can deteriorate the performance of both systems, and uncover the fundamental mechanisms for such new Braess’ paradoxes to occur, leading to managerial insights for infrastructure planners. For general networks, through characterizing the generalized user equilibrium of the coupled systems, we develop efficient algorithms to detect Braess’ paradoxes and novel charging pricing policies to mitigate them.

08:50
Information Structures in AC/DC Grids

ABSTRACT. The converters in an AC/DC grid form actuated boundaries between the AC and DC subgrids. In both simple linear and balanced dq-frame models, the states on either side of these boundaries are coupled only by control inputs. In this paper, we show how this topological property imparts all AC/DC grids with poset-causal information structures. A practical benefit is that certain decentralized control problems that are hard in general are tractable for poset-causal systems. We also show that special cases like multi-terminal DC grids can have coordinated and leader-follower information structures.

09:10
Fair-Over-Time Distributed Energy Resource Coordination

ABSTRACT. There are numerous strategies to coordinate distributed energy resources (DERs) to provide a variety of services to the power grid. DER coordination can affect resources and participants unequally, for example, by excessively degrading or curtailing particular DERs more than others. However, few DER coordination strategies explicitly take into account fairness, equity, or justice. In this paper, we explore fairness metrics and their applicability to the DER coordination problem. In particular, we investigate metrics from machine learning to identify metrics that could be incorporated into DER coordination problems and we summarize fairness metrics that have been used in the power systems literature. A key challenge is that most existing fairness metrics are static -- ensuring fairness at a point in time. DER coordination problems are inherently dynamic and we often care about fairness over time, not at each time. The machine learning literature offers some ways to think about fairness over time and, more generally, how to incorporate fairness into dynamic power systems problems. We use a specific DER coordination problem -- the problem of computing dynamic operating envelopes -- to demonstrate how incorporation of a fair-over-time metric changes DER coordination solutions, and highlight the trade-offs that arise.

09:30
Transferable Learning of GCN Sampling Graph Data Clusters from Different Power Systems
PRESENTER: Tong Wu

ABSTRACT. Contemporary neural network (NN) detectors for power systems face two primary challenges. First, each power sys- tem requires individual training of NN detectors to accommodate its unique configuration and base demands. Second, significant changes within the power system, such as the introduction of new substations or new generators, necessitate retraining. To overcome these issues, we introduce a novel architecture, the Nodal Graph Convolutional Neural Network (NGCN), which utilizes graph convolutions at each bus and its neighborhoods. This approach allows the training process to encompass multiple power systems and include all buses, thereby enhancing the transferability of the method across different power systems. The NGCN is particularly effective for detection tasks, such as cyber-attacks on smart inverters and false data injection attacks. Our tests demonstrate that the NGCN significantly improves performance over traditional NNs, boosting detection accuracy from approximately 85% to around 97% for the aforementioned task. Furthermore, the transferable NGCN, which is trained by samples from multiple power systems, performs considerably better in evaluations than the NGCN trained on a single power system.

09:50
Acknowledgment Feedback in Directed Graphs via Deterministic and Probabilistic Strategies

ABSTRACT. In networked systems where communication links are unreliable and error-prone, acknowledgement feedback is used to let the transmitting node know whether a packet has been successfully decoded at the receiving node. More specifically, an ACK (successful reception) or a NACK (unsuccessful reception) is sent from the receiving node to the transmitting node after each use of a particular communication link. Acknowledgment feedback is crucial in many distributed control applications, including distributed average consensus where it can be used to allow the nodes to reach asymptotic average consensus, while maintaining low values in local buffers and improving the convergence rate, even when information is exchanged over error-prone networks. The challenge in directed networks is that acknowledgement feedback requires a reverse channel (from the receiving node to the transmitting node) that is not necessarily available. In this paper, we propose and analyze deterministic and probabilistic strategies for sending acknowledgement feedback in directed graphs (digraphs) using directed paths from the receiving node to the transmitting node. Such directed paths may involve multiple communication links but are guaranteed to exist in strongly connected digraphs, as those commonly assumed in distributed averaging schemes. We discuss the impact of such acknowledgment feedback strategies on distributed average consensus over digraphs with unreliable communication links.

08:30-10:10 Session 7B: Distributed Machine Learning

Invited Session

Location: Auditorium
08:30
The Sample-Communication Complexity Trade-off in Federated Q-Learning

ABSTRACT. We consider the problem of federated Q-learning, where M agents aim to collaboratively learn the optimal Q-function of an unknown infinite-horizon Markov decision process with finite state and action spaces. We investigate the trade-off between sample and communication complexity for the widely used class of intermittent communication algorithms. We first establish the converse result, where we show that any federated Q-learning that offers a linear speedup with respect to the number of agents in the sample complexity per agent needs to incur a communication cost of at least an order of $\frac{1}{1-\gamma}$ up to logarithmic factors, where $\gamma$ is the discount factor. We also propose a new federated Q-learning algorithm, called Fed-DVR-Q, which is the first federated Q-learning algorithm to simultaneously achieve order-optimal sample and communication complexities. Thus, together these results provide a complete characterization of the sample-communication complexity trade-off in federated Q-learning.

08:50
Incentive Mechanism Design for Federated Learning with Unstateful Clients

ABSTRACT. Federated learning (FL) is an attractive distributed machine learning paradigm that enables numerous clients to collaboratively train a model under the coordination of a central server, while keeping the training data private. However, without sufficient incentive, clients may be reluctant to participate in FL due to the associated training costs and variant availability. Moreover, clients' availability can be inconsistent, leading to periodic or random participation—a phenomenon known as unstatefulness. This variability renders existing incentive mechanisms, designed for full or partial client participation throughout the entire training process, ineffective.

In this work, we propose a game-theoretic incentive mechanism for FL with randomized client participation, where the server adopts a customized pricing strategy to motivate clients to participate at different levels (probabilities). Each client responds to the server’s monetary incentive by choosing its optimal participation level to maximize its profit, considering both the incurred local cost and its intrinsic value for the model. We show that intrinsic value (internal motivation) introduces the intriguing possibility of bidirectional payments between the server and clients, leading to a more efficient pricing strategy and enhanced model performance.

09:10
How Do System Dynamics Affect the Convergence of Federated Learning?

ABSTRACT. In federated learning (FL), the dynamic and unpredictable participation of clients poses significant challenges to algorithm convergence and performance. This talk delves into advanced analytical methodologies that integrate system statistics with convergence analysis, demonstrating how strategies such as adapting aggregation weights based on clients' participation history can ensure optimal convergence even when participation rates are unknown. Our theoretical framework introduces a novel methodology that connects estimation error with convergence, revealing critical insights into the performance of FL algorithms. By simplifying the impact of partial client participation into a single analytical term, we enable the derivation of comprehensive convergence bounds across diverse participation patterns. These theoretical advancements provide a deeper understanding of the efficiency and reliability of FL algorithms, ensuring their effective operation despite the inherent variability in the system.

09:30
collaborative exploration of unknown environments

ABSTRACT. In this talk we propose strategies based on a potential function methods for multiple agents to collaboratively explore unknown environments with improved worst-case overhead and competitive ratio.

09:50
Personalized Federated Learning with Shared Linear Representation: Statistically Optimal Rates

ABSTRACT. Personalized federated learning enables clients and a central server to collaboratively learn a shared model while allowing individual clients to fine-tune and adapt the model to their specific needs privately. Current literature primarily focuses on algorithmic advancements and assumes that the local data volume far exceeds the model dimension, leaving the optimal sample complexity unresolved. In this paper, we determine the optimal sample complexity when clients share a common low-dimensional linear representation. We characterize statistically optimal rates for both unsupervised and supervised learning settings. Specifically, we establish information-theoretic lower bounds on estimation errors and design a federated method-of-moments estimator with local averaging to achieve these bounds. Our results demonstrate that the optimal error rates match those in centralized learning with a moderate number of agents and local dataset sizes. However, a statistical penalty arises due to distributed computing when there are many agents with insufficiently large or heterogeneous local datasets.

08:30-10:10 Session 7C: Information Theory, Statistics, and Data Science IV

Invited Session

Location: 1040
08:30
A robust testing approach to measuring model varability

ABSTRACT. Training deep neural networks generally uses stochastic optimization, which means each run produces a different model. Despite this variability, models are often deemed to be "equally good" based on their performance on the test set. A variety of measures for model variability have been proposed, usually based around the decisions made by classification models. In this work we propose using robust hypothesis testing to measure the difference between models by using the marginal distribution on pre-thresholded predictions. In particular, we measure a trimming level between models and argue that it provides a more nuanced view model variability. We apply this to understand the benefits of model ensembling.

08:50
An Information Geometric Analysis of Noise Contrastive Estimation

ABSTRACT. We study noise-contrastive estimation (NCE) through the lens of local information geometry. In particular, paralleling recent work [1], [2], we demonstrate that the optimal noise distribution for NCE is not the same as those typically chosen based on practical heuristics. Furthermore, we show that an important information geometric construct, the angle between the subspaces of distributions, dictates the asymptotic performance of an extension of NCE proposed in [3]. This analysis provides an information geometric interpretation of the method, encouraging further study of NCE and its variants using information geometric methods.

09:10
Impossibility of latent inner product recovery via rate distortion
PRESENTER: Shenduo Zhang

ABSTRACT. In a random geometric graph model or latent space model, the observed graph $A$ on $n$ vertices with average edge density $p$ is assumed to be generated from latent locations $z_1, \dots, z_n$ in $\mathbb{R}^d$ associated with the $n$ vertices. Given the graph $A$, it is of interest to estimate the inner products $\langle z_i, z_j \rangle$ which represents the geometry of the latent locations. In this note, assuming that the latent locations are Gaussian or spherical points, we show an impossibility result for inner product recovery when $d \gtrsim n h(p)$ where $h(p)$ is the binary entropy function. This matches the condition required for positive results on inner product recovery in the literature. The main technical ingredient of this work is a lower bound on the rate-distortion function of the Wishart distribution which is interesting in its own right.

09:30
OpAMP: Linear Regressions with Erasures

ABSTRACT. Approximate message passing (AMP) is a powerful framework for precise characterization of the iterative inference algorithms. In recent work, we proposed linear operator approximate message passing (OpAMP) as a generalization of AMP that permits the data matrix to be passed through a linear operator at each iterations. In this paper, we apply this framework to large-scale linear regression subject to erasures. These erasures can be used to model scheduled partial computations as well as stragglers in distributed computing.

09:50
Chernoff Information as a Privacy Constraint for Adversarial Classification
PRESENTER: Ayse Unsal

ABSTRACT. This work inspects a privacy metric based on Chernoff information, \textit{Chernoff differential privacy}, due to its significance in characterization of the optimal classifier's performance. Adversarial classification, as any other classification problem is built around minimization of the (average or correct detection) probability of error in deciding on either of the classes in the case of binary classification. Unlike the classical hypothesis testing problem, where the false alarm and mis-detection probabilities are handled separately resulting in an asymmetric behavior of the best error exponent, in this work, we focus on the Bayesian setting and characterize the relationship between the best error exponent of the average error probability and $\varepsilon\textrm{-}$differential privacy. Accordingly, we re-derive Chernoff differential privacy in terms of $\varepsilon\textrm{-}$differential privacy using the Radon-Nikodym derivative and show that it satisfies the composition property for sequential composition. Subsequently, we present numerical evaluation results, which demonstrates that Chernoff information outperforms Kullback-Leibler divergence as a function of the privacy parameter $\varepsilon$, the impact of the adversary's attack and global sensitivity for the problem of adversarial classification in Laplace mechanisms.

08:30-10:10 Session 7D: Wireless Networks
Location: 3000
08:30
Rendering Wireless Environments Useful for Gradient Estimators: A Zero-Order Stochastic Federated Learning Method
PRESENTER: Elissa Mhanna

ABSTRACT. Cross-device federated learning (FL) is a growing machine learning setting whereby multiple edge devices collaborate to train a model without disclosing their raw data. With the great number of mobile devices participating in more FL applications via the wireless environment, the practical implementation of these applications will be hindered due to the limited uplink capacity of devices, causing critical bottlenecks. In this work, we propose a novel doubly communication-efficient zero-order (ZO) method with a one-point gradient estimator that replaces communicating long vectors with scalar values and that harnesses the nature of the wireless communication channel, overcoming the need to know the channel state coefficient. It is the first method that includes the wireless channel in the learning algorithm itself instead of wasting resources to analyze it and remove its impact. We then offer a thorough analysis of the proposed zero-order federated learning (ZOFL) framework and prove that our method converges \textit{almost surely}, which is a novel result in nonconvex ZO optimization. We further prove a convergence rate of $O(\frac{1}{\sqrt[3]{K}})$ in the nonconvex setting. We finally demonstrate the potential of our algorithm with experimental results.

08:50
Mobility in Age-Based Gossip Networks

ABSTRACT. We consider a gossiping network where a source forwards updates to a set of $n$ gossiping nodes that are placed in an arbitrary graph structure and gossip with their neighbors. In this paper, we analyze how mobility of nodes affects the freshness of nodes in the gossiping network. To model mobility, we let nodes randomly exchange positions with other nodes in the network. The position of the node determines how the node interacts with the rest of the network. In order to quantify information freshness, we use the version age of information metric. We use the stochastic hybrid system (SHS) framework to derive recursive equations to find the version age for a set of positions in the network in terms of version ages of sets of positions that are one larger or of the same size. We use these recursive equations to find an upper bound for the average version age of a node in two example networks. We perform numerical simulations to analyze how mobility affects version age of different positions in the network and also show that the upper bounds obtained for version age of the example networks are tight.

09:10
Linear-time Scheduling for Time-varying Wireless Networks via Randomization

ABSTRACT. We study the problem of designing efficient throughput-optimal joint routing and scheduling policy for time-varying network, which has seen recent emergence from the surge in wireless edge-device systems. The only known solutions to the optimal control of dynamic network require solving the NP-hard problem of Max-Weight scheduling, thereby hindering the practicality. Even for the simplified setting of time-invariant network, previous work only considered efficient scheduling without routing, where any direct extension would only be able to incorporate unicast routing, which is restricted in practice.

In this paper, we propose the Randomized Universal Max-Weight (RUMW) policy, which supports the generalized routing paradigm and relies on incremental sampling and updating rule to find good scheduling decisions, thereby resulting a linear-time scheduling algorithm. We further show that the RUMW algorithm is throughput-optimal in the setting of time-varying network. Of independent interest, the algorithmic structure of RUMW is directly congruent with the emerging Software-Defined Network (SDN) architecture.

09:30
Traffic-driven Spectrum and Power Allocation via Scalable Multi-agent Reinforcement Learning

ABSTRACT. This paper introduces a novel traffic-driven approach to radio resource allocation in cellular networks by leveraging a fully scalable multi-agent reinforcement learning (MARL) framework. The objective is to minimize packets delay of links under stochastic arrivals, where access points (APs) make spectrum and power allocation decisions based on limited local information. Formulated as a distributed learning problem, we implement multi-agent proximal policy optimization (MAPPO) algorithm with recurrent neural networks and queueing dynamics to train flexible policies that map dynamic traffic and channel state information (CSI) to allocation decisions. The proposed MARL-based solution enables decentralized training and execution, ensuring scalability to large networks. Extensive simulations demonstrate that the proposed methods achieve comparable packet delay performance to genie-aided centralized algorithms while using only local information and reducing execution time. The trained policies also show scalability and robustness across various network sizes and traffic conditions.

10:30-12:30 Session 8A: Robust Inference and Learning

Invited Session

Location: 1030
10:30
Value Mirror Descent For Reinforcement Learning

ABSTRACT. We present a new class of mirror descent method type method for reinforcement learning. We show that these methods have the following desirable properties: 1) they possess strong and computable termination criteria; 2) they can achieve optimal sample complexity; and 3) they allow the application of function approximation.

10:50
Robust Conditional Value-at-Risk Based Risk-Sensitive Reinforcement Learning

ABSTRACT. Most of existing studies on Robust Markov Decision Processes (RMDPs) focus on risk-neutral reinforcement learning (RL), with the goal of minimizing expected total discounted costs. In this paper, we analyze the robustness of Conditional Value-at-Risk (CVaR)-based risk-sensitive RL, which aims to minimize CVaR risk measure. Firstly, we consider predetermined ambiguity sets. Based on the coherency of CVaR, we establish a connection between robustness and risk sensitivity, thus, techniques in risk-sensitive RL can be adopted to solve the proposed problem. Furthermore, motivated by the existence of decision-dependent uncertainty in real-world problems, we study problems with state-action-dependent ambiguity sets. To solve this, we define a new risk measure named NCVaR and build the equivalence of NCVaR optimization and robust CVaR optimization. We further propose value iteration algorithms and validate our approach in simulation experiments.

11:10
Characterization of Slow Convergence in Word-of-Mouth Social Learning

ABSTRACT. We consider word-of-mouth social learning involving M Kalman filter agents that operate sequentially. The first Kalman filter receives the raw observations, while each subsequent Kalman filter receives a noisy measurement of the conditional mean of the previous Kalman filter. The prior is updated by the M-th filter. When M=2, and the observations are noisy measurements of a Gaussian random variable, a remarkable result shown in the economics literature is that the covariance goes to zero as N^{-1/3} for N observations, instead of O(N^{-1}) in the standard Kalman filter. In this paper we prove that for M agents, the covariance goes to zero as N^{-2^M-1}, i.e, the learning slows down exponentially with the number of agents. We also show that by artificially weighing the prior at each time, the learning rate can be made optimal as N^{-1}. The implication is that in word-of-mouth social learning, artificially re-weighing the prior can improve the learning rate. The result holds regardless of the variance of the noise that corrupts the conditional mean, illustrating robustness of the formulation. Finally numerical studies show that a similar slow down in the learning rate occurs when multiple deep learning agents interact according to the above protocol.

11:30
Adaptive Text Watermark for Large Language Models

ABSTRACT. The advancement of Large Language Models (LLMs) has led to increasing concerns about the misuse of AI-generated text, and watermarking LLM-generated text has emerged as a potential solution. However, it is challenging to generate high-quality watermarked text while maintaining robustness, security, and the ability to detect watermarks without prior knowledge of the prompt and model. This paper proposes an adaptive text watermarking strategy to address such a challenge. To improve the text quality and maintain robustness, we adaptively add watermarking to token distributions with high entropy measured by an auxiliary model and keep the low-entropy token distributions untouched. For the sake of security and to further minimize the watermark's impact on text quality, instead of using a fixed green/red list generated from a random secret key, which can be vulnerable to decryption and forgery, we adaptively scale up the output logits based on the semantic embedding of previously generated text using a well designed semantic mapping model. Our experiments involving various LLMs demonstrate that our approach achieves comparable robustness performance to existing watermark methods. Additionally, the text generated by our method has perplexity comparable to that of un-watermarked LLMs while maintaining sufficient security.

11:50
Achieving the Asymptotically Optimal Sample Complexity of Offline Reinforcement Learning: A DRO-Based Approach

ABSTRACT. Offline reinforcement learning aims to learn from pre-collected datasets without active exploration. This problem faces significant challenges, including limited data availability and distributional shifts. Existing approaches adopt a pessimistic stance towards uncertainty by penalizing rewards of under-explored state-action pairs to estimate value functions conservatively. In this paper, we show that the distributionally robust optimization (DRO) based approach can also address these challenges and is {asymptotically minimax optimal}. Specifically, we directly model the uncertainty in the transition kernel and construct an uncertainty set of statistically plausible transition kernels. We then show that the policy that optimizes the worst-case performance over this uncertainty set has a near-optimal performance in the underlying problem. We first design a metric-based distribution-based uncertainty set such that with high probability the true transition kernel is in this set. We prove that to achieve a sub-optimality gap of $\epsilon$, the sample complexity is $\mathcal{O}(S^2C^{\pi^*}\epsilon^{-2}(1-\gamma)^{-4})$, where $\gamma$ is the discount factor, $S$ is the number of states, and $C^{\pi^*}$ is the single-policy clipped concentrability coefficient which quantifies the distribution shift. To achieve the optimal sample complexity, we further propose a less conservative value-function-based uncertainty set, which, however, does not necessarily include the true transition kernel. We show that an improved sample complexity of $\mathcal{O}(SC^{\pi^*}\epsilon^{-2}(1-\gamma)^{-3})$ can be obtained, which asymptotically matches with the minimax lower bound for offline reinforcement learning, and thus is asymptotically minimax optimal.

12:10
Distributionally Robust Optimal Transport for Improved Model Robustness

ABSTRACT. Distributionally Robust Optimization enhances robustness to model uncertainty by optimizing over an ambiguity set of distributions for uncertain model parameters. However, this method can result in excessive conservatism when the training and testing samples come from significantly different distributions. In this paper, we introduce a novel approach called Distributionally Robust Optimal Transport (DROT). In DROT, we minimize the worst-case performance over an uncertainty set constructed around transported samples. This approach reduces the conservatism by effectively shrinking the ambiguity set, thereby achieving robustness without excessive regularization. Our numerical experiments demonstrate the superiority of DROT over existing robust optimization methods.

10:30-12:10 Session 8B: Electric Power Systems II

Invited Session

Location: Auditorium
10:30
Balancing Fairness and Efficiency in Power Systems with Large and Small Users

ABSTRACT. The rise of large AI computing loads and distributed energy resources has introduced more diverse heterogeneous user groups in power systems. Without explicitly bringing fairness to their interactions, the disparity between the size of the users often leads to disproportionate payments and allocations for certain groups. We address this challenge by formalizing the problem of fair energy resource allocation. Our approach selects allocation schemes that lie on the Pareto front, balancing fairness and efficiency in resource allocation strategies.

10:50
Decentralized Equitable Energy Access in Energy Communities

ABSTRACT. We consider equitable access to energy in an energy community consisting of members from different socioeconomic backgrounds, such as unequal income levels and unequal abilities to access energy resources such as solar, storage, etc. Although optimal scheduling of energy consumption is well studied, incorporating equity measures in optimal decentralized energy access in real-time—the so-called ex-post equitable energy access—is not well understood. This paper formulates the decentralized equitable energy access problem as a bi-level optimization, where a non-profit operator designs a community pricing policy to maximize community welfare subject to an equitable access constraint, and the community members optimize individual consumption for the given prices. An optimal pricing policy and related properties are presented.

11:10
Stability-Constrained Voltage Control in Distribution Grids

ABSTRACT. Motivated by the deployment of distributed energy resources (DERs) in power distribution grids, this talk presents a data-driven framework to design local Volt/Var controllers capable of steering a power distribution network towards efficient network configurations. Our approach employs machine learning techniques to learn local surrogates that map voltages and reactive powers to ideal reactive power setpoints approximating solutions to the optimal reactive power flow problem. Equipped with these, we propose control update schemes and identify conditions on the surrogates and control parameters that guarantee that the resulting reactive power point globally asymptotically converges. For decentralized control, where controllers only have access to local measurements, we show the key role played by the monotonicity of the surrogates in ensuring asymptotically stability. We extend this idea to the case where the controllers can take advantage of an arbitrary communication infrastructure on top of the physical network. This allows the controllers to incorporate information beyond their local bus, covering the decentralized control case as a special case, and leading to less conservative constraints on the controller design. We train neural networks so that, by design, they meet the conditions on surrogates and illustrate the performance of the resulting control scheme. Our simulations show that the proposed framework guarantees voltage stability while significantly reducing the operation cost, and highlight the role of communication in improving control performance.

11:30
Exploring Feasible Transition from Local to Global Optima in Power Systems

ABSTRACT. The optimal power flow (OPF) problem is non-convex and will often yield multiple local optimal solutions. To be practically feasible in operation, a solution must be both dynamically stable and be must reachable from other operating points. In this paper we examine a 5 bus system that exhibits two local and stable OFP solutions, one of which is the global optimum. The two solutions of interest lie in separate regions of the disjoint feasible space of the OPF, and we explore the question of whether it is possible to move between the solutions without violating limits. We demonstrate via detailed simulation that a reactive power limit is violated when the voltage and power setpoints are suddenly changed, though the final operating point is eventually reached. As a main result, we show that there exists a path of setpoint adjustments that does enable the change of operating point, from the local optimum to the global optimum, without violating limits, and we explore practical control strategies to achieve this transition.

11:50
Reverse Kron Reduction on Trees

ABSTRACT. The Kron reduction (Schur complement) of the network admittance matrix can be estimated from partial measurements on an unbalanced three-phase distribution systems. The identification problem is to compute reverse Kron reduction over a three-phase tree graph. We present a method to reverse Kron reduction. The key idea is to decompose Kron reduction into a sequence of iterations that maintain an invariance structure and exploit this structure to reverse each step of the iterative Kron reduction.

10:30-12:30 Session 8C: Game Theory
Location: 2000N
10:30
Linearly Solvable General-Sum Markov Games

ABSTRACT. Linearly solvable Markov decision processes (LSMDPs) are a special class of Markov decision processes (MDPs) in which the optimal value function under an exponential transformation satisfies a linear equation. This model was previously extended to a class of two-player, zero-sum Markov games in which the game's equilibrium value function can similarly be derived from a linear equation. In this work, a new class of linearly solvable n-player, general-sum Markov games is proposed. We show this class of games has a unique Nash equilibrium, and the equilibrium value functions can be derived from a single linear equation. We demonstrate how to approximate arbitrary discrete-state, discrete-action Markov games outside this class through an embedding process analogous to the way LSMDPs approximate standard MDPs. An off-policy reinforcement learning algorithm for general-sum Markov games, which we call Nash-Z learning, is presented. We empirically demonstrate that Nash-Z learning finds equilibrium policies with low regret and that it finds these solutions orders of magnitude faster than the classic Nash-Q learning algorithm.

10:50
Constrained Correlated Equilibria

ABSTRACT. This paper introduces constrained correlated equilibrium, a solution concept combining correlation and coupled constraints in finite non-cooperative games. In a correlated equilibrium, players coordinate their actions based on common random signals, yielding outcomes that can be more efficient than Nash equilibria but may lack feasibility when considering certain constraints. We present the formal definition of the concept, illustrating its relevance through examples. Furthermore, we analyze its fundamental properties, including relations to correlated equilibria in the general case of arbitrary constraints. In the particular case of constraints induced by a feasible set of probability distributions over action profiles, we show that canonical correlation devices suffice to characterize the set of constrained correlated equilibrium distributions and provide a sufficient condition of the existence of the constrained correlated equilibrium. Finally, we illustrate these results through numerical examples.

11:10
Fairness of Equilibrium in Proportional Sharing Mechanisms with Bid Costs

ABSTRACT. This paper considers the proportional sharing mechanism from the lens of fairness. Deviating from prior work we consider the net utility to users by incorporating a weighted cost of bidding, one example being the use of power, incurred at user-dependent costs, to competitively bid for throughput in wireless systems. In this context, we analyze the price of anarchy of the equilibrium when the social welfare is measured both by a fair-sharing utility measure, as well as the sum of utility measures. In the context of power bids by users, we consider both continuous and discrete bids. The analysis shows the PoA can be zero and even for a uniform linear utility function when using the sum-total measure, but with distinct costs of bids, is asymptotically zero with increasing number of users indicating that a number of users/clients were starved of resources. This illustrates that accounting for the cost of bids can substantially change the structure of the equilibrium as compared to uniform costs. We also investigate the number of such starved users experimentally.

In order to determine equilibrium, we show convex program formulations as well as present approximation methods to determine (1 + ϵ)-approximate equilibrium via an efficient algorithm. We also design methods to compute the optimum solution. Optimization of the social utility in the general case, where multiple resources can be chosen by users, is shown to be NP-hard. Experimental results illustrate the asymptotic behavior of the PoA and of the number of users that are starved of resources when users (competition) increases.

11:30
Price of Anarchy of a Censorship Policy removing Misinformation in a Social Network

ABSTRACT. The proliferation of misinformation on social media has become a significant concern, particularly in the realms of political discourse and public health. Censorship policies have emerged as a solution to limit the spread of misinforma- tion. However, although censorship reduces the proportion of misinformation disseminated, it also creates an implied truth effect which skews the perception of less reliable information. This paper investigates the impact of censorship policies in an online social network model where agents sequentially observe an article and decide whether to share it with others or not. We measure the impact of censorship in the virality of articles containing misinformation and observe that while cen- sorship can effectively reduce the spread of misinformation, it also allows less reliable articles to spread over the network. Specifically, we quantify the “price of anarchy” associated with these censorship policies using a formal model that incorporates agents’ beliefs, network structure, and content reliability. Unlike usual frameworks of resources allocation games in commutation networks, we show that the price of anarchy is unbounded and we exhibit minimal limit case scenarios.

11:50
Suppressing Overestimation in Q-Learning through Adversarial Behaviors
PRESENTER: Hyeann Lee

ABSTRACT. The goal of this paper is to propose a new Q-learning algorithm with a dummy adversarial player, which is called dummy adversarial Q-learning (DAQ), that can effectively regulate the overestimation bias in standard Q-learning. With the dummy player, the learning can be formulated as a two-player zero-sum game. The proposed DAQ unifies several Q-learning variations to control overestimation biases, such as maxmin Q-learning and minmax Q-learning (proposed in this paper) in a single framework. The proposed DAQ is a simple but effective way to suppress the overestimation bias through dummy adversarial behaviors and can be easily applied to off-the-shelf value-based reinforcement learning algorithms to improve the performances. A finite-time convergence of DAQ is analyzed from an integrated perspective by adapting an adversarial Q-learning. The performance of the suggested DAQ is empirically demonstrated under various benchmark environments.

12:10
Non-Monotone Variational Inequalities

ABSTRACT. In this paper, we focus on deriving some sufficient conditions for the existence of solutions to non-monotone Variational Inequalities (VIs) based on inverse mapping theory. We have obtained several widely applicable sufficient conditions for this problem and have introduced a sufficient condition for the existence of a Minty solution. We have shown that the extra-gradient method converges to a solution of VI in the presence of a Minty solution. Additionally, we have shown that, under some extra assumption, the algorithm is efficient and approaches a particular type of Minty solution. Interpreting these results in an equivalent game theory problem, weak coupling conditions will be obtained, stating that if the players' cost functions are sufficiently weakly coupled, the game has a pure quasi-Nash equilibrium. Moreover, under the additional assumption of the existence of Minty solutions, a pure Nash equilibrium exists for the corresponding game.

10:30-12:10 Session 8D: Large Language Models

Invited Session

Chairs:
Location: 3000
10:30
From Large Language Models to Large Agent Models: Planning and Reasoning with Physical World Knowledge

ABSTRACT. While Large Language Models excel in language processing, Large Agent Models are designed to interact with the environment. This transition poses significant challenges in understanding lower-level visual details, and long-horizon reasoning for effective goal interpretation and decision-making. Despite the impressive performance of LLMs/VLMs on various benchmarks, these models perceive images as bags of words (semantic concepts). In detail, they use semantic understanding as a shortcut but lack the ability to recognize geometric structures or solve spatial problems such as mazes. To interact with the physical world, we focus on two dimensions: (1) From high-level semantic to low-level geometric understanding: We introduce a low-level visual description language that serves as geometric tokens, allowing the abstraction of multimodal low-level geometric structures. (2) From fast-thinking to slow-thinking: We propose to quantify long-horizon reasoning by incorporating Markov Decision Process (MDP) based decision-making. The key difference between language models and agent models lies in their decision-making capabilities. This fundamental difference necessitates a shift in how we approach the development of large agent models, focusing on both geometric understanding and long-term planning to create more capable embodied AI agents.

Bio: Manling Li is an Assistant Professor at Northwestern University. She was a postdoc at Stanford University and obtained the PhD degree in Computer Science at University of Illinois Urbana-Champaign in 2023. She works on the intersection of language, vision, and robotics. Her work on multimodal knowledge extraction won the ACL'20 Best Demo Paper Award, and the work on scientific information extraction from COVID literature won NAACL'21 Best Demo Paper Award. She was a recipient of Microsoft Research PhD Fellowship in 2021, an EE CS Rising Star in 2022, a DARPA Riser in 2022, etc. She served as Organizing Committee of ACL 25 and EMNLP 2024, and delivered tutorials about multimodal knowledge at IJCAI'23, CVPR'23, NAACL'22, AAAI'21, ACL'21, etc. Additional information is available at https://limanling.github.io/.

10:50
Mitigating Backdoor Threats to Large Language Models

ABSTRACT. Despite the success of Large Language Models (LLMs), their increasingly scaled sizes, as well as their growing deployments in systems, services and scientific studies, are bringing along more and more emergent issues in security and privacy. One significant area of security concern for LLMs is their susceptibility during the training phase. Adversaries can exploit this vulnerability by strategically contaminating a small fraction of the training data and lead to the introduction of backdoors or a significant degradation in model performance. In this introductory talk, will begin discussing the training-time threats by delving into various attack types including ample-agnostic attacks like word or sensitive-level trigger attacks, sample-dependent attacks such as syntactic, paraphrasing and back translation attacks. Subsequently, encompassing emergent LLM development processes of instruction tuning and RLHF, we will discuss how attackers may capitalize on these processes, injecting tailored instruction-following examples or manipulating ranking scores to purposefully alter the model’s behavior. We will also shed light on the far-reaching consequences of training-time attacks across diverse LLM applications. Moving forward, we will introduce threat mitigation strategies in the training stage where defenders can measure and counteract the influence of poisoned data within the training process, and in the inference stage where defenders can detect and eliminate poisoned data given the compromised model.

11:10
Foundation Models for Robotic Manipulation: Opportunities and Challenges

ABSTRACT. Foundation models, such as GPT-4 Vision, have achieved significant milestones in the fields of natural language processing and vision, demonstrating exceptional abilities to adapt to new tasks and scenarios. However, physical interactions—such as cooking, cleaning, or caregiving—remain a frontier where foundation models and robotic systems have yet to achieve the desired level of adaptability and generalization. In this talk, I will explore the opportunities for incorporating foundation models into traditional robotic pipelines to endow robots with capabilities beyond those achievable with conventional robotic tools. The core idea behind this research is to introduce novel representations and integrate structural priors into robot learning systems, leveraging the commonsense knowledge learned from foundation models to achieve the best of both worlds. I will demonstrate how such integration enables robots to interpret instructions given in free-form natural language and allows for finer-grained robotic manipulation, considering both spatial and temporal constraints. The performance of the system is showcased through a wide range of real-world manipulation tasks involving rigid, articulated, and deformable objects. Towards the end of the talk, I will discuss the limitations of current foundation models, the challenges that still lie ahead, and potential avenues to address these challenges.

11:30
Dependence Induced Representations
PRESENTER: Xiangxiang Xu

ABSTRACT. We study the problem of learning feature representations from a pair of random variables, where we focus on the representations that are induced by their dependence. We provide sufficient and necessary conditions for such dependence induced representations, and illustrate their connections to Hirschfeld--Gebelein--R\'{e}nyi (HGR) maximal correlation functions and minimal sufficient statistics. We characterize a large family of loss functions that can learn dependence induced representations, including cross entropy, hinge loss, and their regularized variants. In particular, we show that the features learned from this family can be expressed as the composition of a loss-dependent function and the maximal correlation function, which reveals a key connection between representations learned from different losses. Our development also gives a statistical interpretation of the neural collapse phenomenon observed in deep classifiers. Finally, we present the learning design based on the feature separation, which allows hyperparameter tuning during inference.

11:50
Efficient LLM Serving via Lossy Computation

ABSTRACT. Large language models (LLMs) have exhibited human-like conversation ability. Yet scaling LLMs to longer contexts, such as extracting information from lengthy articles–one of the most fundamental tasks in healthcare applications–poses significant challenges. The primary issues are their inability to handle contexts beyond pre-training lengths and the system constraints that make deployment difficult, as memory requirements for inference increase with context length. The key idea to overcome these issues is that LLMs are extremely robust to the noise from lossy computation, like low-precision computation. Following this insight, we will discuss some recent results in serving LLMs at scale, especially in handling longer contexts. To deal with the algorithmic challenge, I will share our recent work on algorithmic support to extend the context length of LLMs to at least 8X longer by coarsening the positional information of distant tokens. To deal with the system challenge, I'll discuss our recent efforts in quantizing the intermediate states of past tokens to 2-bit numbers, which leads to a 3.5X wall-clock time speedup without harming performance. Finally, I will discuss our latest projects applying LLMs in healthcare applications, especially how we utilize retrieval techniques for long contexts to reduce the hallucination problem of healthcare chatbots.

10:30-12:10 Session 8E: Distributed Control, Optimization, and Learning II

Invited Session

Location: 1040
10:30
On the role of network structure in learning to coordinate with bounded rationality

ABSTRACT. Many socioeconomic phenomena, such as technology adoption, collaborative problem-solving, and content engagement, involve a collection of agents coordinating to take a common action, aligning their decisions to maximize their individual goals. We consider a model for networked interactions where agents learn to coordinate their binary actions under a strict bound on their rationality. We first prove that our model is a potential game and that the optimal action profile always aims to achieve perfect alignment at one of the two possible actions, regardless of the network structure. Using a stochastic learning algorithm known as Log Linear Learning, where agents have the same finite rationality parameter, we show that the probability of agents successfully agreeing on the correct decision monotonically increases in the number of network links. Therefore, more connectivity improves the accuracy of collective decision-making, as predicted by the phenomenon known as ``Wisdom of Crowds.'' Finally, we show that for a fixed number of links, a regular network maximizes the probability of success. We conclude that when using a network of irrational agents, promoting more homogeneous connectivity improves the accuracy of collective decision-making.

10:50
Regularity and Approximations for Partially Observed Stochastic Control and Applications to Learning

ABSTRACT. Partially observed optimal stochastic control presents challenging mathematical problems with significant practical modeling relevance. This paper focuses on partially observed Markov decision processes (POMDPs), where the controller has access to noisy versions of a hidden Markov process. The standard approach for POMDPs is to lift the problem to a probability measure-valued space using so-called {\it belief states}, which are the posterior distributions of the hidden state variable given the available information to the controller. The belief states form sufficient information for optimality, allowing the use of standard methods for fully observed stochastic control problems to approach the POMDPs. However, the resulting measure-valued control problem is generally not easy to work with analytically or computationally. One needs to approximate the problem and establish regularity properties of the POMDPs to ensure the consistency of the approximations.

In this work, we first focus on the regularity properties of the belief MDPs by providing sufficient conditions for the {\it weak Feller} property of the dynamics. These conditions, in turn, are shown to be sufficient for the consistency of a certain class of approximations in the belief space.

The construction of the belief states requires model knowledge, and therefore, they may not be a good compression scheme for POMDPs when the model is not available. To this end, we also introduce an alternative reduction of the POMDPs by introducing a new sufficient information structure based on so-called {\it finite memory history variables}. We then show that this alternative formulation can be used to construct approximation methods using finite-length feedback variables. Furthermore, we demonstrate that the approximation performance of the finite memory controllers is closely related to the notion of controlled filter stability. Finally, we show that one can construct finite memory Q-learning methods to learn the solution of this approximate model.

11:10
Locally Interdependent Multi-Agent MDP: Theoretical Framework for Decentralized Agents with Dynamic Dependencies

ABSTRACT. Many multi-agent systems in practice are decentralized and have dynamically varying dependencies. There has been a lack of attempts in the literature to analyze these systems theoretically. In this paper, we propose and theoretically analyze a decentralized model with dynamically varying dependencies called the Locally Interdependent Multi-Agent MDP. This model can represent problems in many disparate domains such as cooperative navigation, obstacle avoidance, and formation control. Despite the intractability that general partially observable multi-agent systems suffer from, we propose three closed-form policies that are theoretically near-optimal in this setting and can be scalable to compute and store. Consequentially, we reveal a fundamental property of Locally Interdependent Multi-Agent MDP's that the partially observable decentralized solution is exponentially close to the fully observable solution with respect to the visibility radius. We then discuss extensions of our closed-form policies to further improve tractability. We conclude by providing simulations to investigate some long horizon behaviors of our closed-form policies.

11:30
Do LLM Agents Have Regret? A Case Study in Online Learning and Games

ABSTRACT. Large language models (LLMs) have been increasingly employed for (interactive) decision-making, via the development of LLM-based autonomous agents. Despite their emerging successes, the performance of LLM agents in decision-making has not been fully investigated through quantitative metrics, especially in the multi-agent setting when they interact with each other, a typical scenario in real-world LLM-agent applications. To better understand the limits of LLM agents in these interactive environments, we propose to study their interactions in benchmark decision-making settings in online learning and game theory, through the performance metric of \emph{regret}. We first empirically study the {no-regret} behaviors of LLMs in canonical (non-stationary) online learning problems, as well as the emergence of equilibria when LLM agents interact through playing repeated games. We then provide some theoretical insights into the no-regret behaviors of LLM agents, under certain assumptions on the supervised pre-training and the rationality model of human decision-makers who generate the data. Notably, we also identify (simple) cases where advanced LLMs such as GPT-4 fail to be no-regret. To promote the no-regret behaviors, we propose a novel \emph{unsupervised} training loss of \emph{regret-loss}, which, in contrast to the supervised pre-training loss, does not require the labels of (optimal) actions. We then establish the statistical guarantee of generalization bound for regret-loss minimization, followed by the optimization guarantee that minimizing such a loss may automatically lead to known no-regret learning algorithms. Our further experiments demonstrate the effectiveness of our regret-loss, especially in addressing the above ``regrettable'' cases.