next day
all days

View: session overviewtalk overview

08:30-10:10 Session WeA1: Intersection of Learning, Optimization, and Control I

Organizer: Bin Hu

Location: Library
Mauricio C. de Oliveira (University of California San Diego, United States)
Yang Zheng (University of California San Diego, United States)
Convex Parameterization of Stabilizing Controllers and its LMI-based Computation via Filtering

ABSTRACT. Various new implicit parameterizations for stabilizing controllers that allow one to impose structural constraints on the controller have been proposed lately. They are convex but infinite-dimensional, formulated in the frequency domain with no available efficient methods for computation. In this talk, we introduce a kernel version of the Youla parameterization to characterize the set of stabilizing controllers. It features a single affine constraint, which allows us to recast the controller parameterization as a novel robust filtering problem. This makes it possible to derive the first efficient Linear Matrix Inequality (LMI) implicit parametrization of stabilizing controllers. Our LMI characterization not only admits efficient numerical computation but also guarantees a full-order stabilizing dynamical controller that is efficient for practical deployment. Numerical experiments demonstrate that our LMI can be orders of magnitude faster to solve than the existing closed-loop parameterizations.

Thinh Doan (Virginia Tech, United States)
Regularized Gradient Descent Ascent for Two-Player Zero-Sum Markov Games

ABSTRACT. We study the problem of finding the Nash equilibrium in a two-player zero-sum Markov game. Due to its formulation as a minimax optimization program, a natural approach to solve the problem is to perform gradient descent/ascent with respect to each player in an alternating fashion. However, due to the non-convexity/non-concavity of the underlying objective function, theoretical understandings of this method are limited. In our paper, we consider solving an entropy-regularized variant of the Markov game. The regularization introduces structure into the optimization landscape that make the solutions more identifiable and allow the problem to be solved more efficiently. Our main contribution is to show that under proper choices of the regularization parameter, the gradient descent ascent algorithm converges to the Nash equilibrium of the original unregularized problem. We explicitly characterize the finite-time performance of the last iterate of our algorithm, which vastly improves over the existing convergence bound of the gradient descent ascent algorithm without regularization. Finally, we complement the analysis with numerical simulations that illustrate the accelerated convergence of the algorithm.

Daniel Pfrommer (University of Pennsylvania, United States)
Thomas Zhang (University of Pennsylvania, United States)
Stephen Tu (Google, United States)
Nikolai Matni (University of Pennsylvania, United States)
TaSIL: Taylor Series Imitation Learning

ABSTRACT. We propose Taylor Series Imitation Learning (TaSIL), a simple augmentation to standard behavior cloning losses in the context of continuous control. TaSIL penalizes deviations in the higher-order Taylor series terms between the learned and expert policies. We show that experts satisfying a notion of incremental input-to-state stability are easy to learn, in the sense that a small TaSIL-augmented imitation loss over expert trajectories guarantees a small imitation loss over trajectories generated by the learned policy. We provide sample-complexity bounds for TaSIL that scale as Õ(1/n) in the realizable setting, for n the number of expert demonstrations. Finally, we demonstrate experimentally the relationship between the robustness of the expert policy and the order of Taylor expansion required in TaSIL, and compare standard Behavior Cloning, DART, and DAgger with TaSIL-loss-augmented variants. In all cases, we show significant improvement over baselines across a variety of MuJoCo tasks.

Alireza Fallah (MIT, United States)
Ali Makhdoumi (Duke University, United States)
Azarakhsh Malekian (University of Toronto, United States)
Asu Ozdaglar (MIT, United States)
Optimal Private Data Acquisition: Central and Local Differential Privacy

ABSTRACT. We study the design of optimal Bayesian data acquisition mechanisms for a platform interested in estimating the mean of a distribution by collecting data from privacy-conscious users with heterogeneous privacy demands. In particular, we consider two different types of privacy losses that users suffer from. The first one is the privacy loss of a user when she shares her data (only) with the platform, and the second one is the privacy loss through the released estimate (to the public) by the platform. The users share their data in exchange for a payment (e.g., through monetary transfers or services) that compensates for their privacy losses. The platform does not know the privacy sensitivity of users and must design a mechanism to solicit their preferences and then deliver both local and central privacy guarantees while minimizing the estimation error plus the expected payment to users. We first establish minimax lower bounds for the estimation error, given a vector of privacy guarantees, and show that a linear estimator is (near) optimal. We then turn to our main goal: designing an optimal data acquisition mechanism. We establish that the design of such mechanisms in a Bayesian setting (where the platform knows the distribution of users' sensitivities and not their realizations) can be cast as a nonconvex optimization problem. Additionally, for the class of linear estimators, we prove that finding the optimal mechanism admits a Polynomial Time Approximation Scheme.

08:30-10:10 Session WeA2: High Dimensional Statistics and Learning I

Organizers: Bruce Hajek, Jiaming Xu, Yihong Wu

Yuguo Chen (University of Illinois at Urbana-Champaign, United States)
Location: Solarium
Anderson Ye Zhang (University of Pennsylvania, United States)
Harrison H Zhou (Yale University, United States)
Leave-one-out Singular Subspace Perturbation Analysis for Spectral Clustering

ABSTRACT. The singular subspaces perturbation theory is of fundamental importance in probability and statistics. It has various applications across different fields. We consider two arbitrary matrices where one is a leave-one-column-out submatrix of the other one and establish a novel perturbation upper bound for the distance between two corresponding singular subspaces. It is well-suited for mixture models and results in a sharper and finer statistical analysis than classical perturbation bounds such as Wedin's Theorem. Powered by this leave-one-out perturbation theory, we provide a deterministic entrywise analysis for the performance of the spectral clustering under mixture models. Our analysis leads to an explicit exponential error rate for the clustering of sub-Gaussian mixture models.

Nived Rajaraman (University of California, Berkeley, United States)
Yanjun Han (University of California, Berkeley, United States)
Lin F. Yang (University of California, Los Angeles, United States)
Kannan Ramchandran (University of California, Berkeley, United States)
Jiantao Jiao (University of California, Berkeley, United States)
Near-optimal algorithms for Imitation Learning

ABSTRACT. We study the fundamental limits and efficient algorithms for imitation learning in Markov decision processes (MDP). It is known that Behavior Cloning (BC) achieves suboptimality that grows quadratically with the horizon in imitation learning, which is termed error compounding in the literature. However, the only existing lower bound for BC is about optimization error, and it is not clear whether error compounding is a fundamental phenomenon related to sample inefficiency. We show that when the MDP transition function is unknown, all algorithms have to suffer a suboptimality that grows quadratically with the horizon even if the optimization error is zero and the algorithm can interactively query the expert such as in the setting of DAGGER. This negative result suggests that overcoming error compounding requires imposing more assumptions in imitation learning. We show two approaches to mitigate the quadratic error compounding: active query with recoverable mistakes and known transition. First, we establish that if the expert knows how to recover from a temporary mistake and continue on the optimal path, then a carefully designed no-regret learning algorithm that actively queries the expert can provably beat the quadratic error compounding and achieve linear suboptimality growth in the horizon. Second, we transform the intuition that we should always utilize transition function knowledge to only visit states the expert has visited into a precise optimization problem named Mimic-MD, and introduce a novel technique named artificial trajectory synthesis to provably break the quadratic dependence on the horizon and improve the exponent to 3/2. Last, we show that under the additional assumption that the expert is optimal for the true reward function in the known transition setting, one can in fact further improve the exponent from 3/2 to 1 via two new algorithms in some special cases of MDP. One of the two new algorithms is intimately connected to inverse reinforcement learning and shows the provable benefit of running inverse reinforcement learning for imitation learning. We implement multiple instantiations of our approach on several continuous control tasks and find that we are able to significantly improve policy performance across a variety of dataset sizes.

Yucheng Liu (UC Davis, United States)
Xiaodong Li (UC Davis, United States)
Selecting the Number of Communities in Count-Weighted Networks

ABSTRACT. In this work we propose a stepwise method in selecting the number of communities in general count-weighted networks, which are common in real-world network data. In each step with a candidate number of communities $m=2,3,\ldots$, we first cluster the nodes into $m$ groups with SCORE, and then estimate the mean parameters of a generic degree-corrected stochastic block models (DCSBM). Then we normalize the adjacency matrix, where the normalizing factors are determined by the estimated degree-correction parameters as well as scaling factors that are obtained by solving a small-scale matrix scaling problem. The eigenvalues of the resultant normalized adjacency matrix are truncated with the threshold $(2+\epsilon)\sqrt{n}$ in magnitude, where $\epsilon$ is a prespecified small constant, such as $0.05$ or $0.01$. The stepwise procedure continues if the the number of remaining significant eigenvalues is greater than $m$, and stops otherwise.

A prominent feature of our method is that it is derived from a general underdispersed DCSBM, and can be applied to general count-weighted networks. Our method is shown to be consistent in estimating the true number of communities for the underdispersed DCSBM with certain assumptions, which hold for examples such as the Poisson and Bernoulli cases. An extension of the nonsplitting property of SCORE to count-weighted networks, and recent results concerning spectral radii of inhomogeneous random graphs, play essential roles in the analyses. Extensive numerical experiments on simulated and real data have also been conducted to demonstrate the empirical behaviors of our stepwise method.

08:30-10:10 Session WeA3: Learning I
Location: Butternut
Shachar Shayovitz (Tel Aviv University, Israel)
Meir Feder (Tel Aviv University, Israel)
Active Learning for Individual Data via Minimal Stochastic Complexity

ABSTRACT. Modern machine learning systems require massive amounts of labeled training data in order to achieve high accuracy rates. Active learning uses feedback to label the most informative data points and significantly reduce the labeling effort. Many heuristics for selecting data points have been developed in recent years which are usually tailored to a specific task and a general unified framework is lacking.

In this work, the individual data setting is considered and an active learning criterion is proposed. In this setting the features and labels, both in the training and the test, are specific individual, deterministic quantities. Motivated by connections between source coding and minimax learning, the proposed criterion attempts to find data points which minimize the average Predictive Normalized Maximum Likelihood (pNML) on the unlabeled test set. It is shown using a real data set that the proposed criterion performs better than other active learning criteria.

Jayanth Regatti (The Ohio State University, United States)
Hao Chen (The Ohio State University, United States)
Abhishek Gupta (The Ohio State University, United States)
Byzantine Resilience With Reputation Scores

ABSTRACT. Training distributed machine learning algorithms is prone to Byzantine attacks where the adversarial workers send corrupted model updates to derail the training. In this paper, we propose a reputation score based gradient aggregation as a possible solution. We introduce a class of novel stochastic gradient descent algorithms, ByGARS (Byzantine Gradient Aggregation using Reputation Scores) that involve computing reputation scores (of workers) using an auxiliary dataset at the server. These reputation scores are then used for aggregating the gradients (model updates) at the server. Under reasonable assumptions, we show that using these reputation scores is robust to any number of adversaries and prove the convergence of a representative algorithm, ByGARS++ for strongly convex objective functions using results from two-timescale stochastic approximation theory. The computational complexity of ByGARS++ is the same as the usual distributed stochastic gradient descent method with only an additional inner product computation in every iteration. We also demonstrate the effectiveness of the algorithms for non-convex learning problems using MNIST and CIFAR-10 datasets against almost all state-of-the-art Byzantine attacks.

Tomer Gafni (Ben-Gurion university of the Negev, Israel)
Kobi Cohen (Ben-Gurion university of the Negev, Israel)
Yonina Eldar (Weizmann Institute of Science, Israel)
CoBAAF: Controlled Bayesian Air Aggregation Federated Learning from Heterogeneous Data

ABSTRACT. Federated learning (FL) is an emerging machine learning paradigm for training models across multiple edge devices holding local data sets. Due to communications overhead, complexity, and privacy requirements, the global training at the server is done without receiving the entire original data from the edge devices, but only outputs of their local processing and training. A major challenge in FL is to reduce the bandwidth and energy consumption due to the repeated transmissions of large volumes of data by a large number of users over the wireless channel. Recently, over-the-air (OTA) FL has been suggested to achieve this goal. In this setting, all users transmit their data signal simultaneously over a Multiple Access Channel (MAC), and the computation is done over the wireless channel via air aggregation. However, existing methods suffer from performance degradation when handling statistical heterogeneity of users data. In this paper we address this challenge. We present a novel algorithm, dubbed Controlled Bayesian Air Aggregation Federated-learning (CoBAAF), that handles statistical heterogeneity in noisy networks using a joint design of three main steps in FL: Model distribution, local training, and global aggregation. Specifically, CoBAAF controls the drift in local updates using a correction term, and allows users to transmit their data signal simultaneously over MAC. Second, it adopts a Bayesian approach to average properly the channel output, thus mitigating the effect of the noise and fading induced by the channel. We analyze the convergence of CoBAAF to the loss minimizing model theoretically, showing its ability to achieve a convergence rate similar to that achieved over error-free channels. Extensive simulation results demonstrate the improved convergence of CoBAAF for training in machine learning problems.

Bryce Ferguson (University of California, Santa Barbara, United States)
Daigo Shishika (George Mason University, United States)
Jason Marden (University of California, Santa Barbara, United States)
Ensuring the Defense of Paths and Perimeters in Dynamic Defender-Attacker Blotto Games (dDAB) on Graphs
PRESENTER: Bryce Ferguson

ABSTRACT. When can we defend a region of space from being breached by an adversary? To answer this, we consider the Dynamic Defender-Attacker Blotto Game (dDAB) played on a graph, in which a fleet of defensive assets must be maneuvered through a network with the objective of defending a specified set of nodes from an adversarial fleet. In this framework, we seek to understand how the defender's objective (which nodes to defend in the graph) affects the amount of resources required to provide a safety guarantee. Particularly, we are interested in understanding how to defend paths-through and perimeters-of regions of space. Using the dDAB framework, we show the necessary and sufficient size of the defender's fleet needed to guarantee the defense of line graphs and cycle graphs, respectively. Then we show how these results can be viewed through a spatial interpretation to give insights on defense strategies in 2 Dimensional spaces. We find that more defender assets are needed for paths that are less direct (longer than the shortest path) and perimeters that have nooks and crevasses (non-convex).

Jialing Liao (Linköping University, Sweden)
Zheng Chen (Linköping University, Sweden)
Erik G. Larsson (Linköping University, Sweden)
Over-the-Air Federated Learning with Privacy Protection via Correlated Additive Perturbations
PRESENTER: Erik G. Larsson

ABSTRACT. In this paper, we consider privacy aspects of wireless federated learning (FL) with Over-the-Air (OtA) transmission of gradient updates from multiple users/agents to an edge server. By exploiting the waveform superposition property of multiple access channels, OtA FL enables the users to transmit their updates simultaneously with only linear processing techniques, which improves resource efficiency. However, this setting is vulnerable to privacy leakage since an adversary node can hear directly the uncoded message. Traditional perturbation-based methods provide privacy protection while sacrificing the training accuracy due to the reduced signal-to-noise ratio. In this work, we aim at minimizing privacy leakage to the adversary and the degradation of model accuracy at the edge server at the same time. More explicitly, spatially correlated perturbations are added to the gradients at the users before transmission. Utilizing the zero-sum property of the correlated perturbations, the side effect of added perturbation on the aggregated gradient update at the edge server can be minimized. In the meanwhile, the added perturbation will not be canceled out at the adversary due to the heterogeneity of wireless channels and misaligned power scaling at the adversary, which prevents privacy leakage. Theoretical analysis on the perturbation covariance matrix, differential privacy, and model convergence are provided, based on which an optimization problem is formulated to jointly design the covariance matrix and the amplitude scaling factor to balance between privacy protection and convergence performance.

08:30-10:10 Session WeA4: Information Theory and Coding I
Location: Pine
Shuche Wang (National University of Singapore, Singapore)
Yuanyuan Tang (University of Virginia, United States)
Ryan Gabrys (University of California-San Diego, United States)
Farzad Farnoud (University of Virginia, United States)
Permutation Codes for Correcting a Burst of at Most $t$ Deletions
PRESENTER: Yuanyuan Tang

ABSTRACT. Codes over permutations have received significant attention due to their applications in memory devices and powerline transmission systems. In this paper, we construct a family of permutation codes that are capable of correcting a burst of at most $t$ deletions with $\log n+O(\log\log n)$ bits of redundancy where $t$ is a constant, which significantly improves upon the state-of-the-art construction with redundancy $2t\log n$ bits.

Krishna Gopal Benerjee (Indian Institute of Technology Kanpur, India)
Adrish Banerjee (Indian Institute of Technology Kanpur, India)
Bounds on Reversible, Complement, Reversible-Complement, Constant Weight Sum Codes

ABSTRACT. For an even-length alphabet, a code is called Reversible, Complement, Reversible-Complement, Constant Weight Sum code (RCCWS code) if the code is closed under reverse, complement, and reverse-complement sequences, and for given symbols in the alphabet, the total number of occurrence of those symbols in each sequence is identical. In this work, we have obtained an upper bound (Sphere Packing bound) and a lower bound (Gilbert-Varshamov bound) on the size of an RCCWS code defined over an even-size alphabet. As a special case of these two bounds, we also have obtained an upper and lower bound on the size of DNA codes that satisfies Reverse, Complement, Reverse-complement, and GC-content constraints.

Nicolas Charpenay (IRISA, France)
Maël Le Treust (ETIS UMR 8051, CY Université, ENSEA, CNRS, France)
Aline Roumy (INRIA, France)
Zero-error Coding for Computing with Encoder Side-information

ABSTRACT. We study the zero-error source coding problem in which an encoder with side-information (SI) $g(Y)$ transmits source symbols $X$ to a decoder. The decoder has SI $Y$ and wants to recover $f(X,Y)$ where $f,g$ are deterministic. We exhibit a condition on the source distribution and $g$ that we call ``pairwise shared SI'', such that the optimal rate has a single-letter expression. This condition is satisfied if every pair of source symbols ``share" at least one SI symbol for all output of $g$. It has a practical interpretation, as $Y$ models a request made by the encoder on an image $X$, and $g(Y)$ corresponds to the type of request. It also has a graph-theoretical interpretation: under ``pairwise shared SI'' the characteristic graph can be written as a disjoint union of OR products. In the case where the source distribution is full-support, we provide an analytic expression for the optimal rate. We develop two examples, one under ``pairwise shared SI'', the other one with full-support source distribution; and we show that the optimal coding scheme outperforms trivial strategies in both examples.

Keerthana Gurushankar (Carnegie Mellon University, United States)
Praveen Venkatesh (Allen Institute, United States)
Pulkit Grover (Carnegie Mellon University, United States)
Extracting Unique Information Through Markov Relations

ABSTRACT. We propose two new measures for extracting the unique information in $X$ and not $Y$ about a message $M$, when $X, Y$ and $M$ are joint random variables with a given joint distribution. We take a Markov based approach, motivated by questions in fair machine learning, and inspired by similar Markov-based optimization problems that have been used in the Information Bottleneck and Common Information frameworks. We obtain a complete characterization of our definitions in the Gaussian case (namely, when $X, Y$ and $M$ are jointly Gaussian), under the assumption of Gaussian optimality. We also examine the consistency of our definitions with the partial information decomposition (PID) framework, and show that these Markov based definitions achieve non-negativity, but not symmetry, within the PID framework.

Rony Bou Rouphael (ETIS UMR 8051, ENSEA, CNRS, CY Cergy Paris Université, France)
Maël Le Treust (ETIS UMR 8051, ENSEA, CNRS, CY Cergy Paris Université, France)
Strategic Communication via Cascade Multiple-Description Network

ABSTRACT. In decentralised decision-making problems, agents choose their actions based on locally available information and knowledge about decision rules or strategies of other agents. We consider a three-node cascade network with an encoder, a relay and a decoder, having distinct objectives captured by cost functions. In such a cascade network, agents choose their respective strategies sequentially, as a response to the former agent's strategy and in a way to influence the decision of the latter agent in the network. We assume the encoder commits to a strategy before the communication takes place. Upon revelation of the encoding strategy, the relay commits to a strategy and reveals it. The communication starts, the source sequence is drawn and processed by the encoder and relay. Then, the decoder observes a sequences of symbols, updates its Bayesian posterior beliefs accordingly, and takes the optimal action. This is an extension of the Bayesian persuasion problem in the Game Theory literature. In this work, we provide an information-theoretic approach to study the fundamental limit of the strategic communication via three-node cascade network. Our goal is to characterize the optimal strategies of the encoder, the relay and the decoder, and study the asymptotic behavior of the encoder's minimal long-run cost function.

10:30-12:30 Session WeB1: Intersection of Learning, Optimization, and Control II

Organizer: Bin Hu

Location: Library
Jorge I. Poveda (University of Colorado, Boulder, United States)
Momentum-Based Learning and Optimization Dynamics with Stochastic Restarting

ABSTRACT. We study a class of algorithms that incorporate continuous time dynamics with dynamic momentum, and discrete-time stochastic resets. The resets serve as restarting techniques used to guarantee stability and exponential convergence in a general class of optimization and equilibrium seeking problems. The momentum allows to achieve acceleration under suitable strong monotonicity properties. The stability and convergence features of the algorithm are studied using tools from stochastic hybrid dynamical system's theory. The advantages and limitations of the proposed approach are illustrated via numerical examples. To the best of our knowledge this is the first work that uses stochastic hybrid dynamical systems theory to provide stability and convergence guarantees in accelerated dynamics with stochastic restarting.

Yuanyuan Shi (UCSD, United States)
Stability and Safety Constrained Reinforcement Learning for Voltage Control

ABSTRACT. To achieve high penetration of distributed renewable generations at the distribution grid level, maintaining the voltage levels within the safe limit has been increasingly challenging. Recently, significant efforts have been put into the design of real-time feedback controllers for voltage control purposes. Despite the progress, most existing work has only been able to optimize the steady-state cost. In the meanwhile, transient performance is of equal importance. For example, once a voltage violation happens, an important goal is to bring the voltage profile back to the safe limit as soon as possible, or in other words, to minimize the voltage recovery time. However, optimizing the transient cost like the voltage recovery time has long been challenging as this is a nonlinear control problem. The challenge has further been complicated by the fact that the exact topology and the parameters of distribution networks are often unknown or inexact.

Reinforcement Learning (RL), is a promising tool to address the above challenges. Despite the promise, major hurdles remain before the deployment of RL in power systems is possible. Especially, stability and safety guarantees are crucial, e.g. the ability to maintain a safe operating point in the face of disturbances. Failure to maintain stability and safety can lead to violation of regulatory requirements or even catastrophic consequences, e.g., blackouts. However, off-the-shelf RL algorithms usually focus on optimizing certain performance metrics (e.g., reward) and do not explicitly consider stability and safety. In this talk, I would like to introduce two recent works that we developed to ensure provable stability and safety guarantees of reinforcement learning for real-time voltage control in distribution systems.

In the first part of the talk, I will talk about a stability-constrained RL framework that learns a control policy to optimize transient cost for voltage control with provable stability guarantees. The key idea underlying our approach is that we show strict monotonicity of the policy is sufficient to formally guarantee stability. In particular, we consider the Krasovskii's method for Lyapunov function construction. A sufficient condition for the system to be stable is that the derivative of the Lyapunov function is negative definite along the system trajectories. For the voltage control application, we proved that the stability condition boils down to that the voltage controller should be a monotonically decreasing function that goes through the origin.

With this stability result, we propose a stable RL approach that integrates the monotone policy constraint with policy optimization. We parameterize the controllers by a single hidden layer neural network and restrict the weights and the biases of the neural networks such that the resulting function is monotonically decreasing. The proposed method enables us to leverage the power of RL to improve the transient performance of voltage control without knowing the underlying model parameters, and in the meanwhile provably guarantee stability during and after the training. To the best of our knowledge, this is the first RL approach that learns nonlinear policies with stability guarantees for voltage control.

Besides stability, it is also crucial to guarantee system safety when deploying learning-based controllers for power systems. To clarify the difference between safety and stability requirements, safety requires the system states to stay within a certain range (e.g., $\pm 5$\% of the nominal voltage value), while stability requires the system to converge to a certain point (e.g. restoring to the nominal value). In the second part of the talk, I will introduce SAVER, a novel safety projection layer that projects the reactive power injection outputted by the trained RL policy into a safety set of nodal voltage magnitudes. The proposed method can fast compute the safe reactive power injections in terms of voltage constraints with guarantees. Moreover, the safety learning framework can be embedded as a lightweight plug-and-play module into our stability-constrained RL framework, thus obtaining stability and safety guarantees at the same time.

We demonstrate the effectiveness of the approach in the simulation of the Southern California Edison 56 bus distribution system with high penetration of photovoltaic (PV) generations. The proposed method can reduce the transient control cost by more than 30% and shorten the response time by a third compared to a widely used linear policy, while always achieving voltage stability and safety. In contrast, standard RL methods often fail to maintain voltage stability and safety.

Yuan Cheng (University of Science and Technology of China, China)
Jing Yang (Pennsylvania State University, United States)
Yingbin Liang (The Ohio State University, United States)
Representation-assisted Reinforcement Learning under Distribution Shift

ABSTRACT. This work focuses on the reinforcement leaning (RL) setting under distribution shift. The transition distribution of the Markov decision process (MDP) is modeled via a low-rank model, where the representation remains the same for both upstream and downstream tasks, whereas the linear weights are different capturing the distribution shift. For upstream, we design a novel algorithm that learns a near accurate representation via reward-free RL exploration. We then theoretically characterize the closeness between the learned and the ground truth representations using a notion of representation divergence proposed in Du et al. (2021). Then for the downstream task, given an offline dataset, by leveraging the learned representation from upstream, we propose a sample efficient algorithm, which we show to require provably fewer data samples to find near optimal policy.

David Fridovich-Keil (University of Texas at Austin, United States)
Mixed Strategies with Finite Support in Continuous Spaces: A Case Study in Trajectory Games

ABSTRACT. As autonomous systems become more widely deployed, their performance depends strongly upon interactions with human beings and other autonomous actors. This talk presents a game theoretic view, and focuses on a key technical challenge which emerges. That is, what is the right space of strategies to consider when competing with other agents? We shall discuss recent results (which appeared at this year’s Robotics: Science & Systems conference) in which we see that, by jointly reasoning about continuous actions and mixing probabilities, agents can achieve superior performance in noncooperative trajectory games.

10:30-12:30 Session WeB2: Sequential Methods I

Organizer: G. Fellouris and V. Veeravalli

Location: Solarium
Fei Sha (University of Nebraska-Lincoln, United States)
Ruizhi Zhang (University of Georgia, United States)
Quickest Detection of the Change of Community via Stochastic Block Models

ABSTRACT. Community detection is a fundamental problem in network analysis and has important applications in sensor networks and social networks. In many cases, the community structure of the network may change at some unknown time and thus it is desirable to come up with efficient monitoring procedures that can detect the change as quickly as possible. In this work, we use the Erdos-Renyi model and the bisection stochastic block model (SBM) to model the pre-change and post-change distributions of the network, respectively. That is, initially, we assume there is no community in the network. However, at some unknown time, a change occurs, and two communities are formed in the network. We then propose an efficient monitoring procedure by using the number of $k$-cycles in the graph. The asymptotic detection properties of our proposed procedure are derived when all parameters are known. A generalized likelihood ratio (GLR) type detection procedure and an adaptive CUSUM type detection procedure are constructed to address the problem when parameters are unknown.

Topi Halme (Aalto university, Finland)
Eyal Nitzan (KIOXIA America, Israel)
Visa Koivunen (Aalto university, Finland)
Spatio-temporal Multiple Change-point Detection in Sensor Networks

ABSTRACT. Rapid detection of spatial events that propagate across a sensor network is of wide interest in many modern applications. In particular, in communications, radar, IoT, environmental monitoring, and biosurveillance, we may observe propagating fields or particles. In this presentation, we propose Bayesian sequential single and multiple change-point detection procedures for the rapid detection of such phenomena. The multi-sensor system configuration may be arbitrary and sensors may be mobile. We derive the structure of the optimal single-event quickest detection procedure, which minimizes the average detection delay (ADD) subject to a false alarm probability upper bound.In the rare event regime, the optimal procedure converges to a more practical threshold test on the posterior probability of the change point. A convenient recursive computation of this posterior probability is derived. The proposed procedure is extended for detecting multiple propagating spatial events in parallel using the multiple hypothesis testing (MHT) framework. A method that provides strict false discovery rate (FDR) control is proposed. The performance of the methods is studied analytically and in simulations.

Tim Brucks (University of Texas at San Antonio, United States)
Taposh Banerjee (University of Pittsburgh, United States)
Rahul Mishra (ISRO, India)
Quickest Detection of a Threat to an Impending Disaster

ABSTRACT. The problem of detecting the presence of a signal that can lead to a disaster is studied. A decision-maker collects data sequentially over time. At some point in time, called the change point, the distribution of the data changes due to an event or a sudden arrival of an object. If not detected quickly, this change has the potential to cause a major disaster. Optimal and approximately optimal algorithms are obtained to solve this problem. The problem has applications in infrastructure monitoring and satellite safety.

Rui Zhang (Georgia Institute of Technology, United States)
Yao Xie (Georgia Institute of Technology, United States)
Rui Yao (Google, United States)
Feng Qiu (Argonne National Lab, United States)
Online detection of cascading change-points using diffusion networks

ABSTRACT. We propose an online detection procedure for cascading failures in the network from sequential data, which can be modeled as multiple correlated change-points happening during a short period. We consider a temporal diffusion network model to capture the temporal dynamic structure of multiple change-points and develop a sequential Shewhart procedure based on the generalized likelihood ratio statistics based on the diffusion network model assuming unknown post-change distribution parameters. We also tackle the computational complexity posed by the unknown propagation. Numerical experiments demonstrate good performance for detecting cascade failures.

Liyan Xie (The Chinese University of Hong Kong, Shenzhen, China)
George Moustakides (University of Patras, Greece)
Yao Xie (Georgia Institute of Technology, United States)
Window-Limited CUSUM for Sequential Change Detection

ABSTRACT. We study the parametric online changepoint detection problem, where the underlying distribution of the streaming data changes from a known distribution to an alternative that is of a known parametric form but with unknown parameters. We propose a joint detection/estimation scheme, which we call Window-Limited CUSUM, that combines the cumulative sum (CUSUM) test with a sliding window-based consistent estimate of the post-change parameters. We characterize the optimal choice of the window size and show that the Window-Limited CUSUM enjoys first-order asymptotic optimality. Compared to existing schemes with similar asymptotic optimality properties, our test is far simpler in implementation because it can recursively update the CUSUM statistic by employing the estimate of the post-change parameters. A parallel variant is also proposed that facilitates the practical implementation of the test. Numerical simulations corroborate our theoretical findings.

10:30-12:30 Session WeB3: Networks, Learning, and Algorithms I

Organizers: Bruce Hajek and R. Srikant

Thinh Doan (Virginia Tech, United States)
Location: Butternut
Hsu Kao (University of Michigan Ann Arbor, United States)
Chen-Yu Wei (University of Southern California, United States)
Vijay Subramanian (University of Michigan, United States)
Decentralized Cooperative Reinforcement Learning with Hierarchical Information Structure

ABSTRACT. Multi-agent reinforcement learning (MARL) problems are challenging due to information asymmetry. To overcome this challenge, existing methods often require high level of coordination or communication between the agents. We consider two-agent multi-armed bandits (MABs) and Markov decision processes (MDPs) with a hierarchical information structure arising in applications, which we exploit to propose simpler and more efficient algorithms that require no coordination or communication. In the structure, in each step the “leader” chooses her action first, and then the “follower” decides his action after observing the leader’s action. The two agents observe the same reward (and the same state transition in the MDP setting) that depends on their joint action. For the bandit setting, we propose a hierarchical bandit algorithm that achieves a near-optimal gap-independent regret of \tilde{O}(\sqrt{ABT}) and a near-optimal gap-dependent regret of O(\log(T)), where A and B are the numbers of actions of the leader and the follower, respectively, and T is the number of steps. We further extend to the case of multiple followers and the case with a deep hierarchy, where we both obtain near-optimal regret bounds. For the MDP setting, we obtain \tilde{O}(\sqrt{H^7S^2ABT}) regret, where H is the number of steps per episode, S is the number of states, T is the number of episodes. This matches the existing lower bound in terms of A, B, and T.

Anand Kalvit (Columbia University, United States)
Assaf Zeevi (Columbia University, United States)
Bandits with Dynamic Arm-acquisition Costs

ABSTRACT. We consider a bandit problem where at any time, the decision maker can add new arms to her consideration set. A new arm is queried at a cost from an arm-reservoir containing finitely many arm-types, each characterized by a distinct mean reward. The cost of query reflects in a diminishing probability of the returned arm being optimal, unbeknown to the decision maker; this feature encapsulates defining characteristics of a broad class of operations-inspired online learning problems, e.g., those arising in markets with churn, or those involving allocations subject to costly resource acquisition. The decision maker’s goal is to maximize her cumulative expected payoffs over a sequence of pulls, oblivious to the statistical properties of the arms as well as their types. We study two natural modes of endogeneity in the reservoir distribution and characterize (tight) necessary conditions for achievability of sublinear regret in the problem. We also provide a granular analysis of the effects of endogeneity on the performance of algorithms that are agnostic to the reservoir distribution. We believe our algorithm design principles and theoretical findings may be of broader interest and guide future work in the area.

Jiacheng Tang (GE Global Research, United States)
Abhishek Gupta (The Ohio State University, United States)
A Survey of Dynamic Watermarking Algorithms and Some Recent Developments

ABSTRACT. Dynamic watermarking is an active intrusion detection technique that can potentially detect replay attacks, spoofing attacks, and deception attacks in the feedback channel in a networked control systems. In this paper, we provide a survey of dynamic watermarking algorithms for linear systems, nonlinear systems, and Markov decision processes. We demonstrate the effectiveness of the proposed dynamic watermarking algorithm by detecting a spoofing attack in a sensor network system. We further provide some directions for the future research.

Divyansh Jhunjhunwala (Carnegie Mellon University, United States)
Ankur Mallick (Carnegie Mellon University, United States)
Gauri Joshi (Carnegie Mellon University, United States)
Leveraging Spatial and Temporal Correlations in Distributed Learning

ABSTRACT. We study the problem of estimating at a central server the mean of a set of vectors distributed across several nodes (one vector per node). When the vectors are high-dimensional, the communication cost of sending entire vectors may be prohibitive, and it may be imperative for them to use sparsification techniques. While most existing work on sparsified mean estimation is agnostic to the characteristics of the data vectors, in many practical applications such as federated learning, there may be spatial correlations (similarities in the vectors sent by different nodes) or temporal correlations (similarities in the data sent by a single node over different iterations of the algorithm) in the data vectors. We leverage these correlations by simply modifying the decoding method used by the server to estimate the mean. We provide an analysis of the resulting estimation error as well as experiments for PCA, K-Means and Logistic Regression, which show that our estimators consistently outperform more sophisticated and expensive sparsification methods.

10:30-12:30 Session WeB4: Information Theory and Coding II
Location: Pine
Linghui Zhou (KTH Royal Institute of Technology, Sweden)
Tobias Oechtering (KTH Royal Institute of Technology, Sweden)
Mikael Skoglund (KTH Royal Institute of Technology, Sweden)
Uncertainty in Biometric Identification and Authentication Systems with Strong Secrecy

ABSTRACT. In this work, we derive fundamental bounds of biometric identification and authentication systems under uncertainty considerations. The uncertainty lies in two aspects, i.e., the probability mass function (p.m.f.) of users' biometric information and the observation channel. For the p.m.f. of users' biometrics, it is assumed that the users are from different groups and the distribution of each group is determined by the state. For the observation channel, it is assumed that there is a set of channels and the actual observation channel is from this set. It is shown that our work is more general and can be reduced to the setup without uncertainty studied in the literature. Moreover, we show that strong secrecy can be achieved without the loss of the other performance.

Junghoon Kim (Purdue University, United States)
Seyyedali Hosseinalipour (Purdue University, United States)
Taejoon Kim (University of Kansas, United States)
David Love (Purdue University, United States)
Christopher Brinton (Purdue University, United States)
Linear Coding for Gaussian Two-Way Channels
PRESENTER: Junghoon Kim

ABSTRACT. We consider linear coding for Gaussian two-way channels (GTWCs), in which each user generates the transmit symbols by linearly encoding both its message and the past received symbols (i.e., the feedback information) from the other user. In Gaussian one-way channels (GOWCs), Butman has proposed a well-developed model for linear encoding that encapsulates feedback information into transmit signals. However, such a model for GTWCs has not been well studied since the coupling of the encoding processes at the users in GTWCs renders the encoding design non-trivial and challenging. In this paper, we aim to fill this gap in the literature by extending the existing signal models in GOWCs to GTWCs. With our developed signal model for GTWCs, we formulate an optimization problem to jointly design the encoding/decoding schemes for both the users, aiming to minimize the weighted sum of their transmit powers under signal-to-noise ratio constraints. Further, we provide new insights on the encoding design for GTWCs. In particular, we show that it is optimal that one of the users (i) does not transmit the feedback information to the other user at the last channel use, and (ii) transmits its message only over the last channel use. With these solution behaviors, we further simplify the problem and solve it via an iterative two-way optimization scheme. We numerically demonstrate that our proposed scheme for GTWCs achieves a better performance in terms of the transmit power compared to the existing counterparts, such as the non-feedback scheme and one-way optimization scheme.

Alejandro Lancho (Massachusetts Institute of Technology, United States)
Alexander Fengler (Massachusetts Institute of Technology, United States)
Yury Polyanskiy (Massachusetts Institute of Technology, United States)
Finite-Blocklength Results for the A-channel: Applications to Unsourced Random Access and Group Testing

ABSTRACT. We present finite-blocklength achievability bounds for the unsourced A-channel. In this multiple-access channel, users noiselessly transmit codewords picked from a common codebook with entries generated from a q-ary alphabet. At each channel use, the receiver observes the set of different transmitted symbols but not their multiplicity. We show that the A-channel finds applications in unsourced random-access (URA) and group testing. Leveraging the insights provided by the finite-blocklength bounds and the connection between URA and non-adaptive group testing through the A-channel, we propose improved decoding methods for state-of-the-art A-channel codes and we showcase how A-channel codes provide a new class of structured group testing matrices. The developed bounds allow to evaluate the achievable error probabilities of group testing matrices based on random A-channel codes for arbitrary numbers of tests, items and defectives. We show that such a construction asymptotically achieves the optimal number of tests. In addition, every efficiently decodable A-channel code can be used to construct a group testing matrix with sub-linear recovery time.

Francisco Pernice (Stanford University, United States)
On the Symmetries of the Deletion Channel

ABSTRACT. In this short note, we consider a class of symmetry groups associated to communication channels, which can informally be viewed as the transformations of the set of inputs that “commute” with the action of the channel. These groups were first studied by Polyanskiy in (IEEEToIT 2012). We show the simple result that the input distribution that attains the maximum mutual information for a given channel is a “fixed point” of its group. We conjecture (and give empirical evidence) that the channel group of the deletion channel is extremely small (it contains a number of elements constant in the blocklength). We prove a special case of this conjecture. This serves as some formal justification for why the analysis of the binary deletion channel has proved much more difficult than its memoryless counterparts.

13:30-15:10 Session WeC1: Networks, Learning, and Algorithms II

Organizers: Bruce Hajek and R. Srikant

Assaf Zeevi (Columbia University, United States)
Location: Library
Andrew Wagenmaker (University of Washington, United States)
Max Simchowitz (UC Berkeley, United States)
Kevin Jamieson (University of Washington, United States)
Beyond No Regret: Instance-Dependent PAC Reinforcement Learning

ABSTRACT. The theory of reinforcement learning has focused on two fundamental problems: achieving low regret, and identifying -optimal policies. While a simple reduction allows one to apply a low-regret algorithm to obtain an -optimal policy and achieve the worst-case optimal rate, it is unknown whether low-regret algorithms can obtain the instance-optimal rate for policy identification. We show that this is not possible—there exists a fundamental tradeoff between achieving low regret and identifying an -optimal policy at the instance-optimal rate. Motivated by our negative finding, we propose a new measure of instance-dependent sample complexity for PAC tabular reinforcement learning which explicitly accounts for the attainable state visitation distributions in the underlying MDP. We then propose and analyze a novel, planning-based algorithm which attains this sample complexity—yielding a complexity which scales with the suboptimality gaps and the “reachability” of a state. We show that our algorithm is nearly minimax optimal, and on several examples that our instance-dependent sample complexity offers significant improvements over worst-case bounds.

Michael J. Neely (University of Southern California, United States)
Repeated Games, Optimal Channel Capture, and Open Problems for Slotted Multiple Access

ABSTRACT. This paper revisits a classical problem of slotted multiple access with success, idle, and collision events on each slot. First, results of a 2-user multiple access game are reported. The game was conducted at the University of Southern California over multiple semesters and involved competitions between student-designed algorithms. An algorithm called 4-State was a consistent winner. This algorithm is analyzed and shown to have an optimal expected score when competing against an independent version of itself. The structure of 4-State motivates exploration of the open question of how to minimize the expected time to capture the channel for a $n$-user situation. It is assumed that the system delivers perfect feedback on the number of users who transmitted at the end of each slot. An efficient algorithm is developed and conjectured to have an optimal expected capture time for all positive integers $n$. Optimality is proven in the special cases $n \in \{1, 2, 3, 4, 6\}$ using a novel analytical technique that introduces virtual users with enhanced capabilities.

Siddhartha Banerjee (Cornell University, United States)
The Multisecretary Problem: Regret, Approximation and Learning

ABSTRACT. The multisecretary problem --- where a principal can hire up to B out of T candidates, with i.i.d. values drawn from some known distribution F --- has long served as an important baseline for understanding online decision-making problems. Recent works have uncovered the regret landscape (i.e., the additive gap between any online algorithm and the optimal hindsight policy) of this problem across a variety of different distribution classes. We provide a simple framework that unites all these results, and moreover, shows that the guarantees are essentially preserved even when F is unknown, but in some well-behaved class of distributions. Our approach illuminates the relation between regret, measure concentration, and the geometry of the multisecretary problem, and we discuss how these ideas can translate to more complex settings.

Vivek Farias (MIT, United States)
Markovian Interference in Experiments

ABSTRACT. We consider experiments in dynamical systems where interventions on some experimental units impact other units through a limiting constraint (such as a limited inventory). Despite outsize practical importance, the best estimators for this `Markovian' interference problem are largely heuristic in nature, and their bias is not well understood. We formalize the problem of inference in such experiments as one of policy evaluation. Off-policy estimators, while unbiased, apparently incur a large penalty in variance relative to state-of-the-art heuristics. We introduce an on-policy estimator: the Differences-In-Q's (DQ) estimator. We show that the DQ estimator can in general have exponentially smaller variance than off-policy evaluation. At the same time, its bias is second order in the impact of the intervention. This yields a striking bias-variance tradeoff so that the DQ estimator effectively dominates state-of-the-art alternatives. From a theoretical perspective, we introduce three separate novel techniques that are of independent interest in the theory of Reinforcement Learning (RL). Our empirical evaluation includes a set of experiments on a city-scale ride-hailing simulator. Joint work with Andrew Li (CMU), Tianyi Peng (MIT) and Andy Zheng (MIT).

13:30-15:10 Session WeC2: Sequential Methods II

Organizers: G. Fellouris and V. Veeravalli

Location: Solarium
Arpan Mukherjee (RPI, United States)
Ali Tajer (RPI, United States)
Efficient SPRT-based Best Arm Identification in Stochastic Bandits

ABSTRACT. This paper investigates the problem of best arm identification (BAI) in stochastic multi-armed bandits in the fixed confidence setting. A novel formulation based on sequential hypothesis testing is provided, and an algorithm for BAI is proposed that, in spirit, follows the structure of the canonical sequential probability ratio test (SPRT). The algorithm has three features: (1) its sample complexity is asymptotically optimal, (2) it is guaranteed to be $\delta$-PAC, and (3) it addresses the computational challenge of the state-of-the-art approaches. Specifically, the existing approaches rely on Thompson sampling for dynamically identifying the best arm and a challenger. This paper shows that identifying the challenger can be computationally expensive and demonstrates that the SPRT-based approach addresses that computational weakness.

Yuyang Shi (Georgia Institute of Technology, United States)
Yajun Mei (Georgia Institute of Technology, United States)
Efficient sequential UCB-based Hungarian algorithm for assignment problems

ABSTRACT. The assignment problem has many real-world applications such as allocations of agents and tasks for optimal utility gain. While it has been well-studied in the optimization literature when the underlying utility between every pair of agent and task is known, research is limited when the utilities are unknown and need to be learned from data on the fly. In this work, motivated by the mentor-mentee matching application in U.S. universities, we develop an efficient sequential assignment algorithm, with the objective of nearly maximizing the overall utility simultaneously for each time. Our proposed algorithm is to use stochastic binary bandit feedback to estimate the unknown utilities through the logistic regression, and then to combine the Upper Confidence Bound (UCB) algorithm in the multi-armed bandit problem with the Hungarian algorithm in the assignment problem. We derive the theoretical bounds of our algorithm for both the estimation error and the total regret, and numerical studies are also conducted to illustrate the usefulness of our algorithm.

Mehmet Necip Kurt (Columbia University, United States)
Yasin Yilmaz (University of South Florida, United States)
Xiaodong Wang (Columbia University, United States)
Pieter J. Mosterman (MathWorks, United States)
Data-Driven Sequential Change Detection in Privacy-Sensitive Networks
PRESENTER: Yasin Yilmaz

ABSTRACT. We study privacy-preserving sequential change detection in a setting where the data are distributed over a network and locally sensitive to each node, and probabilistic data model is unknown. We design and analyze a data-driven solution scheme where each node observes a high-dimensional data stream for which it computes a local outlierness score, perturbs, encrypts, and sends it to a network operator. The network operator then decrypts an aggregate statistic over the network and performs online network change detection via the proposed generalized cumulative sum (CUSUM) algorithm. We derive an asymptotic lower bound and an asymptotic approximation for the average false alarm period of the proposed algorithm. Additionally, we derive an asymptotic upper bound and asymptotic approximation for the average detection delay of the proposed algorithm under a certain anomaly. We show the analytical tradeoff between the change detection performance and the differential privacy level, controlled via the local perturbation noise. Experiments illustrate that the proposed algorithm offers a good tradeoff between privacy and quick change detection against the UDP flooding and spam attacks in a real Internet of Things (IoT) network.

Hadar Szostak (Ben-Gurion University of the Negev, Israel)
Kobi Cohen (Ben-gurion University Of The Negev, Israel)
Decentralized Anomaly Detection via Deep Multi-Agent Reinforcement Learning

ABSTRACT. We consider a decentralized anomaly detection problem, where multiple agents collaborate to localize a single anomalous process among a finite number M of processes. At each time, a subset of the processes can be observed by each agent, and the observations from each chosen process follow two different distributions, depending on whether the process is normal or abnormal. The communication channel between agents is rate-limited. The objective is a sequential search strategy that minimizes the Bayes risk, consisting of the sampling cost, and the joint terminal cost among the agents. This problem generalizes previous studies that considered anomaly detection by a single detector. We develop a novel algorithm based on deep multi-agent reinforcement learning optimization to minimize the Bayes risk. Numerical experiments demonstrate the ability of the algorithm to learn good policies in this challenging problem, and improve the single-agent performance by applying the proposed multi-agent collaborative learning method.

13:30-15:10 Session WeC3: Representation, Learning, and Inference
Location: Butternut
Pradeep Kr. Banerjee (Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany)
Kedar Karhadkar (UCLA, United States)
Yu Guang Wang (Shanghai Jiao Tong University, China)
Uri Alon (CMU, United States)
Guido Montufar (UCLA and MPI MiS, United States)
Oversquashing in GNNs through the lens of information contraction and graph expansion
PRESENTER: Kedar Karhadkar

ABSTRACT. The quality of signal propagation in message-passing graph neural networks (GNNs) strongly influences their expressivity as has been observed in recent works. In particular, for prediction tasks relying on long-range interactions, recursive aggregation of node features can lead to an undesired phenomenon called ``oversquashing''. We present a framework for analyzing oversquashing based on information contraction. Our analysis is guided by a model of reliable computation due to von Neumann that lends a new insight into oversquashing as signal quenching in noisy computation graphs. Building on this, we propose a graph rewiring algorithm aimed at alleviating oversquashing. Our algorithm employs a random local edge flip primitive motivated by an expander graph construction. We compare the spectral expansion properties of our algorithm with that of an existing curvature-based non-local rewiring strategy. Synthetic experiments show that while our algorithm in general has a slower rate of expansion, it is overall computationally cheaper, preserves the node degrees exactly and never disconnects the graph.

Ben Kizaric (University of Wisconsin-Madison, United States)
Daniel Pimentel-Alarcon (University of Wisconsin-Madison, United States)
Classifying Incomplete Data with a Mixture of Subspace Experts

ABSTRACT. Performing inference on data with missing values is a highly common yet challenging task in machine learning, and is often addressed by imputing missing values before performing inference on the filled dataset. However, the two parts of this paradigm may be at odds with one another in that optimal imputation may not yield data that is optimal for inference. In this paper, we propose an ensemble approach to classification with missing values where base learners are trained on embeddings of the data, not on imputed versions of the data. To accomplish this, MoSEs first uses a novel subspace clustering algorithm which characterizes an input space as a hierarchy of localized affine subspaces, each of which produces a low-dimensional embedding of samples with missing data. Then, an expert for each subspace is trained on the associated embedding, with final predictions being a weighted sum of all expert predictions. The degree to which experts are localized to a subspace of data, as well as the degree to which final predictions are influenced by the nearest subspaces are controlled by a pair of tunable hyper-parameters. We perform experiments demonstrating the state-of-the-art performance of this method across a variety of reference datasets.

Xiangxiang Xu (MIT, United States)
Lizhong Zheng (MIT, United States)
Multivariate Feature Extraction
PRESENTER: Xiangxiang Xu

ABSTRACT. We propose a general framework to extract features for learning problems involving multiple variables. First, we decompose multivariate dependence into different components to obtain the piece relevant to the learning task. Then, we establish a modal decomposition approach to represent the component as informative features. We further demonstrate the algorithm design for extracting such features from real data, which can incorporate and utilize existing deep feature extractors. We also present an application of our framework in learning tasks with side information.

Xinyi Tong (Tsinghua University, China)
Xiangxiang Xu (MIT, United States)
Shao-Lun Huang (Tsinghua-Berkeley Shenzhen Institute, China)
On Sample Complexity of Learning Shared Representations: The Asymptotic Regime
PRESENTER: Xiangxiang Xu

ABSTRACT. Federated learning and multi-task learning have achieved great successes in recent works, which exploit the knowledge contained in different tasks to alleviate the influence of lacking training samples. In order to understand the knowledge utilization mechanism, we study the sample complexity of the federated learning algorithms by computing their expected population risks, where the asymptotic regime is considered that the number of tasks is large. Specifically, we focus on several typical algorithms, including (i) training personalized models merely by local data, (ii) jointly training a single model by all the data of devices, and (iii) jointly training the shared feature representation and learning individual classifiers of different devices, where the latter one is recently widely-adopted. The results reveal that the validity of the shared representation algorithm is affected by the task similarities, the model dimensionality, and the sample size, which quantify the sample size range where such algorithm is preferable. Moreover, we develop a new algorithm in consideration of collaborating the classifiers of different devices under the shared representations, which can have lower sample complexity than conventional methods under the asymptotic regime of large sample size. This provides theoretical guidance for practical algorithm designs.

Diego Benalcazar (University of Central Florida, United States)
Chinwendu Enyioha (University of Central Florida, United States)
Distributed Consensus-based Kalman Filter under Limited Communication

ABSTRACT. In this work, we consider a distributed estimation problem in a communication-constrained environment. To address the limited communication challenge, we present a fully distributed Kalman filtering algorithm in which each agent shares a compressed version of its estimated state information with its neighboring nodes. In the proposed algorithm, we explicitly compute the estimation error covariances of each node in a distributed manner based on the consensus filter using the compressed estimates. An intuitive finding is that for a specific mid-tread quantization function, compared with the uncompressed distributed Kalman consensus filter, the state estimates obtained with the quantized Kalman consensus filter are significantly similar; however, the estimation error covariances are noticeably different. We validate the theoretical results using simulations.

13:30-15:10 Session WeC4: Learning II
Location: Pine
Yigit Ege Bayiz (University of Texas at Austin, United States)
Ufuk Topcu (University of Texas at Austin, United States)
Decentralized Online Influence Maximization

ABSTRACT. We consider the problem of finding the maximally influential node in random networks where each node influences every other node with constant yet unknown probability. We develop an online algorithm that learns the relative influences of the nodes. It relaxes the assumption in the existing literature that a central observer can monitor the influence spread globally. The proposed algorithm delegates the online updates to the nodes on the network; hence requires only local observations at the nodes. We show that using an explore-then-commit learning strategy, the cumulative regret accumulated by the algorithm over horizon $T$ approaches $O(T^{2/3})$ for a network with a large number of nodes. Additionally, we show that, for fixed $T$, the worst case-regret grows linearly with the number $n$ of nodes in the graph. Numerical experiments illustrate this linear dependence for Chung-Lu models. The experiments also demonstrate that $\varepsilon$-greedy learning strategies can achieve similar performance to the explore-then-commit strategy on Chung-Lu models.

Yili Zhang (University of Michigan, United States)
Asaf Cohen (University of Michigan, United States)
Vijay Subramanian (University of Michigan, United States)
Learning-based Optimal Admission Control in a Single Server Queuing System

ABSTRACT. We consider a social welfare maximizing admission control problem in an M/M/1 queuing system with a known arrival rate but an unknown service rate. With a fixed reward collected upon service completion and a cost per unit of time enforced on customers waiting in the queue, a dispatcher decides upon arrivals whether to admit the arriving customer or not based on the full history observation of the queue-length of the system. Naor [1] showed that if all the parameters of the model are known, then it is optimal to use a threshold policy – admit if the queue-length is less than the threshold and otherwise not. We propose a learning-based dispatch algorithm and characterize its regret with respect to the full information model from [1]. We show that the algorithm achieves an O(1) regret when the optimal threshold with full information is non-zero, and achieves a O(ln2(N )) regret in the case the optimal threshold with full information is 0 (i.e., the optimal policy is to reject all arrivals), where N is the number of arrivals.

Sajjad Bahrami (University of California, Riverside, United States)
Ertem Tuncel (University of California, Riverside, United States)
Trimmed Minimum Error Entropy for Robust Online Regression

ABSTRACT. In this paper, online linear regression in environments corrupted by non-Gaussian noise is addressed. In such environments, the error between the system output and the label also does not follow a Gaussian distribution and there might exist abnormally large error samples (or outliers) which mislead the learning process. The main challenge is how to keep the supervised learning problem least affected by these unwanted and misleading outliers. In recent years, an information theoretic algorithm based on Renyi's entropy, called minimum error entropy (MEE), has been employed to take on this issue. However, this minimization might not result in a desired estimator inasmuch as entropy is shift-invariant, i.e., by minimizing the error entropy, error samples may not be necessarily concentrated around zero. In this paper, a quantization technique is utilized by which not only aforementioned need of setting errors around the origin in MEE is addressed, but also major outliers are rejected from MEE-based learning and MEE performance is improved from convergence rate and steady state misalignment points of view.

Yulan Zhang (Yale University, United States)
Anna Gilbert (Yale University, United States)
Stefan Steinerberger (University of Washington, United States)
May the force be with you

ABSTRACT. Modern methods in dimensionality reduction are dominated by nonlinear attraction-repulsion force-based methods (this includes t-SNE, UMAP, ForceAtlas2, LargeVis, and many more). The purpose of this paper is to demonstrate that all such methods, by design, come with an additional feature that is being automatically computed along the way, namely the vector field associated with these forces. We show how this vector field gives additional high-quality information and propose a general refinement strategy based on ideas from Morse theory. The efficiency of these ideas is illustrated specifically using t-SNE on synthetic and real-life data sets.

Ivan Perez Avellaneda (University of Vermont, United States)
Luis Duffaut Espinosa (University of Vermont, United States)
Reachability of Chen-Fliess series: A Gradient Descent Approach

ABSTRACT. Common procedures for maintaining systems in a safe region of operation are based on the computation of reachable sets. The objective is to find the tightest set containing all outputs of a system with respect to a set of inputs. Most methods for computing reachable sets require the knowledge of a state-space model, however, input-output approaches are more appealing due to model complexities and uncertainties. Additionally, these days abundance of data gives support to the development of data-driven approaches. In this paper, reachable sets are computed for of nonlinear systems that can be described as Chen-Fliess series. This approach relies on representing systems as weighted sums of iterated integrals studied using noncommutative formal powers series. Specifically, a gradient descent algorithm based on the Chen-Fliess formalism is proposed for computing output reachable sets. This algorithm requires extending the notion of G\^ateaux derivatives to this domain together with conditions for its existence. The gradient descent algorithm based on this derivative is then shown to be well-posed. Illustrative examples are provided in last section. These are compared against reachable sets computed using the mixed-monotonicity methodology.

15:30-17:10 Session WeD1: Networks, Learning, and Algorithms III

Organizers: Bruce Hajek and R. Srikant

Richard Sowers (University of Illinois at Urbana-Champaign, United States)
Location: Library
Rohan Deb (Indian Institute of Science, India)
Meet Gandhi (Indian Institute of Science, India)
Shalabh Bhatnagar (Indian Institute of Science, India)
Schedule Based Temporal Difference Algorithms

ABSTRACT. Learning the value function of a given policy from data samples is an important problem in Reinforcement Learning. TD(λ) is a popular class of algorithms to solve this problem. However, the weights assigned to different n-step returns in TD(λ), controlled by the parameter λ, decrease exponentially with increasing n. In this paper, we present a λ-schedule procedure that generalizes the TD(λ) algorithm to the case when the parameter λ could vary with time-step. This allows flexibility in weight assignment, i.e., the user can specify the weights assigned to different n-step returns by choosing a sequence {λt}t≥1. Based on this procedure, we propose an on-policy algorithm – TD(λ)- schedule, and two off-policy algorithms – GTD(λ)-schedule and TDC(λ)-schedule, respectively. We provide proofs of almost sure convergence for all three algorithms under a general Markov noise framework.

Harsh Dolhare (Indian Institute of Technology Bombay, India)
Vivek Borkar (Indian Institute of Technology Bombay, India)
A concentration bound for distributed stochastic approximation

ABSTRACT. We revisit the classical model of Tsitsiklis, Bertsekas and Athans for distributed stochastic approximation with concensus. The main result is an analysis of this scheme using the `ODE' (for `Ordinary Differential Equations') approach to stochastic approximation, leading to a high probability bound for the tracking error between suitably interpolated iterates and the limiting differential equation. Several future directions will also be highlighted.

Lili Su (Northeastern University, United States)
Jiaming Xu (Duke University, United States)
Pengkun Yang (Tsinghua University, China)
Global Convergence of Federated Learning for Mixed Regression

ABSTRACT. This paper studies the problem of model training under Federated Learning when clients exhibit cluster structure. We contextualize this problem in mixed regression, where each client has limited local data generated from one of $k$ unknown regression models. We design an algorithm that achieves global convergence from any initialization, and works even when local data volume is highly unbalanced -- there could exist clients that contain $O(1)$ data points only. Our algorithm first runs moment descent on a few anchor clients (each with $\tilde{\Omega}(k)$ data points) to obtain coarse model estimates. Then each client alternately estimates its cluster labels and refines the model estimates based on FedAvg or FedProx. A key innovation in our analysis is a uniform estimate on the clustering errors, which we prove by bounding the VC dimension of general polynomial concept classes based on the theory of algebraic geometry.

Zixian Yang (University of Michigan, Ann Arbor, United States)
Xin Liu (ShanghaiTech University, China)
Lei Ying (The University of Michigan, Ann Arbor, United States)
Exploration, Exploitation, and Engagement in Multi-Armed Bandits with Abandonment

ABSTRACT. Multi-armed bandit (MAB) is a classic model for understanding the exploration-exploitation trade-off. The traditional MAB model for recommendation systems assumes the user stays in the system for the entire learning horizon. In new online education platforms such as ALEKS or new video recommendation systems such as TikTok and YouTube Shorts, the amount of time a user spends on the app depends on how engaging the recommended contents are. Users may temporarily leave the system if the recommended items cannot engage the users. To understand the exploration, exploitation, and engagement in these systems, we propose a new model, called MAB-A where "A" stands for abandonment and the abandonment probability depends on the current recommended item and the user's past experience (called state). We propose two algorithms, ULCB and KL-ULCB, both of which do more exploration (being optimistic) when the user likes the previous recommended item and less exploration (being pessimistic) when the user does not like the previous item. We prove that both ULCB and KL-ULCB achieve logarithmic regret. Furthermore, the regret bound under KL-ULCB is asymptotically sharp.

Liam Collins (The University of Texas at Austin, United States)
Aryan Mokhtari (The University of Texas at Austin, United States)
Sewoong Oh (University of Washington at Seattle, United States)
Sanjay Shakkottai (The University of Texas at Austin, United States)
On representation learning with Model-Agnostic Meta-Learning

ABSTRACT. We consider a linear multi-task learning setting, where each task is a noisy linear regression with the optimal solution lying in a shared k-dimensional subspace. In this setting, we show that Model-Agnostic Meta-Learning (MAML) is capable of representation learning and recovers the ground-truth subspace.

15:30-17:10 Session WeD2: Robust Inference and Learning

Organizer:  V. Veeravalli

Location: Solarium
Akram Awad (University of Central Florida, United States)
George Atia (University of Central Florida, United States)
A distributionally robust approach to domain adaptation

ABSTRACT. Domain Adaptation (DA) methods, which adapt a learning model across domains with mismatched distributions, often fall short of generalizing well to unseen examples from the target domain. In this work, we develop and analyze an approach to DA that leverages distributionally robust optimization to learn robust, generalizable models. The enhanced out-of-sample performance of the proposed approach is further demonstrated experimentally.

Yue Wang (University at Buffalo, the State University of New York, United States)
Shaofeng Zou (University at Buffalo, the State University of New York, United States)
Policy Gradient Method For Robust Reinforcement Learning

ABSTRACT. This paper develops the first policy gradient method with global optimality guarantee and complexity analysis for robust reinforcement learning under model mismatch. Robust reinforcement learning is to learn a policy robust to model mismatch between simulator and real environment. We first develop the robust policy (sub-)gradient, which is applicable for any differentiable parametric policy class. We show that the proposed robust policy gradient method converges to the global optimum asymptotically under direct policy parameterization. We further develop a smoothed robust policy gradient method, and show that to achieve an $\epsilon$-global optimum, the complexity is $\mathcal O(\epsilon^{-3})$. We then extend our methodology to the general model-free setting, and design the robust actor-critic method with differentiable parametric policy class and value function. We further characterize its asymptotic convergence and sample complexity under the tabular setting. Finally, we provide simulation results to demonstrate the robustness of our methods.

Maohao Shen (MIT, United States)
Yuheng Bu (MIT, United States)
Prasanna Sattigeri (IBM Research, United States)
Soumya Ghosh (IBM Research, United States)
Subhro Das (IBM Research, United States)
Gregory Wornell (MIT, United States)
Post-hoc Uncertainty Learning using a Dirichlet Meta-Model

ABSTRACT. It is known that neural networks have the problem of being over-confident when directly using the output label distribution to generate uncertainty measures. Existing methods mainly resolve this issue by retraining the entire model to impose the uncertainty quantification capability so that the learned model can achieve desired performance in accuracy and uncertainty prediction simultaneously. However, training the model from scratch is computationally expensive, and a trade-off might exist between prediction accuracy and uncertainty quantification. To this end, we consider a more practical post-hoc uncertainty learning setting, where a well-trained base model is given, and we focus on the uncertainty quantification task at the second stage of training. We propose a novel Bayesian uncertainty learning approach using the Dirichlet meta-model, which is effective and computationally efficient. Our proposed method requires no additional training data and is flexible enough to quantify different uncertainties and easily adapt to different application settings, including out-of-domain data detection, misclassification detection, and trustworthy transfer learning. Finally, we demonstrate our proposed meta-model approach's flexibility and superior empirical performance on these applications over multiple representative image classification benchmarks.

Vikram Krishnamurthy (Cornell University, United States)
Luke Snow (Cornell University, United States)
Brian Sadler (Army Research Laboratory, United States)
Quickest Detection with Anticipatory and Quantum Decision Makers

ABSTRACT. We consider quickest change detection involving a human decision maker and a sequential detector in two frameworks. In the first framework, the human is modeled as an anticipatory agent that makes decisions by taking into account the probability of future decisions. In the second framework, the human is modeled as a quantum decision maker; this captures order effects and violation of the total probability rule that have been empirically observed in human decision making. The anticipatory models are time inconsistent meaning that Bellman’s principle of optimality does not hold. The appropriate formalism is the subgame Nash equilibrium. From a quantum decision point of view, the human psychological state evolves according to the Linbladian equation. We show that the interaction between an anticipatory/quantum decision maker and sequential detector results in an unusual (nonconvex) structure of the quickest change detection policy. Our methodology yields a useful framework for human decision makers interacting with sequential detectors.

Xinyi Ni (University of California, Davis, United States)
Lifeng Lai (University of California, Davis, United States)
Policy Gradient based Entropic-VaR Optimization in Risk-Sensitive Reinforcement Learning

ABSTRACT. In sequential decision making, a widely adopted optimization framework is the risk-neutral criterion, whose objective function is given by the expectation of the total discounted cost under a Markov decision process (MDP). However, this approach does not take either inherent-uncertainty or model-uncertainty of the MDP into consideration, which may result in events with small probability but severe consequence. In this paper, we apply a risk measure named entropic value-at-risk (EVaR) to develop robust reinforcement learning (RL) algorithms under the MDP framework. We first derive the policy gradients of the EVaR based objective function and then propose a trajectory-based policy gradient method to estimate gradients as well as update the policy until it converges to the local optimal policy. In addition, we prove the convergence of the proposed algorithm by adopting the stochastic approximation approach. Lastly, numerical results are provided to illustrate the proposed algorithm.

15:30-17:10 Session WeD3: Estimation, Prediction, and Control
Location: Butternut
Yunhan Huang (New York University, United States)
Tao Zhang (New York University, United States)
Quanyan Zhu (New York University, United States)
The Inverse Problem of Linear-Quadratic Differential Games: When is a Control Strategies Profile Nash?

ABSTRACT. This paper aims to formulate and study the inverse problem of non-cooperative linear quadratic games: Given a profile of control strategies, find cost parameters for which this profile of control strategies is Nash. We formulate the problem as a leader-followers problem, where a leader aims to implant a desired profile of control strategies among selfish players. In this paper, we leverage frequency-domain techniques to develop a necessary and sufficient condition on the existence of cost parameters for a given profile of stabilizing control strategies to be Nash under a given linear system. The necessary and sufficient condition includes the circle criterion for each player and a rank condition related to the transfer function of each player. The condition provides an analytical method to check the existence of such cost parameters, while previous studies need to solve a convex feasibility problem numerically to answer the same question. We develop an identity in frequency-domain representation to characterize the cost parameters, which we refer to as the Kalman equation. The Kalman equation reduces redundancy in the time-domain analysis that involves solving a convex feasibility problem. Using the Kalman equation, we also show the leader can enforce the same Nash profile by applying penalties on the shared state instead of penalizing the player for other players' actions to avoid the impression of unfairness.

Po-Han Li (The University of Texas at Austin, United States)
Ufuk Topcu (The University of Texas at Austin, United States)
Sandeep Chinchali (The University of Texas at Austin, United States)
Adversarial Examples for Model-Based Control: A Sensitivity Analysis

ABSTRACT. We propose a method to attack controllers that rely on external timeseries forecasts as task parameters. An adversary can manipulate the costs, states, and actions of the controllers by forging the timeseries, in this case perturbing the real timeseries. Since the controllers often encode safety requirements or energy limits in their costs and constraints, we refer to such manipulation as an adversarial attack. We show that different attacks on model-based controllers can increase control costs, activate constraints, or even make the control optimization problem infeasible. We use the linear quadratic regulator and convex model predictive controllers as examples of how adversarial attacks succeed and demonstrate the impact of adversarial attacks on a battery storage control task for power grid operators. As a result, our method increases control cost by 8500% and energy constraints by 13% on real electricity demand timeseries.

W. Steven Gray (Old Dominion University, United States)
System Identification Entropy for Chen-Fliess Series and Their Interconnections

ABSTRACT. The notion of system identification entropy is defined for the class of analytic nonlinear input-output systems having Chen-Fliess series representations. This quantity describes the growth in the number of bits needed to specify the series coefficients in order to represent the input-output map to a desired accuracy. Upper bounds are then computed in terms of the growth rate of the series coefficients and the entropy of the generating series, which is an entirely distinct entropy concept. Once the results for the system identification entropy of a single system are established, the focus turns to interconnected systems, specifically, parallel, series, and feedback interconnected systems. Examples include estimating upper bounds for the system identification entropies of parallel interconnected rational systems and cascaded Wiener systems.

Pouria Tooranjipour (Michigan State University (East Lansing, MI, US), United States)
Bahare Kiumarsi (Michigan State University (East Lansing, MI, US), United States)
Hamidreza Modares (Michigan State University (East Lansing, MI, US), United States)
Risk-Aware Safe Optimal Control of Uncertain Linear Systems

ABSTRACT. This paper synthesizes a risk-aware safe optimal controller for partially unknown linear systems under additive Gaussian noises. The risk is assessed through the concept of Conditional Value-at-Risk (CVaR) to account for the extreme low probability events that occurred in a one-step cost function without being overly conservative. The safety of the CVaR optimization solution is also guaranteed with high probability by imposing a chance constraint. A state-feedback risk-aware controller is first obtained that provides an upper bound to the formulated safe CVaR optimization problem. Then, an online data-driven quadratic programming (QP) optimization problem is devised to simultaneously and safely learn the unknown dynamics and control the system with high probability. As more measurements are collected, the safety constraint is tightened due to increasing the confidence in estimating the dynamic model. In the end, a numerical example is given to elucidate the efficacy of the proposed method.

Yanjun Han (Massachusetts Institute of Technology, United States)
Jiantao Jiao (University of California, Berkeley, United States)
Nived Rajaraman (University of California, Berkeley, United States)
Kannan Ramchandran (University of California, Berkeley, United States)
Beyond UCB: statistical complexity and optimal algorithm for non-linear ridge bandits

ABSTRACT. Many existing literature on bandits and reinforcement learning assume a linear reward/value function, but what happens if the reward is non-linear? Two curious phenomena arise for non-linear bandits: first, in addition to the "learning phase" with a standard \Theta(\sqrt(T)) regret, there is an "initialization phase" with a fixed cost determined by the reward function; second, achieving the smallest cost of the initialization phase requires new learning algorithms other than traditional ones such as UCB. For a special family of non-linear bandits, we derive upper and lower bounds on the optimal fixed cost, and in addition, on the entire learning trajectory in the initialization phase via differential equations. In particular, we show that a two-stage algorithm which first finds a good initialization and then treats the problem as a locally linear bandit is statistically optimal, while several classical algorithms such as UCB are provably suboptimal.

This is based on a recent joint work with Jiantao Jiao, Nived Rajaraman, and Kannan Ramchandran.

15:30-17:10 Session WeD4: Queueing Systems
Location: Pine
Xinyu Wu (Massachusetts Institute of Technology, United States)
Dan Wu (Massachusetts Institute of Techology, United States)
Eytan Modiano (Massachusetts Institute of Technology, United States)
Queueing Delay Minimization in Overloaded Networks via Rate Control

ABSTRACT. We study the pattern of queueing delay minimization in overloaded networks using link rate control. We show that serving packets with maximum link rates does not minimize queueing delay when networks are on overload. To understand this counter-intuitive observation and identify the optimal solution, we characterize the packet queueing delay explicitly through a fluid queue model, which enables us to characterize the sufficient and necessary condition for a policy to be delay-optimal for single-hop networks. We show that setting the link rates to be proportional to the packet arrival rates minimizes both average and maximum delay among all sources simultaneously. We further prove that a queue-proportional rate control policy, agnostic of packet arrival rates, asymptotically achieves optimal delay performance given any initial state of queue backlog. We also demonstrate that our proposed policies are implementable in distributed manner to reduce communication overhead and computational cost. We evaluate our policies over single-hop networks with different network settings, and demonstrate its superiority compared to the policies that work well in networks not overloaded.

Adrian Redder (Department of Computer Science, Paderborn University, Germany)
Arunselvan Ramaswamy (Department of Computer Science, Karlstad University, Sweden)
Holger Karl (Hasso Plattner Institute, Potsdam University, Germany)
Age of Information Process under Strongly Mixing Communication -- Moment Bound, Mixing Rate and Strong Law
PRESENTER: Adrian Redder

ABSTRACT. The decentralized nature of multi-agent systems requires continuous data exchange to achieve global objectives. In such scenarios, Age of Information (AoI) has become an important metric of the freshness of exchanged data due to the error-proneness and delays of communication systems. Communication systems usually possess dependencies: the process describing the success or failure of communication is highly correlated when these attempts are ``close'' in some domain (e.g. in time, frequency, space or code as in wireless communication) and is, in general, non-stationary. To study AoI in such scenarios, we consider an abstract event-based AoI process $\Delta(n)$, expressing time since the last update: If, at time $n$, a monitoring node receives a status update from a source node (event $A(n-1)$ occurs), then $\Delta(n)$ is reset to one; otherwise, $\Delta(n)$ grows linearly in time. This AoI process can thus be viewed as a special random walk with resets. The event process $A(n)$ may be nonstationary and we merely assume that its temporal dependencies decay sufficiently, described by $\alpha$-mixing. We calculate moment bounds for the resulting AoI process as a function of the mixing rate of $A(n)$. Furthermore, we prove that the AoI process $\Delta(n)$ is itself $\alpha$-mixing from which we conclude a strong law of large numbers for $\Delta(n)$. These results are new, since AoI processes have not been studied so far in this general strongly mixing setting. This opens up future work on renewal processes with non-independent interarrival times.

Purbesh Mitra (University of Maryland, United States)
Sennur Ulukus (University of Maryland, United States)
ASUMAN: Age Sense Updating Multiple Access in Networks

ABSTRACT. We consider a fully-connected wireless gossip network which consists of a source and $n$ receiver nodes. The source updates itself with a Poisson process and also sends updates to the nodes as Poisson arrivals. Upon receiving the updates, the nodes update their knowledge about the source. The nodes gossip the data among themselves in the form of Poisson arrivals to disperse their knowledge about the source. The total gossiping rate is bounded by a constraint. The goal of the network is to be as timely as possible with the source. In this work, we propose ASUMAN, a distributed opportunistic gossiping scheme, where after each time the source updates itself, each node waits for a time proportional to its current age and broadcasts a signal to the other nodes of the network. This allows the nodes in the network which have higher age to remain silent and only the low-age nodes to gossip, thus utilizing a significant portion of the constrained total gossip rate. We calculate the average age for a typical node in such a network with symmetric settings and show that the theoretical upper bound on the age scales as $O(1)$. ASUMAN, with an average age of $O(1)$, offers significant gains compared to a system where the nodes just gossip blindly with a fixed update rate in which case the age scales as $O(\log n)$.

Bai Liu (Massachusetts Institute of Technology, United States)
Eytan Modiano (Massachusetts Institute of Technology, United States)
Universal Policy Tracking: Scheduling for Wireless Networks with Delayed State Observation

ABSTRACT. Numerous scheduling algorithms have been proposed to optimize various performance metrics like throughput, delay and utility in wireless networks. However, these algorithms often require instantaneous access to network state information, which is not always available. While network stability can sometimes be achieved with delayed state information, other performance metrics such as latency may degrade. Thus, instead of simply stabilizing the system, our goal is to design a framework that can mimic arbitrary scheduling algorithms with performance guarantees. A naive approach is to make decisions directly with delayed information, but we show that such methods may lead to poor performance. Instead, we propose the Universal Tracking (UT) algorithm that can mimic the actions of arbitrary scheduling algorithms under observation delay. We rigorously show that the performance gap between UT and the scheduling algorithm being tracked is bounded by constants. Our numerical experiments show that UT significantly outperforms the naive approach in various applications.

Tushar Shankar Walunj (IIT Bombay, India)
Shiksha Singhal (IIT Bombay, India)
Veeraruna Kavitha (IIT Bombay, India)
Jayakrishnan Nair (IIT Bombay, India)
Pricing, competition and market segmentation in ride hailing

ABSTRACT. We analyse a non-cooperative strategic game among two ride-hailing platforms, each of which is modeled as a two-sided queueing system, where drivers (with a certain patience level) are assumed to arrive according to a Poisson process at a fixed rate, while the arrival process of passengers is split across the two providers based on QoS considerations. We also consider two monopolistic scenarios: (i) each platform has half the market share, and (ii) the platforms merge into a single entity, serving the entire passenger base using their combined driver resources. The key novelty of our formulation is that the total market share is fixed across the platforms. The game thus captures the competition among the platforms over market share, which is modeled using two different Quality of Service (QoS) metrics: (i) probability of driver availability, and (ii) probability that an arriving passenger takes a ride. The objective of the platforms is to maximize the profit generated from matching drivers and passengers.

In each of the above settings, we analyse the equilibria associated with the game. Interestingly, under the second QoS metric, we show that for a certain range of parameters, no Nash equilibrium exists. Instead, we demonstrate a new solution concept called an equilibrium cycle. Our results highlight the interplay between competition, cooperation, passenger-side price sensitivity, and passenger/driver arrival rates.