ABSTRACT. Monotone dynamical systems, which preserve a partial ordering on the states, have a long history; their theory may provide a path to establishing convergence for large-scale control systems, distinct from the more standard methods of optimization or Lyapunov-based control. We will show a number of problems in network control which are amenable to this approach: dynamic consensus, load balancing, queue scheduling, and peer-to-peer file sharing, allowing for monotone formulations in appropriate coordinates. From these various instances we explore the generalizations and limits of the method.
Join-the-Shortest-Queue with Markov-Modulated Parameters: Poisson Equation and Transform Techniques
ABSTRACT. In parallel-server systems with a single stream of arrivals, Join-the-Shortest-Queue (JSQ) is one of the most popular routing algorithms due to its simplicity and strong performance properties: it is well known that routing with JSQ is throughput and heavy-traffic optimal, regardless of whether the servers are homogeneous or not. Further, JSQ is optimal and does not need any information about the servers' rates. In this work, we study a JSQ system with Markov-modulated arrival and service rates. Specifically, we provide sufficient conditions on the Markov-modulated arrival and service rates to ensure state space collapse, and compute the heavy-traffic distribution of queue lengths. We establish the distribution of queue lengths using a novel hybrid methodology that combines the Transform Method for queue-lengths analysis, and Poisson Equation of the Markov-modulating chain.
Sequential policies and the distribution of their total rewards in dynamic and stochastic knapsack problems
ABSTRACT. We study a dynamic and stochastic knapsack problem in which a decision maker is sequentially presented with n items with unitary rewards and independent weights that are drawn from a known continuous distribution. The decision maker seeks to maximize the expected reward she collects by including items in a knapsack while satisfying a capacity constraint, and while making terminal decisions as soon as each item weight is revealed. We propose a reoptimized heuristic and compare its total rewards with that of the optimal dynamic programming policy. We show that the two total rewards have the same asymptotic mean, the same asymptotic variance, and the same limiting distribution. In contrast, we also note that other asymptotically optimal heuristics that have been considered in the literature have different (larger) higher moments and different limiting distributions.
Target-Following Online Resource Allocation Using Proxy Assignments
ABSTRACT. We study a target-following variation of online resource allocation. As in classical resource allocation, the decision-maker must assign sequentially arriving jobs to one of multiple available resources. However, in addition to the assignment costs incurred from these decisions, the decision-maker is also penalized for deviating from exogenously given, nonstationary target allocations throughout the horizon. The goal is to minimize the total expected assignment and deviation penalty costs incurred throughout the horizon when the distribution of assignment costs is unknown. In contrast to traditional online resource allocation, in our setting the timing of allocation decisions is critical due to the nonstationarity of allocation targets. Examples of practical problems that fit this framework include many physical resource settings where capacity is time-varying, such as manual warehouse processes where staffing levels change over time, and assignment of packages to outbound trucks whose departure times are scheduled throughout the day. We first show that naive extensions of state-of-the-art algorithms for classical resource allocation problems can fail dramatically when applied to target-following resource allocation. We then propose a novel ``proxy assignment" primal-dual algorithm for the target-following online resource allocation problem that uses current arrivals to simulate the effect of future arrivals. We prove that our algorithm incurs the optimal $O(\sqrt{T})$ regret bound when the assignment costs of the arriving jobs are drawn i.i.d. from a fixed distribution. We demonstrate the practical performance of our approach by conducting numerical experiments on synthetic datasets, as well as real-world datasets from retail fulfillment operations.
Mixing Times for Reversible Countable State Markov Chains: A case study of the Erlang-C queue
ABSTRACT. Last few years have seen rapid developments in the mathematical tools for studying mixing times of Markov chains. However, most of the focus has been on finite-state Markov chains. Several engineering problems involve study of countable state Markov chains. Most popular examples are queueing systems that are used to model various resource allocation problems in networks, data centers, ride-hailing etc. In this work, we focus on the Erlang-C system (also known as M/M/n queue), and bound the Chi-square distance between the finite time queue length distribution and the stationary distribution for a finite number of servers. We then use these bounds to study the behavior in the many-server heavy-traffic asymptotic regimes. The Erlang-C model exhibits a phase transition at the so-called Halfin-Whitt regime. We show that our mixing rate matches the limiting behavior in the Super-Halfin-Whitt regime, and matches up to a constant factor in the Sub-Halfin-Whitt regime.
We obtain these results using the Lyapunov-Poincaré approach, where we first carefully design a Lyapunov function to obtain a negative drift outside a finite set. Within the finite set, we develop different strategies depending on the properties of the finite set to get a handle on the mixing behavior via a local Poincaré inequality. A key aspect of our methodological contribution is in obtaining tight guarantees in these two regions, which when combined give us tight mixing time bounds. We believe that this approach is of independent interest for studying mixing in reversible countable-state Markov chains more generally, and will serve as a stepping stone towards understanding the transient behavior of more general queueing systems.
Dual-Sensor Quickest Change Detection with Sampling Control
ABSTRACT. The problem of quickest change detection is studied where there are two streams of observations, namely, the X-stream and the Y-stream, with the Y-stream providing higher Kullback-Leibler divergence between the pre- and post-change distributions. The objective is to solve the detection problem under various constraints on sampling.
Specifically, the following two scenarios are investigated: 1) when the decision-maker can access only one of the streams under an additional constraint on the fraction of time the Y-stream is used, and 2) when the decision-maker can access both the streams under an additional constraint on the fraction of time it can do so. Optimal detection algorithms are developed for both scenarios. The problem of data-efficiency, where the decision-maker chooses not to collect any observations under an additional duty cycle constraint, is also investigated.
Bandit-Based Sequential Client Scheduling in Federated Learning
ABSTRACT. Federated learning (FL) enables distributed model training across multiple clients without sharing raw data, preserving privacy and reducing communication overhead. A key challenge in FL is sequentially scheduling clients for training and transmission to balance generalization performance with latency constraints. In this work, we address this challenge through a multi-armed bandit (MAB) framework that models client scheduling as a sequential decision-making problem. We propose a novel MAB approach for sequential client scheduling in FL, designed to minimize training latency without compromising model generalization, the ability to make accurate predictions on unseen data. Building on this framework, we develop a low-complexity algorithm that efficiently balances this trade-off. We prove that the algorithm achieves logarithmic regret over time, relative to an oracle with full knowledge of client latency means. Experimental results on both synthetic and real-world datasets demonstrate that our method consistently outperforms existing client selection strategies in terms of convergence speed and generalization performance.
Sequential Change Point Detection via Denoising Score Matching
ABSTRACT. Sequential change-point detection plays a critical role in numerous real-world applications, where timely identification of distributional shifts can greatly mitigate adverse outcomes. Classical methods commonly rely on parametric density assumptions of pre- and post-change distributions, limiting their effectiveness for high-dimensional, complex data streams. This paper proposes a score-based CUSUM change-point detection, in which the score functions of the data distribution are estimated by injecting noise and applying denoising score matching. We consider both offline and online versions of score estimation. Through theoretical analysis, we demonstrate that denoising score matching can enhance detection power by effectively controlling the injected noise scale. Finally, we validate the practical efficacy of our method through numerical experiments on two synthetic datasets and a real-world earthquake precursor detection task, demonstrating its effectiveness in challenging scenarios.
Cluster-Aware Causal Mixer for Sequential Change Detection in Multivariate Time Series
ABSTRACT. Early and accurate detection of changes (e.g., anomalies) in time series data is critical, given the significant risks associated with false or missed detections. While MLP-based mixer models have shown promise in time series analysis, they lack a causality mechanism to preserve temporal dependencies inherent in the system. Moreover, real-world multivariate time series often contain numerous channels with diverse inter-channel correlations. A single embedding mechanism for all channels does not effectively capture these complex relationships. To address these challenges, we propose a novel cluster-aware causal mixer to effectively detect changes in multivariate time series. Our model groups channels into clusters based on their correlations, with each cluster processed through a dedicated embedding layer. In addition, we introduce a causal mixer in our model, which mixes the information while maintaining causality. Furthermore, we present a sequential change detection framework that accumulates the anomaly evidence over time to prevent false positives due to nominal outliers. Our proposed model operates in an online fashion, making it suitable for real-time time-series anomaly detection tasks. Experimental evaluations across six public benchmark datasets demonstrate that our model consistently achieves superior F1 scores.
Optimal Bounded-Influence Procedure for Robust Sequential Change-Point Detection
ABSTRACT. We study the problem of robust sequential change-point detection under Huber's gross error model. We incorporate ideas of the influence function from the classical
offline robust statistics literature, and propose a new definition called the false alarm influence function to quantify the robustness of general sequential change detection procedures. Then, we propose a general family of robust detection procedures and derive their false alarm influence function. The result shows they all have bounded influence function, which implies any single arbitrary outlier will only have a limited impact on their detection performance. Furthermore, we construct the optimal robust bounded-influence procedure in that general family that has the smallest detection delay subject to the false alarm rate constraint and the false alarm influence function constraint. It turns out the optimal procedure is based on the truncated likelihood ratio statistic and has a simple form. Finally, we demonstrate the robustness and the detection efficiency of the optimal robust bounded-influence procedure through extensive simulations.
ABSTRACT. In this paper, we present a novel hypothesis testing approach to model an attacker's process of gathering and processing information during the reconnaissance phase in a nudge-theoretic framework where defender uses nudges to lure the attacker towards the decoy server. Employing the developed framework, we characterize the optimal falsification nudging strategies that a defender can employ to blind the attacker i.e., render the information gathered during reconnaissance ineffective. We further analyze the suggestion nudge, which steers the attacker towards the decoy server, thereby further degrading the attacker's decision-making capability. We also characterize the defender’s optimal information-highlighting nudging strategy in response to the attacker’s strategic, sequential reconnaissance behavior.
Numerical results have been presented to provide important insights into the developed strategies.
On Subset Retrieval and Group Testing Problems with Differential Privacy Constraints
ABSTRACT. This paper focuses on the design and analysis of privacy-preserving techniques for group testing and infection status retrieval. Our work is motivated by the need to provide accurate information on the status of disease spread among a group of individuals while protecting the privacy of the infection status of any single individual involved. The paper is motivated by practical scenarios, such as controlling the spread of infectious diseases, where individuals might be reluctant to participate in testing if their outcomes are not kept confidential.
The paper makes the following contributions. First, we introduce a differential privacy framework for the subset retrieval problem, which aims to securely share individuals’ infection status with third parties, such as administrators and decision-makers. We characterize the trade-off between the accuracy of subset retrieval and the degree of privacy guaranteed to the individuals. In particular, we establish tight lower and upper bounds on the achievable level of accuracy subject to the differential privacy constraints. We then formulate the {differential} privacy framework for the noisy group testing problem in which noise is added either before or after the pooling process.
We establish a reduction between the private subset retrieval and noisy group testing problems and show that the converse and achievability schemes for subset retrieval carry over to differentially private group testing.
Approximately Optimal Toll Design for Efficiency and Equity in Arc-Based Traffic Assignment Models
ABSTRACT. Congestion pricing policies have emerged as promising traffic management tools to alleviate traffic congestion caused by travelers' selfish routing behaviors. The core principle behind deploying tolls is to impose monetary costs on frequently overcrowded routes, to incentivize self-interested travelers to select less easily congested routes. Recent literature has focused on toll design based on arc-based traffic assignment models (TAMs), which characterize commuters as traveling through a traffic network by successively selecting an outgoing arc from every intermediate node along their journey. However, existing tolling mechanisms predicated on arc-based TAMs often target the design of a single congestion-minimizing toll, ignoring crucial fairness considerations, such as the financial impact of high congestion fees on low-income travelers. To address these shortcomings, in this paper, we pose the dual considerations of efficiency and equity in traffic routing as bilevel optimization problems. Since such problems are in general computationally intractable to solve precisely, we construct a linear program approximation by introducing a polytope approximation for the set of all tolls that induce congestion-minimizing traffic flow patterns. Finally, we provide numerical results that validate our theoretical conclusions.
Efficient and Privacy-Preserving Binary Dot Product via Multi-Party Computation
ABSTRACT. Striking a balance between protecting data privacy and enabling collaborative computation is a critical challenge for distributed machine learning. While privacy-preserving techniques for federated learning have been extensively developed, methods for scenarios involving bitwise operations, such as tree-based vertical federated learning (VFL), are still under-explored. Traditional mechanisms, including Shamir's secret sharing and multi-party computation (MPC), are not optimized for bitwise operations over binary data, particularly in settings where each participant holds a different part of the binary vector. This paper addresses the limitations of existing methods by proposing a novel binary multi-party computation (BiMPC) framework. The BiMPC mechanism facilitates privacy-preserving bitwise operations, with a particular focus on dot product computations of binary vectors, ensuring the privacy of each individual bit. At the core of BiMPC is a novel approach called Dot Product Via Modular Addition (DoMA), which uses regular and modular additions for efficient binary dot product calculation. To provide privacy, BiMPC introduces two key protocols: Binary Input Secret Sharing (BISS), an information theoretic secret sharing mechanism for linear computations of binary data, and Semantically Secure Secret Sharing (SemS3), which provides semantic security for non-linear operations of binary vectors. The privacy guarantees of the BiMPC framework are rigorously analyzed, demonstrating its efficiency and scalability in distributed settings.
ABSTRACT. We study an infinite server queueing system model that could be used for analyzing the spread of a virus in a facility. We assume the spread of the virus follows the well-established Close Contact Time (CCT) model. We compare the economic reward of running the facility against the risk of contracting the virus. This is done by explicitly modeling the probability of a susceptible individual leaving the facility uninfected by modeling the dynamics as an $M/G/infty$ queue. We obtain additional structural results for the special cases of $M/M/\infty$ and $M/D/\infty$ queues and compare the risk inflicted on a customer in these two facilities. We also look at operational considerations for the facility and qualitative method to improve customer safety in $M/M/\infty$ queues.
A theoretical basis for model collapse in recursive training
ABSTRACT. It is known that recursive training from generative models can lead to the so called `collapse' of the simulated probability distribution. This note shows that one in fact gets two different asymptotic behaviours depending on whether an external source, howsoever minor, is also contributing samples.
ABSTRACT. Inspired by soft-decision decoding in coding theory
and by the assumptions of the universal approximation theorems
on the continuity of a function to be approximated over a
compact region, we introduce a novel training paradigm, termed
soft training, that leverages the continuous nature of neural
networks by training them to approximate a continuous function
(regression) before deploying them for classification tasks.
By using soft labels, in contrast to the traditional “hard labels”,
we incorporate properties such as differentiability and differential
entropy to construct a new labeling framework. Training and
inference over different datasets demonstrate that soft training
improves test accuracy while significantly reducing classification
errors in regions far from the dataset points.
New Convergence Rates for Function Approximation Using Kernel Methods
ABSTRACT. We derive new bounds for function approximation
using reproducing kernel Hilbert spaces by analysing kernels
which are k-times continuously differentiable, instead of kernels
which are norm equivalent to Sobolev spaces. This means our
method allows for a much wider choice of kernels, including the
popular exponential quadratic and polynomial kernels, which
are widely used in practical application of kernel methods. Our
analysis also reveals other possible avenues for new and improved
convergence bounds, depending on how we describe our target
domain X and set of sample points X.
Joint Problems in Learning Multiple Dynamical Systems
ABSTRACT. Clustering of time series is a well-studied problem with applications ranging from quantitative, personalized models of metabolism obtained from metabolite concentrations to state discrimination in quantum information theory. We consider a variant, where given a set of trajectories and a number of parts, we jointly partition the set of trajectories and learn linear dynamical system (LDS) models for each part, so as to minimize the maximum error across all the models. We present globally convergent methods and EM heuristics, accompanied by promising computational results. The key highlight of this method is that it does not require a predefined hidden state dimension but instead provides an upper bound. Additionally, it offers guidance for determining regularization in the system identification.
Artificial Parameter Based Approximate Solutions of Non-Symmetric Matrix Riccati Differential Equation
ABSTRACT. Two alternative methods of approximate solution
of a non-symmetric matrix Riccati differential equation are
proposed. Both methods are based on the artificial parameter
approach. In both cases, the equation is parametrized in such a
way that a resulting equation coincides with the original one for
the parameter equal to one. The solution of the parametrized
equation is represented as a power series over the parameter.
The conditions of strong uniform convergence of the series are
derived, including the case where the parameter is equal to one.
This yields sufficient conditions for the existence of a bounded
solution in the entire time interval. Moreover, subject to these
conditions, a partial series sum constitutes an approximate
solution of the original equation. Numerical example dealing
with a Nash linear-quadratic differential game is presented.
LLM-Stackelberg Games: Conjectural Reasoning Equilibria, Meta-Cognition, and Applications to Spearphishing
ABSTRACT. This paper introduces a novel class of sequential decision-making games, LLM-Stackelberg Games, in which agents reason and interact via large language models (LLMs). Extending the classical Stackelberg framework, this formulation captures scenarios where players’ actions and reasoning processes are communicated in natural language, and where the follower’s strategy is only partially observable to the leader. The proposed model incorporates both reasoning-level policies and behavioral outcomes, leading to a new solution concept: the Stackelberg Reasoning and Behavioral Equilibrium (SRBE). To address settings with incomplete information, we introduce the Conjectural Reasoning and Behavioral Equilibrium, where the leader optimally forms and calibrates beliefs about the follower’s reasoning process. We extend the analysis to multi-round dynamic games, establishing conditions under which forward-calibrated beliefs and backward-optimized strategies are jointly consistent. The framework also includes a mechanism for symbolic knowledge synthesis, enabling agents to expand their reasoning capabilities over time. Applications to adversarial LLM scenarios, such as spearphishing, demonstrate the framework’s ability to model cognitive asymmetries and strategic deception in multi-agent systems.
Deception against Game-Theoretic Learning in Complex Adaptive Systems
ABSTRACT. Complex adaptive systems (CAS) are characterized by their intricate complexity and diverse components, both digital and analog, which communicate through multiple channels. These systems are prevalent in various applications, making them attractive targets for adversarial attacks aimed at causing damage, confusion, and deception. Deception acts as a strategic defense mechanism against adversaries lacking crucial information, forcing them to adopt less effective strategies to the defender's benefit. When an adversary seeks to determine the optimal attack on an unfamiliar system, a defender with thorough knowledge of the system's dynamics can introduce deceptive inputs to mislead the adversary. This work aims to design these deceptive inputs to guide the adversary towards learning a pre-selected, suboptimal attack. Specifically, we demonstrate how to corrupt the dynamics of a learning algorithm to mislead it into converging on a suboptimal policy, and how to directly attack the CAS to inflict maximum damage while remaining undetected. This challenge involves solving a coupled algebraic Riccati and Lyapunov equation, which are analytically complex. However, a block successive over-relaxation algorithm is employed to obtain their numerical solution, with proven convergence under specific conditions. Simulations on a benchmark aircraft showcase the algorithm's effectiveness in steering adversaries towards less damaging attacks.
Experimentation in the Presence of Strategic Buyers: Unbiased Estimation Under Intertemporal Substitution
ABSTRACT. We consider a retailer running a switchback experiment on the price of a single product, with infinite supply.
In such a setting, buyers may intertemporally substitute: they might delay purchase in anticipation of a potentially lower price. This behavior can bias the resulting estimate of the treatment effect of the price on demand. We show that this effect is unavoidable with experiments that involve only two price levels, and we also show how experiments with three price levels can be used to construct an estimator that fully removes the bias.
Persuasion under Dynamic Trust: Parametric Sender Behavior
ABSTRACT. We investigate information design for sequential sender-receiver interactions, where trust in the sender evolves dynamically. In our setting, a higher-level decision-maker has access to private information about the state and strategically reveals it to lower-level agents. We model trust as the receiver agents' perceived credibility of the sender's messages. The sender faces a trade-off between sending accurate signals to maintain or build trust and sending misleading signals for short-term gains. Specifically, the sender uses a mixed parameterized strategy that adjusts based on the current level of trust. Receivers can detect incorrect signals with some nonnegative probability, leading to a reduction in trust. We show that trust converges almost surely to a fixed point, irrespective of the initial trust. We then characterize the optimal sender strategy that maximizes its long-run payoffs, and analyze the sensitivity of both strategies and equilibrium outcomes to changes in game parameters.
Our findings highlight the role of trust dynamics in shaping strategic communication and extend the information design framework by incorporating trust as a state variable beyond the standard Bayesian obedience paradigm.
From signaling to interviews in random matching markets
ABSTRACT. In many two-sided labor markets, interviews are conducted before matches are formed. An increase in the number of interviews in the market for medical residencies raised the demand for signaling mechanisms, in which applicants can send a limited number of signals to communicate interest. We study the role of signaling mechanisms in reducing the number of interviews in centralized random matching markets with post-interview shocks. For the market to clear we focus on interim stability, which extends the notion of stability to ensure that agents do not regret not interviewing with each other. A matching is almost interim stable if it is interim stable after removing a vanishingly small fraction of agents.
We first study signaling mechanisms in random matching markets with $n$ agents when agents on the short side, long side, or both sides signal their top~$d$ preferred partners. Interviews graphs are formed by including all pairs where at least one party has signaled the other. We show that when $d = \omega(1)$, short-side signaling leads to almost interim stable matchings. Long-side signaling is only effective when the market is almost balanced. Conversely, when the interview shocks are negligible and $d = o(\log n)$, both-side signaling fails to achieve almost interim stability. For larger $d \ge \Omega(\log^2 n)$, short-side signaling achieves perfect interim stability, while long-side signaling fails in imbalanced markets. We build on our findings to propose a signaling mechanism for multi-tiered random markets. Our analysis identifies conditions under which signaling mechanisms are incentive compatible. A technical contribution is the analysis of a message-passing algorithm that efficiently determines interim stability and matching outcomes by leveraging local neighborhood structures.
Stochastic model and decomposition schemes for relaxation oscillator based SAT solver
ABSTRACT. In recent times there has been a push to use specific analog computing paradigms and architectures to solve hard combinatorial problems such as the constraint satisfaction problem---the k-SAT problem. One such approach uses a CMOS chip with relaxation oscillators and a combination of analog+digital logic to find SAT solutions (when they exist). In this paper we will discuss a stochastic model for the relaxation oscillators approach. This stochastic is a novel Markov Chain Monte Carlo method for such problems. Further, we also discuss how the physical constraints of using hardware with fixed capabilities, such as the CMOS chip mentioned above, leads to novel decomposition schemes and then randomized algorithms to solve large constraint satisfaction problems.
Scheduling with Uncertain Holding Costs and its Application to Content Moderation
ABSTRACT. In content moderation for social media platforms, the cost of delaying the review of a content is proportional to its view trajectory, which fluctuates and is apriori unknown. Motivated by such uncertain holding costs, we consider a queueing model where job states evolve based on a Markov chain with state-dependent instantaneous holding costs. We demonstrate that in the presence of such uncertain holding costs, the two canonical algorithmic principles, instantaneous-cost ($c\mu$-rule) and expected remaining-cost ($c\mu/\theta$-rule), are suboptimal. By viewing each job as a Markovian ski-rental problem, we develop a new index-based algorithm, "Opportunity-adjusted Remaining Cost (OaRC)", that adjusts to the opportunity of serving jobs in the future when uncertainty partly resolves. We show that the regret of OaRC scales as $\tilde{O}(L^{1.5}\sqrt{N})$, where L is the maximum length of a job's holding cost trajectory and N is the system size. This regret bound shows that OaRC achieves asymptotic optimality when the system size N scales to infinity. Moreover, its regret is independent of the state-space size, which is a desirable property when job states contain contextual information. We corroborate our results with an extensive simulation study based on two holding cost patterns (online ads and user-generated content) that arise in content moderation for social media platforms. Our simulations based on synthetic and real datasets demonstrate that OaRC consistently outperforms existing practice, which is based on the two canonical algorithmic principles.
Projection-based Lyapunov method for fully heterogeneous weakly-coupled MDPs
ABSTRACT. Heterogeneity poses a fundamental challenge for many real-world large-scale decision-making problems but remains largely understudied. In this paper, we study the fully heterogeneous setting of a prominent class of such problems, known as weakly-coupled Markov decision processes (WCMDPs). Each WCMDP consists of N arms (or subproblems), which have distinct model parameters in the fully heterogeneous setting, leading to the curse of dimensionality when N is large. We show that, under mild assumptions, an efficiently computable policy achieves an O(1/\sqrt{N}) optimality gap in the long-run average reward per arm for fully heterogeneous WCMDPs as N becomes large. This is the first asymptotic optimality result for fully heterogeneous average-reward WCMDPs. Our main technical innovation is the construction of projection-based Lyapunov functions that certify the convergence of rewards and costs to an optimal region, even under full heterogeneity
Dynamic Matching with Post-allocation Service and its Application to Refugee Resettlement
ABSTRACT. Motivated by our collaboration with a major refugee resettlement agency in the U.S., we study a dynamic matching problem where each new arrival (a refugee case) must be matched immediately and irrevocably to one of the static resources (a location with a fixed annual quota). In addition to consuming the static resource, each case requires post-allocation services from a server, such as a translator. Given the uncertainty in service time, a server may not be available at a given time, thus we refer to it as a dynamic resource. Upon matching, the case will wait to avail service in a first-come-first-serve manner. Bursty matching to a location may result in undesirable congestion at its corresponding server. Consequently, the central planner (the agency) faces a dynamic matching problem with an objective that combines the matching reward (captured by pair-specific employment outcomes) with the cost for congestion for dynamic resources and over-allocation for the static ones. Motivated by the observed fluctuations in the composition of refugee pools across the years, we aim to design algorithms that do not rely on distributional knowledge. We develop learning-based algorithms that are asymptotically optimal in certain regimes, easy to interpret, and computationally fast. Our design is based on learning the dual variables of the underlying optimization problem; however, the main challenge lies in the time-varying nature of the dual variables associated with dynamic resources. Our theoretical development brings together techniques from Lyapunov analysis, adversarial online learning, and stochastic optimization. On the application side, when tested on real data from our partner agency and incorporating practical considerations, our method outperforms existing ones making it a viable candidate for replacing the current practice upon experimentation.
Learning and predicting uncertain dynamical systems from data
ABSTRACT. Predictive models embody our physical understanding of the world in the form of computable numerical equations, sometimes resulting in large multiphysics, multiscale codebases comprising millions of lines of code. These models make many assumptions involving choices of discretization and other numerical methods, leading to different predictions, but more often the largest model errors and predictive uncertainties lie in missing or poorly understood processes or how unresolved scales are approximated or "parameterized". The uncertainty space becomes high- or infinite-dimensional, since the key unknowns concern the functional forms of missing or postulated terms in the system governing equations.
These challenges have motivated the emergence of hybrid models that are partially numerical equation solvers and partly machine learning, which can learn missing or approximated physics from data. However, many challenges remain in hybrid models, including their training procedures, their computational expense, and the intrinsic "learnability" or system identification of the physics itself.
Using neural ordinary and partial differential equation systems as examples, we outline a research program to tackle these challenges; avenues include system identification methods, synthetic data augmentation and active learning, Bayesian uncertainty quantification and Bayesian deep learning, dimension and model reduction, and online differentiable programming. We conclude with open problems and future directions.
ABSTRACT. Neural operators offer a powerful tool for modeling complex dynamical systems, but often struggle with error accumulation over long time horizons and maintaining stability and physical consistency. We introduce a multiscale implicit neural emulator that enhances long-term prediction accuracy by conditioning on a hierarchy of lower-dimensional representations of future states. Drawing inspiration from the stability properties of numerical implicit time-stepping methods, our approach leverages predictions several steps ahead in time at increasing compression rates for next-timestep refinements. By actively adjusting the temporal downsampling ratios, our design enables the model to capture dynamics across multiple granularities and enforce long-range temporal coherence. Experiments on turbulent fluid dynamics show that our method achieves high short-term accuracy and produces long-term stable forecasts, significantly outperforming autoregressive baselines while adding minimal computational overhead.
ABSTRACT. Scientific investigations often involve modeling, understanding, and reasoning about complex systems characterized by significant uncertainty - both aleatoric (arising from inherent randomness) and epistemic (due to incomplete knowledge). These uncertainties are difficult to reduce and, in practice, impossible to eliminate entirely. The situation is further complicated by data scarcity (especially, relative to system complexity) and the high cost of acquiring additional data. In this talk, we explore how to effectively address these challenges in the context of AI for Science. We focus on Bayesian approaches to uncertainty quantification (UQ), active learning (AL), and optimal experimental design (OED), and discuss how modern AI/ML techniques can both leverage and advance these methods.
ABSTRACT. This talk details progress towards developing foundation models in a spatiotemporal physics setting. Particularly, we focus on a diffusion model framework which, for a given class of PDEs and multi-modal data, provides a unified architecture that can perform forecasting conditional on different parameters/ initial conditions, solve inverse problems, data assimilation and augmentation tasks. By incorporating discretized PDE information into the generative process, our method ensures physically consistent sampling and achieves high fidelity. Additionally, we combine this approach with conditional modeling to enable efficient PDE solution generation based on parameters, macroscopic quantities, or partial field measurements, showcasing its adaptability to a range of scientific and engineering tasks.
We explore the application of this framework in weather forecasting. Inspired by earlier findings where simultaneous sequential generation mitigates forward error propagation, we demonstrate that the model performs comparably to state-of-the-art numerical methods. The generated ensemble of forecasts offers uncertainty quantification without requiring artificial noise injection, thereby reducing biases.
Automation and generative AI accelerated design of bio-therapeutics
ABSTRACT. We discuss our recent experience in building a self-driving laboratory for designing biological therapies targeting intrinsically disordered proteins (IDPs) at scale. IDPs represent a significant challenge in targeting for therapeutics due to their expansive conformational flexibility and their functional diversity across cancer singaling pathways. Hence, designing therapeutics that target IDPs based on specific cancer signaling pathways can be challenging -- including the need to capture its conformational plasticity, context of post-translational modifications and signaling specificity with respect to its binding partners. To address these challenges, we build on a molecular- and pathway-scale reasoning system, which allows us to discover known/novel interactions based on existing literature and experimental data. We integrate automated platforms and high throughput experimental techniques to probe IDP-mediated interactions while simultaneously instantiating multi scale molecular simulations to capture molecular mechanisms that govern these interactions using agentic systems. We then combine generative models to suggest novel biologic designs (governed by protein scaffolds such as affitin, affibody, or antibody) that are also tested within the experimental loop. This approach lets us obtain precise control over IDP-mediated signaling within cancer signaling pathways. We present our results in the context of targeting NMNAT2 -- a nicotinamide mononucleotide adenylyltransferase involved in colorectal, lung, and ovarian cancers and WHSC1 -- a histone methyltransferase involved in intestinal, skin, and stomach cancers.
ABSTRACT. In this work, we generalize results in exact channel simulation to an exponential communication cost, specifically to Campbell's average codeword length L(t) of order t and to Rényi's entropy. In exact channel simulation, given a source of shared randomness, a sender wishes to communicate a message to the receiver so they can generate a sample from the conditional distribution of the channel. We lower bound the Campbell cost of channel simulation using any sampling algorithm in terms of an appropriately defined alpha-mutual information. Using the Poisson functional representation of Li and El Gamal, we prove an upper bound on L(t) whose leading alpha-mutual information term has order within epsilon of the lower bound. Our results reduce to the bounds of Harsha et al. We also provide numerical examples for the additive white Gaussian noise channel, demonstrating that the upper and lower bounds are typically within 5-10 bits of each other.
ABSTRACT. This paper studies a biometric identification system model with two distinct biometric sources, extending beyond the existing model that assume all biometric sequences are drawn from a common source. We consider a scenario involving two user groups, each with an exponential number of users, where the biometric sequences in each group are generated from separate sources and stored in two corresponding databases. To identify an unknown user, the decoder first performs hypothesis testing to determine which database the observed sequence is correlated with, followed by user identification within the correlated database. The main contribution of this work is the derivation of an achievable rate region for the trade-off between identification rates and error exponent of the type-II error probability under negligible type-I error probability. As a specific instance of the derivation, the capacity region is obtained when testing against independence is considered. Our derivation coincides with previous results in special cases.
ABSTRACT. This paper studies a distributed hypothesis testing problem under communication constraints. Specifically, given a scenario where potentially dependent sources are available to a pair of independent devices, the aim is to understand whether a device that sees its own source precisely and a compressed description of the other source can reliably distinguish whether the two sources are distributed according to a known joint distribution or whether they are instead independent. This paper explores the behavior of the corresponding hypothesis test when compression employs random-binning encoder and considers the implication of the existence of such a test on the output statistics of a random binning source code. The extension to testing against other joint distributions with identical marginals as well as the indistinguishability between two distributions from overcompressed source descriptions are also discussed.
Source-Channel Coding in Multiple-Access Channels with Correlated Sources
ABSTRACT. A correlated binning scheme is presented for multiple-access channels with correlated sources. The achievable region defines an equilibrium between two virtual sub-channels: a primary channel for correlated bin information, and a side channel for resolving the uncertainty induced by correlation. Two important approaches in the literature can be interpreted as extremal modes of this equilibrium that seek to minimize side channel usage, the first, by over-binning on the primary channel and the second, by expanding the primary channel’s input symbols with multi-letter source-coding. We show that multi-letter coding does not expand the achievable region.
On Explicit Families of Outer Bounds for Broadcast Networks
ABSTRACT. The cut-set bounds arise from the graph-theoretic notion of a cut and are used as outer bounds for broadcast networks. Although these bounds are tight for broadcasting a single message, they are loose when multiple messages are broadcasted. The generalized cut-set bounds (GCBs) were proposed in [1] in implicit form as a broader class of outer bounds for this setup. In this work, we build on the GCB framework and present two families of explicit outer bounds with associated structure for the broadcast networks. We also present two techniques to modify GCBs that we call GCB-mappings. These are mappings that the class of GCBs are closed under and enable us to customize a known outer bound to generate novel outer bounds with similar structure. The presented explicit families and the GCB-mappings make up a systematic approach for deriving explicit outer bounds for broadcast networks. In addition to the concise proofs this approach yields, it also reveals a common structure shared across the inequalities within a capacity region.
We demonstrate the effectiveness of our explicit outer bounds by focusing on the combination network as a special case. First, we revisit the well-studied 3-user complete message set and show that the entire capacity region is captured by a single explicit family and a GCB-mapping. Next, we turn to a 4-user generalization of this problem that we call the cubic message set, which includes an additional receiver that decodes all seven original messages, as well as an additional unicast message. We show that the capacity region for this message set is captured by the presented explicit families and GCB-mappings. Finally, we present an explicit outer bound for the capacity region of the combination network for the 5-user cubic message set, obtained from the presented explicit outer bounds.
Reduced Complexity Encoding of Non-binary Quasi Cyclic Low-Density Parity-Check Codes
ABSTRACT. Non-binary low-density parity-check (LDPC) codes offer enhanced coding performance, particularly for small and medium block lengths. We develop encoders for systematic non-binary quasi-cyclic (QC) LDPC codes. A full-rank parity-check matrix is used to construct a generator matrix with a block circulant structure, thereby reducing the encoder’s computational complexity, suitable for resource-constrained applications. The proposed structured code designs demonstrate substantially reduced complexity in the encoding and decoding processes, while maintaining decoder performance within 0.2 dB of signal-to-noise ratio (SNR) compared to random assignment of optimal nonzero Galois field (GF) entries in the parity-check matrix. Last, we validate the design architecture by implementing the encoder for a (2,4)-regular rate-1/2 (64,32) non-binary QC LDPC code over GF(256) with a throughput of 1.6 Gbps on an FPGA board using a two-thread architecture.
ABSTRACT. Quantum relays are central to both quantum communication and distributed quantum computing, enabling long-distance transmission and modular architectures. Unlike classical repeaters, quantum repeaters preserve coherence without amplifying quantum information, relying on entanglement swapping and quantum error correction to overcome loss and decoherence. In this work, we investigate the transmission of quantum information via quantum relay channels. Our three-terminal relay model captures the trade-off between repeater-assisted and repeaterless communication strategies. We propose a decode-forward coding scheme and analyze both entanglement-assisted and unassisted scenarios. Our framework allows for different entanglement topologies between the transmitter, the relay and the destination receiver, recovering known results on entanglement-assisted and unassisted communication. Furthermore, we discuss the interpretation of coding with quantum side information. These findings serve as a stepping stone for the design of secure, efficient, and reliable quantum networks and the practical realization of quantum repeaters and long-range quantum key distribution.
A Cutting-plane Method for Semidefinite Programming with Potential Applications on Noisy Quantum Devices
ABSTRACT. There is an increasing interest in quantum algorithms for optimization problems. Within convex optimization, interior-point methods and other recently proposed quantum algorithms are non-trivial to implement on noisy quantum devices. Here, we discuss how to utilize an alternative approach to convex optimization, in general, and semidefinite programming (SDP), in particular. This approach is based on a randomized variant of the cutting-plane method (RCP). We show how to leverage quantum speed-up of an eigensolver in speeding up an SDP solver utilizing the cutting-plane method. For the first time, we demonstrate a practical implementation of a randomized variant of the cutting plane method for semidefinite programming on instances from SDPLIB, a well-known benchmark.
Furthermore, we show that the RCP method is very robust to noise in the boundary oracle, which may make RCP suitable for use even on noisy quantum devices.
Winning Rates of (n, k) Quantum Coset Monogamy Games
ABSTRACT. We formulate the (n, k) Coset Monogamy Game, in which two players must extract complementary information of unequal size (k bits vs. n−k bits) from a random coset state without communicating. The complementary information takes the form of random Pauli-X and Pauli-Z errors on subspace states. Our game generalizes those considered in previous works that deal with the case of equal information size (k = n/2). We prove a convex upper bound of the information-theoretic winning rate of the (n, k) Coset Monogamy Game in terms of the subspace rate R = k/n in [0, 1]. This bound improves upon previous results for the case of R = 1/2, in part due to a structural result we prove on subspace permutations, which may have broader combinatorial interest. We also prove the achievability of an optimal winning probability upper bound for the class of unentangled strategies of the (n, k) Coset Monogamy Game.
The Sum of Leaf Weights in the Minimum Spanning Tree of the Complete Graph
ABSTRACT. Let $G_n$ be a random weighted complete graph on $n$ vertices, where edges have i.i.d. Unif(0,1) weights. Let $H_n$ be the Minimum Spanning Tree of $G_n$. The expected weight of $H_n$ is known to converge to $\zeta(3) = \sum_{k=1}^{\infty} \frac{1}{k^3}$ (Steele 1987). In this paper, we investigate the sum of weights of edges incident to the leaves of $H_n$, denoting this quantity by $L_n$. We give an improved lower bound on $\mathbb{E}[L_n]$, building on the technique of Bollob\'as, Gamarnik, Riordan, and Sudakov (2004), which leverages typical behavior of sparse Erd\H{o}s--R\'enyi random graphs. Our result directly yields an improvement over the upper bound of for the Probabilistic Minimum Spanning Tree, introduced by Bertsimas (1988).
ABSTRACT. We present a new approach to nonnegative matrix factorization (NMF) that directly focuses on the subspace structure of the data instead of the specific samples. The main idea is to find the borders of the data's principal subspace with the nonnegative orthant, which we call nonnegative subspace edges (NoSEs), and construct the factorization according to these NoSEs. We introduce a deterministic algorithm to find NoSEs in linear time, and straight-forward techniques to obtain the desired NMF from these NoSEs. We show that this approach may lead to a deterministic, optimal, unique, and well-posed solution to NMF.
A Proof of The Changepoint Detection Threshold Conjecture in Preferential Attachment Models
ABSTRACT. We investigate the problem of detecting and estimating a changepoint in the attachment function of a network evolving according to a preferential attachment model on $n$ vertices, using only a single final snapshot of the network. Bet et al. show that a simple test based on thresholding the number of vertices with minimum degrees can detect the changepoint when the change occurs at time $n-\Omega(\sqrt{n})$. They further make the striking conjecture that detection becomes impossible for any test if the change occurs at time $n-o(\sqrt{n}).$ Kaddouri et al. make a step forward by proving the detection is impossible if the change occurs at time $n-o(n^{1/3}).$ In this paper, we resolve the conjecture affirmatively, proving that detection is indeed impossible if the change occurs at time $n-o(\sqrt{n}).$ Furthermore, we establish that estimating the changepoint with an error smaller than $o(\sqrt{n})$ is also impossible, thereby confirming that the estimator proposed in Bhamidi et al. is order-optimal.
Active Generative Models for Inference and Learning in Complex Physical World
ABSTRACT. Existing AI models rely on passive data at the input and fixed resolution at the output. We will discuss the shortcomings of such simplifications for real-world applications and physical intelligence. In contrast, we introduce a new family of models which actively seek the most informative data during the inference and learning. We show the power of such models via two set of examples: 1) building more realistic world models with active token management and 2) more efficient universal compressors for distributed learning.
ABSTRACT. As our world becomes increasingly interconnected, the informational landscape that drives decision-making is marked by ever-expanding scale and complex interdependencies. Motivated by computational constraints in emerging computing paradigms, we study how better leveraging structure can push the frontiers of large-scale inference. Through the lens of graphical models, we develop computationally efficient alternatives to canonical subroutines that underlie inference in modern machine learning and optimization infrastructure.
We present our contributions along two key directions. First, we optimize graph algorithms for learning from distributed data sources, addressing a key challenge in decentralized settings—namely, identifying simple probabilistic rules for organizing nodes to balance sparsity with reliable connectivity. Our results resolve several open problems related to the exact analysis of connectivity properties in a class of random graph models known as random k-out graphs, widely appearing as heuristics for network design in settings with limited trust. Second, we develop computationally efficient alternatives to parameter estimation in graphical models that retain the statistical guarantees of classical methods like maximum likelihood estimation while significantly cutting computational costs. Our work sheds new light on how the interplay between graph structure and performance can be leveraged to push the frontiers of efficient and provably reliable algorithms.
ABSTRACT. This paper proves an asymptotic lower bound on the average delay in detection of a change-point under the multi-armed bandit setting of dynamic sampling. A stopping time, which (almost) obtains this lower bound, is constructed for each sampling policy that possesses appropriate regularity conditions. The lower bound can be applied to rank sampling policies.
ABSTRACT. We consider contextual bandits with a finite number of arms, where the contexts are independent and identically distributed $d$-dimensional random vectors, and the expected rewards are linear in both the arm parameters and contexts. The LinUCB algorithm, which is near minimax optimal for related linear bandits, is shown to have a cumulative regret that is suboptimal in both the dimension $d$ and time horizon $T$, due to its over-exploration. A truncated version of LinUCB is proposed and termed "Tr-LinUCB", which follows LinUCB up to a truncation time $S$ and performs pure exploitation afterwards. The Tr-LinUCB algorithm is shown to achieve $O(d\log(T))$ regret if $S = Cd\log(T)$ for a sufficiently large constant $C$, and a matching lower bound is established, which shows the rate optimality of Tr-LinUCB in both $d$ and $T$ under a low dimensional regime. Further, if $S = d\log^{\kappa}(T)$ for $\kappa>1$, the loss compared to the optimal is an extra $\log\log(T)$ factor, which does not depend on $d$. This insensitivity to overshooting in choosing the truncation time of Tr-LinUCB is of practical importance.
ABSTRACT. We present a computationally efficient online kernel Cumulative Sum (CUSUM) method for change-point detection that utilizes the maximum over a set of kernel statistics to account for the unknown change-point location. Our approach exhibits increased sensitivity to small changes compared to existing kernel-based change-point detection methods, including Scan-B statistic, corresponding to a non-parametric Shewhart chart-type procedure. We provide accurate analytic approximations for two key performance metrics: the Average Run Length (ARL) and Expected Detection Delay (EDD), which enable us to establish an optimal window length to be on the order of the logarithm of ARL to ensure minimal power loss relative to an oracle procedure with infinite memory. Moreover, we introduce a recursive calculation procedure for detection statistics to ensure constant computational and memory complexity, which is essential for online implementation. Through extensive experiments on both simulated and real data, we demonstrate the competitive performance of our method and validate our theoretical results.
ABSTRACT. An interesting issue in optimal stopping problems is how much performance is lost to time-causality of decision making, i.e., how the expected reward achieved by a "gambler" who is restricted to using stopping times adapted to a sequence of rewards, compares to that which a "prophet" who knows the entire sequence in advance can achieve. Prophet inequalities provide bounds on this loss, and tight bounds of this type are known for several problems, including notably situations in which the rewards (in [0,1]) are independent and there are sampling costs. This paper examines how Markovity affects these latter bounds. In particular, it is shown that when the rewards come from a possibly nonstationary Markov chain the independent case is worst-case, while somewhat surprisingly, i.i.d. rewards are not the worst case for stationary Markov rewards.
Sequential Procedures for Early Detection of Machine Faults
ABSTRACT. Motivated by the application of detecting rolling element bearing faults, we study a non-Bayesian quickest change detection problem, where the change is from a known pre-change distribution to an unknown post-change distribution, but with parameters belonging to a known set. Recognising the cyclostationary nature of the rolling element bearing vibration signals, which possess correlations in the frequency domain, we construct two cyclic spectrum statistics that detect cyclostationarity in the vibration signal. We then develop sequential decision algorithms based on the above statistics and demonstrate their capability to detect small changes at low signal-to-noise ratio, with reasonable delay to detection, low false alarm rate, and low computational complexity.
ABSTRACT. We consider a quantum change-point detection problem where Alice sends a sequence of states ρ⊗τ ⊗ σ⊗(T−τ) to Bob, one state at a time. Here ρ is the original state and σ is the mutated state. Bob wishes to determine the time index τ of the change point. In particular, we consider quickest quantum state change detection, where at each time instant, a copy of the current state is measured, using a pre-designed POVM, and then a classical CUSUM test is applied to the measurement outcomes to determine whether or not a change has occurred. We formulate the POVM optimization problems under the assumptions of known and unknown post-change states, present the optimal POVM structure, and develop efficient optimization procedures to compute the optimal POVM. We further extend the proposed framework to the quickest quantum channel change detection.
Near-Field Quiet Zone Using Passive XL-RIS for 6G-IoT Coexistence in the Upper Mid-Band
ABSTRACT. The expansion of advanced applications and the high-density deployment of the Internet of Things (IoT) are driving the demand for new spectrum in sixth-generation wireless networks, especially in the upper mid-band (FR3). However, operating massive 6G-IoT networks in the FR3 introduces critical coexistence issues with incumbent passive services, such as radio astronomy. This paper investigates an interference mitigation framework based on a passive extremely large reconfigurable intelligent surface (XL-RIS) operating in the radiative near-field. Using the near-field beamforming properties, we form an electromagnetic consistent quiet zone in the Fresnel region of an XL-RIS through destructive interference alignment. We aim to minimize interference at passive receivers while ensuring quality of service for IoT devices by optimizing the RIS phase shifters. We then solve this optimization problem using an efficient iterative algorithm. Simulation results demonstrate that the proposed method can effectively suppress interference at the target location and, in contrast to a conventional far-field solution, maintains IoT device connectivity, even when the incumbent receiver lies in the direction of the BS beam. Moreover, this method enables locally quiet zone formation around the incumbent receiver with minimal impact on network coverage.
On Secrecy Capacity of Binary Beampointing Channels with Block Memory and Feedback
ABSTRACT. This paper investigates the secrecy capacity of the binary beampointing (BBP) channel with block memory and feedback, a simplified yet insightful model for millimeter-wave (mmWave) systems with beamformed transmissions and backscatter feedback. We consider a system where a legitimate receiver and a passive eavesdropper experience independent and uniformly distributed angular directions over transmission blocks, with the base station receiving noiseless, unit-delayed feedback from both, under the per-symbol input cost constraints. We establish a closed-form upper bound on the secrecy capacity, which is based on the main channel between the base station and the legitimate receiver. Moreover, we propose a joint communication and adaptive sensing (JCAS) scheme and derive its achievable secrecy rate. Simulation results show that the gap between the inner and outer bounds narrows as the number of block length increases. This reveals the efficiency of this JCAS scheme, which strategically leverages feedback to balance the demands of sensing the legitimate user and preventing information leakage to the eavesdropper.
The Pulse-Shaping Trojan: Refinements and New Results
ABSTRACT. Recently, an insidious wireless hardware vulnerability was revealed that surreptitiously exfiltrates information from a wireless node by manipulating a transmitter's pulse-shaping filter. This paper introduces refinements and important new results for the pulse-shaping Trojan that further clarify its threat level and its clandestine nature. Error Vector Magnitude (EVM) is introduced as the metric for the distortion induced by this Trojan, replacing the inner product as a measure of the stealth of the rogue signaling. A Viterbi successive interference cancellation scheme is studied for the rogue receiver, to recover the rogue payload in the presence of the higher-power legitimate transmission. The differences and relative merits of the Viterbi receiver with respect to LMMSE interference cancellation is discussed in the context of Trojan signaling. Our findings also further illuminate the trade-off between the Trojan's stealth and its payload.
Optimizing Age of Information without Knowing the Age of Information
ABSTRACT. Consider a network where a wireless base station (BS) connects multiple source-destination pairs. Packets from each source are generated according to a renewal process and are enqueued in a single-packet queue that stores only the freshest packet. The BS decides, at each time slot, which sources to schedule. Selected sources transmit their packet to the BS via unreliable links. Successfully received packets are forwarded to corresponding destinations. The connection between the BS and destinations is assumed unreliable and delayed. Information freshness is captured by the Age of Information (AoI) metric. The objective of the scheduling decisions is leveraging the delayed and unreliable AoI knowledge to keep the information fresh. In this talk, we derive a lower bound on the achievable AoI by any scheduling policy. Then, we develop an optimal randomized policy for any packet generation processes. Next, we develop minimum mean square error estimators of the AoI and system times, and a Max-Weight policy that leverages these estimators. We evaluate the AoI of the optimal randomized policy and the Max-Weight policy both analytically and through simulations. The numerical results suggest that the Max-Weight policy with estimation outperforms the optimal randomized policy even when the BS has no AoI knowledge.
On Estimation of Angles of Arrival in Monostatic ISAC Without Instantaneous Transmit CSI at the Transmiter
ABSTRACT. This paper explores the fundamental limits of Integrated
Sensing and Communication (ISAC) in a more realistic
setting compared to previous literature. We analyze a monostatic
setting where the Base Station (BS) performs multi-target Angle
of Arrival (AoA) estimation while simultaneously communicating
with one of the targets. We assume that the BS has statistical
CSI about all AoAs, with less uncertainty in the AoA of the
communication receiver. The communication receiver is assumed
to have perfect CSI. Utilizing a Bayesian Cram´er-Rao Bound
(BCRB) framework to characterize the fundamental limits of
sensing under minimum mean square error (MMSE) criteria,
we derive achievable BCRB-rate trade-off regions. Our approach
introduces a number of transmission strategies that share power
across sensing and communication beams over a coherence time.
Our analysis reveals that beam allocation strategies leveraging
the principal eigenvectors of the target-specific sensing matrices
minimize individual AoA estimation errors, while strategies
balancing sensing and communication directions optimize joint
estimation performance at the cost of individual accuracy. We
demonstrate that leveraging updated BCRB-based sensing information
for the communication receiver, due to its lower channel
uncertainty, enables significantly improved communication rates.
ABSTRACT. A recent line of work has emerged that uses machine learning / AI to learn the encoders and/or decoders of error-correcting codes. This is often done for AWGN channels in both the forward error correction and feedback settings. Do these codes learn dramatically new non-linear codes? When or why might they outperform model-based or algorithmically designed codes? In this talk, I will survey recent trends in learned error-correcting codes, present interesting open problems, and outline some of my work on ``interpreting’’ such codes. Our interpretations can sometimes approximate the encoders and/or decoder black boxes, offer insights into training, lead to orders of magnitude reduction in the number of involved parameters, and an understanding of how error correction is performed.
Joint Constellation and Labeling Design for Gaussian Interference Channels via Autoencoders
ABSTRACT. We propose a learning-based modulation design and symbol labeling scheme for the two-user interference channel (IC) with arbitrary channel gains. To this end, we develop autoencoders (AEs) with trainable mappers and demappers, jointly optimized in an end-to-end manner. We show that initialization of the transmitter constellations plays a key role in achieving stable convergence and enhancing performance. Our analysis shows that the proposed AEs can learn effective modulation schemes across a wide range of interference levels and signal-to-noise ratios by jointly performing bitwise clustering through suitable labeling and maximizing the minimum symbol distances to reduce the impact of interference in the IC. We further incorporate low-density parity-check (LDPC) channel codes and demonstrate that the proposed AE-IC with LDPC outperforms existing methods in terms of bit error rates.
ABSTRACT. In this paper we propose a novel training scheme to
improve the robustness of end-to-end autoencoder-based channel
code against physical universal adversarial perturbations (UAP)
and jamming attacks. Our training algorithm improves robustness
by augmenting clean training data with uniformly sampled pertur-
bations from a bounded ℓ 2 -ball, encouraging consistent decoder
outputs within these regions. Our contributions include the design
of the training algorithm, a robustness comparison with standard
autoencoders, and a benchmark against a traditional BPSK system
with Hamming coding and maximum likelihood decoding. Results
show that our approach enhances robustness to UAP and jamming
attacks.
DPIR: Deep Predictive Interference-Aware Routing for Wireless Networks
ABSTRACT. We address the problem of routing in wireless interference networks, where overlapping routes across multiple data flows cause interference and degrade link capacities. The objective is to design a multi-flow transmission strategy that maximizes overall network utility while accounting for dynamic interference patterns. While a range of classical and learning-based routing algorithms have been proposed, recent deep
learning approaches are particularly promising due to their ability to generalize and adapt to complex network dynamics. However, most such methods still overlook the impact of wireless interference and rely on infrequent route updates to reduce
computational and communication overhead. This introduces a fundamental challenge: achieving high spectral efficiency and adaptability without incurring the prohibitive cost of frequent, network-wide recomputation. In this collaborative research between Ben-Gurion University and Ceragon Networks Ltd., we address this challenge by developing a novel algorithm, dubbed Deep Predictive Interference-Aware Routing (DPIR), that integrates a graph neural network (GNN) with a deep reinforcement learning (DRL) agent to overcome these limitations. DPIR predicts network dynamics between routing updates and proactively determines routing schedules in advance. The GNN-based agent produces per-timestep routing decisions that are interference-aware and forward-looking. Extensive simulations over dense wireless topologies demonstrate
the superiority of DPIR compared to existing routing approaches, showing consistent improvements in both throughput and packet delay. The algorithm’s robustness across varying traffic loads and network structures highlights its potential for deployment in next-generation wireless systems, including 5G and beyond.
Single-Server SPIR over Binary Erasure Channels: Benefits of Noisy Side Information
ABSTRACT. We study a single-server Symmetric Private Information Retrieval (SPIR) problem. The server and the client communicate over \rev{a Binary Erasure Channel (BEC)} and a public noiseless channel. The server stores $D$ files. The client has access to private, noisy side information for each file, which is generated by passing the files through a set of discrete memoryless test channels. The statistics of these test channels are shared between the client and the server, but the mapping linking each file to a specific test channel remains private to the client. Our objective is to design a protocol that allows the client to retrieve one file while ensuring that $(i)$ the server cannot deduce the mapping or the client's file choice, and $(ii)$ the client does not gain additional information about the remaining $D-1$ files. We define the SPIR capacity with side information as the supremum of all achievable rates, i.e., the ratio between the length of the requested file and the number of times the noisy channel is us
Leaky Multi-Message Private Information Retrieval with Differential Privacy Guarantees
ABSTRACT. The Private Information Retrieval (PIR) problem aims to enable users to privately access data stored on remote servers. Recently, there has been a growing interest in \emph{leaky} PIR schemes, which enable users to trade off higher rates for a controlled leakage of information about the identities of the downloaded messages. Motivated by applications that require the simultaneous retrieval of multiple data items, we focus on the $\epsilon$-differential privacy framework for Leaky Multi-message PIR (LMPIR). Our paper complements the existing work on multi-message PIR, which has so far focused on perfect privacy.
Our main contribution is the construction of an LMPIR scheme based on a randomized query and answer generation mechanism that carefully balances the tradeoff between privacy and efficiency, and is tailored to maximize the retrieval rate under the given privacy parameter $\epsilon$. Our scheme requires a small degree of subpacketization $L$, which grows at most linearly with the number of servers. We also outline an information-theoretic upper bound on the maximum achievable rate for LMPIR. Our numerical results indicate that our scheme achieves a better privacy-efficiency tradeoff compared to alternatives, such as repeated application of leaky single-message PIR schemes.
Improved Achievable Rate for Single Server SPIR over BEC
ABSTRACT. We propose a single-server SPIR protocol over a Binary Erasure Channel (BEC) to achieve a positive retrieval rate for one-out-of-$D$ files. The server holds $D$ files and the client communicates via the BEC with the erasure factor $\epsilon$, and a public noiseless link. The protocol ensures that the server cannot infer the client’s choice and the client gains no information about the non-requested files. We define retrieval rate as file length in bits per number use of the erasure channel. In each round both entities perform one-out-of-$m$ retrieval protocol $\lceil(m \leq D)\rceil$, and a full one-out-of-$D$ retrieval is realized by repeating the one-out-of-$m$ step $\frac{D - 1}{m - 1}$ times. Compared with schemes that repeat one-out-of-2 retrieval $D - 1$ times, this approach cuts erasure channel uses by up to $m - 1$ when $\frac{(m - 1)}{m} \leq \epsilon < \frac{(D - 1)}{D}$. The retrieval rate meets the SPIR capacity over when $D=2$ or when $D \leq \frac{1}{(1 - \epsilon)}$.
Is Neural Network Verification Useful and What Is Next?
ABSTRACT. Neural network verification is an instantiation of the classical formal verification problem, which is, given a model of a system and a specification, both with precise syntax and semantics, prove that the model satisfies the specification. The neural network verification problem instantiates this generic problem, where the model is a neural network represented as a mathematical function and the specification is typically defined as subsets over the input and output spaces of the function, where the model satisfies the specification if all inputs map only to desired outputs. The past decade or so has witnesses significant progress on this problem, such as improved scalability (e.g., in the sizes of the neural network and specifications), application on a variety of machine learning tasks, and important case studies in critical domains ranging from autonomy to medicine, among other advancements. However, is this progress so far useful, what are the current strengths and weaknesses of results in this research area, and what's next for this research area and related problems? This position presentation will take a critical view of this research area discussing these aspects, highlighting where it is and is not useful, and present suggestions for what problems to investigate next.
ABSTRACT. Deep Neural Networks (DNN) have emerged as an effective approach to implementing challenging sub-problems. They are increasingly being used as components in critical transportation, medical, and military systems. However, like human-written software, DNNs may have flaws that can lead to unsafe system performance. To confidently deploy DNNs in such systems, strong evidence is needed that they do not contain such flaws. This has led researchers to explore the adaptation and customization of software verification approaches to the problem of neural network verification (NNV).
Many dozens of NNV tools have been developed in recent years and as a field these techniques have matured to the point where realistic networks can be analyzed to detect flaws and to prove conformance with specifications. NNV tools are highly-engineered and complex may harbor flaws that cause them to produce unsound results.
We identify commonalities in the algorithmic approaches taken by NNV tools to define a verifier independent proof format -- activation pattern tree proofs (APTP) -- and design an algorithm for checking those proofs that is proven correct and optimized to enable scalable checking. We demonstrate that existing verifiers can efficiently generate APTP proofs, and that an APTPchecker significantly outperforms prior work on a benchmark of 16 neural networks and 400 NNV problems, and that it is robust to variation in APTP proof structure arising from different NNV tools.
StarV: A Qualitative and Quantitative Verification Framework for Learning-enabled Autonomy
ABSTRACT. Verification for deep learning models becomes essential for ensuring the correctness of system design involving the use of learning-based techniques. Numerous powerful verification techniques and tools have been developed to tackle this NP-hard, challenging problem. The state-of-the-art primary focus has been on qualitative verification, providing qualitative verification results, e.g., Safe, Unsafe, or Unknown. In this talk, we will introduce StarV, a novel framework that supports both qualitative and quantitative verification techniques. For qualitative verification, we will focus on a novel reachability-based verification approach using SparseImageStars to address two main bottlenecks in deep neural network verification: memory and scalability. For quantitative verification, we introduce a novel concept called ProbStar, which enables the reachability analysis of deep neural networks with probabilistically uncertain input sets. Additionally, we introduce a novel ProbStar Temporal Logic and associated verification algorithms that allow the quantitative verification of temporal behaviors in learning-enabled autonomy. Finally, we will conclude our talk by applying a probstar-based verification approach at runtime on a real learning-enabled F1Tenth platform and discussing some future work.
Reachable and Invariant Sets of Neural Controlled Systems via a Monotone Embedding
ABSTRACT. We propose a framework for reachability analysis of neural-network controlled systems. Our proposed methodology embeds the system dynamics into a higher-dimension monotone dynamical system that is amenable to scalable and efficient analysis and synthesis methods. In particular, we introduce parametopes as a new reachable set representation defined through the intersection of time-varying, parameterized constraints on the system state. Parametopes accommodate familiar
set representations such as intervals, polytopes, and ellipsoids. Our proposed
framework results in a dynamical approach towards nonlinear reachability analysis: a single trajectory of a monotone embedding system provides a parametope reachable set for the original system, and uncertainties are accounted for
through continuous parameter evolution. We show how any desired parameter evolution is allowable, as long as the offset dynamics are set accord-
ingly, providing a virtual "control input" for reachable set computation that can be leverage to reduce conservativeness. In a
special case of the theory, we demonstrate how closing the loop for the parameter
dynamics using the adjoint of the linearization results in a desirable first-order
cancellation of the original system dynamics. We also show how our methodology extends to the setting of training safe neural controllers. We implement our approach in the JAX computation framework and demonstrate the efficiency and efficacy of
reachable parametope computation on several examples and benchmarks from the literature.
Analytical Lyapunov Function Discovery: An RL-based Generative Approach
ABSTRACT. Despite advances in learning-based methods, finding valid Lyapunov functions for nonlinear dynamical systems remains challenging. Current neural network approaches face two main issues: challenges in scalable verification and limited interpretability. To address these, we propose an end-to-end framework using transformers to construct analytical Lyapunov functions (local), which simplifies formal verification, enhances interpretability, and provides valuable insights for control engineers. Our framework consists of a transformer-based trainer that generates candidate Lyapunov functions and a falsifier that verifies candidate expressions and refines the model via risk-seeking policy gradient. Unlike Alfarano et al. (2024), which utilizes pre-training and seeks global Lyapunov functions for low-dimensional systems, our model is trained from scratch via reinforcement learning (RL) and succeeds in finding local Lyapunov functions for high-dimensional and non-polynomial systems. Given the symbolic nature of the Lyapunov function candidates, we employ efficient optimization methods for falsification during training and formal verification tools for the final verification. We demonstrate the efficiency of our approach on a range of nonlinear dynamical systems with up to ten dimensions and show that it can discover Lyapunov functions not previously identified in the control literature. Full implementation is available on https://github.com/JieFeng-cse/Analytical-Lyapunov-Function-Discovery.
Bridging Learning and Reasoning: From Solvers to LLMs
ABSTRACT. From its inception, AI has had two broad sub-fields, namely, reasoning and learning, with little interaction between them. In recent years, there is a growing recognition that if our goal is to solve problems at the cutting-edge of AI (trustworthy AI, AI for Science, AI for Math), then we need to bring these sub-fields together. In this talk, I will present techniques and results showing how machine learning (ML) can be used in the service of automated reasoning (a la, SAT/SMT solvers), and in the reverse direction, how symbolic reasoning engines can be used to improve LLMs. The key idea in both directions is the same: the ML model is viewed as a synthesizer that generates assignments/code/proofs/molecules/equations, while the reasoning engine acts as a verifier that provides corrective feedback to the model at various points (training, fine-tuning, or inference) in its life cycle.