Program for Friday, September 19th

PROGRAM FOR FRIDAY, SEPTEMBER 19TH

Days:

previous day

all days

View: session overview talk overview

08:30-10:10 Session REGS08: Informational Characterization

Chair:

Hao Liang

Location: 3000 NCSA

08:30	Danil Akhtiamov, Reza Ghane and Babak Hassibi A Precise Performance Analysis of the Randomized Singular Value Decomposition ABSTRACT. The Randomized Singular Value Decomposition (RSVD) is a widely used algorithm for efficiently computing low-rank approximations of large matrices, without the need to construct a full-blown SVD. Of interest, of course, is the approximation error of RSVD compared to the optimal low-rank approximation error obtained from the SVD. While the literature provides various upper and lower error bounds for RSVD, in this paper we derive precise asymptotic expressions that characterize its approximation error as the matrix dimensions grow to infinity. Our expressions depend only on the singular values of the matrix, and we evaluate them for two important matrix ensembles: those with power law and bilevel singular value distributions. Our results aim to quantify the gap between the existing theoretical bounds and the actual performance of RSVD. Furthermore, we extend our analysis to polynomial-filtered RSVD, deriving performance characterizations that provide insights into optimal filter selection.
08:50	Haricharan Balasundaram and Andrew Thangaraj Estimating Error in Natural Distribution Estimation PRESENTER: Haricharan Balasundaram ABSTRACT. Given i.i.d. samples from an unknown discrete distribution, the goal of distribution estimation is to construct an accurate estimate of the underlying distribution. Natural distribution estimators assign one probability estimate to all letters occurring with the same frequency, and this is well-justified for i.i.d. models. However, natural estimators can be significantly erroneous for low frequency or missing (frequency 0) letters in large alphabet scenarios. In this work, we introduce a statistic that captures the unavoidable error at a particular frequency of any natural distribution estimator. For this proposed error statistic, which depends on the distribution and the samples, we provide an estimator that is non-linear in the prevalences (frequencies of frequencies). We show that the proposed estimator has low bias and is consistent, and can be used to ascertain if the distribution restricted to letters of same frequency is close to uniform. Our approach is validated through simulations on synthetic and natural language data.
09:10	Qing Li Linking Diffusion Models & I-MMSE ABSTRACT. This paper establishes a theoretical connection between diffusion models and the I-MMSE theorem in information theory. Specifically, we demonstrate that two widely used log-likelihood-based training objectives in diffusion models are closely related, with their relationship explicitly established via the I-MMSE theorem. As a direct consequence of this theoretical link, we derive a single-step denoising method that achieves optimal performance without iterative sampling.
09:30	Leonhard Grosse, Sara Saeidian, Tobias J. Oechtering and Mikael Skoglund Bounds on the Privacy Amplification of Arbitrary Channels via the Contraction of $f_\alpha$-Divergence PRESENTER: Leonhard Grosse ABSTRACT. We examine the privacy amplification of channels that do not necessarily satisfy any LDP guarantee by analyzing their contraction behavior in terms of $f_\alpha$-divergence, an $f$-divergence related to Rényi-divergence via a monotonic transformation. We present bounds on contraction for restricted sets of prior distributions via $f$-divergence inequalities and present an improved Pinsker's inequality for $f_\alpha$-divergence based on the joint range technique by Harremoës and Vajda. The presented bound is tight whenever the value of the total variation distance is larger than $\nicefrac{1}{\alpha}$. By applying these inequalities in a cross-channel setting, we arrive at strong data processing inequalities for $f_\alpha$-divergence that can be adapted to use-case specific restrictions of input distributions and channel. The application of these results to privacy amplification shows that even very sparse channels can lead to significant privacy amplification when used as a post-processing step after local differentially private mechanisms.
09:50	Jonathan Ponniah Information-Theoretic Outer-bounds from Sphere-Packing, Compactness, and Zero-One Laws ABSTRACT. New derivations of outer-bounds for the point-to-point channel and the relay channel are proposed using a combination of sphere-packing and compactness arguments, as well as zero-one laws. The key idea is the notion of ``decoding sets'', which map regions of the ${\bf y}$-space to transmitted codewords from the code-book. Each decoding set requires an asymptotic minimum volume of $2^{nH(Y\|X)}$ sequences in order for the decoding error to vanish. Strong converses for the point-to-point channel are derived from Kolmogorov's zero-one law. An outer-bound for the relay channel is proposed when the channel capacity exceeds $I(X_{s};Y_{r}\|X_{r})$. The proposed outer-bound is $\max_{p(x_{s})p(x_{r})}I(X_{s};\hat{Y}_{r}Y_{d}\|X_{r})$ for $I(\hat{Y}_{r};Y_{r}\|X_{r})>I(X_{r};Y_{d})$. This outer-bound can be achieved using a simplified version of the original compress-forward scheme.

08:30-10:10 Session VSGA02: Robust Inference and Learning II

Organizers: Venu Veeravalli, Shaofeng Zou, & George Atia

Chair:

Venugopal Veeravalli

Location: 1040 NCSA

08:30	Tong Yang, Yu Huang, Yingbin Liang and Yuejie Chi Transformers Provably Learn Symbolic Multi-step Reasoning via Chain of Thought ABSTRACT. Transformers have demonstrated remarkable capabilities in multi-step reasoning tasks. However, understandings of the underlying mechanisms by which they acquire these abilities through training remain limited, particularly from a theoretical standpoint. This work investigates how transformers learn to solve symbolic multi-step reasoning problems through chain-of-thought processes, focusing on path-finding in trees. We analyze two intertwined tasks: a backward reasoning task, where the model outputs a path from a goal node to the root, and a more complex forward reasoning task, where the model implements two-stage reasoning by first identifying the goal-to-root path and then reversing it to produce the root-to-goal path. Our theoretical analysis, grounded in the dynamics of gradient descent, shows that trained one-layer transformers can provably solve both tasks with generalization guarantees to unseen trees. In particular, our multi-phase training dynamics for forward reasoning elucidate how different attention heads learn to specialize and coordinate autonomously to solve the two subtasks in a single autoregressive path. These results provide a mechanistic explanation of how trained transformers can implement sequential algorithmic procedures. Moreover, they offer insights into the emergence of reasoning abilities, suggesting that when tasks are structured to take intermediate chain-of-thought steps, even shallow multi-head transformers can effectively solve problems that would otherwise require deeper architectures.
08:50	Sean Meyn Reinforcement Learning Design for Quickest Change Detection - Extended Abstract ABSTRACT. The field of quickest change detection (QCD) focuses on the design and analysis of online algorithms that estimate the time at which a significant event occurs. As in much of the existing literature, the goal of this paper is to devise a stopping rule that minimizes an L1 loss. Analysis is cast in a Bayesian framework, where QCD is formulated as an optimal stopping problem with partial observations; motivation for techniques from reinforcement learning easily follow. The main contributions are as follows: • A Q-learning algorithm is proposed for the general partially observed optimal stopping problem. It is shown to converge under linear function approximation, given suitable assumptions on the basis functions. An example is provided to demonstrate that this assumption is necessary to ensure algorithmic stability. • Prior theory motivates a particular choice of features in applying Q-learning to QCD. It is shown that, in several scenarios and under ideal conditions, the resulting class of policies contains one that is approximately optimal. • Numerical experiments show that Q-learning consistently produces policies that perform close to the best achievable within the chosen function class. When the true optimal policy lies in this class, the algorithm accurately approximates it.
09:10	Jiantao Jiao Challenges of Reliable Training and Evaluation of Large Language Models ABSTRACT. How can we robustly evaluate large language models (LLMs)? What training methods enable them to perform reasoning while generalizing to unseen tasks? And how can we interpret two-hop reasoning behavior in LLMs? In this talk, we will present our recent work addressing these questions and highlight several open problems in the field.
09:30	Jeffrey Negrea, Alexander Terenin and Gabriele Farina Follow-the-Perturbed-Leader with Between-Action Dependence ABSTRACT. We present a framework for analyzing Gaussian follow-the-perturbed-leader (FTPL) algorithms for full-information online learning problems when the perturbation distribution exhibits between-action dependence. Applications include FTPL algorithms for online learning for i) infinite action spaces when the adversary plays bounded Lipschitz reward functions, where the perturbations are random functions sampled from a Gaussian process; and ii) linear polyhedral games, where the perturbation is a random linear function. We demonstrate how to tightly account for dependence between actions in the FTPL analysis, and present an ansatz for the selection of the perturbation distribution based on a Bayesian perspective of FTPL as a variant of Thompson sampling.

08:30-10:10 Session YXRC01: Economics and Computation

Organizers: Yunzong Xu & Bhaskar Ray Chaudhury

Chairs:

Yunzong Xu and Bhaskar Ray Chaudhury

Location: 1030 NCSA

08:30	Raul Castro Fernandez Toward a Theory of Data Ecology: Understanding and Steering Data Ecosystems ABSTRACT. Data is increasingly recognized as an economic asset, but it is also a force that shapes the behavior and structure of entire ecosystems. In this talk, I will introduce the concept of data ecology, a framework for understanding how data flows among agents, supports tasks, and generates value. The goal is to go beyond conventional treatments of data as a commodity to explore how its non-rivalry, partial observability, and networked effects alter value dynamics. I will present the core components of the data ecology model, highlight a few key insights, and pose open questions about how we might design interventions — technical, economic, or institutional — that steer data ecosystems toward more desirable outcomes. While rooted in specific methodologies, the framework is designed to support interdisciplinary thinking across computer science, economics, law, public policy, and beyond, providing a shared foundation for reasoning about the dynamics and governance of data across various data ecosystems.
08:50	Davin Choo, Billy Jin and Yongho Shin Online Bipartite Matching with Advice ABSTRACT. We study online bipartite matching when the algorithm is given advice in the form of a suggested matching in each iteration. We develop algorithms for both the vertex-weighted and unweighted variants that provably dominate the naive "coin flip'' strategy of randomly choosing between the advice-following and advice-free algorithms. Our algorithm for the vertex-weighted setting extends to the AdWords problem under the small bids assumption, yielding a significant improvement over the seminal work of Mahdian, Nazerzadeh, and Saberi. Complementing our positive results, we establish a hardness bound on the performance that is attainable by any algorithm.
09:10	Siddhartha Banerjee, Chamsi Hssaine and Sean R. Sinclair Online Fair Allocation of Perishable Resources ABSTRACT. We consider a practically motivated variant of the canonical online fair allocation problem: a decision-maker has a budget of perishable resources to allocate over a fixed number of rounds. Each round sees a random number of arrivals, and the decision-maker must commit to an allocation for these individuals before moving on to the next round. The goal is to construct a sequence of allocations that is envy-free and efficient. Our work makes two important contributions toward this problem: we first derive strong lower bounds on the optimal envy-efficiency trade-off that demonstrate that a decision-maker is fundamentally limited in what she can hope to achieve relative to the no-perishing setting; we then design an algorithm achieving these lower bounds which takes as input (i) a prediction of the perishing order, and (ii) a desired bound on envy. Given the remaining budget in each period, the algorithm uses forecasts of future demand and perishing to adaptively choose one of two carefully constructed guardrail quantities. We demonstrate our algorithm's strong numerical performance - and state-of-the-art, perishing-agnostic algorithms' inefficacy - on simulations calibrated to a real-world dataset.
09:30	Simina Branzei, Reed Phillips and Nicholas Recker The Tarski Search Problem: Algorithms, Complexity, and Open Questions ABSTRACT. In the Tarski search problem, we are given a finite lattice (such as a d-dimensional grid of side length n) and a monotone function mapping each vertex to another vertex (possibly itself). Tarski's theorem guarantees such a function always has a fixed point. Depending on the model, the function f can either be explicitly known (white-box model) or unknown but queryable (black-box/query model). In both cases, the goal is identifying a fixed point efficiently, either by directly analyzing and leveraging the explicit structure or by minimizing the number of queries needed. This talk surveys known results on both the white-box and query complexity of the Tarski search problem and highlights key open questions.
09:50	Rachel Cummings, Hadi Elzayn, Vasilis Gkatzelis, Emmanouil Pountourakis and Juba Ziani Optimal Data Acquisition with Privacy-Aware Agents ABSTRACT. We study the problem faced by a data analyst or platform that wishes to collect private data from privacy-aware agents. To incentivize participation, in exchange for this data, the platform provides a service to the agents in the form of a statistic computed using all agents' submitted data. The agents decide whether to join the platform (and truthfully reveal their data) or not participate by considering both the privacy costs of joining and the benefit they get from obtaining the statistic. The platform must ensure the statistic is computed differentially privately and chooses a central level of noise to add to the computation, but can also induce personalized privacy levels (or costs) by giving different weights to different agents in the computation as a function of their heterogeneous privacy preferences (which are known to the platform). We assume the platform aims to optimize the accuracy of the statistic, and must pick the privacy level of each agent to trade-off between i) incentivizing more participation and ii) adding less noise to the estimate. We provide a semi-closed form characterization of the optimal choice of agent weights for the platform in two variants of our model. In both of these models, we identify a common nontrivial structure in the platform's optimal solution: an instance-specific number of agents with the least stringent privacy requirements are pooled together and given the same weight, while the weights of the remaining agents decrease as a function of the strength of their privacy requirement. We also provide algorithmic results on how to find the optimal value of the noise parameter used by the platform and of the weights given to the agents.

10:10-10:40Break

10:40-12:20 Session ISLA01: Information Theory, Statistics, and Data Science

Organizers: Ilan Shomorony & Lav Varshney

Chairs:

Lav Varshney and Ilan Shomorony

Location: 1030 NCSA

10:40	Ashwin Pananjady Learning and inference from temporally dependent data ABSTRACT. Given a sequence of length $n$ from an unknown, discretely supported distribution, can we estimate, for each $k$, the mass placed by the distribution on elements occurring $k$ times? Flavors of this question have been tackled by an extensive body of literature spanning statistics, information theory and computer science, and dating back to the 40s. Classical estimators of the “unseen” such as those due to Good and Turing are known to possess desirable properties when the data is i.i.d. However, these estimators are no longer reliable in modern settings spanning genomics and language, where the data can be temporally dependent. We design an alternative family of estimators for learning properties of the stationary distribution in mixing sequences. We show that these practical, linear-time estimators achieve a form of “universal consistency” in that their error can decay in the sequence length $n$ with no dependence on the size of the support. Our guarantees are also adaptive to properties of the stationary distribution. Time permitting, we will also show how similar algorithmic ideas can be applied to perform assumption-lean predictive inference with dependent data.
11:00	Joni Shaska, Urbashi Mitra, Yingying Wang and Christine Johnson Ordinal-Poisson Causal Discovery ABSTRACT. This paper deals with causal discovery in the setting where random variables in the causal graph can be either ordinal or Poisson. In particular, we show that it is possible to recover the causal network (modeled by a directed acyclic graph) from purely observational data. Such results are important in the field of causal discovery since, in general, it is impossible to recover causal relationships from purely observational data, and the identifiability of causal network structures relies on assumptions regarding the data-generating process. Causal discovery in the setting where all variables in the graph are ordinal and the setting where all variables are Poisson has been separately considered. However, the setting in this paper where the two are allowed to co-exist in the graph has, to the best of our knowledge, not been considered. In addition to proving the identifiability of ordinal-Poisson causal discovery, we show, through numerical simulations, that the Bayesian information criterion (BIC) tends to reliably recover the true causal graph as the number of observations increases.
11:20	Jennifer Tang, Aviv Adler and Cordelia Hu Synchronizing Probabilities in Model-Driven Compression ABSTRACT. Classical lossless compression algorithms such as Huffman coding or arithmetic coding rely on using probabilities of symbols to achieve compression rates near entropy. Deep neural networks provide a natural choice for estimating these probabilities using their impressive ability to model relevant domains such as natural language or images. However, to take advantage of this for compression, both the compressor and decompressor require access to the same neural network to compute these probabilities directly from context without communicating the probabilities themselves. This introduces the possibility that slight variations in hardware or software may cause the computed probabilities to differ slightly at the compressor and decompressor ends, which could derail the entire compression algorithm. In this work, we look at methods to solve this issue by synchronizing the probabilities generated at the compressor and decompressor ends, with minimal additional communication.
11:40	Juraj Marusic, Marco Avella Medina and Cynthia Rush A FREQUENTIST THEORY FOR M-POSTERIORS: ASYMPTOTIC AND ROBUSTNESS PROPERTIES ABSTRACT. An M-estimator is a general class of estimators in statistics that are defined as the minimizer of an objective function, typically derived from a loss or score function and an M-posterior incorporates the Bayesian framework through a prior distribution the parameters. In this work, we study asymptotic properties of such posteriors as well as their robustness to outliers. We show that the robustness properties of estimators based on M-posteriors are similar to those of M-estimators.
12:00	Jaimin Shah, Alex Dytso and Martina Cardone Focal Loss: Statistical and Information-Theoretic Perspectives PRESENTER: Martina Cardone ABSTRACT. The focal loss has emerged as a de facto training loss in machine learning domains, especially in computer vision. Despite its ubiquity, its theoretical properties and benefits have not been well explored. In this talk, we will investigate statistical properties of the focal loss. In particular, we will explore the consistency and asymptotic normality of estimators that minimize the focal loss. In addition, we will also study several information-theoretic aspects, including connections to the relative entropy and the emergence of novel information measures induced by the focal loss.

10:40-12:20 Session REGS09: Dynamics

Chair:

Boxuan Zhou

Location: 1040 NCSA

10:40	Aimin Li and Elif Uysal Optimal Sampling and Scheduling for Remote Fusion Estimation of Correlated Wiener Processes PRESENTER: Elif Uysal ABSTRACT. In distributed sensor networks, sensors often observe a dynamic process within overlapping regions. Due to random delays, these correlated observations arrive at the fusion center asynchronously, raising a central question: How can one fuse asynchronous yet correlated information for accurate remote fusion estimation? This paper addresses this challenge by studying the joint design of sampling, scheduling, and estimation policies for monitoring a correlated Wiener process. Though this problem is coupled, we establish a separation principle and identify the joint optimal policy: the optimal fusion estimator is a weighted-sum fusion estimator conditioned on Age of Information (AoI), the optimal scheduler is a Maximum Age First (MAF) scheduler that prioritizes the most stale source, and the optimal sampling can be designed given the optimal estimator and the MAF scheduler. To design the optimal sampling, we show that, under the infinite-horizon average-cost criterion, optimizing AoI is equivalent to optimizing MSE under pull-based communications, despite the presence of strong inter-sensor correlations. This structural equivalence allows us to identify the MSE-optimal sampler as one that is AoI-optimal. This result underscores an insight: information freshness can serve as a design surrogate for optimal estimation in correlated sensing environments.
11:00	Faezeh Shojaeighadikolaei, Shouhuai Xu and Keith Paarporn Optimizing Preventive and Reactive Defense Resource Allocation with Uncertain Sensor Signals ABSTRACT. Cyber attacks continue to be a cause of concern despite advances in cyber defense techniques. Although cyber attacks cannot be fully prevented, standard decision-making frameworks typically focus on how to prevent them from succeeding, without considering the cost of cleaning up the damages incurred by successful attacks. This motivates us to investigate a new resource allocation problem formulated in this paper: The defender must decide how to split its investment between preventive defenses, which aim to harden nodes from attacks, and reactive defenses, which aim to quickly clean up the compromised nodes. This encounters a challenge imposed by the uncertainty associated with the observation, or sensor signal, whether a node is truly compromised or not; this uncertainty is real because attack detectors are not perfect. We investigate how the quality of sensor signals impacts the defender's strategic investment in the two types of defense, and ultimately the level of security that can be achieved. In particular, we show that the optimal investment in preventive resources increases, and thus reactive resource investment decreases, with higher sensor quality. We also show that the defender's performance improvement, relative to a baseline of no sensors employed, is maximal when the attacker can only achieve low attack success probabilities.
11:20	Toye Oloko, Zsolt Talata and Morteza Hashemi Error-Sensitive Dynamic Calibration for Online Optimization with Switching Costs ABSTRACT. Neural networks (NNs) have achieved state-of-the-art performance in a wide range of applications; however, their deployment in online settings often exposes a lack of robustness due to various dynamic and uncertain conditions. In contrast, traditional online algorithms offer formal robustness guarantees, yet generally do not attain the predictive accuracy of modern neural network models. To address this trade-off, we propose a novel neural network architecture that incorporates a dynamic calibration layer, designed to enhance robustness in online environments without sacrificing predictive performance. The proposed dynamic calibration layer consists of a differentiable optimization component that adjusts the NN output in real time, enabling end-to-end training via standard backpropagation. We provide theoretical analysis establishing performance bounds relative to an offline optimal benchmark, and we leverage these bounds to constrain the function class used for dynamic calibration, ensuring both practical feasibility and theoretical soundness. Empirical evaluation is conducted on a real-world case study involving data center energy management. Comparative results against existing online and hybrid approaches demonstrate that our architecture consistently outperforms baseline methods, both in terms of average performance and robustness to environmental variability.
11:40	Eric Xu and Gireeja Ranade Control of Dynamical Systems with Multiplicative Observation Noise with Unknown Distribution Parameters ABSTRACT. In this work, we consider the problem of stabilizing a linear dynamical system with multiplicative observation noise (MON), where the precise distribution generating the MON is unknown. We propose a control algorithm that first uses system identification to estimate the necessary parameters for the optimal policy and then applies the resulting control policy using those parameter estimates. We provide theoretical guarantees for our algorithm which show that (1) our estimation scheme requires $O(\log(1/\delta)^2/\epsilon^2)$ samples to obtain estimates with $\epsilon$ accuracy and high probability $1-\delta$, (2) the resulting controller guarantees second-moment stability conditioned on a good estimation event. This controller has a bounded gap to the performance of the optimal linear memoryless controller that knows the distribution of the noise a-priori.
12:00	Thomas Flynn, Patrick Johnstone and Shinjae Yoo Problem-dependent convergence bounds for randomized linear gradient compression ABSTRACT. —In distributed optimization, the communication of model updates can be a performance bottleneck. Consequently, gradient compression has been proposed as a means of increasing optimization throughput. In general, due to information loss, compression introduces a penalty on the number of iterations needed to reach a solution. In this work, we investigate how the iteration penalty depends on the interaction between compression and problem structure, in the context of non-convex stochastic optimization. We focus on linear schemes, where compression and decompression can be modeled as multiplication with a random matrix. We consider several distributions of matrices, among them Haar-distributed orthogonal matrices and matrices with random Gaussian entries. We find that the impact of compression on convergence can be quantified in terms of a smoothness matrix associated with the objective function, using a norm defined by the compression scheme. The analysis reveals that in certain cases, compression performance is related to low-rank structure or other spectral properties of the problem and our bounds predict that the penalty introduced by compression is significantly reduced compared to worst-case bounds that only consider the compression level, ignoring problem data. We verify the theoretical findings experimentally, including fine-tuning an image classification model.

10:40-12:20 Session REGS10: Massive Inference

Chair:

Vincent Leon

Location: 3100 NCSA

10:40	Himanshu Kumar and Rishabh Agrawal BayesFlow++: A Unified Framework for Amortized Bayesian Inference in Complex Systems ABSTRACT. In the contemporary landscape of data science, the demand for robust and scalable inference methods has intensified, particularly when addressing complex, high-dimensional systems. Traditional Bayesian inference techniques, while theoretically rigorous, often suffer from computational limitations in the presence of intractable likelihoods or large parameter spaces. In order to overcome these obstacles, we present BayesFlow++, a sophisticated system that incorporates adaptive summary statistics learning and dynamic nested sampling into the core BayesFlow architecture. Through the use of invertible neural networks and simulation-based amortized inference, BayesFlow++ allows for quick posterior estimate across datasets without the need for retraining. By concentrating on high-posterior-probability zones, dynamic nested sampling enhances convergence and estimation accuracy while also enabling more efficient use of processing resources. Simultaneously, adaptive summary statistics learning replaces fixed data representations with task-specific, learned encodings, enhancing inference robustness across varying data distributions. Compared to current Bayesian methods, the unified framework provides better scalability, flexibility, and precision. To evaluate its effectiveness, we apply BayesFlow++ to three domains: epidemiological modeling, where it accurately estimates disease transmission dynamics; climate modeling, where it handles high-dimensional uncertainty in environmental systems; and in financial forecasting, where nonlinear dependencies are captured for analysis of probabilistic markets. In all cases, BayesFlow++ demonstrates superior performance in terms of inference accuracy and computational efficiency. This work establishes the foundation for future adaptations to streaming and real-time data settings and represents a major step toward general-purpose, scalable Bayesian inference in complicated real-world systems.
11:00	Oluwafemi Ajeigbe and Sandip Roy On the Sign Structure of the Ridge Estimator for Nonnegative Linear Mappings ABSTRACT. The $\ell^2$-regularized linear estimator (ridge regression estimator) is examined, in the case that the linear mapping from parameters to responses is entry-wise nonnegative. The sign pattern of several matrices associated with the estimator is characterized. The main finding is that, for several of these matrices, either their positive or negative entries define a connected graph. These analyses give some insight into the ridge estimator for the nonnegative case.
11:20	Lei Lyu and Urbashi Mitra Hybrid Atomic Norm Sparse/Diffuse Channel Estimation ABSTRACT. In this paper, the hybrid sparse/diffuse (HSD) channel model in frequency domain is proposed. Based on the structural analysis on the resolvable paths and diffuse scattering statistics in the channel, the Hybrid Atomic-Least-Squares (HALS) algorithm is designed to estimate sparse/diffuse components with a combined atomic and l2 regularization. A theoretical analysis is conducted on the Lagrangian dual problem and the conditions needed to be satisfied by primal and dual solutions are provided. This analysis, in turn, suggests an algorithm for optimal frequency support estimation. Debiased methods for improved channel estimation are provided. Given differing amounts of side information, performance bounds are derived in terms of a genie-aided estimator and constrained Cramer-Rao lower bounds (CRLB). Numerical results via simulations on synthetic data as well as real experimental data validate the efficacy of the proposed method. There are clear tradeoffs with respect to the properties of the channel with respect to performance: sparsity of specular paths and relative energy of diffuse components.
11:40	Oveys Delafrooz Noroozi, Jiyoon Han, Wei Tang, Zhengya Zhang and Upamanyu Madhow Scaling Wideband Massive MIMO Radar via Beamspace Dimension Reduction PRESENTER: Oveys Delafrooz Noroozi ABSTRACT. We present an architecture for scaling digital beamforming for wideband massive MIMO radar. Conventional spatial processing becomes computationally prohibitive as array size grows; for example, the computational complexity of MVDR beamforming scales as $O(N^3)$ for an $N$-element array. In this paper, we show that energy concentration in beamspace provides the basis for drastic complexity reduction, with array scaling governed by the $O(N \log N)$ complexity of the spatial FFT used for beamspace transformation. Specifically, we propose an architecture for windowed beamspace MVDR beamforming, parallelized across targets and subbands, and evaluate its efficacy for beamforming and interference suppression for government-supplied wideband radar data from the DARPA SOAP (Scalable On-Array Processing) program. We demonstrate that our approach achieves detection performance comparable to full-dimensional benchmarks while significantly reducing computational and training overhead, and provide insight into tradeoffs between beamspace window size and FFT resolution in balancing complexity, detection accuracy, and interference suppression.
12:00	Masoud Elyasi and Azadeh Vosoughi Multi-Cell MMSE Processing and Pilot Assignment in Massive MIMO under Correlated Fading PRESENTER: Azadeh Vosoughi ABSTRACT. This paper investigates a multi-cell massive MIMO network subject to spatially correlated Rayleigh fading, where pilot reuse occurs both within and across cells. Base stations (BSs) implement multi-cell minimum mean square error (M-MMSE) processing to mitigate interference. We present a new deterministic approximation of the uplink coherent interference term, which remains accurate in the large-antenna regime despite the presence of spatial correlation and pilot reuse—addressing a critical gap in prior analyses. Based on this analytical result, we design a multi-cell pilot assignment (PA) algorithm that fully removes pilot contamination by utilizing the spatial correlation information of all users. To enable deployment in large-scale networks, we also develop a scalable variant using partial M-MMSE (P-MMSE) processing, which restricts inter-BS coordination by selectively engaging only strong interferers, and provide a complexity evaluation to demonstrate its implementation feasibility. Simulation results confirm that our proposed PA algorithms significantly outperform existing PA schemes across various system configurations, achieving higher spectral efficiency, improved fairness, and reduced pilot overhead—offering practical design insights for next-generation massive MIMO systems.

10:40-12:20 Session REGS11: Network Engineering

Chair:

Ben Kim

Location: 3000 NCSA

10:40	Darin Jeff and Eytan Modiano Optimal Service Mode Assignment in a Simple Computation Offloading System ABSTRACT. We consider a simple computation offloading model where jobs can either be fully processed in the cloud or be partially processed at a local server before being sent to the cloud to complete processing. Our goal is to design a policy for assigning jobs to service modes, i.e., full offloading or partial offloading, based on the state of the system, in order to minimize delay in the system. We show that when the cloud server is idle, the optimal policy is to assign the next job in the system queue to the cloud for processing. However, when the cloud server is busy, we show that, under mild assumptions, the optimal policy is of a threshold type, that sends the next job in the system queue to the local server if the queue exceeds a certain threshold. Finally, we demonstrate this policy structure through simulations.
11:00	Ankita Koley and Chandramani Singh Distributed Caching of Transient Contents in a Tandem of Caches ABSTRACT. We focus on caching of transient contents in a tandem of caches. The contents fetched from the server remain usable for a finite duration. Requests arrive at the first cache. A request is forwarded to the upstream caches until a usable version of the requested content is found and is fetched from the server if such a version is not found in any of the caches. Getting the contents from upstream caches or from the server incurs costs. We formulate an optimal content caching problem to minimize the average cost subject to the cache capacity constraints. The problem falls under the category of continuous time Markov Decision Processes and suffers from curse of dimensionality. We first consider a single cache and formulate the problem as a restless multi-armed bandit (RMAB) and propose a Whittle index-based policy with explicit index expressions. We then extend the methodology to a tandem of caches. Here, a newly fetched content is associated with multiple Whittle indices, one per cache, and the index-based caching policy maximizes the sum of Whittle indices across all caches. We also suggest a distributed implementation that does not need a centralized entity with knowledge of the contents at different caches and their ages. Finally, we demonstrate via simulations that the performance of the proposed policy is very close to optimal.
11:20	Rivka Gitik and Alejandro Cohen Variable Min-Cut Max-Flow Bounds and Algorithms in Finite Regime ABSTRACT. The maximum achievable capacity from source to destination in a network is limited by the min-cut max-flow bound; this serves as a converse limit. In practice, link capacities often fluctuate due to dynamic network conditions. In this work, we introduce a novel analytical framework that leverages tools from computational geometry to analyze throughput in heterogeneous networks with variable link capacities in a finite regime. Within this model, we derive new performance bounds and demonstrate that increasing the number of links can reduce throughput variability by nearly $90\%$. We formally define a notion of network stability and show that an unstable graph can have an exponential number of different min-cut sets, up to $O(2^{\|E\|})$. To address this complexity, we propose an algorithm that enforces stability with time complexity $O(\|E\|^2 + \|V\|)$, and further suggest mitigating the delay-throughput tradeoff using adaptive rateless random linear network coding (AR-RLNC).
11:40	Irtiza Hasan and Ahmed Arafa Version Age of Information with Contact Mobility in Gossip Networks ABSTRACT. A gossip network is considered in which a source node updates its status while other nodes in the network aim at keeping track of it as it varies over time. Information gets disseminated by the source sending status updates to the nodes, and the nodes gossiping with each other. In addition, the nodes in the network are mobile, and can move to other nodes to get information, which we term contact mobility. The goal for the nodes is to remain as fresh as possible, i.e., to have the same information as the source's. To evaluate the freshness of information, we use the Version Age-of-Information (VAoI) metric, defined as the difference between the version of information available at a given node and that at the source. We analyze the effect of contact mobility on information dissemination in the gossip network using a Stochastic Hybrid System (SHS) framework for different topologies and mobility scalings with increasing number of nodes. It is shown that with the presence of contact mobility the freshness of the network improves in both end of the network connectivity spectrum: disconnected and fully connected gossip networks. We mathematically analyze the average version age scalings and validate our theoretical results via simulations. Finally, we incorporate the cost of mobility for the network by formulating and solving an optimization problem that minimizes a weighted sum of version age and mobility cost. Our results show that contact mobility, with optimized mobility cost, improves the average version age in the network.
12:00	Azadeh Sheikholeslami Minimum-Latency Reliable Routing in Interference-Powered Networks ABSTRACT. Energy-harvesting wireless networks face fundamental trade-offs between latency, reliability, and energy efficiency, particularly in contested spectral environments with adversarial interferers. This paper introduces HOPE (Harvesting-Optimized Path selection with End-to-end outage guarantees), a novel framework for joint routing and power allocation in multi-hop networks where nodes rely entirely on ambient RF energy harvesting—including from interferers—for operation. We formulate an optimization problem that minimizes the maximum harvesting latency along a path while guaranteeing a probabilistic end-to-end outage constraint. By deriving closed-form expressions for outage probabilities and leveraging Lagrangian duality, we transform the min-max problem into a tractable form solvable via Dijkstra's algorithm. Numerical results demonstrate that HOPE achieves significantly lower harvesting latency compared to benchmark strategies like shortest-path and minimum-energy routing, while robustly maintaining outage guarantees even under high interference.