IMPMS 2026: THE 5TH ITALIAN MEETING ON PROBABILITY AND MATHEMATICAL STATISTICS
PROGRAM FOR FRIDAY, JUNE 12TH
Days:
previous day
all days

View: session overviewtalk overview

11:10-12:50 Session 16A: CS155: Asymptotic results for predictive distributions
11:10
Almost conditionally identically distributed random variables

ABSTRACT. Almost conditional identically distributed (a.c.i.d.) random variables, \cite{bc25}, are generalizations of conditional identically distributed (c.i.d.), \cite{berti2004limit}, and exchangeable, \cite{ald85}, random ones. This class of random variables naturally arises in applications in statistics, such as in recursive algorithms, contamination models and heteroskedastic observations. The definition of almost conditional identially distributed random variables depends on a sequence of parameters that quantifies departure from exchangeability and conditional identical distribution. An alternative definition of these processes is as measure-valued almost supermartingales.

In this talk, I will first introduce this new class of random variables, illustrating some specific examples in statistics. Secondly, I will present new limit theorems that extend those for exchangeable and c.i.d. random variables to the more general setting of a.c.i.d. random variables. Specifically, asymptotic exchangeability, a Strong Law of Large Numbers and three different Central Limit Theorems, involving respectively the predictive, empirical and asymptotic directing distributions of the process, are presented. Also, necessary and sufficient conditions for the asymptotic directing measure of the sequence to be absolutely continuous with respect to a given sigma-finite measure are described. These theorems have statistical applications, especially in Bayesian predictive inference.

11:35
Exchangeable measure-valued Pólya sequences

ABSTRACT. Measure-valued Pólya sequences (MVPS) are stochastic processes whose dynamics are governed by generalized Pólya urn schemes with infinitely many colors. Assuming a general reinforcement rule, MVPSs can be viewed as extensions of Blackwell and MacQueen's Pólya sequence, which characterizes an exchangeable sequence with a Dirichlet process (DP) prior distribution. In this talk, we give a complete account of the class of exchangeable MVPSs. We show that under exchangeability, an MVPS is necessarily balanced and its reinforcement kernel is, after normalization, a proper regular conditional distribution. As a result, its prior distribution is that of a DP mixture with respect to a latent parameter, which is associated with the conditioning $\sigma$-algebra. Furthermore, we examine the effects of relaxing exchangeability to conditional identity in distribution (c.i.d.) and find that the two are equivalent for balanced MVPSs. In the unbalanced case, it is still possible to have c.i.d. MVPSs that are not exchangeable, but this necessitates a particular form of the reinforcement kernel.

This is joint work with Yoana R. Chorbadzhiyska and Mladen Savov.

12:00
Asymptotic and empirical properties of some classes of predictive distribution
PRESENTER: Fabrizio Leisen

ABSTRACT. This talk will focus on introducing new classes of predictive distributions which make use of sample quantities. Theoretical properties of these models will be discussed and compared with a recent proposal based on copulas. We will show some illustrations of their use in the context of predictive resampling.

12:25
Predictive Bernstein-von Mises Theorems
PRESENTER: Sandra Fortini

ABSTRACT. Predictive inference takes the sequence of one-step-ahead predictive distributions as the primitive object for learning and inference, rather than an explicit model- prior specification. This approach naturally encompasses Bayesian procedures, but also applies to prediction--based learning rules that are only asymptotically exchangeable or arise from computationally motivated approximations.

We study asymptotic inference induced by predictive learning rules that are not necessarily exchangeable but converge almost surely to a random limiting distribution. Our main contribution is a functional Doob--type Bernstein--von Mises theorem for predictive inference. We show that, under suitable regularity conditions, the conditional distribution of the limiting predictive process, centered at the current predictive distribution and suitably rescaled, converges almost surely to a Gaussian law in an appropriate functional space. The associated covariance structure is explicitly characterized in terms of predictive updates, yielding an analytic approximation of the implicit posterior distribution and providing a direct tool for uncertainty quantification and predictive efficiency assessment.

Under i.i.d. observations, we obtain a Bernstein--von Mises theorem for the predictive distribution, showing asymptotic normality of the implicit posterior centered at the predictive mean, with variance determined by the learning dynamics of the predictive rule.

Finally, we discuss extensions of the framework to supervised settings with regressors, where predictive distributions depend on covariates. In this context, functional central limit theorems for predictive distributions with fixed covariate values provide Gaussian approximations for conditional laws, with applications to regression and modern prediction-based learning methods.

11:10-12:50 Session 16B: CS101: Statistical methods for complex spatial data analysis
11:10
Bayesian nonparametric inference for covariate-driven point processes
PRESENTER: Matteo Giordano

ABSTRACT. A central task in the statistical analysis of spatial point patterns is to infer the relationship between the point distribution and a collection of covariates of interest. This talk will present recent theoretical and methodological advances for covariate-based nonparametric Bayesian intensity estimation. We devise a “multi-bandwidth” Gaussian process method, and prove that it achieves optimal and adaptive posterior contraction rates in observation schemes with replicated observations of the point pattern and the covariates. Our result cover the case of “anisotropic” intensity functions, which is common in applications where the covariates have different physical nature. We further show how posterior inference can be implemented in practice via a suitable Metropolis-within-Gibbs sampling algorithm. Lastly, we will illustrate the performance of the method via numerical simulations, and present an application to a Canadian wildfire dataset. Joint work Patric Dolmeta.

11:35
Low-Dose Tomography of Random Fields and the Problem of Continuous Heterogeneity
PRESENTER: Alessia Caponera

ABSTRACT. We consider the problem of nonparametric estimation of the conformational variability in a population of related structures, based on low-dose tomography of a random sample of representative individuals. In this context, each individual represents a random perturbation of a common template and is imaged noisily and discretely at but a few projection angles. Such problems arise in the cryo Electron Microscopy of structurally heterogeneous biological macromolecules. We model the population as a random field, whose mean captures the typical structure, and whose covariance reflects the heterogeneity. We show that consistent estimation is achievable with as few as two projections per individual, and derive uniform convergence rates reflecting how the various parameters of the problem affect statistical efficiency, and their trade-offs. Our analysis formulates the domain of the forward operator to be a reproducing kernel Hilbert space, where we establish representer and Mercer theorems tailored to question at hand. This allows us to exploit pooling estimation strategies central to functional data analysis, illustrating their versatility in a novel context. We provide an efficient computational implementation using tensorized Krylov methods and demonstrate the performance of our methodology by way of simulation.

12:00
Guidelines for Cubature-based Likelihood approximation of 3D Poisson Point Processes
PRESENTER: Marco Tarantino

ABSTRACT. Poisson point processes are the simplest, yet fundamental, models for the analysis of spatial and spatio–temporal point patterns. They can be used to describe the locations of events or objects of interest and to estimate the intensity of these point patterns within a de- fined region. Beyond Poisson models, many widely used spatial and spatio–temporal point process models are built on a Poisson–type intensity. For instance, log–Gaussian Cox pro- cesses [1] and self–exciting or Hawkes–type processes [2] typically relies on a well–specified first–order intensity, and on a Poisson (or Poisson–like) likelihood to estimate it. Conse- quently, obtaining consistent, low–bias first–order intensity estimates is critical not only for Poisson models but also for the robustness of more complex models built upon them. Likelihood-based inference for three-dimensional Poisson point process models requires approximating the integral term in the log-likelihood over a 3D observation window. In practice this is done via a quadrature scheme [3], which replaces the integral with a weighted sum over observed points and a set of dummy points. Despite its widespread use, the accu- racy of the approximation strongly depends on a few tuning parameters: whether dummy points are placed on a regular grid or randomly, the dummy-to-data multiplier q, so that nd = q n dummy points are generated (where n represents the number of observed points in the pattern), and the window partition resolution nc, providing n3 c voxels and cubature weights. Clear guidance on how to select these parameters to obtain reliable inference is missing, especially in three dimensions, while concerns about quadrature accuracy have long been noted [4]. We formalise the cubature scheme and its replicated version for multi-type/categorical marks, and develop data-driven guidelines through an extensive simulation study. We simu- late three-dimensional inhomogeneous Poisson processes across different scenarios in which the intensity depends on coordinates, external covariates and categorical marks, with mul- tiple sample sizes, fitting models over a wide grid of (q, nc) values under both dummy-point layouts. Performance is assessed via the mean squared error of parameter estimates and by second-order diagnostics such inhomogeneous K-function weighted by the fitted intensity. Across scenarios, we identify well-delimited regions of (q, nc) that provides stable likelihood approximations and accurate intensity recovery. A validation study confirms the robustness of these recommendations. Finally, an application to 2008 Greece background seismicity shows that baseline cubature choices can fail diagnostics, whereas guideline-based settings provide coherent parameter estimates and pass global envelope tests.

12:25
Bayesian nonparametric estimation of spatio-temporal Hawkes processes

ABSTRACT. We develop a Bayesian nonparametric framework for inference in multivariate spatio-temporal Hawkes processes, extending existing theoretical results beyond the purely temporal setting. Our framework encompasses modelling both the background and triggering components of the Hawkes process through Gaussian process priors. Under appropriate smoothness and regularity assumptions on the true parameter and the nonparametric prior family, we derive posterior contraction rates for the intensity function and the parameter, in the asymptotic regime of repeatedly observed sequences. These results provide, to our knowledge, the first theoretical guarantees for Bayesian nonparametric methods in spatio-temporal point data. We also show that we can numerically approximate the posterior via variational inference and demonstrate the benefit of nonparametric modelling in the context of spatio-temporal events.

11:10-12:50 Session 16C: CS119: Statistical Learning through Kernels and Transport
11:10
Flowing Datasets with Wasserstein over Wasserstein Gradient Flows

ABSTRACT. Many applications in machine learning involve data represented as probability distributions. The emergence of such data requires radically novel techniques to design tractable gradient flows on probability distributions over this type of (infinite-dimensional) objects. For instance, being able to flow labeled datasets is a core task for applications ranging from domain adaptation to transfer learning or dataset distillation. In this setting, we propose to represent each class by the associated conditional distribution of features, and to model the dataset as a mixture distribution supported on these classes (which are themselves probability distributions), meaning that labeled datasets can be seen as probability distributions over probability distributions. We endow this space with a metric structure from optimal transport, namely the Wasserstein over Wasserstein (WoW) distance, derive a differential structure on this space, and define WoW gradient flows. The latter enables to design dynamics over this space that decrease a given objective functional. We apply our framework to transfer learning and dataset distillation tasks, leveraging our gradient flow construction as well as novel tractable functionals that take the form of Maximum Mean Discrepancies with Sliced-Wasserstein based kernels between probability distributions.

11:35
Hilbert-Based Correlation Indices for Distributional Data

ABSTRACT. We propose a unified mathematical framework for defining indices of dependence for random probability measures by embedding them into a Hilbert space and applying correlation indices to the resulting Hilbert-valued random variables. This approach overcomes the lack of a linear structure in the space of probability measures and relies on two sources of variability: the choice of the correlation index and the choice of the embedding. We consider Canonical Correlation, Centered Alignment, and Trace Correlation, combined with a Wasserstein-based embedding and two kernel-based embeddings, allowing us to reinterpret existing dependence measures and extend Kernelized Canonical Correlation and Centered Kernel Alignment to distribution-valued data. We characterize the extremal behavior of the proposed indices under independence, almost sure equality, and equality up to linear push-forward transformations, providing theoretical guarantees and interpretability. Numerical experiments on synthetic data and an application to hierarchical clustering of cortical regions in functional brain imaging illustrate the practical relevance of the framework.

12:00
Wasserstein Least Squares: statistics, geometry, and algorithms

ABSTRACT. We present Wasserstein Least Squares Regression (WLSR), a model that canonically extends least squares regression in the presence of vector-valued covariates and distribution-valued responses. Unlike competing proposals, which focus on the linear structure in the space of probability measures, ours works directly with the functional form of linear regression. In this talk, we will delve into the geometry of WLSR and draw methodological connections with regression models in Eucledian space.

12:25
Learning ergodic dynamical systems from finite trajectories
PRESENTER: Lorenzo Rosasco

ABSTRACT. We consider the problem of learning ergodic dynamical systems from a finite trajectory. We derive learning guarantees for a basic least squares estimator and contrast them with classical supervised learning results for independent and identically distributed data. We further provide extensions to higher-order systems, systems with finite state spaces, and learning Koopman operators. Our analysis integrates tools from statistical learning theory and Markov processes, together with suitable concentration results for non-i.i.d.\ Hilbert space--valued random variables.

11:10-12:50 Session 16D: CS127: Asymptotics of random graphs
11:10
Probability graphons and large deviations for random weighted graphs

ABSTRACT. Graph limit theory studies the convergence of sequences of graphs as the number of vertices grows, providing an effective framework for representing large networks. In this talk, I will give a brief introduction to graph limits and report on recent extensions to weighted graphs, colored graphs and multiplex networks (probability graphons and P-variables). As an application of this theory I will present a large deviation principle (LDP) for random weighted graphs that generalizes the LDP for Erdős-Rényi random graphs by Chatterjee and Varadhan (2011), based on joint work with Pierfrancesco Dionigi.

11:35
Convergence of subgraph densities in ERGMs
PRESENTER: Elena Magnanini

ABSTRACT. Exponential Random Graphs are a class of network models that can be seen as the gen- eralization of the dense Erdős–Rényi random graph. They are defined, with a statistical mechanics approach, by introducing a Hamiltonian, a function that biases the occurrence of certain features, such as the number of edges or triangles. In this talk we will primarily focus on the so-called edge triangle model, where the Hamiltonian of the system only collects edge and triangle densities, properly tuned by real parameters. Using tools from statistical mechanics and large deviation theory, we establish limit theorems and concentration inequalities for subgraph densities (mainly focusing on edge and triangle density) in the replica-symmetric regime, where the limiting free energy of the model is known together with its phase diagram. Part of the results are concerned with a mean-field approximation, which allows for explicit computations and provides insights into the behavior of the original model in certain parameter region where rigorous results are hardly achievable. A generalization of the model in which vertices are allowed to carry a type will also be discussed. This talk is based on joint work with A. Bianchi, F. Collet, and G. Passuello.

12:00
Spectral properties of directed inhomogeneous graphs

ABSTRACT. We consider the spectrum of the adjacency matrix of directed inhomogeneous graphs with independent edges. Our framework includes, directed stochastic block models and the directed Chung–Lu model. We assume that the expected adjacency matrix has k non-zero eigenvalues of multiplicity 1 and we scale connection probabilities, so that average degrees diverge at least poly-logarithmically in the number of vertices. We establish the existence, with high probability, of k real outliers of the spectrum, whose scale is the square of the bulk's one. We further show that, centering and properly rescaling, the joint law of the k outliers converges in distribution to Gaussian multivariate law with an explicit covariance matrix. This complements previous works in the symmetric setting. Based on an ongoing work with Rajat Hazra.

12:25
Exponential Random Edge-Colored Graphs via Probability Graphons: Free Energies and Extremal Colorings

ABSTRACT. In joint work with B. Bhattacharya, A. Ganguly and G. Zucal, we study dense edge-colored exponential random graph models (ERGMs) through the language of probability graphons, building on the large deviation principle developed with G. Zucal in a recent paper. For a finite set of k colors, a probability graphon is a symmetric measurable map $W:[0,1]^2\to\Delta_k$ (the k probability simplex), extending the usual graphon formalism to colored graphs (and, more generally, to random weighted graphs with a prescribed edge-color law).

Within this framework, the asymptotic log-partition function admits a variational characterization that balances the chosen Hamiltonian with an explicit relative-entropy functional, extending the dense-graph ERGM theory to the colored setting. The same formulation yields compactness of maximizers and natural optimality conditions (Euler-Lagrange type) for typical interaction terms built from chromatic subgraph densities. We also discuss qualitative consequences for the typical structure of the model depending on its parameters, including regimes of uniqueness/replica-symmetric behavior and the onset of symmetry breaking as parameters vary.

Finally, a low-temperature (large-parameter) scaling links the probabilistic variational problem to extremal combinatorics: the model concentrates around near-extremizers of the underlying density functional, providing an entropic/probabilistic-method perspective on stability phenomena. We illustrate this connection on rainbow-triangle-type objectives and related extremal colorings.

11:10-12:50 Session 16E: CS157: Recent Advances in Stochastic Volterra Equations
11:10
Kolmogorov equations for stochastic Volterra processes with singular kernels

ABSTRACT. We associate backward and forward Kolmogorov equations to a class of fully nonlinear Stochastic Volterra Equations (SVEs) with convolution kernels $K$ that are singular at the origin. Working on a carefully chosen Hilbert space $\mathcal{H}_1$, we rigorously establish a link between solutions of SVEs and Markovian mild solutions of a Stochastic Partial Differential Equation (SPDE) of transport-type. Then, we obtain two novel Itô formulae for functionals of mild solutions and, as a byproduct, show that their laws solve corresponding Fokker--Planck equations. Finally, we introduce a natural notion of ``singular" directional derivatives along $K$ and prove that (conditional) expectations of SVE solutions can be expressed in terms of the unique solution to a backward Kolmogorov equation on~$\mathcal{H}_1$. Our analysis relies on stochastic calculus in Hilbert spaces, the reproducing kernel property of the state space $\mathcal{H}_1,$ as well as crucial invariance and smoothing properties that are specific to the SPDEs of interest. In the special case of singular power-law kernels, our conditions guarantee well-posedness of the backward equation either for all values of the Hurst parameter $H,$ when the noise is additive, or for all $H>1/4$ when the noise is multiplicative.

11:35
Stochastic Volterra Equations on Convex Domains

ABSTRACT. We will present in this talk sufficient conditions on the kernel and on the coefficients to get the existence of a solution that stays in a convex domain. The underlying tool is an approximation scheme that also stays in this domain. Applications include: a comparison result for scalar SVEs, existence of solutions possibly with a jump component, weak second-order approximation schemes for SVEs with multifactor kernels such as the multi-factor approximation of the rough Heston model.

12:00
Explosions of stochastic Volterra equations
PRESENTER: Sergio Pulido

ABSTRACT. We present a Feller-type test for explosions of one-dimensional continuous stochastic Volterra processes of convolution type. We focus on dynamics driven by nonsingular kernels, which preserve the semimartingale property of the processes while incorporating memory effects through a path-dependent drift. For the Volterra square-root diffusion, also known as the Volterra CIR process, we provide a detailed discussion of the approximation of the singular fractional kernel by a sum of exponentials, a technique commonly used in the mathematical finance literature.

12:25
The Volterra signature
PRESENTER: Luca Pelizzari

ABSTRACT. In this talk, we introduce the Volterra signature -- an extension of Chen's path signature that incorporates memory kernels in a principled way. Formally, it is defined as the collection of iterated integrals arising from Picard expansions of linear controlled/stochastic Volterra equations, and thus plays the role of the resolvent associated with such equations. The additional flexibility provided by the kernel yields a powerful, memory-aware feature map for machine-learning applications to path and time-series data. In the first part of the talk, we leverage analytic and algebraic properties to prove learning-theoretic results, including universal approximation theorems for continuous functionals on path spaces and PDE-based kernel tricks for the associated reproducing kernel Hilbert space (RKHS). Moreover, to exploit these learning guarantees, we develop practical algorithms to compute Volterra signatures for time series across a broad class of kernels, relying on the fundamental Volterra--Chen relation. Finally, we present first applications on synthetic and real data, showing promising performance in learning tasks with complex memory dependence. In the second part, if time permits, we discuss ongoing research on stochastic Volterra signatures, including explicit expected signature formulas, stochastic Taylor expansions, and Wong--Zakai type of approximations.

11:10-12:50 Session 16F: CS173: Probability for Graph Algorithms
11:10
Approximate 2-hop neighborhoods on incremental graphs: An efficient lazy approach

ABSTRACT. In this work, we propose, analyze and empirically validate a lazy-update approach to maintain accurate approximations of the $2$-hop neighborhoods of dynamic graphs resulting from sequences of edge insertions.

We first show that under random input sequences, our algorithm exhibits an optimal trade-off between accuracy and insertion cost: it only performs $O(\frac{1}{\varepsilon})$ (amortized) updates per edge insertion, while the estimated size of any vertex's $2$-hop neighborhood is at most a factor $\varepsilon$ away from its true value in most cases, \emph{regardless} of the underlying graph topology and for any $\varepsilon > 0$.

As a further theoretical contribution, we explore adversarial scenarios that can force our approach into a worst-case behavior at any given time $t$ of interest. We show that while worst-case input sequences do exist, a necessary condition for them to occur is that the \textit{girth} of the graph released up to time $t$ be at most $4$.

Finally, we conduct extensive experiments on a collection of real, incremental social networks of different sizes, which typically have low girth. Empirical results are consistent with and typically better than our theoretical analysis anticipates. This further supports the robustness of our theoretical findings: forcing our algorithm into a worst-case behavior not only requires topologies characterized by a low girth, but also carefully crafted input sequences that are unlikely to occur in practice.

Combined with standard sketching techniques, our lazy approach proves an effective and efficient tool to support key neighborhood queries on large, incremental graphs, including neighborhood size, Jaccard similarity between neighborhoods and, in general, functions of the union and/or intersection of $2$-hop neighborhoods.

11:35
Maintaining k-MinHash Signatures over Fully-Dynamic Data Streams with Recovery

ABSTRACT. We consider the task of performing Jaccard similarity queries over a large collection of items that are dynamically updated according to a streaming input model. An item here is a subset of a large universe $U$ of elements. A well-studied approach to address this important problem in data mining is to design \textit{fast-similarity data sketches}. In this paper, we focus on \textit{global solutions} for this problem, i.e., a single data structure which is able to answer both \textit{Similarity Estimation} and \textit{All-Candidate Pairs} queries, while also dynamically managing an arbitrary, online sequence of element insertions and deletions received in input.

In this talk, we introduce and provide an in-depth analysis of a dynamic, buffered version of the well-known $k$-min hash sketch. This buffered version better manages critical update operations thus significantly reducing the number of times the sketch needs to be rebuilt from scratch using expensive recovery queries. We prove that the \textit{buffered} $k$-min hash uses $O(k \log |U|)$ memory words per subset and that its \textit{amortized} update time per insertion/deletion is $O(k \log |U|)$ \textit{with high probability}. Moreover, our data structure can return the $k$-min hash signature of any subset in $O(k)$ time, and this signature is exactly the same signature that would be computed from scratch (and thus the quality of the signature is the same as the one guaranteed by the static $k$-min hash

12:00
Payment-failure times for random Lightning paths

ABSTRACT. We study a random process inspired by the payment execution mechanism of the Lightning Network \cite{ref_poon2016}, the main layer-two solution on top of Bitcoin \cite{ref_nakamoto2008}, represented as a graph in which users correspond to nodes and payment channels to bidirectional weighted edges with capacities. Each channel has a fixed publicly known capacity, while the balance, which specifies how this capacity is distributed between the two endpoints, is private and known only to the channel owners. Each user can make payments directly to adjacent nodes or indirectly through intermediate nodes, where payments will succeed only if all channels along the payment path can handle the required amount. The process we study is as follows: given an undirected graph $G$, where each edge $e$ has a capacity $C_e$ and each of its endpoints $u$ and $v$ has a balance $b_e(u)$ and $b_e(v)$, such that $C_e = b_e(u) + b_e(v)$, with an initial capacity distributed equally between the endpoints. In each round, a payment of one unit is executed by choosing two nodes $u$ and $v$, and then selecting a shortest path among all possible shortest paths between them, both uniformly at random. Our goal is to investigate how long it takes for the first payment failure to occur, depending on the topology of the graph and the channel capacities. We first prove almost tight upper and lower bounds as a function of the number of nodes and the edge capacities when the underlying graph is complete. Then, we show how such a random process is related to the edge-betweenness centrality \cite{ref_girvan2002} measure and we prove upper and lower bounds for arbitrary graphs as a function of edge-betweenness and capacity. Finally, we validate our theoretical results by running extensive simulations over some classes of graphs, including snapshots of the real Lightning Network.

12:25
The Minority Dynamics

ABSTRACT. Consider $n$ agents labeled $\{1, \dots, n\}$, each holding an arbitrary initial binary opinion $x_i \in \{0,1\}$. We study the \emph{minority dynamics}, in which, at each round, each agent $i$ samples $k$ opinions uniformly at random from $\{x_1, \dots, x_n\}$, and then replaces $x_i$ with the \emph{least common} value among the sampled opinions. The minority dynamics is of interest in computer science and distributed algorithms due to its connection with the \emph{bit-dissemination problem}, which models information spread in biological systems.

This process was previously analyzed in \cite{sodapaper}, where it was shown that if $k = \Omega(\sqrt{n \log n})$ and $k \le n/2$, the system converges to a unanimous state (all 0's or all 1's) within $O(\log^2 n)$ rounds with high probability.

In this work, we analyze the minority dynamics for \emph{polylogarithmic sample sizes}, i.e., $k = \Omega(\mathrm{polylog}(n))$, and show that consensus is still reached rapidly, in $O(\mathrm{polylog}(n))$ rounds with high probability. The chaotic and non-monotone nature of the minority dynamics makes its analysis depart significantly from that of previously studied consensus dynamics in similar settings, as it precludes the identification of a natural potential function to measure progress toward consensus.

11:10-12:50 Session 16G: CS187: Memory effects in Markov and non-Markov stochastic processes
11:10
{Macro-Micro Inference for Galves-Löcherbach Processes: Robust Synaptic Classification via Spike-Triggered Extrapolation

ABSTRACT. In this talk, we address the challenge of synaptic classification from pairwise observational data in networks of neurons. We introduce a novel framework based on ``Spike-Triggered Extrapolation'' that exploits the local reset property of Galves-Löcherbach dynamics. By analyzing the system’s behavior at a finite macroscopic resolution, we develop a Pyramid Extrapolation algorithm capable of performing inference on the microscopic limit ($\Delta \to 0^+$) without requiring data sampling at extremely small time scales. This approach effectively decouples local synaptic signals from global network noise, ensuring robust classification even in high-noise regimes. This work is dedicated to the memory of Antonio Galves.

11:35
Laplace's first law of errors applied to Markov and non-Markov diffusive motion
PRESENTER: Eli Barkai

ABSTRACT. Laplace's first law of errors suggests that measurement errors often have double-exponential tails, decaying as $\exp(-c|x|)$. By contrast, standard Brownian motion yields a Gaussian positional density $P(x,t)$, as expected from the central limit theorem. Yet Laplace-type spatial tails in $P(x,t)$ for spreading clouds of random walkers have been reported in diverse settings, including tracers in glasses, live cells, and bacterial suspensions. Using large-deviation theory, we show that such exponential behavior arises broadly in transport through random media, both for models with quenched disorde and for continuous-time random walks. In these systems, the relevant large-deviation rate function is generically linear over a wide regime, producing Laplace-like tails. When the jump-length distribution is sub-exponential, this mechanism breaks down: rare displacements are dominated by a single atypically large jump, leading to a crossover to the big-jump regime.

12:00
Functional marked self-exciting point process models

ABSTRACT. Point processes are stochastic models for discrete events occurring in continuous space, time, or space–time domains. When each event carries an additional attribute, such as earthquake magnitude or burned area, the process is called a marked point process. Self-exciting point processes represent a natural framework for modeling memory effects in non-Markov stochastic processes, since the occurrence of an event increases the probabil- ity of future events through the dependence encoded in the conditional intensity function [4]. In such models, the intensity is typically decomposed into a background component and a triggering component, the latter capturing the temporal persistence and excitation mechanism. Traditional approaches mainly focus on the spatio-temporal configuration of events, without incorporating additional information. Only recently have models including ex- planatory variables been developed for the analysis of epidemic phenomena [8], seismicity [2], and crimes [10,5]. At the same time, interest in point processes with functional marks has increased. [7] formalized functional marked point processes (FMPPs), where marks are random elements in a (Polish) function space, such as temporal signals or spatial trajectories. In seismology, for example, each earthquake may be associated with a ground-motion waveform or its spectral representation, both naturally treated as functional covariates. Motivated by the will of jointly analysing earthquake locations and their waveform characteristics, [6] introduced local inhomogeneous mark-weighted summary statistics to detect spatial dependence in functional marks and proposed a test to identify regions where the random labelling assumption fails. However, this approach implicitly calls for fully specified FMPP models, which remain underdeveloped. To address this gap, we extend the Epidemic Type Aftershock Sequence (ETAS) model [9] by incorporating functional marks derived from waveform data into the triggering com- ponent, thereby enriching the representation of memory effects driving seismic activity. Following [1], waveforms are summarized through Functional Principal Component Anal- ysis (FPCA) scores, which are included as covariates in the triggering term. Estimation is performed via the Forward Predictive Likelihood (FLP) approach [3]. The goal is to assess the presence of local spatial dependence in the functional marks and to evaluate how the inclusion of waveform-based covariates in the triggering part improves the fit to earthquake sequences.

12:25
A fractional Hawkes process

ABSTRACT. I summarise results of three papers [1,2,3] on a fractional Hawkes process with kernel proportional to the probability density function of Mittag-Leffler random variables. This is joint work with Jane A. Aduda, Maggie Chen, the late Alan G. Hawkes, Cassien Habyarimana, and Federico Polito. The code used to generate simulations and figures is freely available from https://github.com/habyarimanacassien/Fractional-Hawkes.

1. Chen, J., Hawkes, A.G., Scalas, E.: A Fractional Hawkes Process, In: Beghin, L., Mainardi, F., Garrappa, R. (eds.) Nonlocal and Fractional Operators. SEMA SIMAI Springer Series, vol 26. Springer, (2021). 2. Habyarimana C., Aduda J.A., Scalas, E., Chen, J., Hawkes A.G., Polito F.: A fractional Hawkes process II: Further characterization of the process. Physica A 615, 128596, (2023). 3. Habyarimana C., Aduda J.A., Scalas E.: Parameter estimation for the fractional Hawkes process. Journal of Agricultural, Biological and Environmental Statistics, (2024). https://doi.org/10.1007/s13253-024-00663-5.

11:10-12:50 Session 16H: CS190: Methodological Issues in Multidimensional and Composite Data Analysis Organizer
11:10
Nonparametric Inference for Multivariate and Complex Data Structures

ABSTRACT. The increasing availability of high-dimensional, heterogeneous, and non-Gaussian data has reinforced the importance of nonparametric inference in modern statistical analysis. In many applied contexts, classical parametric assumptions such as normality, linearity, and homoscedasticity are often violated, potentially leading to biased or misleading conclusions[1]. As a result, rank-based and distribution-free methods have gained renewed attention in the analysis of complex data structures. This contribution focuses on nonparametric methods for multivariate analysis, with particular emphasis on Spearman’s rank correlation coefficient. Using rank differences, the classical expression is ρS = 1 − 6 Pni=1 d2i n (n2 − 1), (1) where di is the difference between the ranks of two variables for the i-th pair of data, and n is the number of pairs of observations, a robust measure of dependence originally introduced by Spearman [2]. ρS captures monotonic relationships by operating on ranked data, making it especially suitable for ordinal variables, skewed distributions, nonlinear associations, and datasets affected by outliers [3]. These features are increasingly common in real-world applications, where strict parametric assumptions are rarely satisfied. Within multivariate frameworks, ρS plays a dual role. First, it provides an interpretable measure of pairwise association that remains stable under deviations from normality. Second, it is an effective diagnostic tool for detecting multicollinearity and near-redundancy among variables, a critical issue in multivariate modeling and variable selection procedures[4]. High rank correlations can be used as thresholds to identify redundant information, improving model parsimony and robustness. The methodological relevance of ρS is illustrated through its application to the analysis of the Italian pension system, a socio-economic system characterized by strong interdependencies between demographic and economic variables. Using official institutional data and a nonparametric correlation-based approach, associations among pension costs, revenue inflows, GDP, employment rates, and retirement indicators are explored without imposing restrictive distributional assumptions. The results reveal extremely strong rank correlations among key economic aggregates, confirming structural dependencies previously highlighted in the literature on pension system sustainability [5,6]. Moreover, the analysis uncovers significant regional heterogeneity across macro-areas, emphasizing the complexity of territorial dynamics. Beyond descriptive analysis, ρS serves as a foundational step for subsequent inferential and forecasting procedures. In particular, it supports informed variable selection prior to the application of time-series models on non-stationary data, enhancing both interpretability and statistical stability.

CS000: Methodological Issues in Multidimensional and Composite Data Analysis organized by Massimiliano Giacalone and Gianfranco Piscopo. References 1. Hair, J.F., Black, W.C., Babin, B.J., Anderson, R.E.: Multivariate Data Analysis: A Global Perspective. Pearson (2010) 2. Spearman, C.: The proof and measurement of association between two things. The American Journal of Psychology 15(1), 72–101 (1904) 3. Bocianowski, J., Wrońska-Pilarek, D., Krysztofiak-Kaniewska, A.,: Comparison of Pearson’s and Spearman’s correlation coefficients values for selected traits. Biometrical Letters (2023)

11:35
Permutation Tests and NPC Methodology. Theory and Application in Education, in Clinical Research and in Social Security

ABSTRACT. Real-world data frequently violate the assumptions underlying classical parametric inference, including normality, homoscedasticity, and balanced experimental designs. These challenges are particularly relevant in applied contexts characterized by heterogeneous samples, mixed outcome types, and small data sizes. This work investigates permutation tests and the Nonpara metric Combination (NPC) methodology as a unified framework for robust statistical inference under minimal assumptions. Permutation tests reconstruct the null distribution of test statistics through resampling based on exchangeability, yielding exact or Monte Carlo–exact inference without distributional constraints. The NPC methodology ex tends this approach to multivariate settings by combining partial permu tation tests using suitable combining functions, such as Fisher statistics, enabling global inference while preserving dependence structures through synchronized permutations and controlling the family-wise error rate. From an applied perspective, this framework naturally accommodates mixed mea surement scales, correlated endpoints, and unbalanced designs. The empirical contribution is illustrated through three interdisciplinary applications. In education, the methodology evaluates teaching effectiveness using a randomized classroom experiment with heterogeneous outcomes, in cluding continuous improvement measures and ordinal satisfaction indica tors, providing strong global evidence of learning gains. In clinical research, the NPC framework is applied to a dataset of patients affected by necrotiz ing fasciitis, combining continuous biomarkers and binary survival outcomes within a small heterogeneous sample. The analysis identifies key predictors and demonstrates the stability of permutation-based inference in noisy medi cal data. Finally, in the social security domain, permutation tests and NPC methodology are employed to assess regional pension disparities in Italy using administrative microdata, revealing statistically significant territorial inequality and highlighting the ability of the framework to integrate corre lated socio-economic indicators into a single coherent inferential measure. Overall, permutation-based NPC methods emerge as robust, interpretable, and distribution-free tools for interdisciplinary statistical analysis, offering a practically relevant alternative to classical parametric approaches. MSC 2020: 62G09; 62H15; 62P10; 62P20; 62P25 References 1. Pesarin, F. (2001). Multivariate Permutation Tests with Applications in Biostatistics. Wiley. 2. Pesarin, F., Salmaso, L. (2010). Permutation Tests for Complex Data: Theory, Applications and Software. Wiley. 3. Giacalone, M., Piscopo, G., Bandaru, S.T. (2025). Permutation-based analysis of clinical variables using NPC methodology. Mathematics

12:00
An EWMA Control Chart to Monitor a Multivariate Binomial Process

ABSTRACT. An EWMA Control Chart to Monitor a Multivariate Binomial Process

Paolo Carmelo Cozzucoli

Department of Economics, Statistics and Finance ’Giovanni Anania’, University of Calabria, Italy,

paolo.cozzucoli@unical.it

This contribution extends the control chart defined by Cozzucoli and Ingrassia (2008) for monitoring multivariate binomial processes. In many industrial and service applications, items are inspected for multiple non-mutually exclusive defects, making the multivariate binomial distribution a more appropriate model than the multinomial. The paper (Cozzucoli and Ingrassia, 2008) proposed a two-sided Shewhart-type control chart based on a weighted index of overall defectiveness, , which incorporates severity weights for different defect types. However, Shewhart charts are known to be relatively insensitive to small and moderate process shifts. To address this limitation, we propose an enhancement by integrating the ξ index into an Exponentially Weighted Moving Average (EWMA) framework, creating an EWMA-ξ chart. The EWMA-ξ chart is expected to offer superior sensitivity to small and persistent changes in the overall defectiveness level while maintaining a desired in-control Average Run Length (ARL). The core contribution of this paper is a comprehensive Monte Carlo simulation framework designed to evaluate and compare the performance of both the Shewhart-ξ and EWMA-ξ charts. The simulation plan systematically varies key practical factors, including the number of defect categories (k =2,3,5), sample sizes (n =50,100,200,500), in-control probability vectors (low, medium, and skewed defectiveness), weighting schemes (linear, geometric, and cutoff), and correlation structures (independence, weak positive, strong positive). Performance is assessed using ARL metrics, with the EWMA-ξ chart calibrated to match the nominal ARL0 of the Shewhart-ξ chart (e.g., 370).

Keywords: multivariate binomial process · EWMA control chart · ARL · Monte Carlo simulations .

Contributed Session

C190: Methodological Issues in Multidimensional and Composite Data Analysis organized by Massimiliano Giacalone and Gianfranco Piscopo

References

  1. Cassady C., Nachlas J.A.: Evaluating and implementing 3-level control charts, Quality Engineering, 18(3), 285-292 (2006).
  2. Cozzucoli P., Ingrassia S.: A Control Chart to Monitor a Multivariate Binomial Process. In: Proceedings of the XLIV Riunione Scientifica, XLIV Riunione Scientifica della Società Italiana di Statistica, Cleup, Padova, 2008. ISBN 978-88-6129-228-4
  3. Hotelling H.: Multivariate Quality Control. In Techniques of Statistical Analysis. McGraw-Hill (1947).
  4. Johnson N.L., Kotz S.,Balakrishnan N.: Discrete Multivariate Distributions.. Wiley, New York (1997).
  5. Lowry, C.A., Woodall, W.H., Champ, C.W., & Rigdon, S.E.: A Multivariate Exponentially Weighted Moving Average Control Chart. Technometrics, 34(1), 46-53 (1992).
  6. Marcucci M.: Monitoring multinomial process. Journal of Quality Technology, 17(2), 86-91 (1985).
  7. Montgomery D.C.: Introduction to Statistical Quality Control process, 8-th edition. Wiley, New York (2024).
  8. Roberts S.W.: Control Chart Tests Based on Geometric Moving Averages. Technometrics, 1(3), 239-250 (1959).
  9. Westfall P.H., Young S.Y.: p-Value adjustments for multiple tests in multivariate binomial models, Journal of the American Statistical Association, 84(407), 780-786 (1989).
  10. Westfall P.H., Young S.S.: Resampling-based Multiple Testing, Examples and Methods for p-value adjustment. Wiley, New York (1993).
12:25
New Distributed Beacon-Based Approaches for Multi-Parameter Monitoring: Sentinel-GRID
PRESENTER: Angelo Romano

ABSTRACT. The increasing complexity of low-voltage distribution assets calls for scalable, non-intrusive monitoring solutions that are sustainable in terms of energy consumption and maintenance. This contribution presents Sentinel-GRID, a distributed, beacon-based approach for multi-parameter monitoring of secondary substations and distribution transformers, built around ultra-low-power IoT sensor nodes operating in an event-driven "fit-and-forget" paradigm. The nodes integrate heterogeneous sensing (electromagnetic signatures, temperature, vibration and acoustic indicators) and remain mostly in deep sleep, waking only when relevant conditions occur, thus drastically reducing energy draw and enabling multi-year autonomy, potentially supported by micro energy-harvesting. Data collection relies on a hybrid communication architecture: BLE beacons [2] for opportunistic acquisition through mobile gateways (e.g., smartphones carried by maintenance crews interacting in harsh environments via a PWA and Data over Audio links [1]) and longrange connectivity (e.g., LoRaWAN) for periodic reporting and resilient backhaul. In critical situations, the system enables priority alerting over high-availability channels. Alongside physical sensors, the solution also includes low-power computer-vision nodes for visual inspection: by monitoring retroreflective strips applied near tightening points, it is possible to estimate micro-displacements and movements of transformer bolts [3], providing a direct indicator of mechanical loosening and supporting multimodal diagnostics. On the backend, a cloud platform integrates a Digital Twin and ML/AI analytics tailored to sparse and asynchronous data streams, offering anomaly detection, explainable diagnostics, and remaining useful life (RUL) estimation. A representative use case is the early identification of loosened bolted connections, a typical precursor to overheating and reliability degradation. The proposed approach aims to reduce OPEX, increase safety, and accelerate predictive maintenance adoption in distribution networks.