CM2025: 22ND COPPER MOUNTAIN CONFERENCE ON MULTIGRID METHODS
PROGRAM FOR TUESDAY, APRIL 15TH
Days:
previous day
next day
all days

View: session overviewtalk overview

07:00-08:15Breakfast Buffet
08:00-10:05 Session 7A: Coupled physics problems and iterative methods
Location: Bighorn B
08:00
Parameter-robust Preconditioners for the Stokes-Darcy Coupled Problem without Fractional Operators
PRESENTER: Xiaozhe Hu

ABSTRACT. We consider the Stokes-Darcy coupled problem, which models the interaction between free-flow and porous medium flow. By enforcing the normal flux continuity interface condition directly within the finite-element spaces, we establish unified well-posedness results for the coupled system under various boundary condition scenarios. Using the operator preconditioning framework, we develop a parameter-robust preconditioner that avoids the use of fractional operators. Numerical experiments employing $H(\operatorname{div})$-conforming is presented to confirm the theoretical findings and demonstrate the robustness of the proposed block preconditioners with respect to the physical parameters and mesh size.

08:25
Decoupled solution methods for multiphysics problems based on dynamic interface flux surrogates
PRESENTER: Pavel Bochev

ABSTRACT. Digital twin (DT) modeling, simulation, and data assimilation tasks often require numerical solution of complex multiphysics system models comprising many heterogeneous components. Decoupled, or partitioned solution methods for such models can significantly improve the efficiency of DT workflows. Typically, decoupled methods employ data transfers between the constituent physics components to synchronize them and enable their independent solution. By treating each subproblem as a separate entity, these methods enable code reuse, increase concurrency and provide a convenient framework for plug-and-play multiphysics simulations. However, accuracy and stability of partitioned methods depends critically on the type of information exchanged between the subproblems. The exchange mechanisms can vary from minimally intrusive remap across interfaces to more accurate but also more intrusive and expensive estimates of the necessary information, such as dual Schur complements or variational flux recovery techniques. Data-driven system identification techniques provide an opportunity to close the accuracy, performance and intrusiveness gaps between these exchange mechanisms by enabling the construction of accurate, computationally efficient and minimally intrusive data transfer surrogates. This approach shifts the principal computational burden to an offline phase, leaving the application of the surrogate as the sole additional cost during the online simulation phase. To demonstrate the approach, we consider the problem of diffusive transport of a scalar quantity across an interface separating two different materials. In the offline phase we use training data generated by simulating this problem with appropriately chosen initial conditions to learn small dynamical systems approximating the evolution of the interface flux. In the online, simulation phase, we use these dynamic flux surrogates to estimate the flux exchanged between the subdomains for arbitrary initial conditions. We discuss several different strategies for learning the flux surrogates, such as Dynamic Mode Decomposition (DMD) and Operator Inference. Numerical results confirm that the computational cost of the resulting decoupled scheme is comparable to that of a traditional loosely coupled partitioned method, while its accuracy is significantly higher.

08:50
Subspace Descent Method for Non-linear Problems
PRESENTER: Zixiao Yang

ABSTRACT. The subspace descent (SD) method and its randomized variant have recently been proposed for solving nonlinear optimization problems, providing a theoretical framework for analyzing algorithms such as coordinate descent and the full approximation storage (FAS) scheme. In this talk, we compare FAS with nested iteration, another widely used nonlinear multigrid method, within the SD framework. We compare the approaches' performance on the p-Laplacian problem coming from graph-based semi-supervised learning. Additionally, we explore the application of SD to non-PDE problems, such as logistic regression, assessing its broader effectiveness in optimization.

09:15
Nonlinear methods for shape optimization problems in liquid crystal tactoids
PRESENTER: James Adler

ABSTRACT. An emerging theme across many domains of science and engineering is modeling materials that can change shape. In this talk, we focus on modeling the evolution of liquid crystals with free boundaries, known as tactoids, that are in contact with an isotropic fluid. We present promising results from applying a class of classical nonlinear numerical methods to this model and compare them with previously used gradient--descent methods. Moreover, by wrapping the algorithms in a multilevel nested iteration approach, we see significant improvements with a variety of initial guesses. We present shape optimization models with free boundaries to illustrate a two-- and three-- dimensional landscape of both nematic and cholesteric liquid crystal tactoids coupled with some theoretical discussions.

09:40
Comparisons between Anderson Acceleration and Momentum Acceleration Algorithms
PRESENTER: Satchel Lefebvre

ABSTRACT. Anderson Acceleration (AA) has a wide array of uses across disciplines, but less is known about its theoretical convergence rates. In this work, we consider windowed AA, denoted by AA(m), which uses the previous m−1 terms to update the iteration. Empirically, it is easy to find linear problems where AA(m) improves the convergence rate of standard iterative methods such as Richardson, Jacobi, and Gauss-Seidel. However, concise theoretical results are still elusive, and it is still unclear whether AA is the best choice for all applications. To address this, we compare momentum acceleration (MA) algorithms with AA(m), highlighting specific problems where MA outperforms AA(m) under certain conditions. A deeper understanding of AA(m) in linear settings could provide valuable insights into its behavior for nonlinear problems.

08:00-10:05 Session 7B: Applications (Part 1 of 2)
Location: Bighorn C
08:00
Quantum Bayesian Optimisation of Reservoir Simulation Models
PRESENTER: Steven Samoil

ABSTRACT. Reservoir models used in predicting subsurface fluid and rock behaviours have become increasingly large and complex with upwards of billions of grid cells and are modelling increasingly complex behaviour in emerging areas beyond simply energy resource production. Even highly parallel reservoir simulation platforms such as the Reservoir Simulation Group parallel simulator with a suite of linear solvers including AMG are reaching the limits of classical computing. One area under investigation to address these limitations is the process of history matching to update reservoir models with new data to better model real world behaviour. This requires numerous simulation runs and due to its time consuming nature is an area that could greatly benefit from improvements to existing techniques. In this work we propose the use of a quantum computing form of Bayesian optimisation as a technique utilising emerging computational hardware to perform history matching to take advantage of the potential benefits offered by quantum computers. The results found in this work show promise as they perform as well as the classical approach when utilizing a simulated quantum system. This work relied on the use of a hybrid cloud framework that utilized remote compute resources on a large scale cluster as well as developing the necessary tools to run the quantum Bayesian optimization on a real quantum system for future projects.

08:25
A multi-domain relaxation strategy with quasi-Monte Carlo for Mine Counter Measure Simulations

ABSTRACT. Modelling and simulating mine countermeasures (MCM) search missions performed by autonomous vehicles is a challenging endeavour. The goal of these simulations typically consists of calculating trajectories for autonomous vehicles in a designated zone such that the coverage (residual risk) of the zone is below a certain user defined threshold. We have chosen to model and implement the MCM problem as a stochastic optimal control problem. Mathematically, the MCM problem is defined as minimizing the total mission time needed to survey a designated zone for a given residual risk of not detecting sea mines in the user-chosen square domain. The output of our stochastic optimal control implementation consists of an optimal trajectory in the square domain for the autonomous vehicle. Important to note is that the residual risk is mathematically represented as an expected value integral. In order to compute a solution for this expected value integral, we use on the one hand a quasi-Monte Carlo (qMC) sampling scheme based on a Rank-1 Lattice rule, and on the other hand we use the traditional Monte Carlo (MC) sampling scheme. However, we observed that in our most ‘naive’ implementation of the MCM problem, the residual risk obtained at the end of the optimisation run was often higher than the maximally allowed user defined residual risk. In order to remedy to this issue, we developed a multi-domain relaxation strategy. The main idea of our multi-domain relaxation strategy consists of sequentially solving the stochastic optimal control problem with an ever increasing size of the domain until the requested risk is satisfied. The qMC or MC points we use are generated once at the start of the simulation on the unit cube domain, after which they are mapped to the desired domains. By following this approach, we ensure that the MCM risk obtained at the end of the optimisation run is lower or at most equal to the requested residual MCM risk. We observed a speedup up to a factor two in terms of total computational cost in favour of qMC when compared to MC.

08:50
Mulltiscale variational sintering model with degenerate incompressibility
PRESENTER: Arkadz Kirshtein

ABSTRACT. I will present a new approach to micromechanical phase-field model of viscous sintering where only part of the domain satisfies incompressibility constraint. Then I will discuss analytical and numerical challenges that come with this model as well as present numerical scheme and simulation results.

09:15
Multigrid-based Stochastic Minimization for Ptychographic Phase Retrieval
PRESENTER: Borong Zhang

ABSTRACT. We propose a novel approach to ptychographic phase retrieval that integrates a stochastic minimization framework accelerated by a multigrid method. Our formulation addresses ptychographic reconstruction by iteratively minimizing a quadratic surrogate that approximates the original objective function. Notably, this framework encompasses the widely used Ptychographic Iterative Engine (PIE) as a special case. By efficiently solving the surrogate problem using a multigrid strategy, our method mitigates ill-posedness and delivers significant improvements in both convergence speed and reconstruction quality compared to conventional PIE techniques.

09:40
Effects of Process/Thread Allocation for Optimization of Communication-Computation Overlapping in Parallel Multigrid Methods

ABSTRACT. Preconditioned iterative methods based on the Krylov subspace technique are widely employed in various scientific and technical computing. When utilizing large-scale parallel computing systems, the communication overhead tends to increase with the growth in the number of nodes, making its reduction a crucial challenge. In parallel FEM/FVM, halo communication and computation overlapping (CC-Overlapping) are commonly employed, often in conjunction with the dynamic loop scheduling feature of OpenMP. In the previous work, the author proposes a method to apply CC-Overlapping to the forward and backward substitutions of the IC(0) smoother of the parallel Conjugate Gradient method preconditioned by Multigrid (MGCG). Using up to 4,096 nodes on Wisteria/BDEC-01 (Odyssey) with A64FX, performance improvement of approximately 40+% was achieved compared to the original implementation. In the present work, effects of process/thread allocation within a compute node in OpenMP/MPI Hybrid parallel programming model has been conducted for optimization of CC-Overlapping in parallel MGCG solver using CGA (Coarse Grid Aggregation). It has been observed that even when the number of threads and cores per MPI process is low, as in the cases of HB 3×16 and HB 6×8, the effects of CC-Overlapping can still be achieved similarly to the traditional HB 12×4 configuration in the previous works. When the problem size per node increases, the performance of HB 3×16, HB 6×8 and HB 3×16 and HB 12×4 tends to be comparable. However, for smaller problem sizes per node, the performance of HB 3×16 and HB 6×8 configurations are actually superior. This is primarily attributed to the memory access performance in multi-threaded parallelism. Additionally, the number of parallel threads at each level of the smoother in parallel multigrid procedure and coarse grid solver was investigated. It was found that reducing the number of threads had little effect on computation time except at the finest level (level-1). Specifically, for the coarse grid solver, when the problem size was less than 10^4, reducing the number of threads to one at all levels did not make a significant difference in computation time. Further investigation is needed in this regard to determine the underlying causes.

10:05-10:25Coffee Break
10:25-12:30 Session 8: Algebraic multigrid
Location: Bighorn B
10:25
Generalized Optimal AMG Convergence Theory for Stokes Equations Using Smooth Aggregation and Vanka Relaxation Strategies
PRESENTER: Ahsan Ali

ABSTRACT. This paper discusses our recent generalized optimal algebraic multigrid (AMG) convergence theory applied to the steady-state Stokes equations discretized using Taylor-Hood elements ($\mathbb{P}_2/\mathbb{P}_1$). The generalized theory is founded on matrix-induced orthogonality of the left and right eigenvectors of a generalized eigenvalue problem involving the system matrix and relaxation operator. This framework establishes a rigorous lower bound on the spectral radius of the two-grid error-propagation operator, enabling precise predictions of the convergence rate for symmetric indefinite problems, such as those arising from saddle-point systems. We apply this theory to recently developed monolithic smooth aggregation AMG (SA-AMG) solver for Stokes, constructed using evolution-based strength of connection, standard aggregation, and smoothed prolongation. The performance of these solvers is evaluated using additive and multiplicative Vanka relaxation strategies. Additive Vanka relaxation constructs patches algebraically on each level, resulting in a nonsymmetric relaxation operator due to the partition of unity being applied on one side of the block-diagonal matrix. Although symmetry can be restored by eliminating the partition of unity, this compromises convergence. Alternatively, multiplicative Vanka relaxation updates velocity and pressure sequentially within each patch, propagating updates multiplicatively across the domain and effectively addressing velocity-pressure coupling, ensuring a symmetric relaxation. We demonstrate that the generalized optimal AMG theory consistently provides accurate lower bounds on the convergence rate for SA-AMG applied to Stokes equations. These findings suggest potential avenues for further enhancement in AMG solver design for saddle-point systems.

10:50
Nodal AMG Coarsening and Interpolation for PDE Systems
PRESENTER: Taoli Shen

ABSTRACT. We present an approach to constructing a practical coarsening algorithm and interpolation operator for the algebraic multigrid (AMG) method, tailored towards systems of partial differential equations (PDEs) with large near-kernels, such as H(curl) and H(div). Our method builds on compatible relaxation (CR) and the ideal interpolation model within the generalized AMG (GAMG) framework but introduces several modifications to define an AMG method for PDE systems. We construct an interpolation operator through a coarsening process that first coarsens a nodal dual problem and then builds the coarse and fine variables using a matching algorithm. Our interpolation follows the ideal formulation; however, we enhance the sparsity of ideal interpolation by decoupling the fine and coarse variables completely. When the coarse variables align with the geometric refinement, our method reproduces re-discretization on unstructured meshes. Together with an automatic smoother construction scheme that identifies the local near kernels, our approach forms a complete two-grid method. Finally, we also show numerical results that demonstrate the effectiveness of this interpolation scheme by applying it to targeted problems and the Stokes system.

11:15
Advancing Adaptive Algebraic Multigrid: Composite Preconditioners for Anisotropic Problems and Contact Elasticity
PRESENTER: Austen Nelson

ABSTRACT. This work continues the research on adaptive Algebraic Multigrid (AMG) methods that we first presented at the 2024 Copper Mountain Conference on Iterative Methods. In that earlier study, we introduced a methodology for adaptivity through the construction of composite preconditioners utilizing multiple coarse-grid hierarchies, established a theoretical framework to ensure convergence of these composite operators, and reported preliminary results for finite element discretizations of diffusion problems with highly anisotropic and heterogeneous coefficients. Building on these foundations, we now provide a more detailed investigation different interpolation scheme’s approximation properties, present a more comprehensive analysis on challenging anisotropic and heterogeneous diffusion problems, and introduce an application to contact problems in linear elasticity as a preconditioner for an inequality constrained quadratic program.

11:40
Aggressive coarsening for faster CFD simulations on GPU-accelerated supercomputers

ABSTRACT. Owing to its effectiveness in solving elliptic PDEs, Algebraic Multigrid (AMG) is the default preconditioner of many Computational Fluid Dynamics (CFD) simulation codes. Nevertheless, its quality comes with relatively high memory requirements and setup and application costs, and making it lighter without harming its fast convergence yields significant speed-ups. A way to achieve this is through aggressive coarsening [1], which Chronos implements by computing the symbolic power of the adjacency matrix, T, corresponding to the graph of strongly connected unknowns (for a given strength-of-connection measure and filtering). Then, it computes a maximum independent set on the resulting matrix, T^k, which ensures there are no coarse nodes at a distance ≤k. The greater k, the more aggressive the coarsening and the fewer levels in the multigrid hierarchy, but the potentially worse accuracy and, therefore, slower convergence. Hence, to preserve AMG's excellent convergence, aggressive coarsening requires accurate prolongations relying on long-distance interpolations [2]. Chronos uses the dynamic-pattern least squares (DPLS) interpolation, which results in remarkably low operator complexities. Nevertheless, DPLS was designed to accommodate multiple test vectors, which makes it perform poorly on CFD problems due to their one-dimensional near-null space. However, combining it with energy minimization proved very effective in improving DPLS quality while keeping the operator complexity low, resulting in an AMG converging comparably regardless of the aggressive coarsening [3]. Numerical experiments harnessing our novel aggressive coarsening GPU implementation will be presented at the conference.

12:05
A new AMG Strength-of-connection measure that incorporates finite element mass matrices
PRESENTER: Raymond Tuminaro

ABSTRACT. AMG's current strength-of-connection (SOC) measures remain problematic especially for finite element discretizations on stretched grids. In particular, it is well-known that several popular finite element discretizations can give rise to matrices where relatively large magnitude matrix off-diagonal entries correspond to directions that should be categorized as weak within an AMG procedure. This is largely due to "smearing" effects associated with basis function inner products that are used to define the matrix. In this talk, we illustrate how these smearing effects can be reduced by instead considering strength-of connection for the matrix M^-1 A as opposed to the matrix A. Here, A and M are the finite element discretization and mass matrices respectively. Of course, this requires users to now also provide the matrix M, though this is often easy within many finite element applications. While the inverse of the mass matrix is dense, it is well-conditioned and can be approximated by a number of relatively inexpensive methods. A key point is that the underlying approximation can employ a sparsity pattern that corresponds to that of A or |A|^2. For problems where the mesh does not change, an M^-1 approximation can be re-used throughout a simulation to solve the associated sequence of linear systems. Further storage reductions can be achieved by storing only the upper triangular portion of the symmetric matrix matrix and employing low precision ideas. Computational results will be shown illustrating the advantages of this approach on a number of stretched finite element examples illustrating employing a range of different approximation methods for the inverse mass matrix.

12:30-16:00Lunch Break
16:00-16:30Coffee Break
16:30-18:35 Session 9: Structured and matrix-free methods
Location: Bighorn B
16:30
Overlapping Schwarz methods are not anisotropy-robust multigrid smoothers
PRESENTER: Oliver Krzysik

ABSTRACT. We consider overlapping multiplicative Schwarz methods as smoothers in the geometric multigrid solution of two-dimensional anisotropic diffusion problems. The smoothing properties of point-wise smoothers, such as Gauss–Seidel, rapidly deteriorate as the strength of anisotropy increases, while global smoothers based on line smoothing tend to provide good smoothing, independent of the anisotropy strength. We analyze whether global methods are really necessary to achieve good smoothing, or whether it can be obtained with locally overlapping block smoothers using sufficiently large blocks and overlap. For a fixed block size, we find that the smoothing properties of overlapping multiplicative Schwarz rapidly deteriorate with increasing anisotropy, irrespective of the amount of overlap between blocks.

LA-UR-25-20777

16:55
Semi-Structured Algebraic Multigrid in hypre
PRESENTER: Wayne Mitchell

ABSTRACT. Due to their optimal linear complexity, multigrid methods are widely used for the solution of large-scale linear systems arising from PDE discretizations. Algebraic multigrid methods are particularly attractive due to their black-box approach, constructing a multigrid hierarchy based only on the linear system to be solved. This approach generally neglects the structure that is often present in the underlying PDE discretizations, however. This talk describes the Semi-structured Algebraic Multigrid (SSAMG) algorithm, which leverages the structure present in the linear system while maintaining a black-box algebraic approach. SSAMG is implemented in hypre, a library of high-performance linear solvers specializing in multigrid algorithms. This talk will give an overview of the approach and design of SSAMG and present the latest numerical results highlighting the benefits of SSAMG compared to fully unstructured AMG.

17:20
A Characterization of the Behavior of Nonlinear GMRES on Linear Systems
PRESENTER: Yunhui He

ABSTRACT. The Nonlinear GMRES (NGMRES) proposed by Oosterlee and Washio [SIAM Journal on Scientific Computing, 2000, 21(5):1670-–1690] is an acceleration method for fixed point iterations. It has been demonstrated to be effective, but its convergence properties have not been extensively studied in the literature so far, even for linear systems. In this work we aim to close some of this gap. We offer a convergence analysis for a windowed version of NGMRES applied to linear systems, which we denote by NGMRES($m$) for window size $m$ applied to a Richardson fixed-point iteration. A central part of our analysis focuses on identifying equivalences between NGMRES and the classical Krylov subspace GMRES method. Under the mild assumption that the Euclidean norm of the GMRES residual strictly decreases throughout the iteration, we prove the residuals of NGMRES($\infty$) and GMRES are identical in the absence of roundoff errors. Moreover, if the matrix of the linear system is nonsingular, we prove equivalence of NGMRES($\infty$) and GMRES.  We prove that NGMRES(1) coincides with MINRES if the coefficient matrix is symmetric or shifted skew-symmetric. We also provide some insights and comparisons between the convergence of NGMRES and Anderson acceleration in a few specific cases.  We derive a simple algorithm for updating the solution iterates, which keeps the cost of the updates acceptably modest. Finally, we derive upper bounds on the convergence rate of NGMRES for a few specific examples of interest.

17:45
A matrix-free AMG method targeting large-scale GPU architectures
PRESENTER: Peter Ohm

ABSTRACT. With the advent of advanced computing architectures such as GPUs, many algorithms have required changes in order to properly utilize them. One such approach that is well-suited for GPUs is matrix-free methods. Matrix-free methods can be advantageous over traditional methods due to their memory efficiency and lower computational complexity, but with the cost of needing specialized software frameworks and solvers. Geometric multigrid methods have seen great success on GPU architecture with the regular geometric stencils naturally lending themselves to matrix-free implementations. Algebraic multigrid methods are problematic because they usually require information about matrix entries, which may not be easily obtainable in the matrix-free framework.

In this talk we discuss the implementation and performance of a matrix-free AMG solver implemented in NEKO, a portable framework for high-order spectral element flow simulations. We utilize an hp-multigrid approach, where the problem is first coarsened from high-order elements to low-order elements and then the low-order system is further coarsened spatially in a matrix-free AMG fashion.

We present numerical results demonstrating the performance and scalability of our method and applicability to real-world applications.

18:10
Error Estimates for the Arnoldi Approximation of a Matrix Square Root
PRESENTER: Zhongqin Xue

ABSTRACT. The Arnoldi process provides an efficient framework for approximating functions of a matrix applied to a vector, i.e., of the form $f(M)b$. In this paper, we present an a priori error estimate for approximating matrix functions using the Arnoldi process, extending the error analysis of the Lanczos method for Hermitian matrices in [Chen et al., SIAM J. Matrix Anal. Appl., 2022] to non-Hermitian cases. By employing the integral representation for the error of an Arnoldi iteration, we reformulate it in terms of the error for approximating $M^{-1}b$. Furthermore, assuming the matrix is preprocessed utilizing low-rank approximations that preserve positive definiteness, we derive an error bound of the Arnoldi approximation for the square root of a matrix. Numerical experiments on large-scale matrices arising in particulate suspensions validate the effectiveness and practicality of the proposed approach.