CM2020: 16TH COPPER MOUNTAIN CONFERENCE ON ITERATIVE METHODS
PROGRAM FOR SUNDAY, MARCH 22ND
Days:
previous day
next day
all days

View: session overviewtalk overview

08:00-10:05 Session 2A: Preconditioning for Multiphysics Problems I. Bighorn B
08:00
A Scalable Approximate Inverse Block Preconditioner for an Incompressible Magnetohydrodynamics Model Problem

ABSTRACT. We introduce a new approximate inverse preconditioner for a mixed finite element discretization of an incompressible magnetohydrodynamics (MHD) model problem. The derivation exploits the nullity of the discrete curl-curl operator in the Maxwell subproblem. We show that the inverse of the coefficient matrix contains zero blocks, and use discretization considerations to obtain a practical preconditioner based on further sparsification. We demonstrate the viability of our approach with a set of numerical experiments.

08:25
Block preconditoning strategy for reduced magnetohydrodynamic (RMHD) equations

ABSTRACT. Magnetohydrodynamic (MHD) equations are used to model the flow of plasma when the time scale of interest is long compared to the particle velocities. An important application of MHD is for models of plasma in tokomak geometry, where there is a very strong magnetic field in the toroidal direction. Near the edge of the tokomak, a reduced form can be used to model “edge effects”. Discrete approximations to the reduced MHD equations require the solution of a system of multiple, coupled, partial differential equations. This talk will describe this system and the reorganization of the equations into a block form. A preconditioning strategy for the block system that involves Shur complements and AMG solvers will be describe and preliminary numerical results will be presented.

08:50
Scalable implicit, adaptive MFEM-based solvers for reduced resistive MHD

ABSTRACT. The extended magnetohydrodynamics (XMHD) equations are advanced continuum models to understand the evolution of complex plasma dynamics in tokamak disruptions. However, solving XMHD numerically is challenging due to its disparity in time and length scales, its strongly hyperbolic nature and its nonlinearity. Therefore, scalable, adaptive implicit algorithms based on efficient preconditioning strategies are necessary for XMHD.

In this work, we design and develop several finite-element schemes for a simple model, the reduced resistive MHD equations. Both explicit and implicit schemes are developed using the scalable C++ framework MFEM (mfem.org). The implicit scheme is based on the Jacobian-free Newton-Krylov method with a physics-based preconditioner as proposed in [Chacon et al, 2002]. The preconditioner is generalized for the finite element discretization, and algebraic multigrid methods are used to invert certain operators to achieve scalability. Our preconditioner effectively treats stiff hyperbolic components in the system. Both explicit and implicit solvers are implemented in parallel with adaptive mesh refinement and dynamic load-balancing. Benchmark results will be presented to demonstrate the accuracy and scalability of the implicit scheme in the presence of strong scale disparity.

09:15
An element-based preconditioner for mixed finite element problems

ABSTRACT. We introduce a new and generic approximation to Schur complements arising from inf-sup stable mixed finite element discretizations of self-adjoint multi-physics problems. The approximation ex- ploits the discretization mesh by forming local, or element, Schur complements and projecting them back to the global degrees of freedom. The resulting Schur complement approximation is sparse, has low construction cost (with the same order of operations as a general finite element matrix), and can be solved using off-the-shelf techniques, such as multigrid. Using the Ladyshenskaja-Babuška-Brezzi condition, we show that this approximation has favorable eigenvalue distributions with respect to the global Schur complement. We present several numerical results to demonstrate the viability of this approach on a range of applications. Interestingly, numerical results show that the method gives an effective approximation to the non-symmetric Schur complement from the steady state Navier-Stokes equations.

09:40
Scalable Block Preconditioning of Implicit / IMEX FE Continuum Plasma Physics Models

ABSTRACT. Continuum plasma physics models are used to study important phenomena in astrophysics and in technology applications such as magnetic confinement (e.g. tokamak), and pulsed inertial confinement (e.g. NIF, Z-pinch) fusion devices. The computational simulation of these systems, requires solution of the governing PDEs for conservation of mass, momentum, and energy, along with various approximations to Maxwell's equations. The resulting systems are characterized by strong nonlinear coupling of fluid and electromagnetic phenomena, as well as the significant range of time- and length-scales that these interactions produce.  To enable accurate, and stable approximation of these systems, a wide-range of spatial discretizations that include mixed integration, stabilized, and structure-preserving approaches are employed. For effective long-time-scale integration some implicitness is required. In this context, fully-implicit, and implicit-explicit methods have shown considerable promise. These characteristics make scalable and efficient iterative solution, of the resulting poorly-conditioned discrete systems, extremely difficult. Our approach to overcome these challenges has been the development of efficient fully-coupled multilevel preconditioned Newton-Krylov methods. This talk considers the structure of these algorithms, demonstrates the flexibility of this approach, and presents results on scaling of the methods on up to 1M cores. (This is joint work with Michael Crockatt, Sidafa Conde, Roger Pawlowski, Ari Rappaport, Edward Phillips, Eric Cyr, Sean Miller and Paul Lin (LBL), Sibu Mabuza (Clemson))

*This work was partially supported by the DOE Office of Science Advanced Scientific Computing Research (ASCR) - Applied Math Research program and an ASCR/Office of Fusion Energy SciDAC Partnership Project at Sandia National Laboratories. Sandia is a multimission laboratory managed and operated by NTES of Sandia, LLC, a wholly owned subsidiary of Honeywell International, Inc., for the U.S. Department of Energy NNSA under contract DE-NA0003525.

08:00-10:05 Session 2B: Krylov Methods. Bighorn C/1
08:00
On the Convergence Rate of Variants of the Conjugate Gradient Algorithm in Finite Precision Arithmetic

ABSTRACT. We consider three mathematically equivalent variants of the conjugate gradient algorithm and how they perform in finite precision arithmetic. We are concerned mainly with the rate at which the A-norm of the error is reduced before the ultimately attainable accuracy (which may be different for the different variants) is achieved. It was shown in ["Behavior of slightly perturbed Lanczos and conjugate-gradient recurrences", Lin. Alg. Applied. 113 (1989), pp. 7-63] that if certain local orthogonality conditions and a 3-term recurrence are approximately satisfied, then the finite precision computation behaves like exact CG for a matrix with many eigenvalues distributed throughout tiny intervals about the eigenvalues of the given matrix. We determine how closely each variant satisfies these conditions using a set of test problems and demonstrate that there is significant correlation between how well these conditions are satisfied by the different variants and how rapidly they reduce the A-norm of the error for difficult problems, before the final accuracy level is achieved.

08:25
Sharp 2-norm Error Bounds for the Conjugate Gradient Method and LSQR

ABSTRACT. For positive definite linear systems Ax=b, we propose a method for cheaply computing an upper bound on the 2-norm error of the iterates produced by the conjugate gradient method. Our bound is only slightly tighter than existing estimates, but we also prove that it is sharp in the following sense: it uses only the information revealed by the Lanczos process along with a lower bound on the smallest singular value of A, and among all such methods it provides the tightest possible upper bound on the error. Thus our work demonstrates the limits of one current approach to estimating the 2-norm error.

We also consider LSQR, an iterative least-squares solver. As LSQR is formally equivalent to the conjugate gradient method applied to the normal equations, we derive analogous results for LSQR.

08:50
Singular Systems along with Eigenvalues and Singular Values

ABSTRACT. We consider singular linear equations where there are small non-zero eigenvalues that matter to the solution. We propose a method with a version of GMRES that computes a null vector and uses it to assist solution of linear equations with the nullity removed. Small eigenvalues are also computed, and the method can be adapted to find singular values. Tools used are rank-one perturbation, the deflated method GMRES-DR, and possibly a version of Golub-Kahan.

09:15
Delayed Re-Orthogonalization to Achieve One-Synch Classical Gram-Schmidt Applied to GMRES and Krylov-Schur Algorithms

ABSTRACT. We are interested in the Gram-Schmidt orthogonalization process in the context of the Arnoldi expansion, where we reduce the number of synchronization steps within the algorithm. We then present an Arnoldi expansion algorithm that is fully stable and requires only one synchronization per step. Hernandez et al. published a similar algorithm (one synchronization per step), albeit less numerically stable. An extensive numerical study was performed and is presented demonstrating the stability of this new method. Performance results on large scale parallel computers are presented as well.

09:40
A single-pole rational Krylov subspace method for approximating Markov functions of matrices

ABSTRACT. A single-pole rational Krylov subspace method (SPRKSM) is considered for numerical approximation of the action of $f(A)$ to a vector $b$, where the function $f$ is Markov type. SPRKSM has the same framework as the extended Krylov subspace method (EKSM), replacing the zero pole in EKSM with a properly chosen fixed nonzero pole. For symmetric positive definite matrices, the optimal pole is derived for SPRKSM to ensure the fastest asymptotic convergence rate. In the nonsymmetric setting, SPRKSM is also effective for many problems. For certain difficult problems that cannot be efficiently approximated by SPRKSM, a four-pole rational Krylov subspace method is proposed, which aims to strike a balance between SPRKSM and adaptive RKSM to reduce runtime. For large matrices that can be handled efficiently by factorization-based direct linear solvers, numerical experiments show that SPRKSM and the four-pole variant outperform EKSM in both storage and runtime, and they usually have advantage in runtime over adaptive RKSM.

08:00-10:05 Session 2C: PDE-constrained Optimization. Bighorn C/2
08:00
An Approximation Scheme For Distributionally Robust Nonlinear PDE-Constrained Optimization

ABSTRACT. We develop a sampling-free approximation scheme for distributionally robust optimization (DRO) with PDEs. The DRO problem can be written in a bilevel form that involves maximal (i.e., worst case) value functions of expectations of nonlinear functions which depend on the optimization variables and random parameters. The maximum values are taken over an ambiguity set of probability measures that is defined by moment constraints. To achieve a good compromise between tractability and accuracy we approximate nonlinear dependencies of the cost / constraint functions on the random parameters by quadratic Taylor expansions. This results in an approximate DRO problem which on the lower level then involves value functions of parametric trust-region problems and of parametric semidefinite programs. Using trust-region duality, a barrier approach, and other techniques we construct gradient consistent smoothing functions for these value functions and show global convergence of a corresponding continuation method. We discuss the application of our approach to PDE constrained optimization under uncertainty and present numerical results.

08:25
An ADMM-based time domain decomposition approach for PDE constrained optimization

ABSTRACT. We consider optimization problems governed by time-dependent parabolic PDEs and discuss the construction of parallel solvers based on time-domain decomposition. We propose an ADMM-based approach to decompose the problem in independent subproblems and combine it with a multigrid strategy. We analyze the convergence properties and present numerical results.

08:50
Algebraic Multigrid Preconditioning for PDE-constrained Optimization

ABSTRACT. We construct an algebraic multigrid (AMG) based preconditioner for the reduced Hessian of a linear-quadratic optimization problem constrained by an elliptic equation. While the preconditioner generalizes a geometric multigrid preconditioner introduced in earlier works, its construction relies entirely on standard AMG infrastructure build for solving the forward elliptic equation, thus allowing for it to be implemented using a variety of AMG methods and standard packages. Our analysis establishes a clear connection between the quality of the preconditioner and the AMG method used. The proposed strategy has a broad and robust applicability to problems with unstructured grids, complex geometry, and varying coefficients. Several numerical examples are presented.

09:15
Fractional Optimal Control Problems with State Constraints: Algorithm and Analysis

ABSTRACT. Motivated by several applications in geophysics and machine learning, in this talk, we introduce a novel class of optimal control problems with fractional PDEs. The main novelty is due to the obstacle type constraints on the state. The analysis of this problem has required us to create several new, widely applicable, mathematical tools such as characterization of dual of fractional Sobolev spaces, regularity of PDEs with measure-valued datum. We have created a Moreau-Yosida based algorithm to solve this class of problems. We establish convergence rates with respect to the regularization parameter. Finite element discretization is carried out and a rigorous convergence of the numerical scheme is established. Numerical examples confirms our theoretical findings.

10:25-12:30 Session 3A: Preconditioning for Multiphysics Problems II. Bighorn B
10:25
Preconditioning a Fully Implicit Gyrokinetic Electromagnetic Particle-in-Cell Method

ABSTRACT. A fully implicit particle-in-cell (PIC) method based on the work of G. Chen and L. Chacón [1] has been developed to study gyrokinetic electromagnetic modes in tokamak plasmas. A fully implicit time discretization scheme overcomes stability issues due to the inductive component of the electric field in the sympletic formulation of gyrokinetics [2] while avoiding a well-known “cancellation problem” associated with the Hamiltonian formulation of gyrokinetics [3].

To handle the resulting system of nonlinear equations, we employ the strategy outlined in [1] to formulate the system in terms of field equation residuals. This formulation requires far less solver memory than an equivalent residual formulated in terms of particle quantities. A preconditioned Picard iteration scheme, accelerated with Anderson mixing [4], is then applied.

The preconditioner is derived from an electron fluid model, which captures the stiff modes of the kinetic system, and includes additional effects to account for the numerics of the PIC method. Application of the preconditioner requires the solution of a linear system of equations resulting from the discretization of a coupled PDE system. We will discuss strategies for solving the linear system including block factorizations, multigrid methods, and ADI-like approaches using PETSc. Finally, we will present numerical results to validate our scheme, including the simulation of the ITG-KBM transition [5] and long wavelength Alfven waves, which has been problematic with previous approaches.

[1] G. Chen, L. Chacon, Comput. Phy. Comm. 197, 73-87, 2015. [2] J.V.W. Reynders, Ph.D. thesis, Princeton University, 1992. [3] J.C. Cummings, Ph.D. thesis, Princeton University, 1995. [4] D.G. Anderson, J. Assoc. Comput. Mach., 12, 547-560, 1965. [5] T. Gorler, N. Tronko, et al., Phys. Plasmas, 23, 072503, 2016.

10:50
A coupled thermal radiative transfer and kinetic plasma solver

ABSTRACT. Simulating inertial confinement fusion (ICF) implosions is challenging due to the extreme conditions in the target capsule during the shot, different time scales from the fast plasma interactions to the slower material motions and nonlocal effects in radiation and plasma. The current mainstream approach in modeling such a system is to use a radiation-hydrodynamic model, where the plasma is modeled as a fluid. However, this approach neglects kinetic effects, such as differential ion motions, enhanced viscosity, and nonlocal heat transport. Recent work of our group has focused on a kinetic plasma solver for ICF simulations, but has neglected thermal radiative transfer (TRT) so far. At the kilo-electronvolt temperature regime, in which most ICF experiments live in, TRT becomes the dominant mechanism for energy transfer. Additionally, the driving force of ICF implosions is radiation (laser for direct drive and X-ray for indirect drive). Hence a radiation model allows us to drive our simulations without the need for external radiation-hydrodynamics calculations.

In this work we present a coupled kinetic plasma physics and radiation solver based on a hierarchical structure of coupled and kinetic solvers. The coupling between radiation, ion, and electron physics is done in a low-order (LO) system in a high-order low-order (HOLO) scheme. The cheaper, monolithic LO solver takes care of the stiff physics, related to the nonlocal coupling between the different ion species, electrons, and radiation. Correction terms, which are informed by high-order (HO) systems for each physics, force the LO solution to be equivalent to the kinetic solution. This approach allows us to decouple the expensive HO solves, and additionally, accelerate the convergence.

The first step is a radiation diffusion solver, which is fully coupled to a plasma fluid model. This solver is designed to be the LO system in our HOLO-scheme. It will implement all multiphysics coupling terms between the thermal radiation and fluid electrons, and will be later expanded to include the kinetic correction terms from the radiation HO system. We demonstrate our implementation with a challenging radiation driven shock problem. These calculations will use a radiation diffusion solver, and a hybrid ion-kinetic electron-fluid model.

11:15
A moment-accelerated iterative implicit solver for the multispecies 1D-2V Vlasov-Fokker-Planck – Ampère kinetic equation

ABSTRACT. We discuss a fully-implicit iterative solver for the Vlasov-Fokker-Planck – Ampère (VFPA) system with one configuration-space dimension and two velocity-space dimensions (1D-2V). The VFPA system couples a multispecies kinetic plasma description to the electrostatic electric field (Ampère). The purpose of this work is to enable efficient, high fidelity, fully kinetic simulations of weakly coupled high-energy-density plasmas with an arbitrary number of species, particularly in regimes relevant to inertial confinement fusion (ICF). For many applications of interest (e.g., ICF implosions) the VFPA system exhibits large separation of numerical and physical scales in time and space, in particular due to the plasma frequency, fast electron motion, and electron collisions [1]. This numerical stiffness is addressed here by iterative implicit timestepping, which is accelerated using a high-order – low-order (HOLO) strategy (essentially a multiscale moment-based preconditioning scheme [2]). The scheme also employs an adaptive mesh strategy in physical and velocity space [3], in which the velocity space of each species is analytically rescaled and shifted by its spatially and temporally varying thermal speed and bulk flow, respectively; a moving mesh is used in the configuration space that allows tracking features with sharp spatial gradients, such as shocks and material interfaces. Further, conservation of mass, momentum, and energy, together with Gauss’ law, are discretely preserved – commensurately with the iterative nonlinear relative tolerance – through a Lagrange-multiplier-like approach using nonlinear constraint functions [1, 4]. We demonstrate the effectiveness and accuracy of the algorithm with several benchmark problems of varying complexity, including a fully kinetic collisional plasma shock in planar geometry.

[1] S. Anderson et al., An efficient, conservative, time-implicit solver for the fully kinetic arbitrary-species 1D-2V Vlasov-Ampère system. eprint arXiv:1910.02099, 2019 [2] L. Chacón et al, Multiscale high-order/low-order (HOLO) algorithms and applications. J. Comp. Phys., 330:21—45, 2017 [3] Taitano et al., A conservative phase-space moving-grid strategy for a 1D-2V Vlasov-Fokker-Planck Equation, eprint arXiv:1903.05467, 2019 [4] Taitano et al., An adaptive, implicit, conservative, 1D-2V multi-species Vlasov–Fokker–Planck multi-scale solver in planar geometryblab la, J. Comp. Phys., 365:173—205, 2018

11:40
A solution strategy for fluid-structure interaction using the unified continuum formulation, quasi-direct coupling, and nested block preconditioning

ABSTRACT. In this work, we first present a unified continuum modeling framework for viscous fluids and hyperelastic solids using the Gibbs free energy as the thermodynamic potential. This framework naturally leads to a pressure primitive variable formulation for the continuum body, which is well-behaved in both compressible and incompressible regimes. We perform variational multiscale (VMS) analysis for this general continuum model, and the resulting formulation recovers the residual-based VMS formulation for the Navier-Stokes equations. For hyperelastic materials, this VMS formulation provides a mechanism to circumvent the inf-sup condition for low-order tetrahedral elements. For the fully discrete problem, a block factorization is performed for the consistent tangent matrix, which reduces the linear algebra problem to a simpler one with a 2-by-2 block structure. This naturally leads to the quasi-direct coupling method for fluid-structure interaction (FSI) problems, which is amenable to implementation within existing software framework. As a generalization of the conventional block preconditioners, a three-level nested block preconditioner using Schur complement reduction is introduced to attain a better representation of the Schur complement, which plays a key role in the overall algorithm robustness and efficiency. This approach provides a flexible, algorithmic way to represent the Schur complement for problems involving multiscale and multiphysics coupling. The robustness, efficiency, and parallel scalability of the proposed technique are then examined in several settings, including the tensile test of a fully-incompressible anisotropic hyperelastic arterial wall model, moderately high Reynolds number flows, and physiological flows with strong resistance effect due to the downstream vasculature model.

12:05
Simulating Multi-Species Compressible Flow with a Fully-Implicit All-Speed Flow Solver

ABSTRACT. The numerical simulation of flows associated with laser-induced phase change in metal additive manufacturing, supercritical fluids, and the slow cook-off of energetic materials present significant challenges. In all cases, the Mach number is very low and gas compression and expansion occurs due to large thermal gradients. In thermal cookoff problems, multiple species and multiple thermodynamic phases (gases, liquids, solids) are present, creating additional modeling complications.

To address these challenges, we present a high-order, fully-implicit all-speed fluid dynamics solver for simulating multi-species compressible flow at very low-Mach numbers. The governing equations are the conservative Navier-Stokes equations with N-species transport equations and an equation of state. The equations are discretized in space with a Discontinuous Galerkin method and integrated in time with L-stable fully implicit time discretization schemes. The resulting set of non-linear/linear equations is converged using a robust preconditioned Newton-Krylov solver.

We demonstrate that our fully-implicit solver is able to converge multi-species compressible reactive flow with a Mach number less than 10-5 and in the presence of large viscosity gradients. Future work will focus on developing scalable block-based preconditioning techniques, incorporating Non-Newtonian rheology, and handling sharp density jumps across gas-liquid interfaces.

This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.

10:25-12:30 Session 3B: Iterative Methods and Preconditioning. Bighorn C/1
Chair:
10:25
Band-Toeplitz preconditioners for nonsymmetric ill-conditioned Toeplitz systems

ABSTRACT. Preconditioning for ill-conditioned Toeplitz systems has been of interest over the past few decades. While circulant preconditioners, band-Toeplitz preconditioners, and various techniques have been developed for the Hermitian case in particular, the non-Hermitian case is less well investigated. Considering instead (real) nonsymmetric ill-conditioned Toeplitz systems, we propose a simple yet effective preconditioning approach for solving them using the minimal residual (MINRES) method that can achieve the overall optimal $\mathcal{O}(n\log{n})$ complexity. Namely, we first symmetrize a given Toeplitz system by using a permutation matrix and construct several band-Toeplitz type preconditioners for the modified system. Then, under certain assumptions, we show that the eigenvalues of the preconditioned matrix-sequence are clustered around $\pm 1$, except a number of outliers and hence superlinear convergence rate of MINRES can be obtained. An extension of this work to the multilevel (block) Toeplitz case is also included. Moreover, we indicate that our preconditioning approach can be employed for the systems arising from solving fractional differential equations. Numerical examples are provided to demonstrate the effectiveness of our proposed preconditioners.

10:50
Efficient low-order refined solvers for high-order matrix-free continuous and discontinuous Galerkin methods

ABSTRACT. Modern computer architectures favor the use of high-order discretizations for many problems. These methods feature high accuracy per degree of freedom and high arithmetic intensity, both of which are attractive properties for modern architectures such as GPUs. However, the cost of assembling or even storing the matrix associated with these discretizations can be prohibitively expensive, thus necessitating the use of matrix-free methods and solvers. Sum-factorization and related techniques allow for efficient operator evaluation, however solving the resulting large linear systems remains challenging. Furthermore, common preconditioners such as those arising from domain decomposition methods typically require dense blockwise factorizations, which are not possible without an explicit representation of the matrix.

In this talk, I will describe recent work on the development of matrix-free preconditioners and solvers for very high-order finite element discretizations. These solvers are based on a sparse, low-order refined discretization which is spectrally equivalent to the high-order discretization. This equivalence (often known as SEM-FEM equivalence) allows for the construction of preconditioners whose memory requirements scale optimally with the order of the method. In order to obtain good performance, the geometric anisotropy of the low-order method must be treated. The resulting preconditioners are robust with respect to the mesh spacing and polynomial degree. These preconditioners can be extended to discontinuous Galerkin methods, for which they are also robust with respect to the penalty parameter. Applications of these methods to a high-performance solver for the incompressible Navier-Stokes equations will be discussed.

11:15
Robust and Efficient Multilevel-ILU Preconditioning of Newton-GMRES with Applications to Navier-Stokes Equations

ABSTRACT. We introduce a new preconditioner for the Newton-GMRES method for solving nonlinear systems, especially for incompressible Navier-Stokes equations. When the Reynolds number is high, these systems often involve more than millions of unknowns, and the solutions sometimes have corner singularities in pressure, making it challenging to achieve robustness and efficiency. We propose to address this dual challenge by using a multilevel ILU technique, in particular HILUCSI, which the author and co-workers introduced recently for solving large-scale indefinite linear systems. We apply HILUCSI on physics-aware preconditioning operators in Newton-GMRES methods with hot start and damping. We also introduce adaptive refactorization, dynamic thresholding, and iterative refinement within the preconditioner, and implement them in a new nonlinear solver called HILUNG. We demonstrate the robustness and efficiency of HLUNG for a 2D driven-cavity problem with Reynolds number 5000 as well as a 3D channel-flow problem with tens of millions of unknowns and compare it with some state-of-the-art alternatives.

11:40
A new implicit Eulerian solver for semiconductor models

ABSTRACT. I will present a new nonlinear iterative strategy for solving semiconductor models in the Eulerian framework. We use a kinetic equation model, which can have nonlocal operators, high dimensionality, and multiple scales in the particle collision frequency. Stiffness of the problem may necessitate the use of implicit timestepping methods, and it is important to develop efficient nonlinear solvers to address this.

The new solver poses the problem as a fixed point problem on a lower dimensional space using a domain decomposition approach. We take advantage of the reduced dimensionality to store more solution vectors, and are thus able to apply Anderson Acceleration to speed up convergence. We also accelerate convergence by preconditioning the system with the drift-diffusion limiting equations when collisions dominate.

10:25-12:30 Session 3C: Inverse Problems and Regularization. Bighorn C/2
Chair:
10:25
Iterative Image Reconstruction and Segmentation Using Mesh Adaptation

ABSTRACT. In large-scale image processing applications, one important problem is to reconstruct and segment an image from measured data. The reconstruction problem is ill-posed, and requires regularization in order to compute a meaningful approximation of the solution. Segmentation is often then done as a post-processing tool, after the inverse problem is solved. This two-step process (reconstruct then segment) can then be iterated to produce higher resolution reconstructions, and more accurate segmentations.

In this talk we focus on image reconstruction and segmentation starting from indirect measurements. A classical example in such a direction is represented by tomographic imaging when the reconstruction-segmentation process works directly on x-ray projection data. In particular, we aim at improving the performances of some of the approaches currently available in the literature by means of a mesh adaptation procedure. The benefits due to the employment of adapted meshes have been already investigated, for instance, when image segmentation is driven by the minimization of suitable energy functionals and starting from direct measurements. In this case, mesh adaptation leads to better computational performances without giving up the accuracy, with a considerable saving in terms of memory storage.

10:50
An Inner-Outer Iterative Method for Edge Preservation in Image Restoration and Reconstruction

ABSTRACT. We present a new inner-outer iterative algorithm for edge enhancement in imaging problems. At each outer iteration, we formulate a Tikhonov-regularized problem where the penalization is expressed in the 2-norm and involves a regularization operator designed to improve edge resolution as the outer iterations progress, through an adaptive process. An efficient hybrid regularization method is used to project the Tikhonov-regularized problem onto approximation subspaces of increasing dimensions (inner iterations), while conveniently choosing the regularization parameter (by applying well-known techniques, such as the discrepancy principle or the L-curve criterion, to the projected problem). This procedure results in an automated algorithm for edge recovery that does not involve regularization parameter tuning by the user, nor repeated calls to sophisticated optimization algorithms, and is therefore particularly attractive from a computational point of view. A key to the success of the new algorithm is the design of the regularization operator through the use of an adaptive diagonal weighting matrix that effectively enforces smoothness only where needed. We demonstrate the value of our approach on applications in X-ray CT image reconstruction and in image deblurring, and show that it can be computationally much more attractive than other well-known strategies for edge preservation, while providing solutions of greater or equal quality.

11:15
A Hybrid Approach for Image Reconstruction in Electrical Impedance Tomography

ABSTRACT. Electrical impedance tomography (EIT) is an imaging method that has been gaining more popularity due to its ease of use and non-invasiveness. In EIT, the inner distribution of resistivity, corresponding to contrast in resistivity properties of different tissues, is estimated from the voltage potentials measured on the boundary of the object being imaged. In this paper, a novel hybrid method combining the analytical regularization method using mollifiers and iterative method using Iteratively Regularized Gauss-Newton (IRGN) method is proposed. A comprehensive numerical and computational comparison for EIT is presented.

11:40
CALIBRATING SENSING DRIFT IN TOMOGRAPHIC INVERSION

ABSTRACT. Scanning-probe x-ray tomography is useful for imaging nanoscale structures of a sample. The image resolution, which can reach down to 10 nm by the increased brightness and coherence of the x-ray optics, however, is highly susceptible to experimental error. Failure to address these errors can lead to a smeared image and, in the worst case, to misinterpretation of the imaged object’s structure. In this work, we present a novel optimization-based approach to calibrate a common yet challenging source of experimental error, the drifts of the scanning positions, while simultaneously reconstructing the object. This approach utilizes the coupled and complementary information from different measurements to inform a coherent system between the measurements and the reconstruction. We illustrate the proposed approach on both synthetic and real tomography images and show its superior performance compared without explicit error calibration.

12:05
A Multilevel Hyperbolic Neural Network for Image Segmentation.

ABSTRACT. Convolutional Neural Networks (CNN) have recently seen tremendous success in various computer vision tasks. However, their application to problems with high dimensional input and output, such as high-resolution image and video segmentation or 3D medical imaging, has been limited by various factors. Primarily, in the training stage, it is necessary to store network activations for back propagation. In these settings, the memory requirements associated with storing activations can exceed what is feasible with current hardware, especially for problems in 3D. Previously proposed reversible architectures allow one to recalculate activations in the backwards pass instead of storing them. For computer visions tasks, only block reversible networks have been possible because pooling operations are not reversible. Block-reversibility still requires storing a number of activations that grows with the number of blocks. Motivated by the propagation of signals over physical networks, that are governed by the hyperbolic Telegraph equation, in this work we introduce a fully conservative hyperbolic network for problems with high dimensional input and output. We introduce a coarsening operation that allows completely reversible CNNs by using the Discrete Wavelet Transform and its inverse to both coarsen and interpolate the network state and change the number of channels. This means that during training we do not need to store any of the activations from the forward pass, and can train arbitrarily deep networks. We show that fully reversible networks are able to achieve results comparable to the state of the art in image depth estimation and full 3D video segmentation, with a much lower memory footprint that is a constant independent of the network depth.

16:30-18:35 Session 4A: Preconditioning for Multiphysics Problems III. Bighorn B
16:30
Monolithic Multigrid for a Stabilized Discretization of the Poroelastic Equations

ABSTRACT. In this talk we discuss a monolithic multigrid method for a recently developed stabilized discretization of the poroelastic equations. The discretization is well-posed with respect to the physical and discretization parameters. We use geometric multigrid with finite-element interpolation operators and Galerkin coarsening, focusing on different relaxation schemes. We consider both Braess-Sarazin and Vanka relaxation, and optimize the relaxation parameters via local Fourier analysis. Numerical results agree with the theoretical predictions provided by the local Fourier analysis of the system, and provide insight into which methods work best.

16:55
Robust Preconditioners for Mixed-dimensional Models of Flow in Fractured Porous Media

ABSTRACT. Mixed-dimensional partial differential equations arise in many physical applications including flow in fractured porous media, where the fractures and their intersections form a hierarchy of lower-dimensional submanifolds. An essential component, and usually the most time-consuming part of simulating PDEs, is solving the large-scale and ill-conditioned linear systems of equations arising from discretizations. In this work, we generalize the traditional framework of designing preconditioners for the saddle point systems and develop effective preconditioners that are robust with respect to the physical and discretization parameters for mixed-dimensional models for flow in fractured porous media. Preliminary numerical experiments are presented to support the theory and demonstrate the robustness of our preconditioners.

17:20
A New Solver for Coupling Maxwell's Equations to Transmission Lines

ABSTRACT. We have developed a new self-consistent variational algorithm for coupling a 1D transmission line model to a full electromagnetic model. This algorithm enables large scale power flow simulations by reducing the computational resources required to simulate large portions of a pulsed power driver. This algorithm applies to general unstructured finite element discretizations and can be combined with other boundary conditions such as absorbing boundary conditions at the coupled boundary. This talk will outline the algorithm and present initial verification results for transmission lines coupled to electromagnetic particle-in-cell plasma simulations. We will focus specifically on our efficient technique for solving the coupled system which draws on methods developed in block preconditioning.

17:45
An iterative particle-in-cell scheme for the hybrid plasma model using physics-based preconditioning

ABSTRACT. The efficient simulation of many important multi-scale plasma physics problems requires a combination of physical model reduction and algorithmic innovation. The hybrid-PIC (particle-in-cell) approach, which uses a particle-based discretization of the kinetic ions and a grid-based fluid model for the electrons, is a promising method to simulate a large class of problems where the coupling between the macro-scale system and the ion kinetic scales is of primary importance (e.g. [1]). However, a number of algorithmic issues with the hybrid approach need to be addressed to fully enable efficient simulation of the largest systems. These issues include 1) the need to step over stiff timescales associated with electromagnetic whistler waves and ion gyration about the magnetic field, 2) the need to mitigate severe numerical errors caused by the non-conservation of momentum and energy, and 3) the need to selectively resolve fine-scale regions in space, such as current sheets and shock fronts.

Here, we present a novel hybrid algorithm [2] that addresses these issues. 1) An implicit multi-rate time-stepping approach is used to accurately integrate the fast ion gyro-motion, while allowing large time-steps to advance the electromagnetic fields. A physics-based preconditioner is developed from the fluid-moment equations and is used to efficiently step over the stiff electromagnetic timescales. 2) The new scheme is shown to give exact momentum and energy conservation and, in contrast to existing hybrid algorithms [3], is stable with respect to finite-grid instabilities caused by particle-mesh aliasing. 3) The scheme has been generalized for arbitrary curvilinear meshes to allow selective spatial resolution of fine structures. Finally, we remark upon a novel `cancellation problem' [4] that can give significant wave dispersion errors in the long-wavelength limit that is caused by the specific choice of finite-sized particles for ions and a grid-based fluid description for electrons.

[1] A. Stanier et al., Phys. Rev. Lett., 115, 175004 (2015). [2] A. Stanier, L. Chacon, G. Chen, J. Comput. Phys., 376, 597 (2019). [3] P. W. Rambo, J. Comput. Phys., 118, 152-158 (1995). [4] A. Stanier, L. Chacon, A. Le, arXiv preprint, arXiv:1912.04860 (2019).

18:10
On the robust discretization and fast solver for the H(curl) and H(div) convection-diffusion problems

ABSTRACT. In this talk, we present robust discretization and fast solver for H(curl) and H(div) convection-dominated PDEs discretized on unstructured simplicial grids. The derivation of these schemes makes use of some intrinsic properties of differential forms and in particular some crucial identities from differential geometry. Both theoretical analysis and numerical experiments show that the new upwind finite element schemes provide an accurate and robust discretization and fast solver in many applications and in particular for simulation of magnetohydrodynamics systems.

16:30-18:35 Session 4B: AMG for Nonsymmetric Problems. Bighorn C/1
16:30
Algebraic Multigrid for Highly Convective Flow Simulations

ABSTRACT. With an ever growing variety of space vehicles, the simulation of hypersonic flight gives rise to increasingly important research challenges associated with the efficient solution of highly non-symmetric systems. While the use of multigrid within sub-sonic and transonic flow simulations has been well studied, the hypersonic flow case has received little attention and is still very much an open research topic. In this talk, we investigate the application of algebraic multigrid to solve the linear systems that arise from the use of Newton's method in conjunction with adaptive pseudo-time stepping to solve steady state hypersonic flow problems. The overall approach gives rise to a series of linear systems associated with a Jacobian approximation of the nonlinear equations that is then solved by a multigrid method. Unfortunately, the application of algebraic multigrid can be problematic for highly non-symmetric systems. Specifically, the coarse grid equations produced by a standard Petrov-Galerkin approximation may not be stable, though the fine grid equations are stable. Thus, standard smoothing algorithms such as ILU and line Gauss-Seidel often diverge when applied directly to coarse level equations. Further, solutions from a coarse grid do not necessarily reduce error (as the underlying projections are not orthogonal). We discuss a few algebraic multigrid algorithms to address the above-mentioned concerns. This includes the adaptation of specialized grid transfers that attempt to mimic upwinding ideas used for highly convective flows and the use of non-Galerkin or non-Petrov-Galerkin coarse grid operators. The basic idea behind a non-Petrov-Galerkin approach is to add a perturbation to the coarse level operator produced by a standard Petrov-Galerkin projection. Two natural possibilities for this matrix perturbation include a diagonal term (motivated by considering a reduced pseudo-time step on the coarse level) as well as a projection portion of the symmetric part of the matrix (loosely motivated by artificial dissipation considerations). In both cases, the aim is to add the smallest perturbation such that the standard smoothing algorithms will converge on the resulting system. Error expressions associated with multilevel methods will be presented along with numerical results corresponding to some highly convective flow simulations.

16:55
Nonlinear and Linear Agglomeration Multigrid Methods for Reynolds-averaged Navier-Stokes Problems in Aerodynamics

ABSTRACT. We discuss three solver variants based on agglomeration multigrid methods for Reynolds-averaged Navier-Stokes with applications in aerodynamics. The first approach consists of a traditional nonlinear full-approximation storage (FAS) multigrid method. The second approach involves a linear multigrid solver which is used at each nonlinear step to solve a simplified linearized problem based on a first-order accurate Jacobian, resulting in an outer nonlinear defect-correction step. In the third approach, this first-order linear multigrid solver is used as a preconditioner for a GMRES method that solves the nonlinear problem using a Newton-Krylov approach. The performance of these three solver strategies is examined for problems of increasing difficulty, using unstructured meshes with highly anisotropic stretching in near wall regions, employing a line smoother on all levels. Test cases consist of large-scale steady-state turbulent compressible flows of industrial relevance in aerodynamics. In general, for simple problems where all three solvers achieve convergence, the Newton-Krylov solver is found to be more costly in terms of cpu time. However, the Newton-Krylov method is found to be the most robust approach and is capable of achieving convergence to machine zero on problems that fail to converge with the other two solvers. Strategies for further improving the performance of these solvers are discussed.

17:20
Incompressible Navier-Stokes with Two-Stage Gauss-Seidel Smoothers for FGMRES-AMG

ABSTRACT. In the high Re number flow regimes encountered in high-resolution wind-turbine simulations with Nalu-Wind, we have found that hybrid variants of a two-stage nested form of Gauss-Seidel iteration, Frommer and Szyld (1992), are robust and lead to lower compute times than existing hybrid (block-Jacobi) Gauss-Seidel smoothers. Adaptive multigrid for incompressible Navier-Stokes solvers was described by Drikakis et al (2000). Adaptive smoothers were first presented at Copper by Rude (1993). Wolfson-Pau and Chow (2016), (2017) have recently explored distributed Gauss-Southwell relaxation as a parallel solver and AMG smoother. The AMG preconditioner may vary at each GMRES iteration depending on the updates of residual vector elements or in the case of random or asynchronous iterations. Thus, the one-reduce Flexible GMRES recently derived by Swirydowicz et al (2020) is applied. We present higher order variants of the (two-stage) Gauss-Seidel algorithm suitable for GPUs and examine the resulting solver convergence rates with two-stage GS as the momentum preconditioner and FGMRES-AMG for pressure in Nalu-Wind benchmark simulations.

17:45
A parallel multigrid technique for nonsymmetric elliptic systems with application to variable-density flows

ABSTRACT. A geometric multigrid algorithm is introduced for solving nonsymmetric linear systems resulting from the discretization of the variable density Navier-Stokes equations on nonuniform structured rectilinear grids and high-Reynolds number flows. The restriction operation is defined such that the resulting system on the coarser grids is symmetric, thereby allowing for the use of efficient smoother algorithms. To achieve an optimal rate of convergence, the sequence of interpolation and restriction operations are determined through a dynamic procedure. A parallel partitioning strategy is introduced to minimize communication while maintaining the load balance between all processors. Testing the algorithm on systems with up to a billion unknowns shows that the cost varies linearly with the number of unknowns. This linear scaling of cost confirms the robustness of the proposed multigrid method and its suitability for multiscale simulation of high-Reynolds number turbulent flows. The robustness of our method to density variations is established by considering cases where density varies sharply in space by a factor of up to $10^4$, showing its applicability to two-phase flow problems. Strong and weak scalability studies are carried out, employing up to 30,000 processors, to examine the parallel performance of our implementation. Excellent scalability of our solver is shown for a granularity as low as $10^4$ to $10^5$ unknowns per processor. At its tested peak throughput, it solves approximately 4 billion unknowns per second employing over 16,000 processors with a parallel efficiency higher than 50\%.

18:10
Discretization of the Multi-fluid Plasma Model using IMEX and Mixed Continuous/Discontinuous FEM

ABSTRACT. Multi-fluid plasma models are used to accurately simulate the various time and spatial behaviors found in high-density plasma physics. Each plasma species is treated as a separate electromagnetically/collisionally-coupled Euler fluid interacting with Maxwell's equations, which can lead to a challenging implicit system. In this talk we discuss a numerical technique for modeling multi-fluid plasmas using mixed continuous and discontinuous Galerkin spatial discretizations and implicit-explicit time integration. Important time and velocity scales will be identified and leveraged to reduce the complexity of the assembly and preconditioning of the system. Numerical techniques, including block decomposition with multigrid applied to the continuous Schur complement, diagonal grouping of discontinuous terms, and quasi-Newton iteration with Anderson acceleration, will be discussed with the goal of a fast, scalable treatment of complex physics.

Sandia National Labs is managed and operated by National Technology & Engineering Solutions of Sandia, LLC, a subsidiary of Honeywell International, Inc., for the U.S Dept. of Energy’s National Nuclear Security Administration under contract DE-NA0003525.

16:30-18:35 Session 4C: Nonlinear Solvers and Coupling Methods in Climate Modeling. Bighorn C/2
16:30
Initialization of the thermo-mechanical state of an ice sheet model using PDE-constrained optimization

ABSTRACT. We present an implicit thermo-mechanical solver for modeling ice sheets dynamics. The model is highly nonlinear due to the non Newtonian ice rehology and the ice phase changes. In order to estimate the initial state and unknown parameters of the ice sheet model we solve a PDE-constrained optimization problem to minimize the mismatch with observations, using the thermo-mechanincal model as a constraint. We present results related to the initialization of the Greenland ice sheet, and discuss the challenges that this multi-physics problem poses to the linear solver, the nonlinear solver and the optimization solver.

16:55
A scalable semi-implicit barotropic mode solver for the MPAS-Ocean

ABSTRACT. A scalable semi-implicit barotropic mode solver for the Model for Prediction Across Scale-Ocean (MPAS-O) has been implemented. A nonlinear barotropic system is discretized in time with the Crank-Nicolson method to build a Helmholtz-type problem. To solve this nonlinear system, barotropic thickness inside each divergence operator is replaced by lagged values from the previous time step and an outer iteration stage, where two Krylov solver iterations in the outer iteration stage are carried out to obtain up-to-date lagged values. In this stage, the coefficient matrix is reassembled using barotropic thickness from the previous time step. The coefficient matrix in the main iteration stage is reassembled once again using up-to-date barotropic thickness from the outer iteration stage. During the Krylov solver iteration, the coefficient matrix is not changed so that the system is treated as a linear system. As a Krylov subspace solver, we use the pipelined preconditioned biconjugate gradient stabilized method (BiCGSTAB), which has better parallel scalability compared to other BiCGSTAB algorithms. To accelerate the convergence of this solver, the restricted additive Schwarz (RAS) preconditioner which is a block matrix type local preconditioner is used, in which the preconditioner requires a local halo exchange before preconditioning. Before construction of the RAS preconditioning matrix, the global coefficient matrix is reordered to make a near-block diagonal form to obtain better performance from the RAS preconditioner. Several numerical experiments show that the semi-implicit barotropic mode solver has better numerical efficiency compare to an existing explicit subcycling scheme. When using 16,320 cores with 1.4M grid cells, compute time for the barotropic mode is almost three times faster than the existing explicit scheme. Moreover, it is found that the semi-implicit solver can take almost two times larger time step size without any numerical instability.

17:20
Meshless Remap of Native Fields for Earth System Models via Generalized Moving Least Squares

ABSTRACT. Coupling global Earth system models requires passing information between codes where the fields to be transferred exist on meshes that are independently discretized. The coupling takes place over the shared subdomain boundary, resulting in interface discretizations that are generally mismatched. Generalized moving least squares (GMLS) is a meshless reconstruction technique for generating a target functional from nearby neighbor information. Reconstruction of fields defined on a sphere are performed by first approximating the tangent plane for each point where a field reconstruction is requested. The tangent plane approximation is then used as a local coordinate system on which sampling functionals are evaluated acting on a basis. After solving a least squares problem, the field is then reconstructed in the local coordinate system, and transformed to a the global coordinate system, if necessary. The Compadre toolkit is used for the implementation of all described techniques. Numerical results will be presented demonstrating remap of several fields on grids where the degrees of freedom represent cell-averaged quantities, measuring both accuracy and conservation.

17:45
Interface Flux Recovery Coupling Method for the Ocean-Atmosphere System

ABSTRACT. The simplest atmosphere-ocean coupling schemes transfer fluxes between models in an explicit fashion. This can be formally shown to be equivalent to a single iteration of an iterative method. In general, this type of coupling leads to stability restrictions. Possible remedies are to solve the monolithic system in a coupled fashion or to sacrifice conservation of flux by mixing data at different time-steps to achieve implicitness in the method. In this talk, we present a coupling method that constructs the flux from the fully discrete monolithic system for the so-called bulk condition that is common in the atmosphere-ocean coupling. However, the method constructs the flux in an implicit fashion while decoupling the models. We refer to this method as the Bulk-Interface-Flux-Recovery method (Bulk-IFR). The bulk condition leads uniquely different method than the method which inspired the Bulk-IFR method, the Implicit-Value-Recovery method (IVR). The IVR method applied to the simpler transmission conditions leads to a Lagrange multiplier method for monolithic models whose semi-discrete in space formulations can be reduced to Hessenberg index-1 differential algebraic equations. On the other hand, the presence of the bulk condition leads to a lifted system with an auxiliary variable. The resulting formulation allows the Bulk-IFR method to be implemented with implicit or explicit time-steps which is not the case in the original IVR method. In the Bulk-IFR method, the two model equations and the interface conditions are combined into a monolithic system. A Schur complement is then used to decouple the system. We will show that this allows us to find the flux at the current time-step while still solving the individual models in a decoupled fashion. The method is applied to a simplified atmosphere-ocean model of heat transfer with implicit and explicit time stepping. This results in a partitioned coupling algorithm. Accuracy, stability, and conservation results are presented for the Bulk-IFR method applied to the simplified atmosphere-ocean heat transfer test case.

18:10
Overcoming Gibbs Phenomenon in Data Remap and Convection-Dominant PDEs using WLS-ENO Techniques

ABSTRACT. The Gibbs phenomenon (or spurious oscillations) often occur at discontinuities or under-resolved regions in various applications, such as convection-dominant partial differential equations (PDEs) and data remap in multiphysics problems or domain-decomposition methods. The resolution of the Gibbs phenomenon has been a long-term active research area, and it is particularly challenging in the context of high-order methods. Some commonly used regularization techniques, such as low-order limiters, often compromise the accuracy of the solution, while others, such as streamline diffusion, are highly specialized and difficult to generalize. In this work, we introduce a novel indicator to identify discontinuities and then use Weighted-Least-Squares-based Essentially Non-Oscillatory (WLS-ENO) techniques to resolve the Gibbs phenomenon in the context of data remap between surfaces and in convection-dominant PDEs. Analogous to some mollification techniques for Gibbs phenomenon, WLS-ENO adaptively replaces the basis functions near detected discontinuities using polynomial approximations constructed from weighted-least-squares (WLS) approximations, where the weights are adapted based on the smoothness of the solutions. We demonstrate the effectiveness of our proposed techniques in the context of repeated date transfer between meshes and in solving stationary convection-diffusion equations even at the hyperbolic limit.