previous day
all days

View: session overviewtalk overview

07:00-08:15Breakfast [Copper Conference Center]
08:00-10:05 Session 14A: Preconditioning II [Bighorn C]
Fast High-Order Finite Element Solvers on Simplices
PRESENTER: Pablo Brubeck

ABSTRACT. We present new high-order finite elements discretizing the $L^2$ de Rham complex on triangular and tetrahedral meshes. The finite elements discretize the same spaces as usual, but with different basis functions. They allow for fast linear solvers based on static condensation and space decomposition methods.

The new elements build upon the definition of degrees of freedom given by (Demkowicz et al., De Rham diagram for $hp$ finite element spaces. Comput.~Math.~Appl., 39(7-8):29--38, 2000.), and consist of integral moments on a symmetric reference simplex with respect to a numerically computed polynomial basis that is orthogonal in both the $L^2$- and $H(\mathrm{d})$-inner products ($\mathrm{d} \in \{\mathrm{grad}, \mathrm{curl}, \mathrm{div}\}$).

On the reference symmetric simplex, the resulting stiffness matrix has diagonal interior block, and does not couple together the interior and interface degrees of freedom. Thus, on the reference simplex, the Schur complement resulting from elimination of interior degrees of freedom is simply the interface block itself.

This sparsity is not preserved on arbitrary cells mapped from the reference cell. Nevertheless, the interior-interface coupling is weak because it is only induced by the geometric transformation. We devise a preconditioning strategy by neglecting the interior-interface coupling. We precondition the interface Schur complement with the interface block, and simply apply point-Jacobi to precondition the interior block.

The combination of this approach with a space decomposition method on small subdomains constructed around vertices, edges, and faces allows us to efficiently solve the canonical Riesz maps in $H^1$, $H(\mathrm{curl})$, and $H(\mathrm{div})$, at very high order. We empirically demonstrate iteration counts that are robust with respect to the polynomial degree.

Achieving h- and p-robust monolithic multigrid solvers for saddle-point systems
PRESENTER: Scott MacLachlan

ABSTRACT. The numerical analysis of higher-order mixed finite-element discretizations for saddle-point problems, such as the Stokes equations, has been well-studied in recent years. While the theory and practice of such discretizations is now well-understood, the same cannot be said for efficient preconditioners for solving the resulting linear (or linearized) systems of equations. In this work, we propose and study variants of the well-known Vanka relaxation schemes that lead to effective geometric multigrid preconditioners for various higher-order discretizations of the stationary Stokes and Navier-Stokes equations. Numerical results demonstrate robust performance with respect to FGMRES iteration counts for increasing polynomial order for some of the considered discretizations, and expose several open questions in this area.

Redundant equation methods for stiff wave motions

ABSTRACT. There are a number of areas in applied partial differential equations for time-dependent systems, in which the stiffness can be expressed in terms of a subsystem extracted from the full system using a normal-mode decomposition of the primary dependent variables. Such systems include compressible fluid dynamics at low Mach numbers, fast gravity wave dynamics in atmospheric fluid dynamics, and fast magnetosonic waves in MHD. It is often the case that the stiff subsystem has a structure that, when discretized implicitly, leads to manifestly well-behaved linear systems, with a much smaller number of degrees of freedom than that of the full system. In this work we describe a class of methods for discretizing the full system in a way that the time step is not constrained by the fast scales, but which represents long-wavelength motions associated with the fast scales correctly. It is based on augmenting the the original system of PDEs with the PDEs describing the stiff subsystem obtained from the normal-mode analysis, with the right-hand side of the original equations depending on the stiff variables in an upper-triangular fashion. The resulting system is equivalent to the original system of equations combined with a constraint on the initial values the says that the redundant variables corresponding to the stiff subsystem are consistent with the initial data for the original system. For example, for low-Mach number compressible flows, the redundant equations are an evolution equation for the curl-free component of the velocity field, combined with the classical equation for the evolution of the pressure field. The latter is redundant to the evolution equations for the density and energy, via the equation of state. The augmented system is discretized in time using an IMEX method, with only the unknowns from the redundant variables treated implicitly. This leads to a method for which the CFL time step constraint for the stiff wave motions is eliminated.

We demonstrate this approach for the case of low-Mach number compressible flow, using a fourth-order additive Runge-Kutta method for the time discretization, and fourth-order finite-volume discretizations in space on block-structured adaptive grids. This leads to an implicit system of equations for the redundant stiff variables that can be expressed in terms of the solution of a symmetric positive-definite Helmholtz equation for the pressure, along with a constant-coefficient Poisson solver for computing the Hodge decomposition of the right-hand side of the velocity equation side into divergence-free and curl-free parts. For these solvers, we use FAS geometric multigrid methods. Example calculations include the evolution of viscous reacting flows in closed containers, for which we recover the classical zero-Mach-number limiting behavior for such solutions. We also compute the generation of long-wavelength sound waves generated by a localized vorticity distribution on a block-structured refined mesh. the time step is controlled by the advective CFL condition; in particular, the stiff acoustic CFL condition does not constrain the time step. We also recover fourth-order accuracy for the non-stiff component of the evolution, as well as for the long-wavelength stiff waves.

Implicit time and rank adaptive method for time-dependent PDEs
PRESENTER: Yingda Cheng

ABSTRACT. In this work, we develop implicit time- and rank-adaptive schemes for stiff time-dependent matrix differential equations. The idea of dynamic low rank approximation (DLRA) is to capture the dynamic evolution of the low rank matrices by a numerical procedure. Our scheme is based on a three-step procedure similar to the unconventional robust integrator (BUG integrator). First, a prediction step is made computing the approximate column and row spaces at the next time level. Second, a Galerkin evolution step is invoked using a base implicit solve for the small core matrix. Finally, a truncation is made according to the error threshold. However, compared with the BUG integrator, which has convergence issues for some PDEs with cross terms, our scheme has two important features which enhance the robustness for convergence. We use the row and column space s from the explicit step truncation method in the prediction step to alleviate the tangent projection error associated with the dynamic low rank approximation. In addition, a time adaptive strategy is proposed in conjunction with rank adaptivity. We benchmark the scheme in several tests such as anisotropic diffusion to show robust convergence properties.

08:00-10:05 Session 14B: Machine Learning / Data Science Algorithms and Applications [Bighorn B]
Acceleration methods for scientific and data science applications

ABSTRACT. There has been a surge of interest in recent years in general-purpose `acceleration' methods that take a sequence of vectors converging to the limit of a fixed point iteration, and produce from it a faster converging sequence. In this talk, we will discuss a new class of nonlinear acceleration algorithms based on extending conjugate residual-type procedures from linear to nonlinear equations. The main algorithm has strong similarities with Anderson acceleration as well as with inexact Newton methods- depending on which variant is implemented. We will demonstrate the efficiency of the proposed method on a variety of problems from simulation experiments to deep learning applications. This is joint work with Yousef Saad, Huan He, Ziyuan Tang and Shifan Zhao.

Efficient Hybrid Spatial-Temporal Operator Learning
PRESENTER: Francesco Brarda

ABSTRACT. Recent advancements in operator-type neural networks, such as Fourier Neural Operator (FNO) and Deep Operator Network (DeepONet), have shown promising results in approximating the solutions of parametric Partial Differential Equations (PDEs). However, these neural networks often entail considerable training expenses, and may not always achieve the desired accuracy required in many scientific and engineering disciplines. In this paper, we propose a new operator learning framework to address these issues. The proposed paradigm leverages the traditional wisdom from numerical PDE theory and techniques to refine the pipeline of existing operator neural networks. Specifically, our architecture initiates the training for a single or a few epochs for the operator-type neural networks in consideration, concluding with the freezing of its parameters. The latter are then fed into our error correction scheme: a single parametrized linear spectral layer trained with a convex loss function defined through a functional-type a posteriori error estimator. The computation of the error functional is facilitated by the spatial-spectral correspondence using negative Sobolev norms. This design allows the operator neural networks to effectively tackle low-frequency errors, while the added linear layer addresses high-frequency errors. Numerical experiments on a commonly used benchmark of 2D Navier-Stokes equations demonstrate significant improvements in both computational time and accuracy, compared to existing FNO variants and traditional numerical approaches.

Fast Solvers for Neural Network Least-Squares Approximations

ABSTRACT. Neural networks provide an effective way to approximate functions, especially for some challenging situations with discontinuities, large variations, and sharp transitions. In our recent development of a novel block Gauss-Newton method for least-squares approximations via ReLU shallow neural networks, some dense linear systems arise in the iterations for finding some linear and nonlinear parameters. The coefficient matrices are shown to be symmetric and positive definite. We can further show that they are highly ill conditioned, and the condition numbers get even worse for some challenging function approximations. The ill-conditioned dense linear systems are thus difficult to solve by traditional direct and iterative solvers. On the other hand, we prove that the matrices have some intersting features that we can explore so that the systems can be solved efficiently and accurately. This is joint work with Zhiqiang Cai, Tong Ding, Min Liu, and Xinyu Liu.

Smoothing Gradient Descent (SmoothGD)
PRESENTER: Andrew Starnes

ABSTRACT. In this talk, we discuss recent work on the convergence of a family of smoothing-based gradient descent methods (SmoothGD) that have been applied to unconstrained optimization problems. SmoothGD uses the gradient of the convolution between the target function and a probability distribution (e.g., normal or compactly supported distributions) and then iteratively changes the variance. Intrinsically, this convolution smoothes out noise and small oscillations, which allows SmoothGD to avoid local extrema that would typically trap traditional gradient descent. This approach can also be applied to stochastic gradient descent giving SmoothSGD. In addition to the smoothing benefits, these methods have the practical benefit that they can be used as a gradient-free or zero-order optimization method.

We present convergence rates for L-smooth functions for both SmoothGD and SmoothSGD, which explicitly depend on how the variance changes between steps. We also examine how the choice of distribution impacts these iterative methods. Numerical examples on machine learning problems and optimization test functions will be provided to complement the theoretical results.

Structure Optimization and Compression Methods for Finite Element Operators
PRESENTER: Graham Harper

ABSTRACT. As computing technology moves toward utilizing accelerators with little onboard memory, traditional finite element methods (FEMs) must be adapted in order to fit extreme-scale simulations on fixed resources. One of the main challenges is evaluating and storing basis values for extreme numbers of basis functions. Methods such as sum factorization handle the polynomial degree scaling of those costs, but depend on geometry and choice of finite element basis. As an alternative, we introduce an approach that compresses the physical FEM basis values using measures of mesh cell similarity. Our approach may be used separately or complementary to sum factorization methods. Our compressibility metrics involve Jacobians of the reference-to-physical cell mappings, allowing for broad applicability to simplicial or tensor-product geometry and many finite element bases. When two different mesh cells are similar in shape, the reference-to-physical Jacobians are similar; therefore, the FEM basis values are also similar, and thus do not need to be re-evaluated. The errors made by this approximation may be characterized by considering the transforms specified by de Rham complex. Compressing these basis values allows for greater throughput in evaluating function values while reducing the memory overhead of the simulation. We further enhance the compressibility of the operator by performing r-adaptive mesh optimization offline using an iterative trust-region augmented Lagrangian sequential quadratic programming (ALESQP) method with cell volume constraints, which requires efficiently solving a saddle-point system many times per iteration. This approach perturbs mesh nodes to reduce the number of similarity classes, which increases compression. To validate the theory, we present a variety of results related to the approximation quality, mesh optimization, performance metrics, and how one may use this approach to simulate 1 billion mesh cells on a single node of a typical supercomputer.

10:25-12:30 Session 15: Robust & Scalable Iterative Solution of Multi-physics Problems [Bighorn B]
Space-time parallel iterative solvers for the integration of parabolic problems

ABSTRACT. Due to the limitations of sequential computing, parallelization has become a strong tool to integrate evolutionary differential problems. In this context, we present a new family of space-time parallel iterative solvers suited for the integration of parabolic problems. These methods are based on the combination of the well-known parallel-in-time parareal algorithm (cf. [4]) and suitable splitting techniques that permit us to perform space parallelization without any additional iteration.

The parareal algorithm is a rather simple iterative method based on two propagators, the coarse and fine propagators. The former is cheap and inaccurate, whereas the latter is expensive but accurate. This last propagator is computed in parallel by the algorithm.

On the other hand, we consider two types of splittings, i.e., dimensional (cf. [3]) and domain decomposition (cf. [5]) splittings. The resulting partitioned problems are solved using first-order splitting time integrators, that is, the fractional implicit Euler scheme (see [3]) and the Douglas-Rachford method (see [1]). The main contribution of this work is to consider these integrators as the propagators involved in the parareal algorithm, enabling both space and time parallelization. This results in an optimal use of the available processors.

We study the main stability and convergence properties of the designed methods, considering the convergence factor defined in [2], and we analyze the influence of L-stability of the propagators considered by the parareal algorithm. In this regard, we remark the requirement that the coarse propagator must be L-stable in order to satisfy convergence and stability properties. Moreover, the L-stability of the fine propagator yields faster convergence of the method in the case of having problems with large real negative eigenvalues. Finally, we perform numerical experiments to illustrate our theoretical analysis.


[1] J. Douglas Jr. and H. H. Rachford, On the numerical solution of heat conduction problems in two and three space variables, Trans. Amer. Math. Soc., 82 (1956), pp. 421--439.

[2] M. J. Gander and S. Vanderwalle, Analysis of the parareal time-parallel time-integration method, SIAM J. Sci. Comput., 29 (2007), pp. 556--578.

[3] W. Hundsdorfer and J. Verwer, Numerical Solution of Time-Dependent Advection-Diffusion-Reaction Equations, vol. 33 of Springer Ser. Comput. Maths., Springer-Verlag, Berlin/Heidelberg, 2003.

[4] J.L. Lions, Y. Maday, and G. Turinici, Résolution d'EDP par un schéma en temps ``pararéel", C. R. Acad. Sci. (1) - Math., 332 (2001), pp. 661--668.

[5] T. P. Mathew, P. L. Polyakov, G. Russo, and J. Wang, Domain decomposition operator splittings for the solution of parabolic equations, SIAM J. Sci. Comput., 19 (1998), pp. 912--932.

A new decoupled solution method for a novel oscillation-free numerical scheme for Biot's model
PRESENTER: Carmen Rodrigo

ABSTRACT. Single-phase flow problems on deformable porous media are modelled by means of the so-called Biot's model. Several challenges appear in the numerical solution of this model. On the one hand, it is important to choose appropriate discretization schemes that avoid the appearance of spurious oscillations in the pressure approximation when low permeabilities and/or small time steps are considered. On the other hand, the efficient solution of the large-sparse systems of equations arising after discretization also is challenging. In this work, for different finite element discretizations of Biot's model, we propose a new stabilized scheme that provides numerical solutions that are free of non-physical oscillations, and that, at the same time, allows us to iterate the fluid and mechanic problems in a convergent way to obtain the solution of the whole coupled system. We present numerical results illustrating the robust behavior of both the stabilization and iterative solver with respect to the physical and discretization parameters of the model.

Parallel Compact High-Resolution GMRES-PFFT-Type Algorithms for 3D Stochastic Subsurface Scattering Problems.
PRESENTER: Yury Gryazin

ABSTRACT. In this talk, we present efficient parallel compact high-resolution algorithms for the solution of a stochastic subsurface electromagnetic scattering problem. The methods are based on a combination of the GMRES method and a partial FFT-type (PFFT) lower order deterministic preconditioner. The four and six-order compact finite difference schemes are considered for the solution of the 3D stochastic Helmholtz equation. The PFFT preconditioner utilized the fast implementation of low-dimensional eigenvectors solvers to efficiently implement the developed iterative approach.

The convergence analysis of the proposed iterative method for the stochastic scattering problem is analyzed based on an extension of the approaches proposed by authors in previous publications to analyze the deterministic subsurface solvers. The complexity and scalability of the methods are analyzed on scattering problems with realistic ranges of parameters in soil and mine-like targets.

Improving Greedy Algorithms for Rational Approximation
PRESENTER: Zhongqin Xue

ABSTRACT. When developing robust preconditioners for multiphysics problems, fractional functions of the Laplace operator often arise and need to be inverted. Rational approximation in the uniform norm can be used to convert inverting those fractional operators into inverting a series of shifted Laplace operators. A desired property is that the poles of the approximation be negative so that the shifted Laplace operators remain symmetric positive definite, making them better conditioned. In [Li et al., arXiv:2305.18038], an orthogonal greedy algorithm (OGA) was proposed. Although the approach produces negative poles, the OGA provides rational approximation in the L2 norm. In this work, we address this issue by studying two greedy algorithms for finding rational approximations to such operators, allowing for efficient solution methods for these applications. The first algorithm improves OGA by adding one minimization step in the uniform norm to the procedure. The second approach employs the weak Chebyshev greedy algorithm (WCGA) in the uniform norm, which converges monotonely. We present analysis and numerical results demonstrating that both proposed algorithms ensure the poles are negative while providing rational approximation in the uniform norm. Furthermore, our approach can be extended to other approximation problems by expanding the dictionary used in the algorithm, demonstrating potential flexibility and applicability.

A Multi-Tiered Iterative Algorithm for Determining the Rotational Diffusion of Particle Systems from the Temporal-Angular Cross-Correlation of Their Power Spectrum

ABSTRACT. Investigating particle Brownian motion is critical for understanding the dynamic properties of particles in many fields like molecular biology and materials science. Such motion involves two main movements: translational and rotational diffusion. An experiment known as X-ray Photon Correlation Spectroscopy (XPCS) enables the analysis of these particle system dynamics by computing the correlation function of a time series of X-ray images. While traditional XPCS analysis techniques can extract translational diffusion from correlations in time, they are unable to estimate rotational diffusion.

In this presentation, we introduce the first methodology for estimating rotational diffusion from a time series of XPCS images, each of which can be viewed as the power spectrum of the particle system’s density. Our method is based on computing the angular-temporal cross-correlations of the images, yielding a 4-way data tensor that can encode the rotational motion of the system. However, recovering the rotational diffusion from this tensor is challenging because the images are dominated by noise, and it requires solving a very high-dimensional nonconvex inverse problem.

To overcome these challenges, we develop a new algorithmic framework called Multi-Tiered Estimation for Correlation Spectroscopy (MTECS) for the determination of the rotational diffusion coefficient of the system from XPCS images. MTECS decomposes the problem into a series of constrained subproblems that are either high-dimensional and convex or low-dimensional and nonconvex. For each subproblem, we develop a new mathematical operator to project the physical parameters of the system to the closest solution values. By applying these operators in an iterative scheme, MTECS is able to converge to a set of parameters that are consistent with the underlying mathematical model, allowing us to reduce the dimensionality of data representation, filter out the experimental noise, and extract the rotational diffusion coefficient.

We demonstrate the efficacy of MTECS in determining rotational diffusion from XPCS data by applying it to simulated X-ray data of a 2D system of dumbells and 3D systems of crossing nanotubes and proteins that evolve under translational and rotational Brownian motion for a wide range of different diffusion rates and noise levels. In particular, our results show that MTECS is able to determine rotational diffusion coefficients within a few percent error.