PETSC '18: PETSC USER MEETING 2018
PROGRAM FOR WEDNESDAY, JUNE 6TH
Days:
previous day
all days

View: session overviewtalk overview

09:30-10:15 Session 12: Invited III
09:30
How ETH Seismology and Wave Physics group employs PETSc (invited talk)
SPEAKER: Vaclav Hapla

ABSTRACT. Measurements of mechanical waves traveling through a medium can be used to reveal the subsurface and interior structure of unknown objects. This has plentiful applications ranging from medical imaging at millimeter scale to seismic tomography at the planetary scale. However, solving these problems is challenging from both a mathematical and computational perspective, and scalable simulation tools are key to enable scientific progress.

The Seismology and Wave Physics group at ETH Zurich, which I recently joined as a postdoc, deals with this problem domain. This group’s software flagship is Salvus, a suite for full waveform modeling and inversion. It makes use of PETSc DMPlex module for mesh representation and operations on it. DMPlex represents a mesh by a graph whose vertices represent cells, faces, edges and nodes uniformly. Discretization methods can be used unchanged for meshes of different shapes and dimensions. This flexibility allows for the application of Salvus to physically realistic domains.

In 2018, the NASA InSight mission will place a highly sensitive broadband seismometer on Mars' surface to investigate its deep interior structure. Elastic wave propagation simulations are a crucial component for data interpretation. Hence, it is a relevant challenge for Salvus. At the highest frequencies that are typically used, the resulting system has trillions of spatial degrees of freedom and requires hundreds of thousands of time steps. For such large simulations, loading the whole mesh onto a single processor must be avoided. Hence, I recently adopted parallel I/O, partitioning, and load-balancing techniques in Salvus in order to work with a distributed DMPlex representation throughout the whole simulation.

Another interesting application of PETSc, which I recently became involved in, is an inversion solver for image reconstruction in Ultrasound Computed Tomography (USCT) for early breast cancer detection. USCT is a non-invasive, radiation- and pressure-free technique that uses both transmitted and reflected signals to create images of the soft tissue's acoustic properties. These images are particularly useful for characterizing interior breast tissue and differentiating between benign and malign lesions. The acoustic properties of the breast tissue are reconstructed using ray-based tomography techniques which can result in a much quicker time-to-solution than that available with full-waveform techniques. The resulting problem is a linear least square problem with a large sparse rectangular matrix.

10:15-11:00Coffee
11:00-12:20 Session 13: Preconditioners
11:00
Matrix-free multigrid preconditioners for radiation transport

ABSTRACT. Simulating the distribution of radiation (neutrons/photons) in nuclear reactors is very challenging. One of the main difficulties is due to the dimensionality of the (linear) Boltzmann transport equation (BTE) that must be solved; this equation has three spatial, two angular, one time and one energy dimension (7D). Different material properties also mean the problem has both strongly hyperbolic and diffusive regions.

Simply solving the linear BTE with deterministic technology for a whole reactor core is a “grand-challenge” problem, that cannot yet be run on the worlds largest supercomputers. This also neglects the difficulties of coupling to fully turbulent (multi-phase) fluid flow, heat transfer, radiolytic gas bubble generation, heat-based deformation of materials, etc.

Traditionally, deterministic solver technology and the space/angle discretisations are intimately linked; sweep-based (wavefront) methods are typically used with DG FEM in space and Sn in angle to solve the BTE. These parallelise well (scaling to >100,000 cores) on structured grids, however achieving good scaling on unstructured grids is still an open problem.

This talk will focus on alternate space/angle discretisations we have been developing within the Applied Modelling and Computation Group (AMCG) at Imperial College. These discretisations are intimately linked to the matrix-free multigrid technology we have built, allowing the possibility of good parallel scaling on unstructured grids.

11:20
PDEs: they should be the solver's problem

ABSTRACT. Many optimal solvers for PDEs require access to auxiliary operators, or compositions thereof, over and above what is easily offered by PETSc's Amat, Pmat interface. Although possible, setting things up "by hand" is tricky, error prone, and requires changing to compare the performance of different solver options. This is especially the case when we wish to provide problem-specific data deep in some nested solver.

In this talk, I will describe how we address some of these problems in Firedrake, by augmenting operators (and hence preconditioners) with the ability to provide auxiliary operators as needed. By more tightly coupling the PDE library with the linear algebra, we can make solvers problem- and discretisation-aware.

Recently, we have taken this approach to develop a very flexible framework for domain-decomposition preconditioning, utilising DMPlex to define topological patches, and the auxiliary information to provide operator assembly. Most of this is not specific to Firedrake, and so the question naturally arises as to how to develop discretisation- and problem-aware preconditioning infrastructure that can live in PETSc, yet be usable by the plethora of PDE libraries in the wider community. I do not have the answer to this question, but am hopeful of useful discussion.

11:40
New Coarse Corrections for Optimized Restricted Additive Schwarz Using PETSc.

ABSTRACT. Additive Schwarz Methods (ASM) are implemented in PETSc's PCASM preconditioner tool. By default, however, PCASM applies the Restricted Additive Schwarz (RAS) method because of better convergence behavior. We present here two further improvements for this method: a new and more effective coarse correction, as well as optimized transmission conditions, resulting in an Optimized 2-level Restricted Additive Schwarz method.

It is well known that domain decomposition methods applied to elliptic problems need a coarse correction to be scalable, since without it, information is only transferred from each subdomain to its direct neighbors which makes the number of iterations grow with the number of subdomains. Scalability is achieved by introducing a coarse grid on which a reduced-size calculation is performed, yielding a coarse correction at each iteration of the solution process. Such a 2-level method permits global propagation of the iterative corrections throughout the entire domain, leading to the scalability of the method. Many choices for the coarse grid point locations are possible and all lead to scalable methods, provided coarse grid points lie in each of the subdomains. However, a good choice of coarse grid point locations can lead to much faster methods. We follow here the method introduced in [M.J. Gander, L. Halpern and K. Santugini, A New Coarse Grid Correction for RAS/AS, Domain Decomposition Methods in Science and Engineering XXI, LNCSE, Springer-Verlag, 2014]. The coarse grid points are placed in the overlap and chosen in 1D to be the extreme grid points of the non-overlapping subdomains used to define RAS. Similarly, for a rectangular decomposition in 2D, four coarse grid points are placed around each cross point of the non-overlapping decomposition of RAS. This choice is based on approximating what is called an optimal coarse space which leads to convergence of the 2-level method in a finite number of iterations. Our choice of placing the coarse grid nodes leads to substantially faster convergence than the classical option of equally distributing the coarse grid points within each subdomain.

Optimized transmission conditions stem from a similar idea, namely, to approximate what are called optimal transmission conditions which also leads to convergence of the domain decomposition method in a finite number of iterations. Since these optimal transmission conditions are non-local, in practice one uses local approximations. For RAS we consider here Robin transmission conditions instead of the classical Dirichlet ones, i.e. a well-chosen combination of Dirichlet and Neumann values at subdomain interfaces. A good choice of the Robin coefficient representing the relative weight of Dirichlet and Neumann values permits minimizing the number of iterations, which led to the name Optimized Schwarz Methods. We follow here the method described in [O. Dubois, M.J. Gander, S. Loisel, A. St-Cyr, D.B. Szyld, The Optimized Schwarz Methods with a Coarse Grid Correction, SIAM J. Sci. Comp., vol 34(1), 2012] which only requires modifying the diagonal entries of interface nodes in the subdomain matrices. Again, a good choice of these diagonal entries, for which closed form formulas are available based on the mesh size and the problem parameters, leads to much faster convergence of the associated domain decomposition method than using the standard diagonal entries from RAS.

Our implementation is based on PCASM and, additionally, uses preconditioner composition for the coarse correction (the PCASM is multiplicatively composed with a self-defined PCSHELL implementing the coarse correction) and submatrix modification (PCSetModifySubMatrices) for optimized Robin coefficients. We combine these two improvements and apply them to a 2D Laplace test case up to 16384 cores. We obtain substantially improved computation times with this new optimized 2-level RAS method which, despite a larger memory footprint, proves to be competitive with the multigrid library HYPRE (with the default options of the PETSc interface to this library).

12:00-13:30Lunch
13:30-14:15 Session 14: Invited IIII
13:30
pTatin3D, an example of problem driven new developpement in Petsc Library

ABSTRACT. Over the last ten years,the speed and robustness of linear solver for solving Stokes flow with free surface have brought a lot of new concepts into the earth sciences community. pTatin3d, one of these 3D stokes solver devoted to solving problems in geodynamics strongly leverage on Petsc Libraries and encompasses a lot of these recent developments.

The problem of earth sciences are specials because they require to capture large deformation, across millions of year, with large variations in coefficients together with accurate description of the topography.

Based on the example of pTatin3d developement,I will illustrate how initially problem driven improvements to the code have turn out to be implemented in a general enough manner to be implemented back into Petsc for the benefit of the community, but also ours, as these parts of codes do not need to be maintained or updgraded anymore.

I will then show, how being capable of simulating the earth in 3D has improved our understanding of the deformation of the first 150 km of the earth before discussing the remaining numerical issues that needs to be assessed to go further in our understanding of the Earth.

14:15-14:55 Session 15: Performance
14:15
A performance spectrum model based on the Time-Accuracy-Size (TAS) analysis
SPEAKER: Justin Chang

ABSTRACT. We present a performance analysis appropriate for comparing algorithms using different numerical discretizations. By taking into account the total time-to-solution, numerical accuracy with respect to an error norm, and the problem size, a cost-benefit analysis can be performed to determine which algorithm/solver and numerical discretization are particularly suited for an application. This work extends the performance spectrum model in Chang et al (2017) for interpretation of hardware and algorithmic tradeoffs in numerical PDE simulation. As a proof-of-concept, we first compare CG and DG methods provided by the PETSc and Firedrake Project libraries for solving Poisson's equation. Then, we extend this analysis to some mixed formulation approaches for dual permeability/porosity flow through a porous medium. It turns that this TAS spectrum analysis is indeed necessary in order to perform any comparative finite element studies.

14:35
An exploration of solver performance and optimisations through the mini-app TeaLeaf.

ABSTRACT. At AWE we work on a range of scientific problems, where many of the codes developed in-house to solve these problems leverage PETSc, and other DOE libraries, heavily, in particular for their linear and nonlinear solvers. In recent years, a concerted effort has been made into understanding the underlying algorithms and their performance on modern/future architectures (Xeon, Xeon-Phi, GPUs), with the majority of this work being realised through the development of an MPI+X mini-app, TeaLeaf. The development of Tealeaf has subsequently facilitated fruitful engagement with both academic partners and library developers, and has been run to scale on some of the largest supercomputers in the world. We will discuss recent efforts such as: an investigation of the performance of PETSc solvers on multiple GPUs through CUSP and ViennaCL, communication avoiding optimisations for CPUs and GPUs in single level solvers, cache reuse through loop tiling for CPUs, and the exploration of exponential integrators as a timestepping alternative. © British Crown Owned Copyright 2018/AWE

14:55-15:30Coffee
15:30-16:30 Session 16: Applications II
15:30
Algebraic solvers for discrete tide models
SPEAKER: Robert Kirby

ABSTRACT. Firedrake is a high-level package for the automated finite element solution of partial differential equations. It works closely with PETSc in many ways. We will illustrate several of these advanced features in the context of damped tide models. While the mixed finite element methods used for these models provide robust theoretical properties, implicit time stepping requires the scalable solution of the resulting algebraic systems. When the damping is nonlinear, one typically utilizes a Newton-type solver with a preconditioned Krylov method inside. We will study a range of options for these that include a seamless interface to hypre_AMS and hybridization. Thanks to tidy interfaces, it is easy to evaluate a wide range of practical methods.

15:50
Hybrid-mixed compatible finite element solvers for numerical weather prediction
SPEAKER: Thomas Gibson

ABSTRACT. We present compatible finite element spatial discretizations for three-dimensional equations relevant to numerical weather prediction. This work is a subject of ongoing development with the UK Meteorological Office, as part of the ``Gung-Ho'' dynamical core project. We focus on a procedure known as ``hybridization,'' where the equations are discretized in such a way that element-wise static condensation is permitted. This approach leads to a sparse problem defined only on mesh interfaces, and the original fields can then be recovered in a purely local manner. The implementation of hybridized finite element solvers is challenging since this procedure requires invasive intervention in intricate numerical codes for matrix assembly to construct the appropriate condensed operators.

Firedrake is a finite element library comprised of several domain-specific abstractions. Recent developments have introduced a new framework for the code-generation of static condensation and local recovery operations. In this talk, we show how the development of hybridized solvers can be simplified significantly by using a framework which abstracts away each component of the algorithm. Our approach allows for the design of sophisticated preconditioning interfaces using hybridization that composes naturally with the numerical linear algebra library provided by PETSc.

16:10
Rational eigensolvers in SLEPc: application to the source-free Helmholtz equation
SPEAKER: Jose E. Roman

ABSTRACT. SLEPc provides solvers for nonlinear eigenvalue problems, including polynomial (PEP) and general (NEP). We have recently added specific support for rational eigenproblems as a particular case of NEP. This formulation appears in the solution of the source-free Helmholtz equation in frequency-dispersive media, for instance in the analysis of photonic open structures or the study of scattering resonances of metallic nano-structures.