previous day
all days

View: session overviewtalk overview

09:40-11:00 Session 14: Applications III
MOOC-HPFEM: Teaching High performance FEM with Unicorn/FEniCS to 5000+ students
SPEAKER: Johan Jansson

ABSTRACT. ``Learn how to make cutting edge simulations! Engineering simulations are rapidly becoming fundamental in virtually all industrial sectors, from medicine to energy, aerospace and beyond. The breakthrough general adaptive finite element methods (AFEM) and open source FEniCS software you will learn in this course will position you to take lead to effectively solve the grand challenges in science and engineering.''

We will present our Unicorn/FEniCS-based Massively Online Open Course (MOOC) \href{https://www.edx.org/course/high-performance-finite-element-modeling-kthx-hpfem01-1x#}{High performance FEM (HPFEM)} which is available on the edX platform and has attracted over 5000 students, with 100 new enrollments every day. The teaching team is composed of the researchers in the CFDCT research line at BCAM and the Numerical Methods group at the EECS school at KTH.

The course employs a Jupyter-FEniCS \cite{kluyver2016jupyter,alnaes2015fenics,LoggMardalEtAl2012a} cloud computing framework for basic assignments, and Unicorn/FEniCS in the open source MSO4SC \cite{MSO4SC,hoffmanppam} HPC-cloud framework for advanced supercomputing assignments.

The course covers adaptive FEM, with a simple pedagogical starting point in $L^2$ projection, through adaptive error control, to supercomputing simulation of flight with the Direct FEM methodology in Unicorn/FEniCS.

We will describe recent ground-breaking performance improvements enabling robust large timesteps and prediction of stall in the open source Unicorn framework \cite{unicorncaf}, demonstrating 100x faster and cheaper computation of a full aircraft in the HiLiftPW-3 workshop in 2017 \cite{jansson2017time} compared to HiLiftPW-2 in 2014 with an in-house version of Unicorn. This is an illustration of the power of open source and open science in the entire FEniCS software chain, and enables accurate full aircraft simulation in just a few hours, allowing participants in MOOC-HPFEM the possibility to carry out such simulations. This is an answer to the main challenge today in CFD of reliably predicting turbulent-separated flows for a complete air vehice \cite{witherden2017future,slotnick2014cfd}. We discuss the possibility of real-time aerodynamics computation enabled by the performance in the Unicorn/FEniCS framework.

Phaseflow: FEniCS applied to the monolithic simulation of convection-coupled phase-change

ABSTRACT. The melting and solidification of phase-change materials (PCM's) are relevant to many applications ranging from latent heat based energy storage devices, to ice-ocean coupling and its effects on Earth's climate, to the evolution of the icy moons of our solar system. The phase-change process is strongly affected by the presence of convection in the liquid. Accurately simulating convection-coupled phase-change, including the transient evolution of the phase-change interface, is a demanding problem and an active area of research. In this contribution, we will discuss our model and numerical methods, our implementation using FEniCS, and benchmark applications.

To mathematically model the coupled phase-change system, we adopt an enthalpy formulated, single-domain semi-phase-field, variable viscosity approach, with monolithic system coupling and global Newton linearization. We discretize in time via implicit Euler finite differences, and in space via the Galerkin method. For the vector-valued system of PDE's, we consider a mixed finite element function space. We use the Taylor-Hood element for the pressure and velocity subspaces, which we stabilize via the pressure penalty method. Using FEniCS, we implemented this model into our open-source Python module, named Phaseflow, hosted publicly on GitHub. Accurately resolving the phase-change interface requires adaptive mesh refinement (AMR). To this end, we use the adaptive solver from FEnICS, which implements a dual-weighted residual method for goal-oriented AMR.

We present a series of test cases verifying components of Phaseflow. This shows how appropriate parameter selection reduces our general model to classical benchmarks, such as the heat-driven cavity. The abstractions of FEniCS allowed quick application of Phaseflow to 1D, 2D, and 3D problems, with and without the effects of convection, heat transfer, and phase-change. We demonstrate the current capability of Phaseflow with an octadecane PCM melting benchmark. Looking forward, to practically apply Phaseflow to realistic 3D problems, the adaptive solver implementation must be extended with mesh coarsening and with distributed memory parallel execution (using MPI). Additionally, robust application of Phaseflow to new problems will require bounds on parameters such as the time step size and the semi-phase-field regularization parameter, as well as a systematic procedure for generating initial meshes before the onset of AMR.

Multi-Scale Modeling of Plasticity: a Coupling between Dislocation Dynamics and FEniCS

ABSTRACT. Plastic deformation of crystalline materials is the result of the collective movement of dislocations, in response of their mutual interactions and external applied loads. The study of the dislocations behavior is of fundamental importance in a wide range of fields in materials science, such as the prediction of the mechanical response of ductile materials or the plastic relaxation of strained epitaxial films. Nowadays the quantitative modeling of these systems, with an efficient mathematical and/or numerical approach, is a challenging problem.

A reliable tool to study these complex phenomena is the 3D Dislocation Dynamics (DD). Nevertheless, classical DD simulations, based on an analytical description of dislocations stress field, presents limitations such as the impossibility of handling problems with complicated boundary conditions and strongly heterogeneous loading. Addressing these problems is of particular relevance in the study of thin films and micro and nano-objects, where the strong influence of the free surfaces can affect the evolution of the dislocations microstructures.

Here, to overcome the limitations of classical DD simulations, we present a coupling between a 3D DD code (microMEGAS [1]) and FEniCS, exploiting the Discrete-Continuum Model (DCM) algorithm, as presented in Ref. [2]. In this approach, the DD simulation code is in charge of the evolution of the dislocation microstructure and short-range dislocation-dislocation interactions, at the same time the long-range mechanical fields and dislocation interactions with the boundaries are handled by solving the mechanical equilibrium by means of the FE code.

As shown in Fig. 1, the coupling with a FE code succeeds in providing a numerically exact solution accounting for the presence of complex boundary conditions. Thus, it provides a reliable tool for modeling the mechanical properties at the micro and nano scale where the presence of the free surfaces influences substantially the final dislocation microstructure [3].


[1] B. Devincre, R. Madec, G. Monnet, S. Queyreau, R. Gatti and L. Kubin, Mechanics of Nano-objects (2011) 81-100 [2] O. Jamond, R. Gatti, A. Roos and B. Devincre, International Journal of Plasticity 80, 19 (2016). [3] F. Rovaris, F. Isa, R. Gatti, A. Jung, G. Isella, F. Montalenti and H. von K ̈anel, Physical Review Material 1, 073602 (2017).

A two-phase flow and transport model in porous media for simulating laboratory tests implemented in FEniCS

ABSTRACT. In this work, a two-phase flow and transport model in porous media to simulate, analyze and interpret laboratory tests is presented. The two-phase flow model is based on the oil phase pressure and total velocity formulation, in which the capillary pressure, relative permeabilities, the effects of gravity and the dynamic porosity and permeability modification are allowed. Whereas, the transport model includes physical-chemical phenomena such as advection, diffusion, dispersion and reactions. For the numerical solution of the non linear equation system, a finite element method in space and a backward Euler finite difference method in time are applied, resulting in a fully implicit scheme. Its computational implementation was carried out in the programing language Python using FeniCS project. From the methodological point of view, each stage of model development (conceptual, mathematical, numerical and computational) is described. The resulting model is applied to a case study of low salinity water injection (LSWI) in a core at laboratory conditions. The numerical solutions are compared with the implementation using the commercial software COMSOL Multiphysics.

14:00-14:40 Session 16: Framework III
Complex-valued PDE support in UFL and Firedrake
SPEAKER: David A. Ham

ABSTRACT. Support for PDEs defined over the field of complex numbers has been a freqently requested feature by users of both Firedrake and FEniCS over a number of years. Firedrake now has experimental support for complex-valued PDEs. This required both changes to the symbolic language and semantics of UFL, as well as a number of more mechanical changes in the form compiler, mesh iterator and other supporting software. This presentation will focus on the changes to UFL which are required to support complex values. It is intended both to introduce the current functionality to users, and to inform discussion among developers with a view to merging this functionality into master UFL.

Subjects which will be addressed include: 1. Supporting complex-valued functions, arguments, constants, and expressions. 2. Additional required intrinsics. 3. Redefining inner, outer, and dot products as sesquilinear operators. 4. Sesquilinearity conventions and interactions with those of the linear algebra backend. 5. Implications for UFL's algorithmic differentiation support given the limited class of functions which are differentiable over the complex field. 6. Enforcing sesquilinearity in the form arity checker. 7. Maintaining compatibility with real-mode solvers. 8. Interfacing complex UFL with the rest of the Firedrake stack.

Automated cross element vectorization in Firedrake
SPEAKER: Tianjiao Sun

ABSTRACT. Modern CPUs increasingly rely on SIMD instructions to achieve higher throughput and better energy efficiency. It is therefore important to vectorize sequences of computations in order to sufficiently utilize the hardware today and in the future. This requires the instructions to operate on a group of data that are multiples of the width of the vector lane (e.g. 4 doubles, 8 floats on AVX2 instructions). The assembly kernels that appear in typical finite element computations suffer from issues that often preclude efficient vectorization. These include complicated loop structure, poor data access patterns, and loop trip counts that are not multiples of the vector width. General purpose compilers often perform poorly in generating efficient, vectorized code for such kernels.

In this work, we present a generic and portable solution in Firedrake based on cross element vectorization. Although vector-expanding the assembly kernel is conceptually clear, it is only enabled by applying a chain of complicated loop transformations. Loo.py is a Python package which defines array-style computations in integer polyhedral model, and supports a rich family of transformations that operate on this model. In Firedrake, We adapt the form compiler, TSFC, to generate Loo.py kernels for local assembly operations, and systematically generate data gathering and scattering operations across the mesh in PyOP2. Firedrake drives loop transformations using Loo.py from this high level interface to generate efficient code vectorized across a group of elements which fully utilizes the vector lane. This toolchain automates the tedious and error-prone process of data layout transformation, loop unrolling and loop interchange, while being transparent to the users.

We will present experimental results performed on multiple kernels and meshes. We achieve speed ups consistent with the vector architecture available compared to baseline which vectorizes inside the local assembly kernels. The global assembly computations reach tens of percent of hardware peak arithmetic performance.