View: session overviewtalk overview
| 09:00 | To Be Defined |
| 10:30 | A Physics-Based Classifier for Identifying Uncertain Regions in RANS Models PRESENTER: Seoyeon Heo ABSTRACT. A physics-based classifier is developed to identify uncertain flow regions in Reynolds-averaged Navier–Stokes (RANS) predictions. RANS models remain reliable for well-attached turbulent boundary layers but suffer from accuracy degradation in complex flows involving separated, secondary, and vortical flows. Identifying these uncertain regions is essential for adaptive RANS modeling strategies, particularly in data-driven turbulence modeling frameworks. This study proposes a physics-based classifier that automatically distinguishes reliable and uncertain regions using well-established physical characteristics of attached turbulent boundary layers. The robust applicability of the proposed classifier is demonstrated using various flow scenarios, including attached boundary layer, separated flow, and a jet in crossflow. |
| 10:55 | On Using Source Terms to Sustain Freestream Turbulence for the Langtry-Menter Transition Model PRESENTER: Balaji Shankar Venkatachari ABSTRACT. Freestream disturbance levels experienced by the boundary layer form an important input parameter in the prediction of laminar-turbulent transition via local correlation-based transition models, such as the Langtry-Menter γ-Re_θt model coupled with Meter’s shear stress transport (SST) model. However, it is challenging to match the freestream turbulence level (Tu)at the desired location close to the test article, because of the decay in Tu across the upstream portion of the computational domain and the sensitivity of the predicted decay to the distance from the inflow station to the leading edge of the test article along with the grid resolution and the numerical scheme. Addition of source terms to the SST equations to prevent the destruction of turbulence below a desired value has been proposed as a solution in the context of fully turbulent computations, and it has also been adopted for transition prediction. However, these additional source terms are also active within the boundary layer region, resulting in erroneous transition-onset predictions as well as the surface pressure distribution, especially under moderate-to-high values of turbulence levels. This paper presents a general methodology to shield both laminar and turbulent boundary layers, as well as wake regions, from the unphysical effects of these source terms. The shielding formulation is designed to work well with both stationary RANS computations and any of the hybrid RANS-LES formulations. A detailed assessment of both the baseline implementation and the proposed shielding is carried out using both 2D and 3D cases under subsonic-to-transonic speed regimes involving natural and bypass transition. Illustrative results are presented in this extended abstract and a detailed formulation and the outcomes of the assessment will be reported in the final paper. |
| 11:20 | Wall modeling via function enrichment for RANS simulations of compressible wall-bounded turbulent flows PRESENTER: Xiaorui Xu ABSTRACT. Accurately simulating high-Reynolds number compressible wall-bounded turbulent flows is computationally intensive due to the fine grids needed to resolve near-wall gradients. To address this, we propose a wall modeling technique incorporating function enrichment within a finite volume framework for efficient RANS simulations. This approach integrates a wall function that approximates the sharp streamwise velocity profile, expanding the solution space and enabling accurate capture of near-wall turbulent features on coarse grids. Analysis of compressible flat plate boundary layers shows that function enrichment effectively models momentum and total energy variables at Mach numbers below three, using a wall function originally developed for incompressible flows. The solution in the enriched space is determined via variational reconstruction, and preliminary analysis indicates discontinuities are unlikely in the boundary layer, simplifying shock capturing without the need for limiting in enriched cells. Numerical |
| 11:45 | Large Eddy Simulation of Burgers' Turbulence Through Integrated Filtering, Modeling and Discretization ABSTRACT. The research presented seeks to advance large eddy simulation (LES) through the integration of its three core components - filtering, modeling and discretization. To begin with, motions below the numerically resolvable scales are filtered out by applying the conservation laws over control volumes of prescribed minimum size. This process introduces two spatial filters: one performing control‑volume averaging and another defining the evaluation of fluxes from volume‑averaged quantities. The two spatial filters yield a three‑way partition of kinetic energy. The flux-based filter sets the effective resolution of the large eddies. The net effect of smaller scales is modeled in a way that aligns physical and numerical interpretations, treating filtering, modeling and discretization as a coherent whole. The model prevents large eddies from generating scales smaller than those the fluxes can represent. The resulting framework is successfully tested on a canonical problem: one‑dimensional decaying Burgers turbulence, which provides a controlled setting for examining nonlinear energy transfer. The abstract presents the approach in one spatial dimension. Preliminary 3D results are already available. |
| 10:30 | Online learning of turbulence closure models via ensemble Kalman inversion and reinforcement learning PRESENTER: Katerina Kostova ABSTRACT. Introduction Turbulence closure remains a major source of uncertainty in atmospheric and oceanic models, particularly on coarse computational grid resolution, where subgrid scales strongly influence large-scale dynamics. While machine-learning-based closures have shown promise, most existing approaches rely on offline training and require large amounts of subgrid-scale data, raising concerns regarding generalization and stability in non-stationary geophysical flows. In this work, we present two online learning strategies for turbulence-closure modeling, in which model parameters are adaptively updated during large-eddy simulations (LES) using limited statistical information from high-fidelity reference data. Both approaches are demonstrated in idealized two-dimensional beta-plane geophysical turbulence, serving as a testbed for atmospheric and oceanic flows. Methodology The first approach employs Ensemble Kalman Inversion (EKI) to perform parametric online calibration of physics-based closures, including Smagorinsky-, Leith-, and backscatter-type models [1]. Closure parameters are iteratively updated to minimize mismatches between LES kinetic energy spectra and filtered direct numerical simulation (DNS) spectra. This derivative-free framework is data-efficient, requiring only a small number of DNS snapshots, and preserves physical interpretability while enabling uncertainty quantification of calibrated parameters. The second approach uses reinforcement learning (RL) to learn a state-dependent closure policy online [2]. Here, a neural network policy dynamically adjusts closure coefficients based on resolved-scale spectral information, with rewards defined through the agreement with DNS enstrophy spectra. This formulation allows for spatially and temporally varying closures, naturally incorporates backscatter, and improves the representation of extreme events. Conclusions By contrasting these two online learning methods, we highlight trade-offs between interpretability, flexibility, and data requirements. Our results demonstrate that online learning offers a robust pathway for adaptive turbulence modeling in atmospheric and oceanic simulations, bridging physics-based closures and data-driven methods within a unified framework. References [1] Guan, Yifei, Pedram Hassanzadeh, Tapio Schneider, Oliver Dunbar, Daniel Zhengyu Huang, Jinlong Wu, and Ignacio Lopez-Gomez. "Online learning of eddy-viscosity and backscattering closures for geophysical turbulence using ensemble Kalman inversion." arXiv preprint arXiv:2409.04985 (2024). [2] Mojgani, Rambod, Daniel Waelchli, Yifei Guan, Petros Koumoutsakos, and Pedram Hassanzadeh. "Extreme event prediction with multi-agent reinforcement learning-based parametrization of atmospheric and oceanic turbulence." arXiv preprint arXiv:2312.00907 (2023). |
| 10:55 | Shocklet-Capturing Spatial Artificial Neural Network Model for Large Eddy Simulation of Compressible Isotropic Turbulence PRESENTER: Jiahao Zhang ABSTRACT. To address the issues of inaccurate dissipation and insufficient robustness of traditional subgrid-scale (SGS) models in large eddy simulation (LES) of highly compressible turbulence dominated by shocklets, a Shocklet-Capturing Spatial Artificial Neural Network (SCSANN) model is proposed in this paper. Based on high-fidelity direct numerical simulation (DNS) data of compressible isotropic decaying turbulence with an initial turbulent Mach number M_t = 1.2, we systematically design multi-dimensional input features that integrate first-order physical basis tensors, second-order spatial derivatives, and a seven-point spatial stencil. A priori tests demonstrate that the SCSANN model can accurately reconstruct the SGS stress and heat flux with exceedingly high correlation coefficients and low relative errors. Furthermore, a posteriori LES tests verify the model's excellent generalization capability and long-term computational stability under unseen turbulent evolution states. Compared with the dynamic Smagorinsky model (DSM) and the no-model implicit LES, the SCSANN is capable of reproducing the energy cascade process of the flow field with high fidelity. This study provides an effective data-driven paradigm for high-precision SGS modelling in highly compressible turbulent environments. |
| 11:20 | Data-driven detached-eddy simulations with explicit algebraic stress modelling PRESENTER: Haochen Liu ABSTRACT. This work presents a data-driven explicit algebraic stress-based detached-eddy simulation (DES) method. Despite the widespread adoption of data-driven methods in model development for both Reynolds-averaged Navier-Stokes (RANS) and large-eddy simulations (LES), their applications to DES remain limited. The challenge primarily lies in the absence of modeled stress data, the requirement for proper length scales in RANS and LES branches, and the maintenance of reasonable switching behavior. The data-driven DES approach is constructed based on the algebraic stress equation. Control of RANS/LES switching is achieved through the eddy viscosity in the linear part of the modeled stress, under the ell^2–omega DES framework. Three model coefficients associated with the pressure-strain terms and the LES length scale are represented by a neural network as functions of scalar invariants of the velocity gradient. The neural network is trained using velocity data via the ensemble Kalman method, thereby circumventing the need for modeled stress data. Moreover, baseline coefficient values are incorporated as additional reference data to ensure reasonable switching behavior. The proposed method is evaluated on two challenging turbulent flows, namely, the secondary flow in a square duct and the separated flow over a bump. The trained model achieves significant improvements in predicting mean flow statistics compared with the baseline model. This improvement is attributed to better predictions of the modeled stress. The trained model also exhibits reasonable switching behavior, expanding the LES region to resolve more turbulent structures. Furthermore, the model shows satisfactory generalization capabilities for both cases in similar flow configurations. |
| 11:45 | A neural-network-based subgrid-scale model for LES: application to unseen flows PRESENTER: Haecheon Choi ABSTRACT. We develop a neural-network(NN)-based subgrid-scale model for large eddy simulation with enhanced generalizability across diverse flow configurations. The model is trained on a combined dataset from two different turbulent flows, turbulent channel flow and flow over a circular cylinder. To enable consistent training across these flows, we modify the NN architecture by removing bias terms and batch normalization layers and introduce non-dimensional loss function, thereby eliminating flow-specific scaling dependency. Input features are constructed based on the Vreman model (Vreman 1994) and normalized using filter-scale quantities to ensure geometric invariance. The resulting model generalizes to unseen flow configurations: it captures the reattachment and redeveloping boundary layer characteristics in backward-facing step flow and laminar separation bubble characteristics in flow over an SD7003 airfoil. These results are better than or similar to those of traditional SGS models, demonstrating the model’s ability to learn physically meaningful subgrid-scale interactions rather than overfitting to specific training conditions. |
| 10:30 | Lattice Boltzmann Modeling of Oldroyd-B Viscoelastic Flows Enabling Fluid–Structure Interaction PRESENTER: Dario De Marinis ABSTRACT. 1. Introduction Viscoelastic carrier fluids are increasingly employed in microfluidic devices to enhance the manipulation and separation of biological particles, where purely inertial effects may be insufficient [1]. Elastic stresses generated by dilute polymer solutions give rise to viscoelastic and elasto-inertial focusing regimes [2]. Such flows are commonly characterized by the Reynolds number, defined as the ratio between inertial and viscous forces, the Weissenberg number, defined as the product of the polymer relaxation time and a characteristic shear rate, and the viscosity ratio, defined as the ratio between solvent viscosity and total viscosity. Here, the fluid density, characteristic velocity and length scales, polymer relaxation time, solvent dynamic viscosity, polymer dynamic viscosity, and their sum defining the total viscosity determine the governing dimensionless parameters. In this work, we present a fully three-dimensional framework based on lattice Boltzmann methods for Oldroyd-B fluids targeting the low-to-moderate elasticity regimes relevant to biomicrofluidics. The scheme advances mass and momentum using a lattice Boltzmann formulation with forcing terms modeled through the Guo approach [3], while an advection–diffusion lattice Boltzmann scheme, following Su et al. [4], evolves each component of the polymeric stress tensor through an auxiliary set of distribution functions. The lattice Boltzmann solver is validated against analytical solutions for steady and transient planar Poiseuille flow, showing excellent agreement over the considered parameter range. The framework is designed to be coupled with an Immersed Boundary technique for the simulation of rigid microparticles and nanoparticles suspended in viscoelastic flows. The Immersed Boundary method enforces kinematic and dynamic coupling between the fluid and the suspended structures [5]. 2. Methodology The Oldroyd-B constitutive equation describes the viscoelastic fluid as a dilute solution in which polymer chains, modeled as beads connected by an infinitely extensible spring, are suspended in a Newtonian solvent. We consider an incompressible, isothermal Oldroyd-B fluid governed by the continuity equation expressing mass conservation and by the momentum balance equation. The momentum equation includes inertial effects, the pressure gradient, viscous stresses associated with the solvent, the divergence of the polymeric stress tensor, and a generic external body force per unit volume, such as an imposed pressure-gradient surrogate, gravity forcing, or Immersed Boundary forcing. Pressure, velocity, and density denote the hydrodynamic fields of the system, while the polymeric stress tensor accounts for the elastic contribution of the dissolved polymers. The polymeric stress tensor evolves according to the Oldroyd-B constitutive equation, which combines stress advection by the flow, stress stretching and rotation due to velocity gradients, and a relaxation term driving the stress toward its equilibrium value over a characteristic polymer relaxation time, with the polymer viscosity determining the magnitude of the elastic response. 2.1 Lattice Boltzmann Method for viscoelastic fluids The flow evolution is modeled on a three-dimensional computational lattice using nineteen distribution functions associated with the discrete velocities of the D3Q19 model. All right-hand-side terms are evaluated locally in space and time. The lattice Boltzmann equation is written in terms of collision and streaming steps, in which the distribution functions relax toward their local equilibrium values and include a forcing contribution. The relaxation time is related to the solvent kinematic viscosity, which depends on the solvent dynamic viscosity and the fluid density. The lattice speed of sound is determined by the spatial and temporal discretization. The forcing term is computed via the Guo forcing scheme [3] using the total force acting on the fluid, which includes the divergence of the polymeric stress tensor and the external body force. The polymeric stress tensor is advanced using a second set of distribution functions, one set for each tensor component, following the advection–diffusion lattice Boltzmann formulation of Su et al. [4]. These auxiliary distributions relax toward equilibrium values and include a source term constructed from the constitutive operator of the Oldroyd-B model, which depends on the stress tensor, the velocity gradient, the polymer viscosity, and the polymer relaxation time. A Chapman–Enskog analysis shows that this formulation recovers the Oldroyd-B constitutive equation plus a controllable diffusive term. The relaxation parameter associated with the stress evolution is chosen slightly above its lower stability limit in order to limit spurious diffusivity while ensuring numerical stability. Macroscopic quantities are reconstructed as low-order moments of the distribution functions. The hydrodynamic populations yield the velocity field and the associated pressure variable, while each component of the polymeric stress tensor is directly obtained from the corresponding auxiliary distributions. Velocity or pressure are enforced at no-slip walls and at inlet/outlet boundaries using the non-equilibrium bounce-back approach proposed in [5] for the hydrodynamic populations. Here, the same procedure is consistently extended to the auxiliary stress distributions. This strategy ensures accurate Dirichlet boundary conditions and avoids reliance on periodic forcing when prescribed inlet profiles are required. The algorithm proceeds as follows: evaluation of the divergence of the polymeric stress; collision, streaming, and boundary reconstruction of the hydrodynamic distributions; update of the macroscopic fields; evaluation of the velocity gradient and constitutive operator; collision, streaming, and boundary reconstruction of the stress distributions; and update of the polymeric stress tensor. 2.2 Fluid–Structure Interaction Particles can be coupled to the viscoelastic carrier fluid through an Immersed Boundary technique enforcing kinematic and dynamic continuity at the fluid--structure interface [5]. Within this framework, deformable particle dynamics may be modeled using finite element methods, rigid-particle motion through rigid-body dynamics, and nanoparticle transport through Langevin-based models accounting for stochastic forcing. 3. Validation and Conclusions To assess the accuracy of the proposed formulation, we consider a planar Poiseuille flow of an Oldroyd-B fluid between two parallel plates. The computational domain consists of a square lattice with equal length in the two in-plane directions, with three nodes in the homogeneous streamwise direction and periodic boundary conditions applied along the streamwise and spanwise directions. A constant body force is imposed in the spanwise direction as a pressure-gradient surrogate, with magnitude proportional to the total viscosity, the characteristic velocity, and the inverse square of the channel height. No-slip conditions are enforced at the walls through non-equilibrium bounce-back reconstructions. The velocity and stress fields are initially set to zero. To validate the transient response, we analyze the flow at the channel center for viscosity ratio values equal to 0.1, 0.3, 0.5, 0.7, and 0.9, at fixed Reynolds and Weissenberg numbers equal to 1. The temporal evolution of the centerline velocity is examined. The solver correctly reproduces the characteristic oscillatory relaxation induced by viscoelastic stresses. The amplitude of the first velocity peak is predicted within 0.25% for all tested viscosity ratios, demonstrating the accuracy and stability of the proposed formulation. We then consider a Reynolds number equal to 1, a viscosity ratio equal to 0.3, and Weissenberg numbers equal to 0.05, 0.3, 0.6, and 1. A comparison between numerical and analytical solutions is performed at the mid-plane of the channel. The numerical velocity profiles exhibit excellent agreement with the analytical solution [6], with relative errors below 0.05%. The normal and shear components of the polymeric stress tensor are also accurately captured, and their relative errors remain below 0.5%. Overall, the proposed 3D LB solver accurately reproduces viscoelastic benchmark flows while maintaining numerical stability in regimes where inertia and elasticity coexist. The benchmark configurations confirm the correct coupling between momentum and polymer stress transport and establish the foundation for forthcoming simulations of particle migration and focusing in complex geometries. References [1] Jian Zhou, Chunlong Tu, Yitao Liang, Bobo Huang, Yifeng Fang, Xiao Liang, Ian Papautsky, and Xuesong Ye. Isolation of cells from whole blood using shear-induced diffusion. Scientific reports, 8(1):9411, 2018. [2] Seungyoung Yang, Jae Young Kim, Seong Jae Lee, Sung Sik Lee, and Ju Min Kim. Sheathless elasto-inertial particle focusing and continuous separation in a straight rectangular microchannel. Lab on a Chip, 11(2):266–273, 2011. [3] Zhaoli Guo, Chuguang Zheng, and Baochang Shi. Discrete lattice effects on the forcing term in the lattice boltzmann method. Physical review E, 65(4):046308, 2002. [4] Jin Su, Jie Ouyang, Xiaodong Wang, Binxing Yang, and Wen Zhou. Lattice boltzmann method for the simulation of viscoelastic fluid flows over a large range of weissenberg numbers. Journal of Non-Newtonian Fluid Mechanics, 194:42–59, 2013. [5] Dario De Marinis, Domenico Careccia, Francesco Ferrara, and Marco Donato de Tullio. Fully resolved simulations of rigid particle focusing in serpentine microfluidic devices. Physical Review Fluids, 10:104202, 2025. [6] EO Carew, P Townsend, and MF Webster. Taylor-galerkin algorithms for viscoelastic flow: application to a model problem. Numerical Methods for Partial Differential Equations, 10(2):171–190, 1994. |
| 10:55 | Hybrid Discrete Element Lattice Boltzmann Method for fluid-particle systems with large particle size ratio PRESENTER: Pei Zhang ABSTRACT. Fluid-particle systems characterized by a large particle size ratio are ubiquitous in natural and industrial processes, yet they present significant challenges for computational modeling. Fully resolved simulations are computationally prohibitive for the entire system, whereas standard unresolved methods fail to capture the complex hydrodynamic interactions of the larger particles. To bridge this gap, this paper presents a novel hybrid numerical framework coupling the Discrete Element Method (DEM) with the Lattice Boltzmann Method (LBM). The fluid phase is solved using the LBM, while the kinematics of both large and small particles are tracked via DEM. To efficiently and accurately handle the extreme size disparity, a dual-scale coupling strategy is implemented. Large particles, which span multiple fluid lattice cells, are treated as fully resolved using the Partially Saturated Method (PSM) to accurately map the solid-fluid boundary conditions and capture detailed hydrodynamic forces. Conversely, small particles (sub-grid scale) are modeled using an unresolved CFD-DEM volume-averaging approach, relying on local void fractions and empirical drag correlations for momentum exchange. This hybrid multi-scale approach effectively balances computational efficiency with hydrodynamic accuracy, providing a robust tool for investigating complex multiphase flows with extreme polydispersity. |
| 11:20 | Generation of efficient adjoint lattice Boltzmann collision kernels with reverse mode algorithmic differentiation PRESENTER: Shota Ito ABSTRACT. Background and Objective Adjoint-based optimization is widely used to address large-scale flow control problems with distributed control variables, as the computational cost of gradient evaluation is independent of the dimension of the control space. Within this context, the LBM represents an attractive discretization scheme, as it not only reduces the computational expense within optimization iteration loops but also exposes Jacobian expressions through its explicit operator-split formulation, thereby simplifying adjoint analysis. Despite these advantages, most existing adjoint LBM approaches rely on manual derivation of the adjoint system, lack automation, and are highly case-specific. High-level LBM frameworks such as XLB demonstrate the feasibility of using algorithmic differentiation (AD) by evaluating gradients at the compiler or framework level. While these frameworks significantly lower the entry barrier for adjoint-based optimization, they completely hide the structure of the discrete adjoint system from the user and are inherently constrained by global tape-based reverse mode AD with performance overheads. Their reliance on Python-based back-end frameworks, such as JAX or PyTorch, limits transparency and complicates integration into established large-scale C++ code bases such as OpenLB. Taken together, these limitations reflect the lack of an algorithm-aware, performance-transparent, and automated framework for discrete adjoint LBM suitable for large-scale optimization. In this talk, we present a novel framework implemented in the open-source library OpenLB that enables the automated generation of discrete adjoint LBM collision kernels. Algorithmic differentiation is applied locally to evaluate the Jacobians of the adjoint system, eliminating manual derivations. Code generation combined with common subexpression elimination (CSE) removes the runtime overhead of reverse mode AD tapes, allowing efficient adjoint collision kernels to be generated directly from their primal implementations. The main objectives of this work are summarized as follows. An operator-level adjoint framework is to be developed for LBM that exploits the algorithmic structure of the method to localize and automate Jacobian evaluation using AD. A systematic discussion and evaluation of the choice of AD modes for adjoint LBM, focusing on arithmetic cost, should be performed. Finally, a comprehensive performance evaluation of automatically generated adjoint kernels assessing the impact of CSE should be carried out. Methodology and Implementation By formulating the adjoint derivation independently of specific LBM models, the proposed approach is applicable to a wide range of collision operators without requiring manual Jacobian derivations. Exploiting the operator-split structure of LBM, it can be demonstrated that the locality of the primal collision operator is naturally inherited by its adjoint counterpart, enabling an efficient and transparent adjoint formulation. For stationary flow problems, the evaluation of the Jacobian of the collision operator with respect to the particle populations was identified as the computationally dominant component of the adjoint simulation. By seeding the reverse AD with adjoint pre-collision populations, the required gradients can be computed in a single reverse mode pass, avoiding the explicit construction of the full Jacobian. The OpenLB framework supports the flexible exchange of the value type at operator level, enabling expression tree extraction and their differentiation with a native reverse AD implementation based on operator overloading. In this way, the adjoint collision kernel is generated from the primal implementation before simulation using code generation. The generated kernels are optimized for performance, as they are CSE-optimized and free of AD type overhead. Validation and Performance Evaluation The proposed framework was validated using a synthetic inverse problem with distributed controls. The computed gradients were shown to be consistent with finite-difference quotients for an inverse problem based on a manufactured solution of the incompressible Navier–Stokes equations. To showcase the performance of the generated adjoint kernels, a domain identification problem was solved for 10 optimization steps with 216 million cells, involving 16 primal and 13 adjoint simulations. The optimization completed in 26.9 hours on a single node equipped with four NVIDIA H200 GPUs, achieving 48,000 MLUP/s for the primal solver and 16,000 MLUP/s for the adjoint solver. Performance investigations were carried out for the primal and adjoint collision kernels used in the domain identification problem. The results revealed that CSE has a stronger impact on compute-intensive adjoint collision kernels compared to their primal counterparts, yielding on average a speedup by a factor of three on both CPU and GPU systems. On general-purpose GPUs with limited double-precision throughput, CSE proved to be a decisive optimization, as the adjoint kernels are more strongly compute-bound, yielding speedups of up to a factor of nine. Roofline plot results highlight that kernel-level arithmetic efficiency is a critical factor for adjoint-based optimization, complementing traditional strong and weak scaling analyses. Conclusion Overall, the presented approach combines the transparency and performance of manually derived adjoint schemes with the flexibility and automation of algorithmic differentiation. By restricting AD to well-defined operator-level scopes and eliminating runtime overhead through code generation, the framework enables efficient and extensible adjoint LBM formulations within OpenLB. By seeding the reverse AD with adjoint pre-collision populations, the required gradients can be computed in a single reverse mode pass, avoiding the explicit construction of the full Jacobian as required for forward mode approaches. The impact of CSE has been investigated across different hardware architectures and floating-point precisions, demonstrating speedups of up to a factor of nine for adjoint collision kernels in double precision on general-purpose GPUs. The extension to non-local collision operators and coupled multi-physics systems represents a promising direction for future research and is currently under investigation. |
| 11:45 | Turbulent Fluid Flow Sampling with OpenLB-UQ PRESENTER: Johannes Leonard Grafen ABSTRACT. Turbulent Fluid Flow Sampling with OpenLB-UQ Uncertainty quantification (UQ) in computational fluid dynamics (CFD) is traditionally bottlenecked by the curse of dimensionality, requiring thousands of realizations to achieve statistical convergence. Although promising frameworks such as OpenLB-UQ have been recently developed, the computational cost of CPU-based solvers for three-dimensional (3D) turbulent flows renders comprehensive UQ studies demanding when the computational effort for a single sample exceeds a certain threshold. We provide an overview of OpenLB-UQ and present recent results that enable the structural refactoring of the framework for future work toward a GPU-accelerated lattice Boltzmann method (LBM) backend to enable high-throughput sampling of complex fluid systems. OpenLB is an open-source C++ library designed for efficient and extensible simulations of complex fluid dynamics on high-performance computers. In OpenLB-UQ, we leverage the efficiency of OpenLB for large-scale flow sampling with a dedicated and integrated UQ module. The methodological focus is placed on non-intrusive stochastic collocation (SC) methods based on generalized polynomial chaos (gPC) and (quasi-)Monte Carlo (MC) sampling. The OpenLB-UQ framework is extensively validated in convergence tests with respect to statistical metrics and sample efficiency using selected benchmark cases, including two-dimensional Taylor-Green vortex (TGV) flows with up to four-dimensional uncertainty and a flow past a cylinder. Our results confirm the expected convergence rates and demonstrate promising scalability, robust statistical accuracy, as well as computational efficiency across thousands of samples and thousands of CPU cores. We present results from an efficient UQ workflow based on OpenLB-UQ for large-eddy simulations (LES) of real urban geometries (within the city of Reutlingen, Germany) including numerically assimilated measurement data of wind speed and flow direction. Key quantities of interest, including the mean flow field, standard deviation, and confidence intervals, are obtained without modifying the deterministic solver. By leveraging a scalable ensemble implementation, the workflow achieves faster-than-real-time performance, completing a 48-hour inflow sequence in only 15 hours of wall-clock time. Our results confirm the scalability of the framework, its convergence properties for mathematically well-defined test cases, and its applicability to realistic turbulent fluid flows. Current research is focused on the structural refactoring of the frameworks I/O toward a fully GPU-accelerated LBM backend to enable high-throughput sampling of complex fluid systems. After integrating the existing transparent parallelization of the LBM collision and streaming kernels in OpenLB and adapting the UQ layer of OpenLB-UQ for GPU-aware background data manipulations, the 3D RTGV will be used as a validation case for analyzing the speedup achieved compared to the CPU-based OpenLB-UQ implementation. This work will pave the way for UQ in industrial-scale turbulent applications, including complex internal and external flows. |
| 10:30 | Validated CFD-EFD Assessment of Fin Stabilizer Impacts on Hydrodynamic Performance, Motions, and Fin Loads of Atlantic Canadian Fishing Vessels PRESENTER: Fatima Jahra ABSTRACT. Passive fin stabilizers are widely used on fishing vessels to mitigate roll motion and improve safety and operability. However, the hydrodynamic penalties and local loading effects associated with fin installation are often not quantified with sufficient rigour, leading to conservative or uneconomical designs. This study presents a combined experimental fluid dynamics (EFD) and computational fluid dynamics (CFD) investigation of fin stabilizers installed on a representative Atlantic Canadian fishing vessel, with emphasis on resistance, powering, motion responses, and fin base loads. A 1:7.5-scale model was tested in calm water and regular waves to measure resistance, power, motions, and fin loads under controlled conditions. In parallel, full-scale CFD simulations were performed to resolve viscous flow features, fin-hull interactions, and local pressure distributions governing fin loading. A structured validation and uncertainty analysis framework was implemented. Experimental uncertainties were quantified by accounting for instrumentation accuracy, repeatability, and scaling effects, while numerical uncertainties were evaluated through grid-refinement studies, time-step sensitivity analyses, and turbulence-model assessments. The CFD predictions agreed with measured resistance and power within approximately 5%, and fin load trends were captured within the combined experimental and numerical uncertainty bounds. The results demonstrate that while fin stabilizers provide measurable roll reduction, they also increase resistance and generate localized hydrodynamic loads at the fin root, which are critical for structural assessment. Although structural optimization was not undertaken in this study, the quantified load envelopes provide an essential basis for future structural design and optimization work aimed at maximizing motion reduction, minimizing power penalties, and mitigating the risks of structural damage or capsize under extreme operating conditions. The validated CFD-EFD framework developed herein establishes a reliable methodology for assessing fin performance and loading, supporting improved design guidance for motion stabilization systems in small commercial fishing vessels. |
| 10:55 | Fast and Accurate Prediction of High-Resolution Airfoil Ice Accretion Using Gradient Boosting Models Trained on Validated CFD Simulations PRESENTER: Sobhan Ghorbani Nohooji ABSTRACT. Aircraft icing is one of the most critical safety hazards in aviation because it degrades aerodynamic performance, increases drag, reduces lift, and compromises stability. Ice accretion occurs when supercooled water droplets impact exposed surfaces and freeze, forming rime, glaze, or mixed ice depending on atmospheric conditions. These ice formations modify the aerodynamic geometry of lifting surfaces and can significantly reduce stall margins. Similar challenges are also observed in wind energy systems, where ice accretion on turbine blades operating in cold climates reduces power output and increases structural loads. Traditional ice accretion prediction relies on high-fidelity numerical solvers that couple Navier–Stokes flow computation with droplet impingement and thermodynamic ice growth modeling. Although physically accurate, such simulations are computationally expensive and require repeated mesh deformation as ice evolves. This limits their use in real-time applications and large parametric studies involving thousands of atmospheric scenarios. To address this limitation, this study develops a systematic benchmark for fast and accurate prediction of full ice accretion profiles using machine learning models trained on a validated high-fidelity dataset. A total of 5120 icing simulations are generated using the FENSAP-ICE framework for a NACA0012 airfoil under systematically varied free-stream velocity, angle of attack, ambient temperature, liquid water content, median volumetric diameter, and icing time. The learning task is formulated as a direct regression from flight and atmospheric parameters to the high-resolution ice-thickness distribution sampled along the airfoil surface. Both raw input variables and geometry-aware engineered features are employed, including surface tangents, normals, curvature, and arc-length descriptors, together with physically motivated interaction terms. Five predictive models are evaluated under identical training and testing conditions. Three gradient boosting algorithms—CatBoost, XGBoost, and LightGBM—are compared against two neural network-based approaches: a geometrically constrained neural network and a convolutional autoencoder-based architecture. Hyperparameters are optimized using a unified framework to ensure a fair comparison. Model performance is assessed using both pointwise error metrics and shape-aware similarity measures to quantify geometric deviations between predicted and reference ice profiles. Results demonstrate that gradient boosting methods consistently outperform neural network models in both numerical accuracy and geometric fidelity. Among the evaluated approaches, CatBoost achieves the highest predictive performance with an R-squared value of 0.998 and very low root-mean-square error, while maintaining robustness across diverse icing regimes, including cases with pronounced horn structures. The proposed benchmark establishes a quantitative baseline for real-time-oriented ice accretion prediction and highlights the strong potential of ensemble learning methods as efficient surrogate models for high-fidelity icing simulations. |
| 11:20 | Prediction of aerodynamics installation effects of USF using body force modeling and metric-based mesh adaptation PRESENTER: Armandojanni Petrucci Orefice ABSTRACT. In recent years, efforts to increase aircraft engine efficiency and reduce losses have highlighted the Unshrouded Single Fan (USF) architecture as a promising option. By enabling larger fan diameters and smaller nacelles, USF can enhance propulsive efficiency while reducing parasite drag and engine weight, but it remains an industrially disruptive and not yet fully consolidated solution. A key challenge is the strong interaction between this propulsion system and the aircraft, notably the aerodynamic coupling between wing and propulsor that motivates the present study, together with aeroacoustic, structural and aeroelastic constraints, all of which demand significant development and testing effort. CFD is therefore used to limit experimental costs and support performance prediction and optimization, but full aircraft–engine simulations are often too expensive in preliminary design. To alleviate this, reduced-order or surrogate models are introduced to reproduce propeller effects on the flow, such as actuator disk approaches. A more recent reduced-order class is the body force method, which replaces blades with source-term distributions capturing flow turning and entropy rise. This avoids blade meshing, yields steady computations even with inflow distortion, and has been applied to UHBR installation studies and propeller configurations. The present work also exploits anisotropic mesh adaptation for turbulent RANS flows, shown to deliver accurate predictions on fully unstructured tetrahedral meshes with 20–100 times fewer cells than expert meshes, making discretization error negligible. The objective is to combine body force modeling and mesh adaptation to obtain steady, mesh-converged RANS solutions for USF installations over the flight envelope. Two main targets should be achieved with the proposed strategy: on one hand it will be possible to obtain steady simulations (RANS) in the aircraft frame thanks to the body force modeling of the propeller, on the other hand the mesh adaptation strategy will ensure the mesh-convergence of the solution. Being able to guarantee both targets at any flight conditions would represent a quite challenging task and at the best of authors' knowledge there are no other examples in literature making use of body force models in conjunction of mesh adaptation strategies to achieve such result. The paper first presents the body force model, then the adaptive solution platform, and finally demonstrates the method on isolated and installed USF cases. |
| 11:45 | Numerical Assessment of the Impact of Icing on Aerodynamics and Mission Safety of Small Unmanned Aerial Vehicles PRESENTER: Zoia Sazanishvili ABSTRACT. Small unmanned aerial vehicles (UAVs) are especially vulnerable to atmospheric icing due to their limited aerodynamic margins, low flight altitudes, and high sensitivity to surface contamination. At the same time, traditional icing certification criteria developed for manned aircraft do not adequately reflect the operational specifics of small UAVs. This paper presents a numerical methodology for assessing the impact of icing on the aerodynamic performance and mission safety of small UAVs based on coupled simulation of airflow, supercooled droplet transport, and ice accretion. The approach combines CFD modeling of viscous flow, a multiphase description of the air–droplet mixture, and a surface ice growth model. Typical low-altitude meteorological scenarios involving supercooled droplets, drizzle, and freezing rain are considered. Using a representative airfoil configuration, a systematic multiparametric numerical study is performed over a wide range of flight speeds, ambient temperatures, liquid water content, and droplet sizes. The evolution of ice shapes and their influence on lift, drag, and stall characteristics are analyzed. Based on the numerical results, integral criteria and nomograms for rapid icing risk assessment are constructed. The simulations show that even relatively small ice accretions or increased surface roughness can lead to a drastic degradation of aerodynamic performance and controllability of small UAVs. The proposed numerical framework can be used both at the mission planning stage and, in combination with meteorological data, for operational icing risk assessment and flight profile adaptation. |
| 14:00 | Finite volume-particle method (FVPM) for free surface flow PRESENTER: Jiawang Zhang ABSTRACT. This study proposes a novel finite volume-particle method (FVPM) for accurately and efficiently simulating free surface flows. The proposed FVPM synergistically combines the Eulerian finite volume method (FVM) on unstructured meshes with the Lagrangian smoothed particle hydrodynamics (SPH) approach. Specifically, the mesh-based FVM is employed in regions distant from the free surface to leverage its computational efficiency and numerical accuracy, while a weakly compressible SPH formulation is applied in the vicinity of the interface to maintain robust free-surface tracking capabilities. A key innovation of this framework is an adaptive bidirectional conversion strategy that dynamically transfers computational domains between mesh cells and particles, thereby capitalizing on the respective strengths of both methodologies. To ensure seamless data exchange across the Lagrangian-Eulerian interface, a dedicated ghost cell-particle algorithm is developed, enabling stable and accurate information transfer between the two discretization paradigms. Furthermore, an isothermal gas-kinetic scheme incorporating gravitational effects and compatible with arbitrary equations of state is formulated to enhance the physical fidelity of free surface simulations. The performance and reliability of the proposed FVPM are rigorously validated through a series of benchmark test cases involving complex free surface phenomena, including dam breaks and wave impacts. Numerical results demonstrate that the hybrid method achieves superior accuracy compared to pure SPH approaches while significantly reducing computational cost. |
| 14:25 | Non-Inertial Lagrangian Particle Tracking on Arbitrarily Moving Grids PRESENTER: Francesco Caccia ABSTRACT. We present the development, verification, and validation of a numerical algorithm for Lagrangian particle tracking in one-way coupled multiphase flows over grids undergoing arbitrary motion. The carrier flow is assumed to be known a priori and computed on a moving mesh within an Arbitrary Lagrangian-Eulerian framework. Particles are advanced in a non-inertial reference frame attached to the moving grid, while their inertial state is restored at discrete CFD time levels. The formulation relies solely on the discrete mesh motion provided by the carrier flow solver and does not require continuous interpolation of the grid kinematics. The overall framework is verified through analytical tests and validated on an unsteady pitching airfoil configuration. |
| 14:50 | Revisiting Edge-Based Non-Linear V4-Scheme for Turbulent Flows using Metric-Based Anisotropic Mesh Adaptation PRESENTER: Frederic Alauzet ABSTRACT. Second-order vertex-centered Mixed-Element-Volume (MEV) MUSCL scheme has been very successful for solving steady Reynolds-Averaged Navier-Stokes (RANS) simulations when coupled with metric-based anisotropic mesh adaptation, notably for sonic boom prediction, high-lift prediction, vortex dominated flows, compressors, film-cooled turbine, and many other applications. At the core of the MEV method lies in the edge-based linear V4-scheme and V6-scheme discretizing the convective terms. Generally, the V4-scheme is considered for steady applications thanks to its good iterative convergence property, and the V6-scheme is used for unsteady applications, in particular for eddy-resolved simulations because of its lower dissipation. In this work, we focus on the V4-scheme but the results extend easily to the V6-scheme. Theoretically, the linear V4 convective numerical scheme is third-order accurate for the linear advection on uniform unstructured or structured meshes. If non-linear equations are considered and/or if the mesh is non-uniform then it becomes a low dissipation second-order scheme with a fourth-order numerical dissipation making it very accurate. On the numerical side, this scheme is very efficient because it reduces solving a 1D flux for each edge. It thus requires only one loop over the edges to compute the MUSCL extrapolation, the limiter and the approximate Riemann solver for convective fluxes. In comparison, more classical edge-based methods using stencil limiters require one loop to compute solution extrema in each vertex neighborhood, one loop to compute the limiter coefficient and one loop to compute the MUSCL extrapolation and convective fluxes. In the quest of using even more accurate numerical scheme, Koobus et al. proposed a non-linear V6-scheme, but its formulation makes it very difficult to apply a limiter to the MUSCL extrapolation. Consequently, it was not usable for practical applications. Recent works on economical high-order flux-solution-reconstruction (FSR) schemes and the third-order edge-based convective scheme give clues to solve that issue. The goal of this paper is to combine both approaches to propose a new non-linear V4-scheme. Theoretically, the non-linear V4 scheme should exhibit the same properties as linear V4 on uniform unstructured or structured meshes also for the Euler equations. If non-uniform meshes (such as adapted meshes) are considered, then it becomes a low dissipation second-order scheme with a fourth-order numerical dissipation. We expect this scheme to be more accurate than the linear one when solving RANS equations. This work will fairly compare the linear and the non-linear V4-schemes on realistic applications in aeronautics. |
| 15:15 | General Riemann Solvers and Dissipation Control for Third-Order Edge-Based Schemes for Metric-Based Adapted Simulations PRESENTER: Cosimo Tarsia Morisco ABSTRACT. In the last decade, second-order schemes for anisotropic metric-based mesh adaptation have been shown to be robust and very accurate for industrial applications on complex geometries. However, these schemes provide by definition first-order gradients and consequently engineering functionals involving such quantities (e.g., skin friction, heat flux, et) are poorly captured. The challenge of the future is developing robust high-order schemes for highly anisotropic adapted meshes. In this scenario, the third-order edge-based method is one of the most efficient third-order accurate discretization algorithms for general tetrahedral meshes. One main advantage is that it only requires a single numerical flux evaluation per edge just like the second-order method. In this way, the residuals can be computed efficiently in a loop over edges followed by a loop over boundary elements, and third-order accuracy is achieved with flux and solution reconstruction using second-order accurate gradient algorithms, not requiring second derivatives, nor high-order curved meshes. The most common algorithm for second-order accurate gradients is the quadratic least-square (QLS). However, this method is generally known to destabilize a flow solver especially for very thin and highly curved meshes. For inviscid flow problems, it has been shown that a quadratic implicit edge-based gradient (QIEBG) is a superior alternative and improves both the accuracy and iterative convergence of the third-order edge-based Euler solver. A robustness similar to second-order schemes was recently shown also on highly anisotropic adapted meshes, providing great convergence improvements with respect to second-order simulations for smooth and shock-dominated flows. Moreover, a clever dedicated implementation makes the QIEBG algorithm as convenient as QLSQ in terms of CPU cost. All of that makes the former the current best candidate to handle third-order accurate solutions for industrial applications. Up to now, the third-order edge-based method is easily applicable only to the Roe flux function or those in a similar form, i.e., the averaged flux with a dissipation term, but not to others. This is because rewriting general flux functions in a similar form is not necessarily a straightforward process. In this paper, we present a flux-correction form of the third-order edge-based method to enable us to directly call general flux functions. Specifically, we focus on its applications to adaptive tetrahedral meshes towards efficient and automated computational fluid dynamics (CFD) simulations with anisotropic metric-based mesh adaptation. Adapted mesh and solutions obtained for different approximate Riemann solvers will be compared and discussed. |
| 15:40 | An Alternative Finite Volume Discretization of Multidimensional Compressible Euler Equations on General Unstructured Grids PRESENTER: Pierre-Henri Maire ABSTRACT. The Euler equations constitute a system of conservation laws governing the evolution of mass, momentum, and total energy in compressible fluid flows. The development of accurate and robust numerical discretizations for the multidimensional compressible Euler equations on general unstructured grids remains a major challenge, particularly in the presence of strong discontinuities. An ideal numerical method should not only correctly capture strong shocks and rarefaction waves while remaining minimally sensitive to grid irregularities, but should also accurately resolve the low-Mach-number limit without resorting to ad hoc switching parameters. To meet these stringent requirements, we investigate the potential of so-called factorizable schemes. Factorizable discretizations for the Euler equations were introduced more than two decades ago by Sidilkover and have recently been revisited to extend their applicability to Lagrangian hydrodynamics—that is, the Euler equations formulated in a Lagrangian framework. Here, we propose to utilize this unconventional approach to design a novel cell-centered Finite Volume discretization for the compressible Euler equations characterized by a genuinely multidimensional flux vector splitting between the Lagrangian and the convective parts of the numerical flux. |
| 14:00 | Look-ahead Adaptive Mesh Refinement on Hierarchical Cartesian Grids PRESENTER: Julian Vorspohl ABSTRACT. Adaptive Mesh Refinement (AMR) on hierarchical Cartesian grids is a fundamental tool for high-fidelity fluid simulations. However, on modern GPU-accelerated hardware, the overhead of dynamic memory reallocation and memory constraints often make the grid adaptation process significantly slower than the numerical solution itself. To maximize performance, it is necessary to minimize both the number of refined cells and the frequency of adaptation cycles without sacrificing accuracy. This work introduces a novel "look-ahead" AMR strategy designed to optimize the balance between refinement density and adaptation frequency. The methodology utilizes a performance model that quantifies wall-clock time as a function of refinement count and adaptation intervals. Unlike standard AMR, which relies on instantaneous sensor values, the proposed algorithm advects adaptation sensors by solving an additional transport equation. This predictive step allows the method to determine where refinement will be required in future time steps. To maintain efficiency, this transport equation is solved on the lower levels of the hierarchical grid. Preliminary results, applied to a Lattice-Boltzmann method within the m-AIA solver framework, demonstrate that the look-ahead approach primarily adds cells in the direction of flow features (such as vortex motion), allowing for larger adaptation intervals for a similar number of cells. Benchmarks conducted on both CPU (AMD EPYC 7742) and GPU (AMD MI300A) systems show that predictive mesh refinement reduces simulation times. Specifically, on GPU-based systems, the performance was increased by 28.6%. The results suggest that reducing memory-intensive adaptation cycles leads to substantial performance benefits across diverse architectures. |
| 14:25 | High-Order Conservative Solution Transfer for Anisotropically Adapted Tetrahedral Meshes PRESENTER: Tomáš Levý ABSTRACT. We present a conservative high-order solution transfer procedure for three-dimensional tetrahedral meshes in the context of time-dependent anisotropic mesh adaptation. The method is formulated as a local Galerkin projection between non-matching meshes and relies on a robust supermesh construction based on tetrahedron–tetrahedron intersections. Particular emphasis is placed on geometric robustness, conservation, and preservation of formal accuracy. The transfer algorithm is subsequently embedded into a time-accurate hybridized discontinuous Galerkin (HDG) framework with predictor-based anisotropic mesh adaptation. Numerical experiments confirm optimal high-order convergence and demonstrate stability under successive applications in fully three-dimensional adaptive simulations. |
| 14:50 | A Subelement-Based Strategy for Hanging-Node Resolution in Tree-Based Adaptive Mesh Refinement PRESENTER: Lena Plötzke ABSTRACT. 1. Introduction Adaptive mesh refinement (AMR) locally adapts the mesh resolution according to an adequate error indicator in mesh-based simulations and is widely employed in computational fluid dynamics (CFD). By refining the mesh only in regions where higher accuracy is required, AMR enables simulations to capture complex flow dynamics while keeping computational cost manageable. This localized refinement significantly reduces the number of elements compared to uniformly refined meshes, resulting in lower memory consumption and shorter solver run times while maintaining nearly the same level of accuracy. In the forest-of-trees approach, refinement relationships are organized in multiple refinement trees, each associated with a coarse mesh element. Space-filling curves (SFCs) are employed to uniquely identify elements, enable efficient data layouts, and support scalable parallel algorithms for mesh adaptation, partitioning, and ghost-layer construction. While AMR significantly improves efficiency by concentrating resolution where needed, the complexity of managing dynamic, parallel adaptive meshes introduces substantial development overhead, motivating the use of dedicated mesh-management libraries such as t8code. The open-source library t8code (pronounced "tetcode") provides a modular implementation and extends the forest-of-trees approach to a wide range of element types relevant to CFD applications. Nevertheless, the adoption of AMR in solver frameworks is often limited by the challenge of hanging nodes, which are element corners without counterparts in neighboring elements. A mesh that satisfies a one-to-one correspondence between element faces and their neighbors is called conformal. Hierachical AMR meshes are in general non-conformal, creating hanging nodes where an element meets neighbors of a different refinement level. Hanging nodes are particularly problematic for finite-element formulations, which generally require conformal meshes to update solution values. Although finite-volume and discontinuous Galerkin methods can handle non-conformal meshes through special techniques like the mortar method, doing so increases implementation effort and algorithmic complexity, such that conformal meshes remain preferable. Existing methods for resolving hanging nodes geometrically are frequently restricted to specific element types or computationally expensive. In three dimensions, additional complications arise from the rapid growth of element numbers, making the development of robust, general, and scalable solutions particularly challenging. In this conference contribution, we present a conceptual design and current progress in the development of a general subelement-based strategy for hanging-node resolution in tree-based AMR. Although hanging-node resolution has been studied for decades, the presence of mixed-element meshes and higher-dimensional refinement continues to pose unresolved theoretical and practical challenges. Our strategy introduces subelements as temporary transition elements inserted after the standard recursive refinement step to geometrically restore mesh conformity. These subelements preserve the global AMR structure, integrate seamlessly with t8code’s tree-based data structures, and are removed in subsequent adaptation cycles. Beyond hanging-node resolution, the modular subelement framework will provide a foundation for further applications, including element-type conversion, anisotropic refinement for boundary layers, and GPU-optimized subpatches. This contribution presents the methodological framework and design principles of the subelement approach, highlighting its flexibility and its potential to overcome a key barrier to adopting AMR in CFD solvers. 2. Methodology The proposed methodology introduces a subelement-based framework for tree-based AMR, providing a modular and extensible approach to enhance mesh conformity while preserving hierarchical structure, parallel scalability, and solver compatibility. 2.1 Subelement Concept For a well-designed, extensible implementation of hanging-node resolution, we introduce the concept of subelements and integrate it into t8code’s core algorithms. Subelements may be inserted after the standard recursive refinement as an additional refinement strategy and are discarded in the subsequent adaptation cycle to permit further standard refinement. Therefore, subelements always represent the final level in a refinement tree. For hanging-node resolution, subelements act as an intermediate refinement layer that can bridge differences in refinement levels. Subelements are designed to support all occurring element types in a mesh and operations beyond hanging-node resolution. An other application for subelements is an element-type conversion after the adaptation step, e.g., converting tetrahedra to hexahedra through subdivision. While mesh generators typically use tetrahedra to mesh complex 3D domains, there exist solvers for which hexahedral elements are preferable. Subelements allow converting tetrahedra into hexahedra by subdividing each into four hexahedra after the adaptation step. Further applications include y+ refinements or subpatches for GPU parallelization. A key objective of this work is to enable the stacking of different subelement types. This allows multiple subelement operations to be applied in sequence, such as first resolving hanging nodes in a triangular mesh and then performing a type conversion to quadrilaterals. By isolating additional refinement logic within a modular subelement layer, the framework maintains the hierarchical structure of tree-based AMR, preserves SFC indexing, and ensures scalability for parallel execution. 2.2 Subelements for Hanging-Node Resolution A primary application of subelements is the resolution of hanging nodes, which arise at faces between elements of different refinement levels. The proposed methodology follows a two-step approach: - Mesh balancing: The mesh is first balanced such that neighboring elements differ by at most one refinement level. This step ensures that we only have to bridge a difference of one refinement level with subelements. The corresponding balancing algorithm is implemented in t8code. - Subelement insertion: Remaining hanging nodes are resolved by inserting subelements for elements with non-conforming faces to restore local conformity. These subelements are removed in subsequent adaptation cycles to allow the use of the standard recursive refinement scheme and to limit the number of mesh elements. The methodology has been validated for 2D quadrilateral meshes through theoretical analysis and a preliminary implementation in t8code. It will be extended to triangular and hybrid 2D meshes through red/green refinement. In 3D, hanging-node resolution is more challenging due to the rapid increase in the number of elements and the greater geometric complexity involved in resolving all hanging nodes. Established schemes exist for tetrahedral meshes. A preliminary approach for resolving hanging nodes in hexahedral meshes is already developed. However, it should be systematically compared, for example, to 1:27 hexahedral refinement strategies, with particular attention to performance and computational efficiency. Our objective is to integrate existing hanging-node resolution techniques into the subelement logic and implement them in the software library t8code, preserving it’s robustness, modularity, performance, and scalability. We will examine the methods and clarify key concepts such as extending the SFC approach, interpolating data, combining different subelement types, and applying the methodology to hybrid meshes. 3. Conclusions In this conference contribution, we present the conceptual design and current state of development of a general, modular subelement framework for hanging-node resolution in tree-based AMR. Hanging nodes can pose challenges in mesh-based simulations, as numerical methods either require additional handling of hanging nodes or rely on strict mesh conformity. The proposed approach introduces subelements as transition elements that locally restore mesh conformity while preserving the global AMR hierarchy, SFC indexing, and parallel scalability. Beyond hanging-node resolution, the framework is designed to support additional mesh operations, such as element-type conversion, anisotropic boundary-layer refinement, and GPU-optimized subpatches. The methodology has been validated for 2D quadrilateral meshes and will be extended to triangular, hybrid, and 3D meshes. By integrating the subelement logic into the open-source t8code library, the results will be made publicly available and solver compatibility is ensured. Overall, this research aims to overcome a key barrier to the adoption of AMR in CFD solvers and to provide a robust, extensible methodology. |
| 15:15 | Hybrid Mesh Adaptation using metric field for high-order Discontinuous Galerkin methods PRESENTER: Dipendrasingh Kain ABSTRACT. Flows around aerospace vehicles are dominated by complex anisotropic features, such as shocks and the boundary layer. All such important flow features should be sufficiently resolved to obtain accurate quantities of interest, such as drag or lift. Towards this end, metric-based mesh adaptation provides a robust mathematical approach for generating unstructured, adaptively refined meshes (triangles in 2D and tetrahedra in 3D) in a fully automated manner. However, for viscous, high-speed flows, it is necessary to have a structured layer of elements within the boundary layer to obtain accurate, smooth distributions of skin friction or heat flux. In this regard, we propose a hybrid-mesh adaptation framework that produces quad-dominant meshes having adapted layers of quads aligned with the viscous wall, within the boundary layer. The adapted structured layer is generated using an advancing-layer approach coupled with a metric field, resulting in layers of right-angled triangles aligned with the wall within the boundary layer. The tangential and normal spacings of layers are given by the metric field. The metric field is derived by minimizing the high-order interpolation error estimate in the Mach number. The mesh outside the boundary layer is generated using an advancing-front approach coupled with a metric field, resulting in right triangles aligned orthogonally to the flow features. The hybrid-adapted mesh is obtained by combining the triangles in the adapted triangular mesh using suitable quality criteria based on the internal angles of the quad, resulting in structured layers of adapted quads within the boundary layer. The proposed framework is coupled with a high-order solver based on Hybridized Discontinuous Galerkin (HDG) discretization. A benchmark test case of turbulent flow over a flat plate, taken from the NASA Turbulence Modeling Resource (TMR) website, is simulated using the HDG solver on both unstructured triangular meshes and hybrid-adapted meshes. The adapted meshes achieve drag convergence with almost 20 times fewer degrees of freedom than the two finite-volume solvers, CLF3D and FUN3D, on the NASA TMR website, which used expert-generated, fixed (non-adapted) meshes. In addition, the hybrid-adapted mesh yielded smoother, more accurate distributions of the skin-friction coefficient than a fully unstructured mesh, demonstrating the advantage of the proposed hybrid-mesh adaptation framework. |
| 15:40 | A dual adaptive multi-resolution method based on hybrid block-tile SIMD parallel architectures PRESENTER: Xuzhen Xie ABSTRACT. Adaptive mesh refinement methods are crucial for solving multi-scale flow phenomena, balancing computational accuracy with cost. They are primarily categorized into Block-based and Cell-based methods. Block-based methods, while advantageous in regular data structures, struggle with coarse refinement and grid redundancy. In contrast, Cell-based methods offer better grid reduction but suffer in grid transition quality and data locality, which affects numerical accuracy and parallel efficiency. This paper introduces a Block-Tile hybrid adaptive mesh refinement method designed for SIMD parallel architectures, establishing a dual-level data organization with Tiles as the core computational unit. The method uses Block structures for coarse-grained multi-resolution analysis up to level Lmax−1 and switches to Tile-based refinement to address over-refinement and enhance data locality. Tiles enable efficient error estimation and refinement based on flow characteristics, particularly in capturing shocks and gradients. Furthermore, the Tile concept is tailored to fit modern CPU architectures, aligning with SIMD register widths to facilitate efficient data computation and minimize latency. The flexible Tile structure supports task scheduling across multiple cores, enhancing load balancing and computational performance through both multi-core parallelism and vector acceleration. |
| 14:00 | An Implicit Two-Stage Fourth-Order Subcell Finite Volume Gas-Kinetic Solver on Unstructured Meshes PRESENTER: Bintao Yang ABSTRACT. 1. Introduction The gas-kinetic scheme (GKS) is based on the space-time evolving solution of BGK model to constructs numerical fluxes [1] and possesses the distinctive ability to simultaneously provide the flux and its time derivative [2]. The two-stage fourth-order (S2O4) time discretization method takes advantage of this property and achieves fourth-order temporal accuracy with only two stages [3,4]. Zhang et al. [5,6] combined the S2O4 method with the subcell finite volume (SCFV) method, realizing fourth-order spatial reconstruction on a compact stencil involving only face neighbors, and developed a compact high-order explicit GKS suitable for unstructured meshes. It inherited the advantages of FV such as robustness and high resolution, and the compactness of schemes based on internal degrees of freedom such which is beneficial for parallel computing and complex boundary treatment. Nevertheless, explicit time-stepping is restricted by the CFL condition, which often severely limits its engineering applications. Cao et al. [8] combined LU-SGS [7] with the S2O4-GKS to develop an implicit high-order scheme, but for structured meshes. The present work targets unstructured meshes and integrates the compact reconstruction of the SCFV method, the high efficiency of the S2O4 time-stepping, and the large time-step capability of the implicit method to develop an implicit high-order GKS that simultaneously possesses compactness, high accuracy, and high efficiency. 2. Methodology Within the SCFV framework, each grid cell is subdivided into subcells, and the subcell-averaged values are stored as the fundamental variables. A cubic polynomial with zero mean is reconstructed on each cell. The reconstruction stencil involves only the target cell's own subcells and those of its face neighbors. The polynomial coefficients are determined via a constrained least-squares approach to ensure conservation on the subcells. GKS constructs a time-dependent gas distribution function at the cell interface from the integral solution of the BGK equation: Consequently, both the numerical flux and its time derivative can be obtained simultaneously. This forms the basis for the S2O4 time stepping, which advances the solution in two stages: where Q represents the conservative variables, F the flux, and R the spatial discretization operator. Compared to the traditional fourth-order Runge-Kutta method, the S2O4 method achieves fourth-order accuracy with only two stages, significantly improving computational efficiency. For implicit time advancement, the LU-SGS method is applied within each of the two stages. Taking the first stage as an example, the implicit formulation is written as: After linearization, the LU-SGS method solves the resulting linear system by decomposing the implicit operator into lower, diagonal, and upper triangular parts and performing a two-sweep procedure 3. Conclusions A subcell finite volume gas-kinetic scheme on unstructured meshes based on an implicit two-stage fourth-order temporal discretization is developed. By incorporating the LU-SGS implicit method, the CFL stability restriction of explicit schemes is overcome, allowing much larger time steps and leading to a significant improvement in computational efficiency. Numerical tests validate the fourth-order accuracy in both space and time and demonstrate the efficiency of the proposed scheme. This work provides a basis for the development of high-order implicit GKS applicable to engineering problems. References [1] Xu K. A gas-kinetic BGK scheme for the Navier-Stokes equations. J Comput Phys, 2001, 171:289-335. [2] Li Q B, Xu K, Fu S. A high-order gas-kinetic Navier-Stokes flow solver. J Comput Phys, 2010, 229:6715-6731. [3] Li J, Du Z. A two-stage fourth order time-accurate discretization for Lax-Wendroff type flow solvers I: Hyperbolic conservation laws. SIAM J Sci Comput, 2016, 38:A3046-A3069. [4] Pan L, Xu K, Li Q B, Li J. An efficient and accurate two-stage fourth-order gas-kinetic scheme for the Euler and Navier-Stokes equations. J Comput Phys, 2016, 326:197-221. [5] Zhang C, Li Q B, Song P, Li J. Two-stage fourth-order gas kinetic solver based compact subcell finite volume method for compressible flows on triangular meshes. Phys Fluids, 2021, 33:126108. [6] Zhang C, Li Q B, Song P, Li J. Two-stage fourth-order subcell finite volume method on hexahedral meshes for compressible flows. Phys Fluids, 2022, 34:086110. [7] Yoon S, Jameson A. An LU-SSOR scheme for the Euler and Navier-Stokes equations. AIAA J, 1988, 26:1025-1026. [8] Cao G, Su H, Xu J, et al. Implicit high-order gas kinetic scheme for turbulence simulation[J]. Aerospace Science and Technology, 2019, 92: 958-971. |
| 14:25 | A Class of Efficient and Robust Gas Kinetic Scheme Using Discontinuity Feedback PRESENTER: Hong Zhang ABSTRACT. This work proposes a discontinuity-control strategy for high-order gas-kinetic scheme by introducing a Discontinuity Feedback Factor (DFF) into reconstruction. Interface non-smoothness at t^n is feedback to modulate reconstruction at t^{n+1}: high-order accuracy is preserved in smooth regions, while high-order components are adaptively suppressed near strong discontinuities to increase stability through controlled dissipation. The framework is unified for both structured and unstructured grids with low extra memory and computational cost. Results demonstrate effective suppression of non-physical oscillations around shocks and contact-type discontinuities, while maintaining smooth-region accuracy and key flow structures. The method offers a compact, robust, and general enhancement for high-order gas-kinetic simulations of strongly compressible flows. |
| 14:50 | An Effective Implementation of High-order Compact Gas-kinetic Scheme for Compressible Flows PRESENTER: Yaqing Yang ABSTRACT. High-resolution simulations are indispensable for resolving turbulent flows. To balance accuracy, robustness, and efficiency, a novel fifth-order compact gas-kinetic scheme (CGKS-5th) is developed for compressible flows on structured meshes. This scheme utilizes a new multidimensional compact reconstruction that incorporates line-averaged derivatives as additional degrees of freedom to achieve superior resolution on a compact stencil. For non-orthogonal meshes, reconstruction is performed in a transformed computational space, enabling a unified polynomial form that significantly reduces memory usage and computational complexity. A nonlinear adaptive method ensures high accuracy and robustness by smoothly transitioning from a high-order linear scheme in smooth regions to a second-order scheme at discontinuities. Furthermore, the scheme is accelerated via multi-GPU parallelization for efficient large-scale applications. Comprehensive numerical tests, ranging from subsonic to supersonic turbulence, validate the scheme's high accuracy, resolving capability, and excellent robustness. Performance comparisons against a conventional second-order gas-kinetic scheme (GKS-2nd) demonstrate that CGKS-5th achieves comparable solution quality at approximately an order of magnitude lower computational cost. This evaluation provides the first clear verification of the efficiency advantages of high-order compact gas-kinetic schemes in simulating viscous flows with discontinuities. To further extend the scheme's applicability to practical engineering problems, a non-equilibrium wall model is integrated to establish a wall-modeled CGKS-5th, enabling high-fidelity simulations of complex industrial flows. |
| 15:15 | Generalized ENO Reconstruction and Compact Gas-Kinetic Scheme for Compressible Flows PRESENTER: Fengxiang Zhao ABSTRACT. This study proposes a generalized ENO (GENO) nonlinear reconstruction method for compressible flow simulations. This method generalizes the adaptive concept of ENO schemes. By constructing a smooth path function that directly links high-order linear reconstruction with low-order non-oscillatory reconstruction, GENO ensures non-oscillatory behavior at discontinuities while maximally preserving the high accuracy of linear schemes in smooth regions. The direct adaptive approach of GENO simplifies the construction of nonlinear schemes, making it particularly suitable for very high-order schemes on unstructured meshes. Comparative analysis with WENO and its optimized variants demonstrates that GENO achieves an ideal transition from linear to nonlinear reconstruction. Benchmark cases confirm the superiority of GENO in terms of accuracy and shock-capturing capability. Furthermore, this study presents the application of a high-accuracy compact gas-kinetic scheme (GKS) based on GENO reconstruction in turbulence simulations. This highlights the advantages and distinct features of the space-time coupled solver of the GKS in constructing high-accuracy schemes and performing refined turbulence simulations. This study offers fresh insights into the theory and application of ENO-type schemes and the construction of high-accuracy compact schemes, providing a novel, robust, and practical methodology for high-accuracy flow simulation. |
| 15:40 | A Three-Dimensional Two-Temperature Gas-Kinetic Scheme with Generalized Kinetic Boundary Condition for Hypersonic SBLI Simulations PRESENTER: Xingjian Gao ABSTRACT. This paper presents a three-dimensional two-temperature Gas-Kinetic Scheme (GKS) on unstructured meshes designed for the accurate simulation of hypersonic flows involving strong Shock-Wave/Boundary-Layer Interactions (SBLI) and thermal non-equilibrium. Traditional computational methods often struggle to predict aerothermal loads in such regimes due to the limitations of linear constitutive relations and simplified gas-surface interaction models. To address this, we introduce a Generalized Kinetic Boundary Condition (GKBC) within the GKS framework. Unlike standard Maxwell boundary conditions, the GKBC explicitly decouples the accommodation coefficients for momentum, translational-rotational energy, and vibrational energy. This formulation allows for a more physical description of the slow relaxation of vibrational energy at solid surfaces, which is critical for accurate heat flux prediction. The numerical scheme employs a modified BGK model with a two-stage relaxation process to account for thermal non-equilibrium, integrated using an implicit LU-SGS method for efficiency. The solver is rigorously validated against canonical CUBRC LENS experiments, including sharp double-cone and hollow cylinder-flare configurations. Results demonstrate that the proposed GKBC significantly improves the prediction of surface heat flux compared to standard no-slip or Maxwell conditions by correctly modeling the distinct energy accommodation processes. Furthermore, parametric studies regarding Reynolds number variations confirm the solver's capability to capture complex flow topology changes, such as the evolution of separation bubbles and secondary vortices. The study establishes the 3D two-temperature GKS with GKBC as a robust tool for analyzing complex thermal non-equilibrium flows. |
| 14:00 | Scalable Variational Quantum Linear Solvers for CFD via an Efficient Laplacian Block Encoding and CMA-ES Optimization PRESENTER: Viraj Dsouza ABSTRACT. Solving the pressure-Poisson equation efficiently is a central computational bottleneck in incompressible Computational Fluid Dynamics (CFD), and quantum linear solvers have been identified as a potential long-term avenue for addressing this challenge. Building on our prior work (D'Souza et al., AIAA Aviation 2025), we present two concrete improvements to the Variational Quantum Linear Solver (VQLS) for structured CFD linear systems. First, we integrate a new unified block-encoding framework for discrete Laplacian operators (Boutot and Dsouza, arXiv:2603.12405), a companion contribution by the authors, which reduces two-qubit gate counts by up to 3.2x and circuit depth by up to 2.5x relative to our prior implementation, while also improving post-selection success probability. This enables near-exact recovery of the classical solution upto system size N = 128, compared to N = 32 in our prior work. Second, we show that the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) outperforms COBYLA as a VQLS optimiser at scale: both converge for N ≤ 64, but CMA-ES continues to converge at N = 128 where COBYLA stalls, and at N = 256 CMA-ES reduces the cost function to approximately 0.59 while COBYLA makes no progress beyond 0.90. The methodology is validated on two CFD-relevant benchmarks: a 2D Poisson problem with sinusoidal forcing, representative of incompressible pressure-Poisson subproblems, and a steady-state heat conduction problem with a localised Gaussian source term. All results are obtained on the Qiskit Aer statevector simulator, and together establish a clearer pathway toward practical quantum-assisted CFD solvers. |
| 14:25 | Rethinking the Design of Next-Generation CFD Solvers PRESENTER: Christophe Coreixas ABSTRACT. We present a GPU-oriented framework for high-fidelity, scale-resolving aerodynamic simulations using Cartesian Octree grids. By combining fully explicit, local operations with memory-efficient data structures, the framework enables production-level LES simulations on a single GPU node. Validation on the LAGOON~1 landing gear benchmark shows accurate prediction of the main wake structures and velocity fields at high resolutions. Performance assessments indicate that large-scale, scale-resolving simulations can be completed within a few to tens of hours on a single GPU, demonstrating the feasibility of high-fidelity simulations on affordable hardware. |
| 14:50 | Performance Analysis of Fortran, C, and Regent based GPU Accelerated Meshfree Solvers for 3D Compressible flows PRESENTER: Mayuri Verma ABSTRACT. In recent years, GPUs have emerged as a competitive alternative to CPUs in high-performance computing, offering superior throughput, cost effectiveness, and energy efficiency. Owing to their advantages in single-instruction, multiple-data (SIMD) operations, several research groups have developed GPU-accelerated codes for CFD applications using Fortran, C, and Python, primarily leveraging frameworks such as CUDA. In these programming models, the code developers typically need to implement corresponding CUDA kernels for each function or subroutine in the serial code and manage memory operations explicitly. Furthermore, for simulations on multi-node and multi-GPU systems, explicit handling of data communication and synchronisation is required, often using libraries such as MPI or NCCL. Developing and maintaining such GPU codes is tedious and requires significant human effort. In addition, CUDA based Fortran or C codes are limited to NVIDIA GPUs and cannot be executed on AMD GPUs. Supporting AMD hardware requires rewriting the code using the ROCm framework. Code developers would greatly benefit if a programming language supports implicit parallelism and also enables portability across hardware architectures. The programming language Regent precisely addresses these challenges by enabling the development of CPU or GPU parallel codes with minimal human effort. Regent is a high level, task based parallel programming language built on the Legion framework. Regent programs are composed of tasks, which are analogous to subroutines in Fortran or functions in C. Each task has specific privileges to operate on sets of data. Regent can infer data dependencies between tasks. Using this information, the Legion runtime implicitly schedules tasks and handles all memory operations, including data communication and synchronisation while preserving the sequential semantics of the program. Furthermore, the Regent compiler transforms the tasks into equivalent kernels that can execute on any GPU architecture. To the best of our knowledge, a rigorous investigation and comparative assessment of the performance of GPU codes for three-dimensional compressible flows, developed using both traditional programming languages and Regent, has not yet been reported. In this work, an attempt has been made to present a comprehensive performance analysis of GPU codes for three-dimensional inviscid flows written in Fortran, C, and Regent. The underlying CFD solver is based on the meshfree least squares kinetic upwind method. The GPU codes based on Fortran and C are developed using the CUDA parallel computing platform. To evaluate the computational efficiency of the GPU solvers, benchmark simulations are performed on a node equipped with an NVIDIA H100 GPU card. Numerical results show that CUDA Fortran exhibits superior performance, followed by CUDA C and Regent codes. This can be attributed to better SM utilisation, and more efficient PTX code generation in Fortran over other language implementations. Although the Regent code is slower, its performance gap is well compensated by ease in code development and its portability across GPU and CPU architectures. |
| 15:15 | GPU-Accelerated Wall-Modeled Large Eddy Simulations for Transonic Shock-Buffet with GALÆXI PRESENTER: Yannik Feldner ABSTRACT. The development of scale-resolving, high-fidelity CFD frameworks for unsteady, complex flow phenomena remains a fundamental challenge for both applications in aerospace engineering and the broader field of fluid mechanics. In their CFD Vision 2030, the National Aeronautics and Space Administration (NASA) identified several challenges and requirements for realizing the ambitious goals of wall resolved LES simulations of full, powered aircraft configurations across the whole flight envelope, including the development of efficient numerical schemes and harnessing the capabilities of accelerator-based high performance computing (HPC) systems [1]. A major focus of the Vision 2030 roadmap is the simulation of unsteady, separated flows around complex geometries for high Reynolds numbers, including shock-boundary layer interactions [2, 3]. Achieving this milestone is especially relevant for capturing the 2D and 3D transonic buffet phenomenon [4] on airfoils and swept wings with increasing spanwise extent. The buffet phenomenon occurs for certain combinations of the Mach number and angle of attack (AoA) and is characterized by a self-sustained, large-scale shock oscillation on the suction side of the airfoil [5]. The result is a shock-induced boundary layer separation and strong variations of the lift coefficient due to the low-frequency oscillations that limit the flight envelope. The majority of publications investigating the unsteady buffet phenomenon rely on solving the Unsteady Reynolds-Averaged Navier-Stokes equations (URANS) in order to reduce the computational demand. While the application of URANS is often justified due to the slower time scales of the buffet phenomenon compared to the turbulent boundary layer scales, these approaches often fall short in accurately predicting the unsteady nature of the shock turbulent boundary layer interaction (STBLI). The URANS approach exhibits a strong sensitivities to the applied turbulence model and is unable to capture the broadband turbulent spectrum [6, 7]. Although wall-resolved large-eddy simulations represent a methodology with the potential of enhancing the numerical accuracy, they remain prohibitive for realistic, high Reynolds numbers and large spanwise extents [8]. Consequently, wall-modeled LES (WMLES) can be introduced as trade-off between numerical accuracy and computational cost, motivating the development and optimization of efficient wall models for the emerging accelerator-based HPC systems. Therefore, this work presents the implementation of an algebraic, equilibrium wall-stress model based on the publication of Kawai and Larsson [9, 10] into the GPU-accelerated, high-order discontinuous Galerkin spectral element (DGSEM) framework GALÆXI [11]. A transonic buffet phenomenon over a supercritical OAT15A airfoil is investigated using the newly implemented wall-stress model. The increased computational power offered by GPUs allows experiments to examine geometries with a spanwise extent large enough to capture meaningful three-dimensional effects. With this application case, the computational performance and accuracy of the WMLES implementation in GALÆXI is examined. [1] Jeffrey P Slotnick, Abdollah Khodadoust, Juan Alonso, David Darmofal, William Gropp, Elizabeth Lurie, and Dimitri J Mavriplis. CFD vision 2030 study: a path to revolutionary computational aerosciences. Technical report, 2014. [2] A Cary, J Chawner, E Duque, W Gropp, B Kleb, R Kolonay, E Nielsen, and B Smith. The CFD vision 2030 roadmap: 2020 status progress and challenges. AIAA Paper, 2726:2021, 2021. [3] Andrew W Cary, John Chawner, Earl P Duque, William Gropp, William L Kleb, Raymond M Kolonay, Eric Nielsen, and Brian Smith. CFD vision 2030 road map: Progress and perspectives. In AIAA aviation 2021 forum, page 2726, 2021. [4] Will Pazner, Michael Franco, and Per-Olof Persson. High-order wall-resolved large eddy simulation of transonic buffet on the OAT15A airfoil. In AIAA Scitech 2019 Forum, page 1152, 2019. [5] Laurent Jacquin, Pascal Molton, Sebastien Deck, Bernard Maury, and Didier Soulevant. Experimental study of shock oscillation over a transonic supercritical profile. AIAA journal, 47(9):1985–1994,2009. [6] Sebastian Illi, Thorsten Lutz, and Ewald Krämer. On the capability of unsteady RANS to predict transonic buffet. Third Symposium Simulation of Wing and Nacelle Stall, pages 21–22, 2012. [7] Nicholas F. Giannelis, Gareth A. Vio, and Oleg Levinski. A review of recent developments in the understanding of transonic shock buffet. Progress in Aerospace Sciences, 92:39–84, 2017. [8] Joshua Holgate, Alex Skillen, Timothy Craft, and Alistair Revell. A review of embedded large eddy simulation for internal flows. Archives of Computational Methods in Engineering, 26(4):865–882, 2019. [9] Johan Larsson, Soshi Kawai, Julien Bodart, and Ivan Bermejo-Moreno. Large eddy simulation with modeled wall-stress: Recent progress and future directions. Mechanical Engineering Reviews, 3(1):15–00418–15–00418, 2016. [10] Soshi Kawai and Johan Larsson. Wall-modeling in large eddy simulation: Length scales, grid resolution, and accuracy. Physics of fluids, 24(1), 2012. [11] Marius Kurz, Daniel Kempf, Marcel P. Blind, Patrick Kopper, Philipp Offenhäuser, Anna Schwarz, Spencer Starr, Jens Keim, and Andrea Beck. GALÆXI: Solving complex compressible flows with high-order discontinuous Galerkin methods on accelerator-based systems. Computer Physics Communications, 306:109388, 2025. |
| 15:40 | CABA: A Large-Scale Parallel Solver for Automatic Adaptive Cartesian Grid Generation and Flow Simulation PRESENTER: Shuo Zhang ABSTRACT. In adaptive Cartesian grid generation for complex 3D geometries, geometric query overhead and parallel load imbalance are major bottlenecks. This paper proposes a mesh generation method designed for large-scale parallel environments. First, a k-d tree is employed to filter candidates during geometric queries on triangular patches, significantly reducing the cost of distance calculations, intersection tests, and inside-outside checks. Second, a dynamic weighted parallel repartitioning strategy is developed. By using the computational workload of geometric queries rather than cell count as the load metric, this strategy effectively mitigates the load imbalance caused by local refinement. Results demonstrate that an adaptive mesh with over 1.3 billion cells can be generated within tens of seconds using 1,024 cores. Compared to the unoptimized method, generation efficiency is improved by up to 60%, with a parallel efficiency maintained above 70%. |