View: session overviewtalk overview
Session Chair - Pete Peterson
Session Chair - Pete Peterson
Session Chair - Pete Peterson
Session Chair - Maria Patrou
Session Chairs - Brett Eiffert and Reece Boston
A Digital Twin Framework for Liquid-Cooled Supercomputers ABSTRACT. We have developed an open-source framework for developing comprehensive digital twins of liquid-cooled supercomputers, called “ExaDigiT”. It integrates three main modules: (1) a transient thermo-fluidic cooling model, (2) a resource allocator and power simulator, and (3) an augmented reality model of the supercomputer and central energy plant. The framework enables the study of ``what-if'' scenarios, system optimizations, and virtual prototyping of future systems. Using Frontier as a case study, we demonstrate the framework's capabilities by replaying six months of system telemetry for systematic verification and validation and functional testing. ExaDigiT is able to elucidate complex transient dynamics of the cooling system, run synthetic or real workloads, and predict energy losses due to rectification and voltage conversion. We envision the digital twin will be a key enabler for advancing sustainable and energy-efficient supercomputing. |
A Domain-Specific Compiler for Building GPU-Capable Physics Applications ABSTRACT. Exploiting the Graphical Processing Unit (GPU) power of modern supercomputers is challenging, especially for physicists. Issues with data locality, latency, parallelism, branching, etc… make GPU programming significantly different than traditional code. Compounding these issues is the proprietary nature of GPUs. Applications developed for one GPU vendor do not run on the GPUs of competitors. Unless codes were designed with proper abstraction layers, significant refactoring is needed every time a new GPU architecture comes out. This continuously accumulating technical debt hinders scientific advancement. Over the years attempts have been made to build performance-portable solutions to the heterogeneous GPU programming problem. OpenCL was one of the first attempts at developing a cross-platform solution. However, a lack of vendor support and the subsequent poor performance led to limited uptake. OpenACC uses source annotation to lower the barrier to entry. Again, the lack of vendor support and differences in programming consideration compared to Central Processing Units (CPUs) limit its usefulness. The emergence of machine learning frameworks like TensorFlow and PyTorch provides a modern solution to the heterogeneous programming problem. These frameworks use a graph computation approach which abstracts tensor operations from the underlying compute. However, since they are geared towards machine learning, they can be challenging to use. Their black-box nature can result in excessive memory usage and poor compute utilization. Support for GPUs other than CUDA is often lacking. Additionally, as Python frameworks, they are not easy to embed into non-Python codes. To overcome these problems, we have developed a domain-specific compilation framework written in C++. This framework is designed to abstract the physics equations to the compute backends. Computation is deferred by building a graph data structure representing the physics equations. In graph form, algebraic simplification can be applied to reduce complexity. Additionally, analytic derivatives are obtained by transforming the graph operation through the chain rule. The simplified graph form can then be just-in-time (JIT) compiled to a GPU or CPU backend. Focusing on the operations just for targeting physics problems results in a much simpler framework. This simplicity allows us to customize the graph to meet the physics needs. New graph operations are made by defining methods to evaluate, reduce, differentiate, and compile. Using this framework, we developed a true cross-platform GPU-capable RF Ray tracing code. This code exploits the properties of the graph framework simplifying the problem of encoding the RF physics in different geometries. We can demonstrate the ability to solve a power deposition in a real tokamak equilibrium on parallel CPUs and CUDA, HIP, and Apple GPUs at a scale beyond what is possible using legacy codes. |
Scalable, energy-efficient training of graph neural networks for accurate and stable predictions of atomistic properties ABSTRACT. We present our work on developing and training scalable graph neural networks using HydraGNN, a multi-headed graph convolutional neural network architecture. HydraGNN expands the boundaries of graph neural network computations in both training scale and model adaptability. It abstracts over message passing algorithms, allowing both reproduction of and comparison across algorithmic innovations that define nearest-neighbor convolution in graph neural networks. This work discusses a series of optimizations that have allowed scaling up the HydraGNN training to tens of thousands of GPUs on datasets that consist of hundreds of millions of graphs. HydraGNN uses multi-task learning (MTL) to simultaneously learn graph-level and node-level properties of atomistic structures, such as the total energy and atomic forces. Using over 150 million atomistic structures for training, we illustrate the performance of our approach along with the lessons learned on two state-of-the-art United States Department of Energy (US-DOE) supercomputers, namely the Perlmutter petascale system at NERSC and the Frontier exascale system at ORNL. The HydraGNN architecture achieves near-linear strong scaling performance using more than 2,000 GPUs on Perlmutter and 16,000 GPUs on Frontier. Hyperparameter optimization (HPO) weas performed using more than 8,000 nodes on Frontier to select HydraGNN architectures with high accuracy. Early stopping was applied on each HydraGNN architecture for energy awareness in performing such an extreme-scale task. The training of an ensemble of highest ranked HydraGNN architectures was continued till convergence to establish uncertainty quantification (UQ) capabilities with ensemble learning. Our contribution opens the door for rapidly developing, training, and deploying HydraGNN models using large-scale computational resources to enable AI-accelerated materials discovery and design. |
JACC: Leveraging HPC Meta-Programming and Performance Portability with the Jus-In-Time and LLVM - based Julia Language ABSTRACT. We present JACC, the performance-portable and metaprogramming model for the just-in-time and LLVM-based Julia language. JACC provides a unified and lightweight front end across different back ends available in Julia to enable the same Julia code to run efficiently on current HPC CPU and GPU targets. This is the very first time that meta-programming and performance portable capabilities are taken to the just-in-time and real-time interactive Julia ecosystem, elevating the capabilities of programming productivity for the implementation of Julia scientific and HPC codes. We evaluated the performance of JACC for common HPC kernels (e.g., AXPY and DOT) as well as for some of the most computationally demanding kernels used in applications, such as MiniFE, a proxy application for unstructured implicit finite element codes, HPCCG, a supercomputing benchmark test for sparse domains, and HARVEY, a blood flow simulator to assist on the diagnosis and treatment of patients suffering from vascular diseases. We carried out the performance analysis on some of the most advanced supercomputers today: Aurora, Frontier, and Perlmutter, providing an excellent platform to demonstrate the advantages of JACC. Overall, we show that JACC has a negligible overhead versus Julia's vendor-specific solutions, reporting GPU speedups over the CPU implementations with no extra cost to programmability. |
Implementation of the new Bio-SANS detector in drtsans, The Data Reduction Toolkit for SANS at ORNL ABSTRACT. The implementation of the new BIOSANS detector in the drtsans data reduction toolkit for Small Angle Neutron Scattering (SANS) at Oak Ridge National Laboratory (ORNL) represents a significant advancement in Q-resolution for data collected at the BIOSANS beamline. This "midrange" detector is designed to generate intensity profiles that overlap with those of the "main" and "wing" detectors. The Data Reduction Toolkit for SANS (drtsans) has been extended to include calibration and reduction for the "midrange" detector in conjunction with the "main" and "wing" detectors. Bar-scan and tube-width calibration determine the effective position, dimension, and width of each detector pixel, ensuring precise measurements by correcting for spatial distortions and occlusions. In the data reduction workflow, the toolkit converts Time-of-Flight (TOF) data collected at the "midrange" detector to momentum transfer (Q) space, applying corrections for transmission, sample thickness, and background noise. These advancements in the BIOSANS detector and the drtsans toolkit significantly enhance the capabilities of SANS experiments, providing researchers with more accurate reduced data. |
Moving from Desktop to Web Application Architecture - Design Considerations using GARNET ABSTRACT. The Single Crystal Graphical Advanced Reduction Neutron Event Toolkit (GARNET) project will enable users to select single crystal diffraction data and transform them into a meaningful form. The main goal is to combine several tools from many instruments into a user-friendly environment for data reduction. Working with a prototype on data manipulation, we are exploring the GARNET application architecture moving from desktop to web application design. We compare the above designs regarding the front and back end by presenting considerations, challenges, similarities and differences within the GARNET ecosystem. Finally, we show how visualization changes moving from a desktop to a web application design for the same screen, by following common visualization principles. |
ACTIVE: the Automated Control Testbed for Integration Verification and Emulation ABSTRACT. ACTIVE is a software framework to aid in the definition, testing, and deployment of building operations software. ACTIVE integrates with Python programs to allow the easy swapping of communications code and runtime environment for the code that handles building control logic, including creating emulated services for the code to be run against. This allows the code to be easily deployed across multiple scenarios, such as a local development environment running against local HTTP servers and a production environment deploying the code as a VOLTTRON agent using real devices. Test scripts can be included to define scenarios to test against, such as ensuring resiliency against a service temporarily losing connection. |
Green Energy Solutions: Fusion and Solar ABSTRACT. Green Energy is a top research priority, and its importance has only grown over time due to environmental, geopolitical, and practical reasons. Two projects related to creating large scale research tools for these areas of interest will be explored; GITR and EMT_AGILE. The GITR code models the complex phenomena inside prototype fusion reactors from atomic scale phenomena and tiny timescales to macroscopic outcomes for the whole device. The EMT_AGILE code models large scale power grid systems containing many elements with an emphasis on performance to enable as large scale simulations as modern hardware can handle. |
INTERSECT-SDK: Integrating Diverse Microservices Into a Single Ecosystem ABSTRACT. Developing a framework for microservice integration into the wider Interconnected Scientific Ecosystem contains many challenges. The framework should handle INTERSECT integration almost automatically, yet should be as unopinionated as possible regarding the rest of the application; it should allow for straightforward configuration of interacting with the INTERSECT control plane and data plane; it should allow for extensible communication patterns, yet require concrete standardization. We present a library which allows for users to generate an entire schema for their microservice via a straightforwards and declarative API, providing the INTERSECT ecosystem a consistent mechanism for creating and orchestrating scientific campaigns between numerous microservices, while allowing microservice developers to focus on domain-oriented input and output definitions. |
SNAPRed: A Novel Approach to Data Reduction for the Highly Re-configurable SNAP Diffractometer ABSTRACT. Updates on progress to last year's submission. The SNAP diffractometer at the Spallation Neutron Source (SNS) stands as a hallmark of versatility, designed to analyze a wide range of material forms. Enhanced with cutting-edge detectors, optics, and expansive angular coverage, SNAP offers unparalleled precision in neutron diffraction analysis. It boasts a unique assortment of pressure devices, capable of achieving pressures up to 100 GPa, making it indispensable for studies in extreme conditions like planetary ices and hydrogen bonding dynamics. However, SNAP's intricate features, such as movable detectors and wide angular coverage, introduce challenges in data reduction and interpretation. Recognizing these complexities, the SNAPRed project was initiated. SNAPRed is a desktop application designed for Lifecycle Management of SNAP data. It incorporates advanced Mantid algorithms, provides a streamlined calibration process, and effectively manages the instrument's re-configurability. Notably, SNAPRed promises consistent data management and introduces standardization in Reduction and Calibration practices, ensuring reliable and accurate results. Future enhancements for SNAPRed will include improved diagnostics, data visualization, enhanced user experience, and advanced data management features. |
Developing Interactive Neutron Science Applications using Galaxy ABSTRACT. Galaxy is being used at Oak Ridge National Laboratory (ONRL) as a central component for the Neutron Data Interpretation Platform project to enable users to create, execute, share, and reuse scientific tools and workflows for reproducible neutron scattering experimental data analysis. Last year, our focus was on ensuring Galaxy runs smoothly across multiple computational clusters, including Frontier, our leadership computing system. Now that we've successfully completed this infrastructural milestone, we're shifting our attention to improving the user experience. To accomplish this, we've implemented various enhancements to Galaxy, such as live job monitoring and the ability to stop running jobs if necessary. Additionally, we've created about 60 neutron science tools and integrated interactive tools like noVNC and ThinLinc desktops, allowing users to access any Linux (and Windows) graphical applications seamlessly through Galaxy. Furthermore, we're developing interactive Neutron science applications using an open-source web framework, Trame. These applications integrate with Galaxy, providing users with advanced visualization capabilities for complex 3D datasets utilizing the VTK library and GPUs. In our poster, we demonstrate how we've integrated the Trame framework with Galaxy to create powerful scientific tools. We believe these developments will significantly benefit researchers working in neutron science and other scientific domains. |
Live Simulation Feedback Additive Manufacturing using INTERSECT ABSTRACT. The Interconnected Science Ecosystem – Software Development Kit (INTERSECT SDK) project is an open software framework for autonomous laboratories. Using a microservices architecture, it enables connecting an active additive manufacturing (AM) machine with in-situ monitoring accessories to computational hardware for complex simulation feedback mid-print. Low level control mechanisms for the AM machine are integrated with INTERSECT via the open-source Robot Operating System (ROS) to dynamically adjust print parameters. Adamantine heat transfer modeling software is managed by an INTERSECT service on computational hardware that can support rapid modeling of a print. Because each are integrated with INTERSECT, the two services can interact to optimize a print and support future developments to printing and simulation using INTERSECT’s extensible SDK. |
Smart Spectral Matching ABSTRACT. The Smart Spectral Matching (SSM) library leverages machine learning to help in the identification of unknown spectra. It leverages the Compendium of Uranium Ramen and Infrared Experimental Spectra (CURIES) database to allow the creation of machine learning models used in the characterization of Uranium mineral spectra. |
Rapid Radiological Calculations for New Nuclear Facility Design ABSTRACT. In new nuclear facility design, radiological calculations are complicated by inherent design iteration. The Radioisotope Processing Facility project has developed a set of Python scripts to enable rapid recalculation and analysis of basic radiological information. The RPF code consists of three subroutines that calculate irradiation, shielding, and consequences. This code has already proven extremely useful for facility planning and hot cell design, and could have applications for facility planning beyond RPF. |
Cabana: a scalable and performance portable particle library ABSTRACT. We present Cabana, a scalable and performance portable library for building scientific applications, including particle-based (mesh-free) methods from atomic scales (molecular dynamics) to cosmology (N-body), hybrid particle-mesh (e.g. particle-in-cell), and structured grid simulation. Cabana was created through the U.S. Department of Energy Exascale Computing Project to enable particle simulations on exascale supercomputers, as well as local workstations. Cabana uses a Kokkos+MPI strategy to separate the concerns of the application physics from the threaded parallelism and vendor backends, as well as from domain decomposition and distributed parallelism. Cabana provides data structure, parallelism, and algorithmic extensions to Kokkos for both particles and structured grids, as well as MPI communication for both. Examples of current Cabana-based development and use includes fracture mechanics, materials manufacturing, and plasma physics. |
HKL diffractometer computations as an EPICS IOC using PyDevice ABSTRACT. We extend and integrate a diffractometer HKL software package[1] into an EPICS[2] IOC with the help of PyDevice[3]. This IOC provides a generalized interface for mapping real space motor positions to HKL reflections with the advantage of being built directly in the EPICS control system. This approach binds core C functions to Python which takes advantage of the efficiency of C and the readability of Python. Extending to inelastic scattering is in progress, which will allow our IOC to be applicable to neutron and x-ray diffractometers. Multiple diffractometers (4-circle, 6-circle, kappa geometries) will be supported with planned extension to triple-axes spectrometers. References: [1] https://repo.or.cz/hkl.git [2] https://epics-controls.org/ [3] https://github.com/klemenv/PyDevice/ Acknowledgments: This manuscript has been authored by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy (DOE). The US government retains and the publisher, by accepting the article for publication, acknowledges that the US government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for US government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public AccessPlan (https://www.energy.gov/doe-public-access-plan). 2 A.B. research was supported in part by an appointment to the Oak Ridge National Laboratory GEM Fellowship Internship Program, sponsored by the U.S. Department of Energy and administered by the Oak Ridge Institute for Science and Education. |
Fusion Graph Neural Networks (FuGNN): a GNN-based framework for dynamic graph representation learning ABSTRACT. The Fusion Graph Neural Network (FuGNN) is an innovative software library designed to address dynamic graph problems by leveraging advanced graph representation learning techniques. At its core, FuGNN employs GraphSAGE to transform graph data into latent embeddings, which are then processed by transformers to learn their temporal evolution. This modular approach allows for diverse downstream tasks to be handled via specialized decoders. Built on top of PyTorch and DGL, FuGNN integrates Lightning and mlFLOW to streamline the training and deployment processes. The primary goal of FuGNN is to empower domain scientists, who may not be experts in machine learning, to effectively utilize graph neural networks for solving complex, dynamic graph problems in their respective fields. This capability is critical for advancing research and applications in areas such as social network analysis, biological systems, and infrastructure management, making FuGNN a versatile and powerful tool for scientific and industrial advancements. |
Software for insight into metal additive manufacturing processing conditions and microstructure development ABSTRACT. Through the Exascale Computing Project’s ExaAM initiative and the Digital Factory framework at the ORNL Manufacturing Demonstration Facility (MDF), a series of software capabilities for simulating metal additive manufacturing processes and the resulting microstructure have been developed. These tools have been validated against benchmark data from the National Institute of Standards and Technology’s AM benchmark series and incorporated into the Myna framework for launching coupled process-microstructure simulations to create digital threads of MDF builds. This poster will summarize the process and microstructure simulation capabilities currently in use as part of MDF workflows, including models at various length scales and levels of physics fidelity. Model validation and use cases for each component and a summary of model formulation and assumptions will be given, as well as gaps in the current capabilities and planned software development to fill these gaps. |
Scalable GPU sort on Frontier with Chapel ABSTRACT. Sorting is a critical algorithm in the Chapel-based Arkouda exploratory data analytics library. At OSDX 2023, we presented initial performance results for a multi-GPU sort implementation on NVIDIA GPUs. This poster will update the community on our efforts to implement a scalable, distributed GPU sort on Frontier with AMD GPUs. We will show the interaction between sorting algorithms and system architecture with regard to the particular features of Frontier. We will also present initial performance results for sorting a large dataset on multiple nodes of Frontier, and discuss opportunities for further performance improvements. |
Automated Testing and Continuous Integration for the KORC Codebase ABSTRACT. Runaway electrons (REs) generated during disruptions in burning fusion plasmas pose a major risk for plasma-facing components in the ITER Tokamak reactor. The ORNL-developed Kinetic Orbit Runaway electrons Code (KORC) paired with the NIMROD extended-magnetohydrodynamic (MHD) solver, bring quantitative dynamics simulation capabilities for studying ways to mitigate reactor damage. Efforts are underway for expanding KORC to run on Perlmutter using the OpenACC framework. We use the GitHub Actions Matrix Strategy, facilitating a succinct, parameterized approach for automating builds across a wide variety of operating systems and hardware. In the same manner, we leverage Spack's combinatorial versioning capability for managing toolchains and software stacks. CMake allows for feature selection and adjustments to KORC during build time, while CTest automates testing KORC driven calculations against golden data. Together, these tools comprise a robust paradigm for enhancing the KORC project's software development cycle. Finally, a secure human-in-the-loop CI mechanism is devised, allowing safe CI interoperability with self-hosted ExCL (ORNL Experimental Computing Lab) GPU accelerated machines. |
Accessible Content Organization for Research Needs (ACORN) Schemas ABSTRACT. Communicating research breakthroughs is often done through fact sheets and posters as physical mediums. These provide a high-level overview of projects and groups for sponsors, research peers, government officials, and the public. They can also serve as recruitment tools, marketing materials and watershed pieces for holistic lab communication. With ACORN, we aim to standardize and automate the research content creation process so we can better represent our projects, capabilities, technology and, most importantly, our people at ORNL. |
Distributed Workflows Automation at OLCF with Zambeze and Flowcept ABSTRACT. Modern science relies on end-to-end workflows that incorporate experimental instruments and utilize edge, cloud, or high-performance computing and storage resources. These components are geographically dispersed across various user facilities and interconnected through high-speed networks. In this poster, we present two tools: Zambeze and Flowcept. Zambeze is a framework for distributed workflows that automatically facilitates cross-facility workflows. Utilizing swarm intelligence principles, Zambeze orchestrates science campaigns by managing distributed autonomous agents. These agents can offer a suite of services, including computing, storage, and data management. We demonstrate the feasibility of Zambeze through a real-world application involving electron microscopy, enhanced with Artificial Intelligence capabilities. Flowcept is a data analysis framework that enables scientists to view and integrate telemetry and application data from across the orchestration execution of multiple workflows. When combined, we envision Zambeze and Flowcept to be a powerful unified solution for both executing and making FAIR advanced cross-facility science workflows at ORNL. |
Taking Open MPI to New Frontiers ABSTRACT. The new exascale systems at the U.S. Department of Energy (DoE) laboratories include a new network interconnect from Cray HPE. Frontier at the Oak Ridge Leadership Computing Facility (OLCF) is the first DoE exascale system that includes the new Slingshot 11 network. This same interconnect is used in Aurora at Argonne Leadership Computing Facility (ALCF) and Perlmutter at NERSC. As such, the support of Slingshot 11 is an important capability to meet the needs of exascale applications. This poster highlights the design and development of supporting infrastructure to enable Open MPI to efficiently support the new Slingshot 11 platforms. The focus of the poster is on enhancements for intra-node and inter-node communication that uses the libfabric shared memory (SHM) and Slingshot (CXI) providers. We include initial performance results for message passing interface benchmarks using SS11 on systems running at OLCF. |
Adaptive Sparse Grid Discretization - ASGarD ABSTRACT. The ASGarD project has the goal of building a solver specifically targeting high-dimensional PDEs where the "curse-of-dimensionality" has previously precluded useful continuum / Eularian (grid or mesh based as opposed to Monte-Carlo sampling) simulation. Our approach is based on a Discontinuous-Galerkin finite-element solver build atop an adaptive hierarchical sparse-grid. |
NetZero-ARMADA ABSTRACT. ARMADA: Action-Relevant Modeling And Decision Analysis A Scalable Multi-Sector Analysis Platform for the Energy Transition |
pyGlobOpt: Adaptive Learning Global Optimization and Reaction Pathways Search ABSTRACT. The search for global minimum is a crucial aspect of chemistry, with numerous applications in materials science and catalysis. However, this task is incredibly challenging for larger systems, as it requires navigating a vast potential energy surface with multiple possible outcomes. The pyGlobOpt software is used to minimize the total energy of large systems (~102-103 atoms) by efficiently navigating the potential energy surface. The algorithm is based on swarm intelligence search, which can accelerate the structural search by orders of magnitude due to its parallel implementation. Each global minimum search considers nanoparticle size, shape, and surface coverage. Our approach has been successfully used to identify global minima of various clusters. Those include the determination of optimal structures for nanoparticles, enabling the design of materials with tailored properties for catalysis. Understanding the behavior of complex materials in the condensed phase. Finding the most stable configurations of molecular clusters, shedding light on complex reaction mechanisms. |
Model America and Future Weather files development for Building Energy Simulation ABSTRACT. The Oak Ridge National Laboratory (ORNL) has pioneered the development of the Automatic Building Energy Modeling (AutoBEM) software suite, a robust and comprehensive tool designed to revolutionize building energy simulations. The AutoBEM suite seamlessly processes multiple data types, extracts building-specific descriptors, generates detailed building energy models, and facilitates large-scale simulations using High Performance Computing (HPC) resources. This work delves into the functionality and application of AutoBEM, emphasizing its capability to implement various EnergyPlus measures to optimize energy efficiency across a wide array of building types in the United States. One of the central components of the work is the detailed presentation of Model America v2.0, an expansive update from the original dataset. This new version extends the coverage to approximately 141 million buildings across the U.S. and its territories, significantly enhancing the granularity and applicability of the data. Model America v1.0 provides free access to energy models for 97.8% of U.S. buildings, making it an invaluable resource for energy researchers, urban planners, and policy makers. This dataset supports a wide range of applications, from regional energy planning to national policy development, by offering a comprehensive baseline for evaluating and forecasting building energy demands under various scenarios. Furthermore, the work also explains the development of historical and future Typical Meteorological Year (fTMY) weather files from 1980 to 2100. These files are critical for refining the accuracy of building energy simulations by representing realistic, year-specific climate variations. The integration of fTMY into the AutoBEM suite enhances the predictive accuracy and reliability of the energy models, facilitating better-informed decisions in building design and energy conservation measures. The implications of this work are profound. By providing a detailed and accessible dataset alongside powerful modeling tools, ORNL's initiatives empower stakeholders to make data-driven decisions that promote energy efficiency and sustainability. The advancements in AutoBEM and the Model America dataset support the pursuit of reduced energy consumption and carbon emissions and contribute to the broader goal of creating more resilient and sustainable urban environments. |
Exploring Vision Transformers on The Frontier Supercomputer for Remote Sensing and Geoscientific Applications ABSTRACT. The earth sciences research community has an unprecedented opportunity to exploit the vast amount of data available from earth observation (EO) satellites and earth system models (ESM). The ascent and application of artificial intelligence foundation models (FM) can be attributed to the availability of large volumes of curated data, access to extensive computing resources and the maturity of deep learning techniques. Vision transformers (ViT) architectures have been adapted for image and image-like data, such as EO data and ESM simulation output. Pretraining foundation models is a compute intensive process, often requiring 10^5 - 10^7 GPU hours for large scale scientific applications. There is a limited body of knowledge on compute optimal methods for pretraining, necessitating a trial and error process. We have performed a series of experiments using ViT and Masked Autoencoder backbones at different scales to understand optimal and cost-effective ways to improve scientific throughput. We then finetune the trained models using the AI-driven Cloud Classification Atlas (AICCA), an AI-generated dataset of satellite ocean clouds, to validate their scientific applicability for classification tasks. This preliminary benchmark provides an assessment of which architectures and model configurations are favorable in a given scientific context. |
Intelligent Runtime System (IRIS) SDK: Portable Abstractions for Extreme Heterogeneity ABSTRACT. IRIS represents a pioneering leap in intelligent runtime systems tailored for extremely heterogeneous computer architectures. This innovative platform dramatically simplifies the complexities associated with programming heterogeneous computers, including memory management, data transfers, and task scheduling across diverse devices. Ultimately, IRIS enhances performance, productivity, and portability, ensuring applications efficiently leverage the full spectrum of computing resources. IRIS stands out as the world's only framework capable of concurrently executing application tasks on a diverse array of devices simultaneously including CPUs, NVIDIA/AMD GPUs, FPGAs, and Hexagon DSPs. This unparalleled versatility facilitates hardware-agnostic application development, allowing code to run seamlessly across high-performance computing (HPC) platforms (such as Summit, Frontier, and Andes) and embedded systems (including Qualcomm mobile SoCs with Hexagon DSP), without any modifications or performance engineering. One of IRIS's hallmark features is its intelligent heterogeneous memory handler. This advanced mechanism orchestrates data transfers seamlessly, eliminating the need for developers to engage with cumbersome device-specific transfer APIs. Empirical evidence showcases IRIS's capability to achieve up to a fivefold increase in performance efficiency compared to traditional user-driven approaches, alongside significant reductions in memory transfer operations between devices. The IRIS Software Development Kit (SDK) comes bundled with a comprehensive suite of tools— MatRIS, Hunter, and Dagger—each designed to extend IRIS's functionality further. Empowered by IRIS runtime, MatRIS is a genuinely heterogeneous and portable linear algebra library that delivers comprehensive support for BLAS, LAPACK, and trigonometric operations alongside an API for tiled algorithms and load distribution across multiple devices. Hunter offers an extensive collection of scheduling algorithms in Python and C++, enabling precise analysis of various scheduling scenarios. The Dagger framework provides a robust tool for validating the scheduler of the IRIS runtime system; Dagger employs synthetic application task graphs to explore scalability across a parameterized range of task counts and device configurations. Multiple teams are building solutions on IRIS. In one example, the IRIS SDK has played a critical role in the development of CHARM-SYCL by Tsukuba University, serving as a backend runtime system that seamlessly integrates with SYCL backend kernels for dynamic task mapping and autonomous data management. In another example, IRISX, an extension of the IRIS runtime, coordinates with the SPIRAL code generation engine to offer an integrated solution for code generation, portability, and heterogeneity. SPIRAL, developed by Carnegie Mellon University, generates optimized code for scientific kernels, which, when combined with IRIS, allows the execution of architecture-agnostic tasks that are specialized for specific devices at runtime. |
IRI Science Applications on the Advanced Computing Ecosystem Testbed PRESENTER: David Rogers ABSTRACT. The DOE Integrated Research Infrastructure effort aims to connect computational and experimental facilities in ways that enable new possibilities for science. This poster presents ongoing work in the National Center for Computational Sciences (ORNL NCCS) to explore this application space, along with supporting infrastructure projects collaborating with NCCS. Science applications wanting to preview this integrated architecture are encouraged to join the collaboration on our Advanced Computing Ecosystem testbed cluster as the work develops. |
A Survey on Data Software for Small Angle Neutron Scattering and Reflectometry ABSTRACT. The Spallation Neutron Source and the High Flux Isotope Reactor are neutron producing facilities that help outside users measure nanoscopic substances. Providing user-friendly data reduction and analysis software for neutron science is a necessary part of maintaining an effective user facility. The different scattering data requires varying levels of adjustments, which makes the software landscape complex and diverse. I focused on the neutron scattering techniques of small angle neutron scattering and reflectometry. Throughout my internship, we had informal discussions with instrument scientists about the software used and how the instrument scientists interact with the data and users. We also scoured the internet for supporting material on the software including, technical documents, user guides, source locations, and papers. We organized the information and arranged each software based on the data processing type (data reduction and data analysis), instrument facility (Spallation Neutron Source and High Flux Isotope Reactor), and neutron scattering technique (small angle neutron scattering and reflectometry). We have collected information and resources to understand how users process and analyze their data and what is the user experience using the various software to achieve that. |
Developing Efficient Multivariate Surface Uncertainty Visualization Algorithms Using VTK-m Software ABSTRACT. Uncertainty visualization is an emerging research topic in data visualization because neglecting uncertainty in the visualization can lead to inaccurate assessments. In this work, we study the effect of data uncertainty on surface-based visualization of multivariate data. Although there have been a few advancements in understanding uncertainties in surface-based scientific visualization, three critical challenges remain to be addressed. First, state-of-the-art uncertainty visualization algorithms are limited to analysis of univariate and bivariate data. Second, these algorithms are computationally intensive and lack support for cross-platform portability. Third, as a consequence of the computational expense, integration into interactive production visualization tools is impractical. In this work, we address the mentioned research gaps with a threefold contribution. First, we extend the existing surface-based univariate and bivariate uncertainty visualization framework to multivariate data with more than two variables. Second, we develop a parallel and platform-portable multivariate surface-based uncertainty visualization algorithm using the VTK-m visualization toolkit to accelerate computations. Lastly, we demonstrate the integration of our algorithm with the ParaView software. With VTK-m’s shared-memory parallelism and cross-platform compatibility features, we demonstrate the acceleration of multivariate uncertainty visualization using OpenMP and AMD GPUs. We present performance enhancements of our VTK-m algorithms and their integration with ParaView through experiments on large-scale multivariate data with three and four variables. |
Software Developments for IMAGINE-X: Future Neutron Beamline using Dynamic Nuclear Polarization for Protein Crystallography ABSTRACT. Through the DOE Office of Science’s Biopreparedness Research Virtual Environment, or BRaVE, Initiative, ORNL has received funding to transform the IMAGINE neutron single crystal diffractometer at the High Flux Isotope Reactor with new dynamic nuclear polarization (DNP) capabilities by 2026. This transformation comes with a planned rename of the beamline, IMAGINE-X. Currently, IMAGINE is purpose built to study protein crystallography, an established technique for determining the structure and function of proteins. Going forward, DNP takes advantage of the nuclear spin dependence of neutron scattering, polarizing both the neutron beam and sample. This allows for order of magnitude improvements for signal-over-background for hydrogen scattering, which is key to many of the functions of proteins. The current software developments for this on-going project will be presented for the virtual data creation for testing, data reduction, deployment, and future data analysis works. |
Ongoing Journey to Digital Twin and Experimental Steering for Neutron Diffraction at ORN ABSTRACT. Routinely, neutron powder diffraction experiments are used to understand the evolution of materials processes. Examples include structure changes for battery materials during charge-discharge cycles or for temperature-dependent phase changes for carbon capture materials. The neutron powder diffractometers have the measurement capabilities to track these structural changes within seconds. These types of high-rate experiments add increasing complexity since there is more measured data than can be manually analyzed and then turned into experimental control decisions. Experimental Steering for Powder Diffraction (ESPD) and NeutMATIX are two projects that are aimed to help with automation, smart data pipelining, and digital twin development for neutron powder diffraction experiments at ORNL. ESPD is focused on completing autonomous experiment loops with data analysis and secure instrument controls. NeuMATIX is focused on providing virtual digital twins of neutron powder diffractometers to bridge atomistic modeling all the way to instrument controlled virtual instruments. The recent developments and future works will be described and how, together, they will create re-usable tools required for others to perform similar autonomous experiments. |
Efficient and portable tensor contractions using NTCL ABSTRACT. Tensor contractions emerge naturally in many computational problems. This can be anything from a simple scalar product of two vectors, a matrix-matrix multiplication, to a multi-index contraction of clustering tensors in coupled clusters. Therefore, it is of the utmost importance that these can be computed as efficiently as possible, utilizing all the available computational resources. However, different computers have very different hardware, meaning that the most efficient implementation of a tensor contraction will be different by necessity. NTCL provides a hardware-independent simple-to-use interface for Fortran programs, while the backend is hardware-specific. This hardware decoupling makes physics code easier to write while allowing the same code to utilize the hardware of all DOE machines efficiently. |
Software development for the IMAGINE-X experiment to enable neutron crystallography with dynamic-nuclear polarization ABSTRACT. Drug design requires precise information about the active site of an enzyme, where most often a reaction or chemical transformation occurs by charge-transfer, bond-dissociation, protonation or de-protonation. Conventional X-ray bio-macromolecular crystallography provides insight into heavy atom positions but does not provide reliable information about hydrogens in the active site due to their low scattering contrast. Neutron macromolecular crystallography (NMC) is the structural biological method of choice that provides important, isotope-dependent information from scattering by hydrogen or deuterium nuclei. Here, we are concerned with the emerging method of dynamic nuclear polarization (DNP)-NMC. In this computational analysis, we introduce a novel method to simulate and analyze DNP neutron diffraction and directly map the hydrogen nuclear density, up to a single global phase factor. Our method and software provide the foundation to design the IMAGINE-X experiment that allows for the refinement or "decoration" of existing structural models that do not feature hydrogens. Furthermore, this DNP-NMC implementation will possibly allow one to probe the quantum-mechanical nature of the spin-dependent physics in biological matter, based on the alignment between incident neutron and active site proton spins. |
Session Chair - Maria Patrou
Session Chair - Marshall McDonnell