PEARC'22: PRACTICE AND EXPERIENCE IN ADVANCED RESEARCH COMPUTING 22
PROGRAM FOR TUESDAY, JULY 12TH
Days:
previous day
next day
all days

View: session overviewtalk overview

10:30-12:00 Session 9A: Systems Track: HTCondor/Docker/Kubernetes

Systems Track 1

10:30
Auto-scaling HTCondor Pools using Kubernetes Compute Resources
PRESENTER: Igor Sfiligoi

ABSTRACT. HTCondor has been very successful in managing globally distributed, pleasantly parallel scientific workloads, especially as part of the Open Science Grid. HTCondor system design makes it ideal for integrating compute resources provisioned from anywhere, but it has very limited native support for autonomously provisioning resources managed by other solutions. This work presents a solution that allows for autonomous, demand-driven provisioning of Kubernetes-managed resources. A high-level overview of the employed architectures is presented, paired with the description of the setups used in both on-prem and Cloud deployments in support of several Open Science Grid communities. The experience suggests that the described solution should be generally suitable for contributing Kubernetes-based resources to existing HTCondor pools.

10:45
Early Experiences with Tight Integration of Kubernetes in an HPC Environment
PRESENTER: Troy Baer

ABSTRACT. The Ohio Supercomputer Center has deployed a Kubernetes cluster with tight integration to a high performance computing (HPC) environment. This deployment leverages existing file systems for data sharing between HPC systems and Kubernetes objects, monitoring, account management, resource management, and accounting systems. This paper describes the motivation and overall design, the novel methods for the implementation, and the applications supported by this new resource. It also presents a short description of future work and some of the questions raised by this design.

11:00
The Anachronism of Whole-GPU Accounting
PRESENTER: Igor Sfiligoi

ABSTRACT. NVIDIA has been making steady progress in increasing the compute performance of its GPUs, resulting in order of magnitude compute throughput improvements over the years. With several models of GPUs coexisting in many deployments, the traditional accounting method of treating all GPUs as being equal is not reflecting compute output anymore. Moreover, for applications that require significant CPU-based compute to complement the GPU-based compute, it is becoming harder and harder to make full use of the newer GPUs, requiring sharing of those GPUs between multiple applications in order to maximize the achievable science output. This further reduces the value of whole-GPU accounting, especially when the sharing is done at the infrastructure level. We thus argue that GPU accounting for throughput-oriented infrastructures should be expressed in GPU core hours, much like it is normally done for the CPUs. While GPU core compute throughput does change between GPU generations, the variability is similar to what we expect to see among CPU cores. To validate our position, we present an extensive set of run time measurements of two IceCube photon propagation workflows on 14 GPU models, using both on-prem and Cloud resources. The measurements also outline the influence of GPU sharing at both HTCondor and Kubernetes infrastructure level.

11:15
Buzzard: Georgia Tech’s Foray into the Open Science Grid
PRESENTER: Mehmet Belgin

ABSTRACT. Open Science Grid (OSG) is a consortium that enables many scientific breakthroughs by providing researchers with access to shared High Throughput Computing (HTC) compute clusters in support of large-scale collaborative research. To meet the demand on campus, Georgia Institute of Technology (GT)’s Partnership for an Advanced Computing Environment (PACE) team launched a centralized OSG support project, powered by Buzzard, an NSF-funded OSG cluster. We describe Buzzard’s unique multi-tenant architecture, which supports multiple projects on a single CPU/GPU pool, for the benefit of other institutions considering a similar approach to support OSG on their campuses.

11:30
Containerizing Visualization Software: Experiences and Best Practices
PRESENTER: Andrew Solis

ABSTRACT. The standard process for software development has changed dramatically in the past decade. What was once a large effort of installing the same software across different systems has become much more streamlined with the rapid emergence and wide-scale adoption of Docker as the de facto container management ecosystem. Coincidentally, this has had an impact in the HPC and scientific computing community, allowing system maintainers to maintain and install packages with easier effort[12]. This can be seen through the adoption of containers on many large scale systems, including those supported by the Texas Advanced Computing Center (TACC), DOE, XSEDE, and the wider NSF community[23] [34] [15]. An extra layer of work is necessary when developing containers that require visualization technologies. This includes applications that require a windowing system such as X[9] to render GUIs. It can be further difficult when wanting to expose NVIDIA or GPU related capabilities to containers[32]. While the ability to containerize applications is widely available, there is no central resource for creating those with visualization requirements. Our work aims to consolidate common issues for visualization containers, including both similar and unique solutions. We give detail to the various ways we have worked at TACC to make visualization containers more available to researchers using HPC systems, made development easier, and the promise to being applied in lab research spaces. Our hope is to share similar challenges that other researchers may face, and providing possible solutions, so that further adoption of containers can be more easily developed.

10:30-12:00 Session 9B: Systems Track: Testbed/Simulation

Systems Track 2

10:30
PROWESS: An Open Testbed for Programmable WirelessEdge Systems
PRESENTER: Jayson Boubin

ABSTRACT. Edge computing is a growing paradigm where compute resources are provisioned between data sources and the cloud to decrease compute latency from data transfer, lower costs, comply with security policies, and more. Edge systems are as varied as their applications, serving internet services, IoT, and emerging technologies. Due to the tight constraints experienced by many edge systems, research computing testbeds have become valuable tools for edge research and application benchmarking. Current testbed infrastructure, however, fails to properly emulate many important edge contexts leading to inaccurate benchmarking. Institutions with broad interests in edge computing can build testbeds, but prior work suggests that edge testbeds are often application or sensor specific. A general edge testbed should include access to many of the sensors, software, and accelerators on which edge systems rely, while slicing those resources to fit user-defined resource footprints. PROWESS is an edge testbed that answers this challenge. PROWESS provides access across an institution to sensors, compute resources, and software for testing constrained edge applications. PROWESS runs edge workloads as sets of containers with access to sensors and specialized hardware on an expandable cluster of light-weight edge nodes which leverage institutional networks to decrease implementation cost and provide wide access to sensors. We implemented a multi-node PROWESS deployment connected to sensors across Ohio State University’s campus. Using three edge-native applications, we demonstrate that PROWESS is simple to configure, has a small resource footprint, scales gracefully, and minimally impacts institutional networks. We also show that PROWESS closely approximates native execution of edge workloads and facilitates experiments that other systems testbeds can not.

11:00
CHI-in-a-Box: Reducing Operational Costs of Research Testbeds
PRESENTER: Michael Sherman

ABSTRACT. Making scientific instruments for computer science research available and open to all is more important than ever given the constantly increasing pace of opportunity and innovation – yet, such instruments are expensive to build and operate given their complexity and need for rapid evolution to keep pace with the advancing frontier of science. This paper describes how we can lower the cost of computer science testbeds by making them easier to deploy and operate. We present CHI-in-a-Box, a packaging of CHameleon Infrastructure (CHI) underlying the Chameleon testbed, describe the practices that went into its design and implementation, and present three case studies of its use.

11:30
Developing Accurate Slurm Simulator

ABSTRACT. A new Slurm simulator compatible with the latest Slurm version has been produced. It was constructed by systematically transforming the Slurm code step by step to maintain the proper scheduler output realization while speeding up simulation time. To test this simulator, a container-based Virtual Cluster was generated which fully mimicked a production HPC cluster. As for all Slurm simulators, the realization is a stochastic process dependent on the computational hardware. Under favorable conditions the simulator is able to approximate the actual Slurm scheduling realization. The simulation fidelity is sufficient to use the simulator for its main function, that is, to test Slurm parameter configurations without having to experiment on full production systems.

11:45
Comparing Single-node and Multi-node performance of an Important Fusion HPC Code Benchmark
PRESENTER: Igor Sfiligoi

ABSTRACT. Fusion simulations have traditionally required the use of leadership scale High Performance Computing (HPC) resources in order to produce advances in physics. The impressive improvements in compute and memory capacity of many-GPU compute nodes are now allowing for some problems that once required a multi-node setup to be also solvable on a single node. When possible, the increased interconnect bandwidth can result in order of magnitude higher science throughput, especially for communication-heavy applications. In this paper we analyze the performance of the fusion simulation tool CGYRO, an Eulerian gyrokinetic turbulence solver designed and optimized for collisional, electromagnetic, multiscale simulation, which is widely used in the fusion research community. Due to the nature of the problem, the application has to work on a large multi-dimensional computational mesh as a whole, requiring frequent exchange of large amounts of data between the compute processes. In particular, we show that the average-scale nl03 benchmark CGYRO simulation can be run at an acceptable speed on a single Google Cloud instance with 16 A100 GPUs, outperforming 8 NERSC Perlmutter Phase1 nodes, 16 ORNL Summit nodes and 256 NERSC Cori nodes. Moving from a multi-node to a single-node GPU setup we get comparable simulation times using less than half the number of GPUs. Larger benchmark problems, however, still require a multi-node HPC setup due to GPU memory capacity needs, since at the time of writing no vendor offers nodes with a sufficient GPU memory setup. The upcoming external NVSWITCH does however promise to deliver an almost equivalent solution for up to 256 NVIDIA GPUs.

10:30-12:00 Session 9C: Applications Track: Optimization and Enhancement Tools

Applications Track 1

10:30
Performance Optimization of the Open XDMoD Datawarehouse
PRESENTER: Gregary Dean

ABSTRACT. Open XDMoD is an open source tool to facilitate the management of high performance computing resources. It is widely deployed at academic, industrial, and governmental HPC centers and is used to monitor large and small HPC and cloud systems. The core of Open XDMoD is a MySQL based data warehouse that is designed to support the storage of historical information for hundreds of millions of jobs with a fast query time for the interactive web portal. In this paper, we describe the transition that we made from the MyISAM to the InnoDB storage engine. In addition, other improvements were also made to the database queries such as reordering and adding indices. We were able to attain substantial performance improvements in both the query execution and in the data ingestion/aggregation. It is a common trend that databases tend to grow in size and complexity throughout their lifetime; this work presents a practical guide for the types of practices and procedures that can be done to maintain data retrieval and ingestion performance.

11:00
AlphaFold2 Workflow Optimization for High Throughput Predictions in HPC Environment
PRESENTER: Edwin Posada

ABSTRACT. In this work, we propose a high-throughput implementation that executes AlphaFold2 efficiently in a High-Performance Computing environment. In this case, we have tested our proposed workflow with the T1050 CASP14 sequence on PSC’s Bridges-2 HPC system. The results showed an improvement in computation-only runtimes and the opportunity to reuse the protein databases when calculating many structures simultaneously, which would lead to massive time savings while maximizing the utilization of computing resources.

11:15
Extending Functionalities on a Web-based Portal for Research Computing
PRESENTER: Duy Pham

ABSTRACT. This paper introduces a research computing portal built as an extension to the OpenOnDemand (OOD) framework. Students 100% implemented the portal at Texas A&M University's (TAMU) High Performance Research Computing (HPRC) facility. It offers an intuitive way for researchers to see all their research computing information on a single web page. This information includes billing accounts, file quotas, recently completed jobs, and currently running jobs. A researcher will also be able to view detailed job information, both for running and completed jobs. The dashboard also provides a 100% visual interface for creating jobs and “offloading” user codes to the cluster and functionality to manage accounts and request quota increases and software installations.

11:30
Data Discoverability in Science Gateways at Scale using Elasticsearch Cluster Architecture

ABSTRACT. Science gateways allow science & engineering communities to access shared data, software, computing services, instruments, educational materials, and other resources specific to their disciplines. One specific example is the use of science gateways to connect researchers with HPC resources by providing a graphical interface to submit jobs and manage shared data sets. In addition to job and data management, the ability to offer robust search features are highly valuable additions to gateways because they enhance the navigability of a user’s personal data as well as the discoverability of collaborative data resources. For a facility managing multiple science gateway products, maintaining up-to-date search indices is a challenge. In this paper, we discuss our framework, architecture, and operation of a multitenant Elasticsearch cluster designed to fulfill the search needs of an expanding portfolio of science gateways. By leveraging Elasticsearch’s distributed data model and role-based access control, we designed a secure search solution which has scaled to over 200 million indexed entities representing approximately 700 terabytes of research data.

11:45
ScriptManager: An Interactive Platform for Reducing Barriers to Genomics Analysis
PRESENTER: Olivia Lang

ABSTRACT. ScriptManager was built to be a lightweight and easy to use genomics analysis tool for novice bioinformaticians. It includes both a graphical interface for easy navigation of inputs and options while also supporting a command line interface for automation and integration with workflow managers like Galaxy. We describe here how a user unfamiliar with the command line can leverage national supercomputing resources using a graphical desktop interface like Open OnDemand to perform their analyses and generate publication quality figures for their research. Widespread adoption of this tool in the genomics community would lower technical barriers to accessing supercomputing resources and allow biochemists to prototype their own workflows that can be integrated into large scale production pipelines. Source code and precompiled binaries available at https://github.com/CEGRcode/scriptmanager.

10:30-12:00 Session 9D: Workforce Track: Outreach & education

Workforce Track 1

10:30
Building Experience and Confidence in HPC Practitioners through the Project-Based, Hands-On Practical HPC Course
PRESENTER: Lauren Milechin

ABSTRACT. The MIT SuperCloud and Lincoln Laboratory Supercomputing Center have been introducing High Performance Computing (HPC) to a new audience through the "Practical High Performance Computing: Scaling Beyond your Laptop" class for the past four years. This informal class, open to the entire MIT community, introduces HPC, identifies canonical HPC workflows, and provides hands-on activities to explore the challenges encountered in the HPC environment. The students use their own research applications as project work to apply the class concepts to gain experience and confidence in using an HPC system and throughout the scaling process. Survey data collected before and after each class demonstrate that students feel they gain familiarity and experience in the concepts taught in the course and confidence in their own ability to apply those concepts.

10:45
HPC Outreach and Education at Nebraska

ABSTRACT. Outreach and education play a critical role in any high-performance computing (HPC) center to help researchers accelerate their research and analyses. At the University of Nebraska's Holland Computing Center (HCC), this remains true with numerous users, classes, and research groups utilizing the high-performance resources available at HCC. The Holland Computing Center is working on expanding and growing the capabilities of researchers using HPC resources and reducing the barrier to using HPC resources at all stages of experience. This is currently accomplished with training events such as workshops and tutorials, documentation, and different tools. With these, HCC is aiming to further improve training and learning opportunities for researchers utilizing HCC resources.

11:00
Building the Research Innovation Workforce: Challenges and Recommendations from a Virtual Workshop to Advance the Research Computing Community
PRESENTER: Thomas Hacker

ABSTRACT. The workforce for research computing, cyberinfrastructure, and data analytics is a complex global ecosystem comprised of workers across academia, national laboratories, and industry. To explore the underlying factors that affect the growth and vitality of this workforce ecosystem, we conducted an NSF funded virtual workshop during the third quarter of 2020 attended by 100 participants. The workshop identified challenges affecting the workforce pipeline and ecosystem and generated recommendations to help address these challenges. This paper provides a summary of the workshop, challenges, and recommendations.

11:30
IndySCC: A New Student Cluster Competition That Broadens Participation
PRESENTER: Darshan Sarojini

ABSTRACT. This paper describes IndySCC, a cloud-based sister competition to the Student Cluster Competition (SCC) held at the SuperComputing (SC) conference. The competition furthers the educational mission of SC and serves as a path to incorporating students into the HPC community. This paper also reports demographic data of the SCC applicants from 2010 to 2021. IndySCC was developed to reduce entry barriers to HPC research and education and create an education-focused track. It included teams with less competitive SCC applications to help them develop stronger applications in the future, to help maintain team participation, and keep university curriculum programs healthy. IndySCC was conducted virtually in a university-course style spanning July to November. Teams were provided with HPC hardware through Chameleon Cloud. In addition, teams received continuous support from HPC experts in the months leading up to the competition and received feedback on assignments periodically. Finally, the teams competed by running real-world applications with maximum throughput for 49 hours straight.

11:45
SimVascular Gateway for Education and Research
PRESENTER: Justin Tran

ABSTRACT. Over the last two decades, science gateways have become essential tools for supporting both research and education. The SimVascular application is an open source-software package providing a complete pipeline from medical image data segmentation to patient-specific blood flow simulation and analysis. With an ever-increasing user base of students, educators, clinicians, and researchers, the development group wanted a user-friendly web portal for users to run SimVascular flow simulations and to be able to support a large number of users with minimum effort and also hide complexity of using HPCs. This paper discusses how the SimVascular Science Gateway became a tool for students, educators, and researchers of all levels and continues to gather and grow a strong research community.

10:30-12:00 Session 9E: Panel: Campus Research Computing Consortium (CaRCC) Town Hall
10:30
Campus Research Computing Consortium (CaRCC) Town Hall
PRESENTER: Dana Brunson

ABSTRACT. CaRCC – the Campus Research Computing Consortium – is an organization of dedicated professionals developing, advocating for, and advancing campus research computing and data and associated professions. CaRCC advances the frontiers of research by improving the effectiveness of research computing and data (RCD) professionals, including their career development and visibility, and their ability to deliver services and resources for researchers. CaRCC connects RCD professionals and organizations around common objectives to increase knowledge sharing and enable continuous innovation in research computing and data capabilities. The new RCD Nexus CI CoE pilot provides an RCD Resource and Career Center to share products and resources with the community. This panel will gather CaRCC leaders and community members to discuss recent products and significant activities that CaRCC has supported as well as new initiatives for 2022 and beyond. The main goal of the panel is to provide a forum for the community to share their ideas and input on how CaRCC can serve the RCD Professional community, to discuss priorities for CaRCC going forward, and to identify productive partnerships between CaRCC and peer organizations in supporting RCD Professionals.

12:00-13:30 Session 10: Co-located event in The Square
12:00
ACCESS Breakfast / Lunch gathering(s)

ABSTRACT. Just putting a placeholder request in for some ACCESS breakfast/lunch gathering(s). Let's discuss details.

13:30-14:30 Session 11A: BOF in Studio 1
13:30
Open OnDemand User Group Meeting
PRESENTER: Alan Chalker

ABSTRACT. The goal of this BoF is to provide a forum for the Open OnDemand (OOD) community to exchange experiences and best practices, as well as to engage with the project development team.

13:30-14:30 Session 11C: BOF in Studio 2
13:30
NSF innovative computing technology testbed community exchange
PRESENTER: Robert Harrison

ABSTRACT. Proposed is a BoF to bring together the leadership, technical staff, and user communities of the National Science Foundation’s (NSF) innovative/prototype technology testbeds as supported by category-II of the NSF program ”Advanced Computing Systems Services: Adapting to the Rapid Evolution of Science and Engineering Research.” These diverse testbeds examine novel technologies, architectures, usage modes, etc., and explore new target applications, methods, and paradigms for discovery in science and engineering. Despite the great diversity of activities and resources that span networking, novel and advanced processor technologies, advancing AI and its applications, these testbeds have many common interests, communities, and potential synergies. This BoF will continue and expand the conversations initiated and explored over the last few PEARC conferences and has the explicit objectives of • more widely advertising the availability, capabilities, and successes of these resources, and especially their motivating technology opportunities and challenges; • strengthening collaborations and partnerships between the facilities and their teams including identifying new potential synergies; and • bringing both the teams and their user communities together to share experiences, best practices, and to start conversations that will broaden/deepen impact and create/identify new opportunities. The BoF is an opportunity for testbed sites to share their experiences including challenges in deploying, installing and running these unique systems, on-boarding users, user education, porting/tuning applications, and with other sites to explore synergies and to meet potential future collaborators. Similarly, for current or potential users it is an opportunity to meet testbed staff in person, to learn more about available sites, to meet other users, and to have open conversations about the reality of systems as deployed. The BoF will be structured as a 90-minute session (to be adjusted in accord with the conference schedule) and will comprise three sections: 1. 5 to 8 minute presentations (slides optional) from each member of a panel comprising one member from each testbed project. Panel members will be asked to directly address the objectives listed above (circa 45 minutes). 2. Open-mike QA with the panel and the audience in response to the presentations (circa 15 minutes). 3. Open-mike discussion structured around the objectives stated above, with active involvement from the chair to keep the conversation moving and focused and to solicit audience participation (circa 30 minutes). The session will be chaired by Robert Harrison, and the panel will represent all testbed sites attending PEARC, not just those from the list of authors. A scribe will take notes during the discussion and the lead authors (Harrison and Siegmann) will prepare a brief report that, after review by our co-authors, will be shared with the community through the appropriate mechanisms consistent with past practices (e.g., XSEDE and testbed news feeds, HPC Wire).

13:30-14:30 Session 11D: BOF in The Loft
13:30
Reimagining Visualization Support at the Research University
PRESENTER: Eric Wernert

ABSTRACT. This Birds-of-a-Feather session is intended for PEARC22 attendees who are interested in the role of visualization and visualization support groups at research universities. The goals of this proposed session are to 1) understand the unique opportunities and challenges related to visualization support, 2) identify the commonalities and differences among the groups and institutions in the community; 3) share site-specific challenges, success stories, insights, and best practices in supporting visualization; and 4) discuss potential methods to create a more enduring dialog and methods for information sharing across the community.

15:00-16:30 Session 12A: Systems Track: Security

Systems Track 1

15:00
Corralling Sensitive Data in the Wild West: Supporting Research with Highly Sensitive Data

ABSTRACT. Due to increased demand from researchers working with highly sensitive data, UC Berkeley developed the Secure Research Data and Compute (SRDC) platform and service. This article describes the design and architecture of the platform as well as key use cases for researchers working on SRDC. The article concludes with observations and lessons learned about the platform and service.

15:30
SciAuth: A Lightweight End-to-End Capability-Based Authorization Environment for Scientific Computing
PRESENTER: Jim Basney

ABSTRACT. We introduce a new end-to-end software environment that enables experimentation with using SciTokens for capability-based authorization in scientific computing. This set of interconnected Docker containers enables science projects to gain experience with the SciTokens model prior to adoption. It is a product of our SciAuth project, which supports the adoption of the SciTokens model through community engagement, support for coordinated adoption of community standards, assistance with software integration, security analysis and threat modeling, training, and workforce development.

15:45
Designing a Vulnerability Management Dashboard to Enhance Security Analysts’ Decision Making Processes

ABSTRACT. Network vulnerability management reduces threats posed by weaknesses in software, hardware, or organizational practices. As networks and related threats grow in size and complexity, security analysts face the challenges of analyzing large amounts of data and prioritizing and communicating threats quickly and efficiently. In this paper, we report our work-in-progress of developing a vulnerability management dashboard that helps analysts overcome these challenges. The approach uses interviews to identify a typical security analyst workflow and proceeds with an iterative design that relies on real-world data. The vulnerability dashboard development was based on a common security analyst workflow and includes functions to allow vulnerability prioritization according to their age, persistence, and impact on the system. Future work will look to execute full-scale user studies to evaluate the dashboard’s functionality and decision-making utility.

16:00
Artificial Intelligence to Classify and Detect Masquerading Users on HPC Systems from Shell Histories
PRESENTER: Kirby Kuznia

ABSTRACT. Modern high-performance computing (HPC) systems are typically accessed through interactive Linux shell sessions. Comprised of login and compute nodes, the system is accessed by researchers who will first access a login node and be provided a bash shell session. By default, bash shell histories are recorded up to a certain number of commands. Since 2013, Arizona State University HPC clusters have enabled a login-sourced shell utility to enable session histories to be recorded to a hidden user home directory. These HPC shell histories are typically utilized to help diagnose researcher issues as they arise and have proved to be invaluable in that respect. However, these histories may have additional value by being able to provide characterizations of how researchers engage with HPC systems, or perhaps be leveraged to foster collaboration or improve the HPC research cycle. This study documents a novel analysis of these prospective datasets by training two different machine learning methods on typical shell behavior as to detect a masquerading user.

16:15
Migrating towards Single Sign-On and Federated Identity
PRESENTER: Jason Anderson

ABSTRACT. This paper describes a two-tier architecture and implementation for single sign-on (SSO) federated identity support in Chameleon and the rationale that shaped it. We also describe how we migrated our users to a new account management system in privacy-preserving ways, a community that numbered in several thousand users and had created hundreds of thousands of digital artifacts.

15:00-16:30 Session 12B: Systems Track: HPC Management

Systems Track 2

15:00
A Fully Automated Scratch Storage Cleanup Tool for Heterogeneous Parallel Filesystems
PRESENTER: Fang Liu

ABSTRACT. Transitional data are a common component of most large-scale simulations and data analysis. Most research computing centers provide scratch storage to keep temporary data needed only during the runtime of jobs. Efficient management of scratch storage becomes critical for HPC centers with limited resources. Different research computing centers employ various policies and approaches to sustain the tricky balance between filesystem capabilities, user expectations, and excellence in customer support. In this paper, we present a homegrown fully-automated scratch storage cleanup tool, along with policies and procedures that we’ve built around it. This tool runs without human intervention to clean up our scratch space periodically, and it’s compatible with both GPFS and Lustre, two popular parallel filesystems that we use for our scratch service. Our approach takes into consideration both filesystems’ unique features and makes the workflow generic enough while accommodating these differences. The workflow has successfully run at our center for several months. Due to the limited literature documenting this type of work, we share our experience with the community in benefit of other centers with similar needs.

15:30
Anvil - System Architecture and Experiences from Deployment and Early User Operations
PRESENTER: Carol Song

ABSTRACT. Anvil is a new XSEDE advanced capacity computational resource funded by NSF. Designed with a systematic strategy to meet the ever increasing and diversifying research needs for advanced computational capacity, Anvil integrates a large capacity high-performance computing (HPC) system with a comprehensive ecosystem of software, access interfaces, programming environments, and composable services in a seamless environment to support a broad range of current and future science and engineering applications of the nation’s research community. Anchored by a 1000-node CPU cluster featuring the latest AMD EPYC 3rd generation (Milan) processors, along with a set of 1TB large memory and NVIDIA A100 GPU nodes, Anvil integrates a multi-tier storage system, a Kubernetes composable subsystem, and a pathway to Azure commercial cloud to support a variety of workflows and storage needs. Anvil was successfully deployed and integrated with XSEDE during the world-wide COVID-19 pandemic. Entering production operation in February 2022, Anvil will serve the nation’s science and engineering research community for five years. This paper describes the Anvil system and services, including its various components and subsystems, user facing features, and shares the Anvil team’s experience through its early user access program from November 2021 through January 2022.

16:00
Phoenix: The Revival of Research Computing and the Launch of the New Cost Model at Georgia Tech
PRESENTER: Aaron Jezghani

ABSTRACT. Originating from partnerships formed by central IT and researchers supporting their own clusters, the traditional condominium and dedicated cluster models for research computing are appealing and prevalent among emerging centers throughout academia. In 2008, Georgia Institute of Technology (GT) launched a campus strategy to centralize the hosting of computing resources across multiple science and engineering disciplines under a group of expert support personnel, and in 2009 the Partnership for an Advanced Computing Environment (PACE) was formed. Due to the increases in scale over the past decade, however, the initial models created challenges for the research community, systems administrators, and GT’s leadership. In 2020, GT launched a strategic initiative to revitalize research computing through a refresh of the infrastructure and computational resources in parallel with the migration to a new state-of-the-art datacenter, Coda, followed by the transition to a new consumption-based cost model. These efforts have resulted in an overall increase in cluster utilization, access to more hardware, a decrease in queue wait times, a reduction in resource provision times, and increase in return on investment, suggesting that such a model is highly advantageous for academic research computing centers. Presented here are the methods employed in making the change to the new cost model, data supporting these claims, and the ongoing improvements to continue meeting the needs of the GT research community whose research is accelerated by the deployment of the new cost model and the Phoenix cluster that ranked #277 on the Top500 November 2020 list.

15:00-16:30 Session 12C: Applications Track: Applications in Container and Cloud

Applications Track 1

15:00
C3F: A Collaborative Container-based Model Coupling Framework
PRESENTER: Jungha Woo

ABSTRACT. Solving complex real-world grand challenge problems requires in-depth collaboration of researchers from multiple disciplines. Such collaboration often involves harnessing multiscale and multi-dimensional data and combining models from different fields to simulate systems. However, the progress on this front has been limited mainly due to significant gaps in domain knowledge and tools that are typically employed in silos of the domains. Researchers from different fields face considerable barriers to understanding and reusing each other’s data/models in order to collaborate effectively. For example, in solving the global sustainability problems, researchers from hydrology, climate science, agriculture, and economics need to run their respective models to study different components of the global and local food, energy and water systems while, at the same time, need to interact with other researchers and integrate the results of one model with another. Developing this kind of model coupling workflow calls for (1) a large amount of data being processed and exchanged across domains and organizations, (2) identifying and processing the output of one model to make it ready for integration into another model, (3) controlling the workflow dynamically so that it runs until a certain convergence condition or other criteria is met, and (4) close collaboration among the modelers to explore, tune, and test the configuration and data transformation needed to link the models. We have developed C3F, a flexible collaborative model coupling framework to help researchers accelerate their model integration and linking efforts by leveraging advanced cyberinfrastructure such as high-performance computing and virtual containers. In this paper, we describe our experience and lessons learned in developing this cyberinfrastructure solution to support the linking of Water Balance Model (WBM) and SIMPLE-G agricultural economic model in an NSF funded INFEWS project and a DOE-funded Program on Coupled Human and Earth Systems (PCHES) to study the implications of groundwater scarcity for food-energy-water systems. The C3F model coupling framework can be extended to facilitate other model linkages as well.

15:30
Parallel Multi-Physics Simulation of Biomass Furnace and Cloud-based Workflow for SMEs
PRESENTER: Xavier Besseron

ABSTRACT. Biomass combustion is a well-established process to produce energy that offers a credible alternative to reduce the consumption of fossil fuel. To optimize the process of biomass combustion, numerical simulation is a less expensive and time-effective approach than the experimental method. However, biomass combustion involves intricate physical phenomena that must be modeled (and validated) carefully, in the fuel bed and in the surrounding gas. With this level of complexity, these simulations require the use of High-Performance Computing (HPC) platforms and expertise, which are usually not affordable for manufacturing SMEs.

In this work, we developed a parallel simulation tool for the simulation of biomass furnaces that relies on a parallel coupling between Computation Fluid Dynamics (CFD) and Discrete Element Method (DEM). This approach is computation-intensive but provides accurate and detailed results for biomass combustion with a moving fuel bed. Our implementation combines FOAM-extend (for the gas phase) parallelized with MPI, and XDEM (for the solid particles) parallelized with OpenMP, to take advantage of HPC hardware. We also carry out a thorough performance evaluation of our implementation using an industrial biomass furnace setup. Additionally, we present a fully automated workflow that handles all steps from the user input to the analysis of the results. Hundreds of parameters can be modified, including the furnace geometry and fuel settings. The workflow prepares the simulation input, delegates the computing-intensive simulation to an HPC platform, and collects the results. Our solution is integrated into the Digital Marketplace of the CloudiFacturing EU project and is directly available to SMEs via a Cloud portal.

As a result, we provide a cutting-edge simulation of a biomass furnace running on HPC. With this tool, we demonstrate how HPC can benefit engineering and manufacturing SMEs, and empower them to compute and solve problems that cannot be tackled without.

16:00
The C-M ̄AIKI Gateway: A Modern Science Platform for Analyzing Microbiome Data
PRESENTER: Sean Cleveland

ABSTRACT. In collaboration with the Center for Microbiome Analysis through Island Knowledge and Investigations (C-MĀIKI), the Hawaii EPSCoR Ike Wai project and the Hawaii Data Science Institute, a new science gateway, the C-MĀIKI gateway, was developed to support modern, interoperable and scalable microbiome data analysis. This gateway provides a web-based interface for accessing high-performance computing resources and storage to enable and support reproducible microbiome data analysis. The C-MĀIKI gateway is accelerating the analysis of microbiome data for Hawaii through ease of use and centralized infrastructure.

15:00-16:30 Session 12D: Applications Track: Machine Learning and Data Tools

Applications Track 2

15:00
A Framework to Capture and Reproduce the Absolute State of Jupyter Notebooks

ABSTRACT. Jupyter Notebooks are an enormously popular tool for creating and narrating computational research projects. They also have enormous potential for creating reproducible scientific research artifacts. Capturing the complete state of a notebook has additional benefits; for instance, the notebook execution may be split between local and remote resources, where the latter may have more powerful processing capabilities or store large or access-limited data. There are several challenges for making notebooks fully reproducible when examined in detail. The notebook code must be replicated entirely, and the underlying Python runtime environments must be identical. More subtle problems arise in replicating referenced data, external library dependencies, and runtime variable states. This paper presents solutions to these problems using Juptyer’s standard extension mechanisms to create an archivable system state for a running notebook. We show that the overhead for these additional mechanisms, which involve interacting with the underlying Linux kernel, does not introduce substantial execution time overheads, demonstrating the approach’s feasibility.

15:30
Validating new Automated Computer Vision Workflows to Traditional Automated Machine Learning
PRESENTER: Davin Lin

ABSTRACT. This paper presents some experiments to validate the design of an Automated Computer Vision (AutoCV) library for applications in scientific image understanding. AutoCV attempts to define a search space of algorithms used in common image analysis workflows and then uses a fitness function to automatically select individual algorithmic workflows for a given problem. The final goal is a semi-automated system that can assist researchers in finding specific computer vision algorithms that work for their specific research questions. As an example of this method the researchers have built the SEE-Insight tool which uses genetic algorithms to search for image analysis workflows. This tool has been used to implement an image segmentation workflow (SEE-Segment) and is being updated and modified to work with other image analysis workflows such as anchor point detection and counting. This work is motivated by analogous work being done in Automated Machine Learning (AutoML). As a way to validate the approach, this paper uses the SEE-Insight tool to recreate an AutoML solution (called SEE-Classify) and compares results to an existing AutoML solution (TPOT). As expected the existing AutoML tool worked better than the prototype SEE-Classify tool. However, the goal of this work was to learn from these well-established tools and possibly identify one of them that could be modified as a mature replacement for the core SEE-Insight search algorithm. Although this drop-in replacement was not found, reproducing the AutoML experiments in the SEE-Insight framework provided quite a few insights into best practices for moving forward with this research.

15:45
CyberGIS for Scalable Remote Sensing Data Fusion
PRESENTER: Fangzheng Lyu

ABSTRACT. Satellite remote sensing data products are widely used in many applications and science domains ranging from agriculture and emergency management to Earth and environmental sciences. Researchers have developed sophisticated and computationally intensive models for processing and analyzing such data with varying spatiotemporal resolutions from multiple sources. However, the computational intensity and expertise in using advanced cyberinfrastructure have held back the scalability and reproducibility of such models. To tackle this challenge, this research employs the CyberGIS-Compute middleware to achieve scalable and reproducible remote sensing data fusion across multiple spatiotemporal resolutions by harnessing advanced cyberinfrastructure. CyberGIS-Compute is a cyberGIS middleware framework for conducting computationally intensive geospatial analytics with advanced cyberinfrastructure resources such as those provisioned by XSEDE. Our case study achieved remote sensing data fusion at high spatial and temporal resolutions based on integrating CyberGIS-Compute with a cutting-edge deep learning model. This integrated approach also demonstrates how to achieve computational reproducibility of scalable remote sensing data fusion.

16:00
Towards Practical, Generalizable Machine-Learning Training Pipelines to Build Regression Models for Predicting Application Resource Needs on HPC Systems

ABSTRACT. This paper explores the potential for cost-effectively developing generalizable and scalable machine-learning-based regression models for predicting the approximate execution time of an HPC application given its input data and parameters. This work examines: (a) to what extent models can be trained on scaled-down datasets on commodity environments and adapted to production environments, (b) to what extent models built for specific applications can generalize to other applications within a family, and (c) how the most appropriate model may change based on the type of data and its mix. As part of this work, we also describe and show the use of an automatable pipeline for generating the necessary training data and building the model.

16:15
Webots.HPC: A Parallel Simulation Pipeline for Autonomous Vehicles
PRESENTER: Matt Franchi

ABSTRACT. In the rapidly evolving and maturing field of robotics, computer simulation has become an invaluable tool in the design and evaluation process. Autonomous vehicle (AV) and microscopic traffic simulators can be integrated to produce cost-effective tools for simulating AVs in the traffic stream. Our research sets out to develop a formalized parallel pipeline for running sequences of Webots simulations on powerful high performance computing (HPC) resources. Since running these simulations on personal lab computers is challenging, this paper presents a framework to support the Webots and Simulation of Urban Mobility (SUMO) simulation tools in an HPC environment. Simulations can be run in sequence, with a batch job being distributed across an arbitrary number of computing nodes and each node having multiple instances running in parallel. We have demonstrated parallel execution of Webots and SUMO, a microscopic traffic simulator, with as many as 2304 simulations across 6 nodes in a 12 hour time period. Overall, this capable pipeline can be used to extend existing research or to serve as a platform for new robotics simulation endeavors. This paper will serve as an important reference for researchers in efficiently simulating AVs in a mixed roadway traffic stream.

15:00-16:30 Session 12E: Workforce Track: Careers & workforce development

Workforce Track 1

15:00
Understanding Factors that Influence Research Computing and Data Careers
PRESENTER: Shafaq Chaudhry

ABSTRACT. Research Computing and Data (RCD) professionals play a crucial role in supporting and advancing research that involve data and/or computing, however, there is a critical shortage of RCD workforce, and organizations face challenges in recruiting and retaining RCD professional staff. It is not obvious to people outside of RCD how their skills and experience map to the RCD profession, and staff currently in RCD roles lack resources to create a professional development plan. To address these gaps, the CaRCC RCD Career Arcs working group has embarked upon an effort to gain a deeper understanding of the paths that RCD professionals follow across their careers. An important step in that effort is a recent survey the working group conducted of RCD professionals on key factors that influence decisions in the course of their careers. This survey gathered responses from over 200 respondents at institutions across the United States. This paper presents our initial findings and analyses of the data gathered. We describe how various genders, career stages, and types of RCD roles impact the ranking of these factors, and note that while there are differences across these groups, respondents were broadly consistent in their assessment of the importance of these factors. In some cases, the responses clearly distinguish RCD professionals from the broader workforce, and even other Information Technology professionals.

15:30
Characterizing the U.S. Research Computing and Data (RCD) Workforce

ABSTRACT. A growing share of computationally and data-intensive research, both inside and outside of academia, requires the involvement and support of computing and data professionals. Yet little is known about the composition of the research computing and data (RCD) workforce. This paper presents the results of a survey (N=563) of RCD professionals’ demographic and educational backgrounds, work experience, current positions, job responsibilities, and views of working in the RCD field. We estimate the size of the RCD workforce and discuss how the demographic diversity and distribution of backgrounds of those in the RCD workforce fail to match that of the larger academic and technical workforces. These survey results additionally support the insights of those working in the field concerning the need to recruit a wider variety of professionals into the RCD profession, better define job descriptions and career pathways, and improve institutional recognition for the value of RCD work.

16:00
Expanding the Reach of Research Computing: A Landscape Study: Pathways Bringing Research Computing to Smaller Universities and Community Colleges

ABSTRACT. Research-computing continues to play an ever increasing role in academia. Access to computing resources, however, varies greatly between institutions. Sustaining the growing need for computing skills and access to advanced cyberinfrastructure requires that computing resources be available to students at all levels of scholarship, including community colleges. The National Science Foundation-funded Building Research Innovation in Community Colleges (BRICCs) community set out to understand the challenges faced by administrators, researchers and faculty in building a sustainable research computing continuum that extends to smaller and two-year terminal degree granting institutions. BRICCs purpose is to address the technology gaps, and encourage the development of curriculum needed to grow a computationally proficient research workforce. Toward addressing these goals, we performed a landscape study that culminated with a community workshop. Here, we present our key findings from workshop discussions and identify next steps to be taken by BRICCs, funding agencies, and the broader cyberinfrastructure community.

16:15
Exchanging Best Practices for Supporting Computational and Data-Intensive Research, The Xpert Network

ABSTRACT. We present best practices for professionals who support computational and data-intensive (CDI) research projects. The practices resulted from the Xpert Network activities, an initiative that brings together major NSF-funded projects for advanced cyberinfrastructure, national projects, and university teams that include individuals or groups of such professionals. Additionally, our recommendations are based on years of experience building multidisciplinary applications and teaching computing to scientists. This paper focuses particularly on practices that differ from those in a general software engineering context. This paper also describes the Xpert Network initiative where participants exchange best practices, tools, successes, challenges, and general information about their activities, leading to increased productivity, efficiency, and coordination in the ever-growing community of scientists that use computational and data-intensive research methods.

15:00-16:30 Session 12F: Panel: Campus Champions Fellows Presentations
15:00
Campus Champions Fellows Presentations
PRESENTER: Robert Sinkovits

ABSTRACT. The 2021-2022 XSEDE Campus Champions Fellows program partners Campus Champions with staff from XSEDE’s Extended Collaborative Support Service (ECSS) groups to work side by side for one year on realworld science and engineering projects. Fellows will develop expertise within varied areas of cyberinfrastructure, and they are already well positioned to share their advanced knowledge through their roles as the established conduits to students, administrators, professional staff, and faculty on their campuses. In addition to the technical knowledge gleaned from their experiences, the individual Fellows benefit from their personal interactions with the XSEDE staff and will acquire the skills necessary to manage similar user or research group project requests on their own campuses. The Campus Champions Fellows program is a unique opportunity for a select group of individuals to learn first-hand about the application of high-end cyberinfrastructure to challenging science and engineering problems. The 2021-22 Fellows who will be sharing their experiences over the past year are: Xinlian Liu, Hood College; Robert Romero, University of California Merced; Dima Shyshlov, Mass General Brigham; and Derek Strong, University of Southern California.

17:00-18:00 Session 14A: BOF in Studio 1
17:00
Science Gateways and HPCs: Next Generation Integration
PRESENTER: Marlon Pierce

ABSTRACT. This proposed Birds of a Feather (BOF) session will discuss policies and best practices for High Performance Computing (HPC) centers to enable remote access for and integration with science gateways and related cyberinfrastructure. The BOF will also promulgate the efforts of the Science Gateways Next Generation HPC Integration Working Group, co-organized by the Science Gateways Community Institute (SGCI, sciencegateways.org) and Trusted CI (trustedci.org). This working group is an open-community effort with a charter [1] to draft a set of general recommendations for academic research computing centers and science gateway providers to securely enable science gateway integration with research computing resources. The working group will base its recommendations on a broad understanding of HPC center cybersecurity requirements and concerns and on common science gateway access mechanisms.

The specific goals of this BOF session are to a) update the science gateways, research computing, and cybersecurity communities on the outcomes of focus group sessions conducted by the XSEDE evaluation team, and b) promote participation in a broader community survey developed by the working group based on the focus group session outcomes.

17:00-18:00 Session 14B: BOF in Studio 2
17:00
Reproducibility and trustworthiness of Scientific Research

ABSTRACT. This BoF will discuss the recently completed report of the Working Group on Reproducibility and Sustainability for the NSF Advisory Committee on Cyber Infrastructure (ACCI) and seek the community feedback about creating a community that will focus on promoting the reproducibility and trustworthiness of scientific research. It will review the catalogue of existing provenance capture and replay tools, discuss the experience of existing reproducibility efforts aiming at defining the essentials a research team needs to capture in order to assure reproducibility and trustworthiness of scientific research. This BoF will discuss opportunities and challenges for developing support services to expand the user base, lower barriers for capturing artifacts while doing research, and brainstorm how to work as a community towards a concerted effort to build an ecosystem of tools to support reproducibility.

17:00-18:00 Session 14C: BOF in The Loft
17:00
Words Matter! Progress in Promoting Inclusion through Language in Advanced Research Computing
PRESENTER: Susan Mehringer

ABSTRACT. The advanced research computing community is aware of the need to update its terminology to foster a more inclusive environment. Many projects and organizations active in advanced research computing are aware of the need to ensure that the language they use both formally and informally is free from terminology that prevents fostering environments that are inclusive for all community members. This applies to documentation, presentations, educational materials, and workplace language. In 2020, the Extreme Science Engineering and Discovery Environment (XSEDE) project started addressing these concerns by forming a Terminology Task Force (TTF). The XSEDE TTF and other invited groups will briefly present their progress to stimulate discussion and exchange best practices among interested PEARC22 participants.

18:30-21:00 Session 15A: Poster Reception

Poster Reception

Using Containers and Tapis to Structure Portable, Composable and Reproducible Climate Science Workflows
PRESENTER: Michael Dodge II

ABSTRACT. Provenance and Reproducibility have been growing needs in scientific computing workflows. This project seeks to split the traditionally monolithic code-base of a climate data computing workflow into small, functional, and semi-independent containers. Each container image is built from public code repositories, and allows a researcher to determine the exact process that was executed for both technical and scientific validation. These containers are composed into their workflows using the Tapis API’s Actor-Based Container (Abaco) system, which can be hosted on a variety of computing infrastructures. They may also be run as standalone containers on computers or virtual machines with Docker installed.

Digital Evidence Acquisition and Deepfake Detection with Decentralized Applications
PRESENTER: Maryam Taeb

ABSTRACT. Smartphones as an essential device for capturing daily life events could present evidence in court by documenting incidents that are of significant forensic interest. However, not everyone is willing for all the data available on their phone to be extracted for analysis, due to privacy/personal concerns. Moreover, Law Enforcement Agencies require a great amount of memory to store the extracted information from a witness's phone. Another important challenge for law enforcement agencies is Deepfakes detection which is being used maliciously as a source of misinformation, manipulation, harassment, and persuasion in court. Blockchain has lately, revolutionized the way we used to handle businesses traditionally. Decentralized applications (Dapps) could be a great solution to authenticate the evidence, avoid the spread of misinformation, perform targeted data extraction, and provide a sharing framework that addresses privacy and memory concerns. This project aims in developing a Dapp which provides a trusted distribution channel for users to provide authenticated evidence. This platform by leveraging machine learning classifiers, not only identifies manipulated vs original media before approving it but also leverages user-uploaded media to retrain itself to improve models’ prediction and provide full transparency. The result of this work, with the help of the blockchains consensus concept, can keep a clean record (timestamp) of the incident, uploaded evidence, and useful metadata.

Halcyon: Unified HPC Center Operations: An Extensible Web Framework for Resource Allocation, Documentation, and More
PRESENTER: Shawn Rice

ABSTRACT. Due to the increasing complexity of user and resource management under a shared campus cluster model, particularly with many research groups investing distinct amounts and annual new hardware acquisitions, the Research Computing division at Purdue set out in 2011 to design a cluster management solution to empower faculty to manage access to their own purchased resources. As operations expanded, the internal portal took on many aspects of the operation of an HPC center beyond resource allocation and management. Eventually, components included HPC and storage resource management, user management and authorization, customer relations, communications, documentation, and ordering/purchasing. Halcyon reconstitutes these in a modular, extensible framework to allow for the growth and maintenance necessary to encompass all aspects of HPC center management. This will allow centers to operate not only more efficiently, but more effectively, and deliver better services to researchers.

Methodology for Imagery Metadata Collection and Entry into a PostgreSQL Database Using Stampede2

ABSTRACT. Agencies such as National Oceanic and Atmospheric Administration (NOAA), Texas Natural Resources Information System (TNRIS), and the National Geographic Society, to name a few, collect Light Detection and Ranging (LIDAR) imagery data through small surveys which are often used to generate digital elevation models (DEMs). Surface water simulations and hazard planning simulations often use these DEM data sets for localized calculations. The data sets area at times in data silos, creating bottlenecks modelers must overcome when creating needed applications. Moreover, the imagery data sets are not standardized, meaning they have different coordinate reference systems, they are of different spatial sizes, they are of different resolutions, and they are of different geographic locations. To ease usability a database was needed which included the metadata for the different types of LIDAR imagery files. This database system would include the following information: where the file was located in the file system, what the file included, what coordinate reference system was utilized, and what geographic location was associated with the file. The research explains a method as to how this task was accomplished using the Texas Advanced Computing Center's Stampede2 supercomputer and associated storage systems.

HyperShell v2: Distributed Task Execution for HPC
PRESENTER: Geoffrey Lentner

ABSTRACT. HyperShell is an elegant, cross-platform, high-performance computing utility for processing shell commands over a distributed, asynchronous queue. It is a highly scalable workflow automation tool for many-task scenarios. There are several existing tools that serve a similar purpose, but lack some aspect that HyperShell provides (e.g., distributed, detailed logging, automated retries, super scale). Novel aspects of HyperShell include but are not limited to (1) cross-platform, (2) client-server design, (3) staggered launch for large scales, (4) persistent hosting of the server, and optionally (5) a database in-the-loop for restarts and persisting task metadata. HyperShell was originally created to support researchers at Purdue University, out of a specific unmet need. It has been in use for several years now. With this next release, we’ve completely re-implemented HyperShell as both an application and a library to provide new features, scalability, flexibility, robustness, and wider support. (https://github.com/glentner/hyper-shell)

Accelerating PET Image Reconstruction with CUDA
PRESENTER: Ping Luo

ABSTRACT. Yale MOLAR is an in-house Positron Emission Tomography (PET) image reconstruction application written in C++ and MPI. It deals with hundreds of millions of lines-of-response (LORs) independently to reconstruct an image. The nature of the image reconstruction process makes MOLAR an ideal candidate for GPU acceleration. In this study, we present our work on accelerating MOLAR with CUDA, and show the results that demonstrate the effectiveness and correctness of our CUDA implementation. Overall, Yale MOLAR with CUDA runs up to 6 times faster than the CPU-only code, reducing a typical high resolution image reconstruction time from several hours to less than one hour.

Developing a Data Science Outreach Program with Rural Native Americans: Southern California Tribal Youth Participate in DataJam via San Diego Supercomputer Center
PRESENTER: Kimberly Bruch

ABSTRACT. Situated approximately 40 miles northeast of San Diego and 30 miles inland from the Pacific Ocean, the Reservation of the Pala Band of Mission Indians is home to 1250 enrolled membersconsisting of Cupeños and Luiseños. A vast 20 square miles of valley surrounded by mountains along the San Luis Rey River, the Pala Reservation is comprised of residential and agricultural areas as well as unused wildlands. The terrain is rough and roads are steep, windy and difficult to navigate even under the best of weather conditions (Figure 1).

Many Pala families reside in isolated areas of this rural reservation and rely upon the tribal learning center and youth center for afterschool programming. In an effort to engage youth in programming related to science, technology, engineering and math (STEM) fields – including data science – tribal leadership met with the authors to discuss potential programs. The authors next worked with the Pala Learning Center Director, Pala Youth Center Director, Pala Youth Council, and Pala Tribal Chairman to determine if a data science outreach program would align with student interest.

Simultaneously, a high school program originally developed in Pittsburgh, Pennsylvania, the DataJam, was expanding to new parts of the country and it seemed like a great fit for the youth at Pala. The DataJam is one of the seed funded projects from the Northeast Big Data Innovation Hub and includes co-authors Catherine Cramer and Judy Cameron on the project team; meanwhile, co-authors Christine Kirkpatrick and Kimberly Mann Bruch are part of the NSF-funded West Big Data Innovation Hub. The Hubs exist to increase the proliferation of techniques, tools and methods–as well as to accelerate innovation around big data and data science connecting academia, non-profits, industry and government. Drawing on past work with regional tribal groups, co-authors Mann Bruch, Cramer, Kirkpatrick and Cameron joined with co-author Doretta Musick and a group of middle and high school students attending an afterschool program at the Pala Youth Center to form the Pala DataJam Team.

The Pala DataJam Team participated in their DataJam research activity from January through April 2022, with weekly in-person meetings with the lead author while co-authors met with the team remotely via Zoom. The team originally consisted of six students; however, numbers decreased to two female middle schoolers between January and March, due to demands from other school activities. With that said, the larger group helped determine the topic for the team’s data science efforts and named their project “Examining pH Data in the San Luis Rey River within the Pala Native American Reservation and Beyond.”

Their research question was to determine the difference between the pH level in the river’s water on their tribal land versus other areas of the river, which flows throughout San Diego County. pH levels are often used as an indicator of water health, given that factors such as sewage outflow or other forms of pollution can impact the pH of a given body of water. To make the comparison between the portion of the river running through their tribal land and previously recorded data, the students collected water samples at a river site on the reservation with the help of co-author Muriel Reid (Figure 2). Reid helped the team examine the pH levels of their samples using a simple Vernier pH sensor connected to a datalogger and co-author Louise Hicks then helped the team compare them with 101 pH measurements published by the California Department of Water Resources found on the California Natural Resources Agency website. This data analysis exercise provided the students with an opportunity to learn how to complete a t-test using a simple Google sheets formula; this portion of the activity also allowed for an overview of data science and careers that involve such work. The students participated in a final competition presentation at the end of the activity, in late April, where they were able to explain their project to a panel of DataJam judges. They received the “Best New Team” Award (Figure 4) among 21 teams. One of the most powerful aspects of the Pala team’s DataJam presentation was that it was delivered in English as well as Cupeño, the tribe’s Native language. Another critical aspect of the project was to ensure that data collected on or about the tribal land was treated with respect and that ownership was retained by the Native youth conducting the research–operating within the framework of the CARE Principles for Indigenous Data Governance. The tenets of CARE include Collective Benefit, Authority to Control, Responsibility, and Ethics.

While the primary goal of this activity was to teach students about the concept of data science and big data, the team also focused on ensuring that the youth realized how big data impacts many aspects of their daily lives. It was interesting to learn during the DataJam team meetings that two of the students had already been using informal data science collection methods to measure an array of daily activities in their lives (Figure 3). Future work will involve the two Pala Native American students completing an internship with the first author at the San Diego Supercomputer Center. The students plan to examine additional traits in the river both on their reservation and beyond–using their newfound data science analysis skills to understand their findings. 

Automated Support Request Categorization using Machine Learning
PRESENTER: Justin Petucci

ABSTRACT. The automatic categorization of user support requests/tickets for Pennsylvania State University’s high performance computing system is carried out using decision tree and artificial neural network models. We explore estimated model prediction performance across different text embedding techniques (TF-IDF and BERT) and prediction models (Gradient Boosted Decision Trees and Multi-Layer Perceptron Neural Networks) for this multiclass problem. The dataset is comprised of 6213 support tickets categorized using a broad (14 classes) and specific (17 classes) set of labels. The results indicate the optimal prediction accuracy for the ’broad’ and ’specific’ categories are 94.6% and 83.2%, respectfully.

Migrating a Pipeline for the C-MĀIKI gateway from Tapis v2 to Tapis v3
PRESENTER: Yick Ching Wong

ABSTRACT. The C-MĀIKI gateway is a science gateway that leverages a computational workloads management API call Tapis to support modern, interoperable, and scalable microbiome data analysis. This project is focused on migrating the existing C-MĀIKI gateway from Tapis v2 to Tapis v3 so that it can keep up and stay modern. This requires three major steps: 1) Containerization of each existing microbiome workflow. 2) Create a new app definition for each of the workflow. 3) Enabling the ability to submit jobs to a SLURM scheduler inside of a singularity container.

BioContainers on Purdue Clusters
PRESENTER: Yucheng Zhang

ABSTRACT. Container technologies such as Docker, Kubernetes, and SingularityCE have been receiving an increasing level of attention in academic institutions. Containers wrap up the application into an isolated file system containing everything it needs to run, such as compiler, libraries and dependencies. This enables containers to always run the same, regardless of the environment in which they are running, promoting container technology as a critical tool for reproducible research. In high-performance computing (HPC) context, containers gain popularity because they can significantly reduce the administrators’ work of deploying applications. On Purdue University HPC clusters, several hundred SingularityCE containers have been deployed. Here, we will introduce how SingularityCE containers are used to create the bioinformatic tool collection (biocontainers). Due to the ease of deployment and portability, biocontainers have been deployed in Purdue’s 6 HPC clusters as well as XSEDE Anvil, providing a reliable and reproducible computing environment for life science researchers.

Investigating Bias in Resource Allocation for Homelessness Prevention and Intervention
PRESENTER: Abigail Santiago

ABSTRACT. This project seeks to identify potential areas of bias in machine learning models that allocate resources for homelessness prevention and intervention by exploring the concepts of fairness and searching for signs of underspecification. Underspecification refers to the phenomenon of when a model uses many different predictors to yield the same outcome, due to the structure chosen for the given model or to selection bias within the data. This becomes an issue because it leads to unpredictable behavior during employment. To explore fairness, A Gradient Boosted Tree Classification model was trained on the individual level data, and then its f1 scores across subgroups of race and gender were visualized to evaluate the fairness of the model. The subgroups with the highest f1 scores were white men and Asian American Pacific Islander men. To look for signs of underspecification, experiments based on the concept of Rashomon sets were conducted under several different stress tests in which a random percentage of the data used for prediction was dropped. Rashomon sets can be understood as a set of equally performing models. Two Rashomon sets were created- one containing 50 LinearSVC models, and another containing 50 Random Forest Classifier models. The f1 scores of these models were then compared. There was no difference found between the performance of the models in the Rashomon set under any stress tests. The experiments on fairness suggest bias in dataset collection. Future work suggests beginning the investigation of the found avenues of potential bias, as well as performing more varied stress tests on the Rashomon set of good models.

Extending Tapis Workflow Management Framework with Elastic Google Cloud Distributed System using CloudyCluster by Omnibond
PRESENTER: Eric Lam

ABSTRACT. The goal of a robust cyberinfrastructure (CI) ecosystem is to catalyse discovery and innovation. Tapis does this through offering a sustainable production-quality set of API services to accomplish computational and data-intensive research in a secure, scalable, and reproducible way and allows them to focus on their research instead of the technology needed to accomplish it.

This project aims to enable the integration of the Google Cloud Platform (GCP) and CloudyCluster resources into Tapis-supported science gateways to provide on-demand scaling needed by computational workflows. The new functionality allows researchers to augment cloud resources on top of existing local and national computing resources.

Simplifying Scientific Application Access in Kubernetes with Push Button Deployments
PRESENTER: Taylor Johnston

ABSTRACT. The Geddes Composable Platform is a Kubernetes-based private cloud resource at Purdue University. To streamline adoption of the platform and lower the barrier to entry, we created push button deployments for some of the popular applications used by Purdue researchers and made them available via Geddes’ Rancher web-based user interface using Helm and Rancher Charts. With little knowledge of the underlying system, a new user can use a web form to deploy custom applications, including JupyterHub instances, Alphafold, CryoSPARC and the Triton Inference Server.

Azure-based Hybrid Cloud Extension to Campus Clusters
PRESENTER: Samuel Weekly

ABSTRACT. We provide an overview of recent successes integrating and using Microsoft Azure public cloud resources for scientific computing at Purdue University, including benchmarking efforts for new processor architectures and a hybrid cloud extension to on-campus computing resources. The architecture of the hybrid cloud extension is described, which allows users to seamlessly burst workloads from Purdue community clusters to the Azure cloud. We also cover two scientific computing use cases demonstrating bursting capabilities using 3rd Gen. AMD EPYC "Milan" based HBv3 Azure instances.

HPC Data Analysis Pipeline for Neuronal Cluster Detection
PRESENTER: Esen Tuna

ABSTRACT. Obtaining neural clusters from data sets collected over different developmental stages poses a computational challenge that is complicated by the number of data sets, clustering methods, and hyperparameters. We used MATLAB parallel toolkit to parallelize the execution of the hyperparameter sweeps as well as developed a workflow for parallelizing the data processing. We present a run-time performance comparison of the workflow for two clustering methods on Stampede2 supercomputer. Our study explored the performance of MATLAB implementations of the K-means and Louvain algorithms for cluster detection, using covariance and cosine similarity matrices, and investigated hyperparameter settings for each algorithm.

Broadening Student Participation in Cyberinfrastructure Research and Development

ABSTRACT. This poster presents preliminary observations from the pilot year of a Student Internship Program (SIP) that was created to broaden student participation in cyberinfrastructure research and development. SIP is part of the CI Compass project, which is the National Science Foundation (NSF) Cyberinfrastructure Center of Excellence, created to provide support and enhance the data lifecycle of NSF Major Facilities (MFs) [1]. MFs are the largest-scale scientific efforts that the NSF supports and are highly diverse, have heterogeneous data, and a wide range of cyberinfrastructure for capturing, processing, archiving, and disseminating data, as well as providing access to sophisticated instruments and computational capabilities. MFs span many science domains, including astronomy, climate, ecology, natural hazard, ocean science, physics, and seismology [2]. Due to the complexity of the cyberinfrastructure and data that supports MFs, it is critical that we create educational opportunities for students interested in pursuing a career in this specialized cyberinfrastructure that supports large-scale science. The program aims to provide students the opportunity to learn about cyberinfrastructure development and MFs, develop cyberinfrastructure-related skill sets important to the work of MFs, and engage directly with the MF CI professionals.

SIP fulfills a specific gap in current internships and educational programs by providing students opportunities to engage with and understand the underlying cyberinfrastructure that supports MFs, as most MF internships focus on the specific scientific domain. This program provides the unique experience of the Spring technical and research program that prepares students for the summer projects program. SIP brings these together and allows students to contextualize technical skills, research skills, and how these would be applied in the real world at MFs or other large-scale cyberinfrastructure projects. By providing students with an understanding of the challenges of the data lifecycle of MFs, SIP provides the stepping stone to a pipeline for potential future MF and CI professionals.

During the 2021-2022 academic year, CI Compass piloted SIP to create the necessary program protocols, including the recruitment plan, training materials, project descriptions, additional logistical items, and to pilot the structure of the program. The internship spanned 12 weeks in the Spring semester and six weeks into the Summer. Students participating in the program were given course credit or audited the course during the Spring semester and were paid a stipend for the Summer. The first year of the SIP was geared towards undergraduate students pursuing studies in computer science, information science, data science, applied mathematics and statistics, embedded systems, communications, and social sciences related to cyberinfrastructure.

We advertised the SIP program at the University of Southern California (USC), Indiana University (IU), and the University of Notre Dame (UND) in November of 2021 to undergraduate students pursuing studies in computer science, information science, data science, and related fields participate. Seventeen students applied from USC and two from UND. Additionally, four students from IU expressed interest in applying; however, they were unable to apply because the deadline had passed, and eight students from USC expressed interest but did not apply. After interviews with the applicants, we accepted six students, five from USC and one from UND.

During Spring 2022, we commenced with the inaugural student internship program. SIP consists of two programs during the spring semester, 1) the technical skills program and 2) the research skills program. The technical skills program provides students with experience with the technical skills relevant to MF cyberinfrastructure, such as Python, best practices in software development (Git, pytest, compression), containers, Docker, cloud, and parallel and distributed computing. The research skills program provides students contextual understanding of the MFs and related cyberinfrastructure by having students research MFs and the data lifecycle, which helps students understand the importance and context of MFs, and the related data and cyberinfrastructures. At the end of the Spring semester, each student presented their research at the SIP Symposium, where the audience included both CI Compass members and MF professionals. During Summer 2022, students have the option of participating in a project-based learning experience where they will gain additional technical skills.

We are expanding the SIP program next year. From observations and evaluation data, we will make updates to the current curriculum and program structure. Additionally, we are expanding the institutions involved in the program and will broadly advertise mentoring and student opportunities to include participants from MSIs, HSIs, and Tribal Colleges to create long-term partnerships with faculty and students at these institutions and to increase diversity and inclusion in the student internship program.

ACKNOWLEDGMENTS This project was supported by the National Science Foundation Grant no: 2127548 and Grant no: 1842042.

REFERENCES [1] Laura Christopherson, Anirban Mandal, Erik Scott, and Ilya Baldin. 2020. Toward a Data Lifecycle Model for NSF Large Facilities. In Proceedings of the Practice and Experience in Advanced Research Computing (PEARC’20). ACM, Portland, OR, 210–217. https://doi.org/10.1145/3311790.3396636 [2] National Science Foundation. Major Facilities Guide (NSF 21-107). https://www.nsf.gov/bfa/lfo/lfo_documents.jsp, last accessed 2021/10/08.

Data Management Workflows in Interdisciplinary Highly Collaborative Research
PRESENTER: Esen Tuna

ABSTRACT. Data curation is an important aspect in research projects. Effective data management is critical for data curation, and it not only contributes to the success of projects but makes research outputs findable, accessible, interoperable and reusable. We have examined interdisciplinary highly collaborative research (IHCR) practices in selected projects to propose data management workflows. This synopsis of work in progress discusses one of these workflows that helps locate information when there are multiple collaborators and the digital assets are spread across multiple storage systems and institutions.

Building the RNAMake Gateway on PATh: a Student-Led Design Project
PRESENTER: Dinuka De Silva

ABSTRACT. We summarize student-led work to build a science gateway for RNAMake, which is software for modeling the three-dimensional structure of RNA molecules. The gateway uses Apache Airavata, which has been extended to support HTCondor submissions. The students also extended the Airavata Django Portal to provide customized user interfaces. In the process, the students learned open source software and open governance practices.

Comparative Evaluation of Hate Speech Models and Racial Bias
PRESENTER: Jennie Youn

ABSTRACT. The growing prevalence of online hate speech is concerning, given the massive growth of online platforms. Hate speech is defined as language that attacks, humiliates, or incites violence against specific groups. According to research, there is a link between online hate speech and real-world crimes, as well as victims' deteriorating mental health. To combat the online prevalence of abusive speech, hate speech detection models based on machine learning and natural language processing are being developed to automatically detect the toxicity of online content. However, current models tend to mislabel African American English (AAE) text as hate speech at a significantly higher rate than texts written in Standard American English (SAE). To confirm the existence of systematic racism within these models, I evaluate a logical regression model and a BERT model. Then, I determine the efficacy of the bias reduction method for the BERT model and the correlation between model performance and reduced bias.

18:30-21:00 Session 15B: PEARC Interact!

Formerly Visualization Showcase

18:30
Equine Sinus Anatomy Collectome
PRESENTER: Scott Birch

ABSTRACT. A Collectome is a collection of related web-based content formatted for large-size video displays or "walls" and shared/delivered through a web browser URL. This Collectome features a collection of equine paranasal sinus anatomy presented in the form of interactive 2D and 3D anatomical studies segmented from cadaver CT contrast studies acquired by the co-author, Dr. James Brown, BVSc, MS, Dipl. ACT, ACVS. The objective of the Equine Paranasal Sinus Anatomy Collectome is to help equine clinicians gain knowledge and understanding of the sinus anatomy and apply that knowledge to imaging interpretation by identifying sinonasal and dental pathology, our hypothesis being that enhanced interpretation of radiographs and CT data sets can be achieved with the use of 3D models and other teaching modules developed from volumetric scans.

3 modalities are featured in the Collectome: a) Interactive annotated 3D models hosted on Sketchfab.com b) an A/B image "slideover" that shows 3D data overlaid over traditional radiographs, and c) an interactive annotated 2D volume slice viewer showing the main paranasal sinuses and related anatomy of the sinonasal region of the horse head. A PDF file containing the didactic iformation and supplementary imagery is embedded into the Collectome as well as a project guide/map and includes explanation of the project, imagery pertaining to the dental features and sinus drainage channels, and gross pathology photographs taken during dissection of cadavers.

19:00
Ohio Supercomputer Center Virtual Tour
PRESENTER: Chase Eyster

ABSTRACT. The Ohio Supercomputer Center (OSC) offers tours of its data center as a way to engage with the community, while sharing information on our computer clusters and demonstrating what makes a computer “super.” Tour sessions feature a history of OSC hardware from 1987 to the present; an inside look at computer nodes and storage systems; information on the types of research that leverage HPC for their work; and a special OSC keepsake to remember the experience. Starting in March 2020, the Covid-19 pandemic made these public tours impossible and eliminated the opportunity for interested parties to see a working supercomputer facility - virtual tours were the logical solution.

19:30
CRAVRE: Cyber Resilience Adaptive Virtual Reality Experiences
PRESENTER: Jake White

ABSTRACT. N/A (PEARC Interact poster submission)

20:00
Scenario Visualization With the ‘Ike Wai Hawai‘i Groundwater Recharge Tool
PRESENTER: Jared McLean

ABSTRACT. The primary purpose of this application is to visualize the effects of land-cover alterations on groundwater-recharge rates for the island of O‘ahu, Hawai‘i. Users can define and modify the land cover of an area of the island. Based on user-defined land-cover types and available rainfall data or projections, the application will display various metrics respective to the estimated impacts on the island’s groundwater-recharge rates. This application simplifies the process of evaluating recharge changes linked to land-cover and climate changes and provides rapid results and metrics that are useful in research and water-management decision making.

20:30
Climatological Data Visualization With the Hawai‘i Climate Data Portal
PRESENTER: Jared McLean

ABSTRACT. The Hawai‘i Climate Data Portal (HCDP) provides an interactive visualization and the ability to export data for climatological data collected in Hawai‘i. The visualization component of the application displays the location of and information about the sensor stations that collect data as well as high-resolution derived gridded data maps approximating values for the entire state. The portal hosts both historical and near-real-time data updated via an automated pipeline at daily or monthly intervals. The data portal is embedded in a wordpress site containing additional cultural and scientific resources for researchers working in Hawai‘i.

21:00
Visualizing the Vulnerability Landscape of Major Scientific Cyberinfrastructure GitHub Ecosystems
PRESENTER: Dalya Manatova

ABSTRACT. Please see submitted poster.

21:30
Interactive 3D Animation of LOFAR Lightning Observations
PRESENTER: David Reagan

ABSTRACT. We developed a workflow to share cutting-edge lightning observations on an interactive 3D web platform.

22:00
Designing a Vulnerability Management Dashboard to Enhance Security Analysts' Decision Making Processes

ABSTRACT. Network vulnerability management reduces threats posed by weaknesses in software, hardware, or organizational practices. As networks and related threats grow in size and complexity, security analysts face the challenges of analyzing large amounts of data and prioritizing and communicating threats quickly and efficiently. In this paper, we report our work-in-progress of developing a vulnerability management dashboard that helps analysts overcome these challenges. The approach uses interviews to identify a typical security analyst workflow and proceeds with an iterative design that relies on real-world data. The vulnerability dashboard development was based on a common security analyst workflow and includes functions to allow vulnerability prioritization according to their age, persistence, and impact on the system. Future work will look to execute full-scale user studies to evaluate the dashboard's functionality and decision-making utility.

22:30
Coloring with Nanoparticles
PRESENTER: David Reagan

ABSTRACT. Make fun designs using the chemistry of stained glass in a responsive web application.

23:00
Extreme Resolution Image Analysis with Serial Block-face Scanning Electron Microscopy

ABSTRACT. The Indiana University (IU) Advanced Visualization Lab is working with clients in neuroscience to develop image processing and analysis workflows that will handle stacks of hundreds (~400) extreme resolution image mosaics, each >350 megapixels. To accomplish this task, we have made efficient interactive use of HPC resources at IU, mainly through IU's Research Desktop (RED) environment. RED is a virtual desktop platform, based on ThinLinc by Cendio AB, allowing one access to a Linux-based desktop coupled to high performance storage and large-memory, many-core compute nodes. Almost all work is completed in the IU HPC environment, using the open source Fiji package and its TrakEM2 plugin. This poster includes an interactive panorama located at https://https://slavin.pages.iu.edu/pearc22/.

18:30-21:00 Session 15C: Posters PEARC'20 and PEARC'21
18:30
Refactoring a statistical package for demanding memory loads: Adapting R for high performance telemetry data analytics

ABSTRACT. Winner of Best Poster for PEARC'20.

The “overlap” package provides distinct resampling and multithreaded routines to the R statistical application which model animal activity patterns captured from camera trap data, calculate overlaps of those models, and assess statistical significance of overlaps via bootstrapping. The endeavor to leverage the package for a different input data source, namely GPS data from collared wild pigs, (an input data set several orders of magnitude larger than typically interrogated by the package), incurs a prohibitively large memory footprint for the typical desktop machine. Initial efforts revealed that the necessary memory allocations for the dataset output from the resampling exceeded that available on a high memory node, making code refactoring necessary. Subsequent in-depth analysis of the employed package revealed that by combining and multithreading bootstrap sampling and overlap calculations, the memory footprint was reduced to a manageable size. However, the computation time was substantially increased. Further development, decomposing the resampling step into two routines, minimized the impact of a rate-limiting step, restoring shorter runtimes while maintaining the smaller memory footprint.

19:00
SGCI Incubator and its Role in Workforce Development: Lessons Learned from Training, Consultancy, and Building a Community of Community-Builders for Science Gateways

ABSTRACT. Workforce development is an important topic in distributed computing. While traditional curricula in engineering and computer science focus primarily on disciplinary technical expertise, facilitating research cyberinfrastructures requires a diverse set of non-discipline-specific skills including usability, business planning, and community building. Science gateways, digital platforms that facilitate the use of complex research and computing resources, are an increasingly popular form of cyberinfrastructure. The Science Gateways Community Institute (SGCI) was funded in 2016 by NSF to support the creation, use, and maintenance of effective, efficient, and reliable science gateways. SGCI’s Incubator provides training, short-term consulting, and community-building measures that support workforce development and professionalization of computational solutions. We present some strategies and lessons learned from four years of science gateways engagement relevant to workforce development.

19:30
Comparing GPU effectiveness for Unifrac distance compute

ABSTRACT. Microbiome studies have recently transitioned from experimental designs with a few hundred samples to designs spanning tens of thousands of samples. After collecting and sequencing samples, one of first questions researchers ask is how similar those samples are to each other, with the Unifrac distance being a popular metric. The recent Hybrid Unifrac implementation can make use of consumer-grade GPUs and in this poster we present the results of benchmarking on the various GPU and CPU models available in the Pacific Research Platform. We show that consumer-grade NVIDIA GPUs provide a very good platform for this kind of compute.

20:00
Using Microsoft Azure for XRootD Network Benchmarking

ABSTRACT. The computing infrastructure is always evolving, so researchers need to evaluate products on infrastructure that is considerably more advanced than the mainstream production one. Today's production services rarely have border links exceeding 100 Gbps, and even getting access to nodes with more than 10 Gbps NICs is not trivial. Testing services over wide area networks at 20 Gbps or higher thus becomes non-trivial. While there are research testbeds that specialize in networking, it is non-trivial to get exclusive use of dedicated nodes. We thus explored the feasibility of using commercial Cloud resources, which are much easier to get access to. The tested Cloud provider was Microsoft Azure, and we limited our setup to single-node to single-node tests using XRootD HTTP-TPC protocol. We show that Azure instances can deliver 30 Gbps within a single region and 15 Gbps between regions with as much as 122 ms latency, which exceeds what we can obtain using resources available on-prem.