SMC-IT/SCC 2024: IEEE SPACE MISSION CHALLENGES FOR INFORMATION TECHNOLOGY / SPACE COMPUTING CONFERENCE
PROGRAM FOR MONDAY, JULY 15TH
Days:
next day
all days

View: session overviewtalk overview

07:30-08:15Registration and Continental Breakfast
08:30-10:00 Session 2: Plenary Keynotes
Location: Hahn Auditorium
08:30
Keynote: Dr. Eugene Tu
09:15
Keynote: Dr. Elizabeth Turtle
10:00-10:30Coffee Break
10:30-12:00 Session 3A: SCC: High Performance Computing Architectures
Location: Learning Lab
10:30
COSMOS - Computational Optimization and Scheduling for Multi-tenant Orbital Services

ABSTRACT. Major companies such as SpaceX are offering opportunities for scientists, researchers, and small businesses to send custom electronics hardware prototypes into orbit. This initiative enables them to prototype and conduct experiments in the unique environment of space. Nevertheless, the associated costs, both direct and indirect, often pose affordability challenges for smaller entities and researchers. In this presentation, we unveil a cutting-edge computing platform designed to offer small companies and research institutions rapid, dependable, and economical access to space. Our platform caters to both customers equipped with their own computing capabilities who seek to expedite and distribute computations through resource rental, as well as those who prefer to concentrate on their core research without the hassle of designing custom boards, accelerators, and software. This platform guarantees that all customers have access to top-tier, state-of-the-art computational nodes, including GPUs, multicore CPUs, FPGAs, and more. These resources can be rented on demand, and tailored to the specific computational and performance needs of each customer.

11:00
Open Source Architecture for Coherent, Distributed Computing

ABSTRACT. Distributed computing provides 2 major advantages to space travel and exploration. One, it provides performance that can scale up or down based on computing needs during a mission - thereby a potential for power reduction. Two, it offers fault tolerant capability. Fault tolerant enables a space system to continuously working if encountered erroneous conditions due to radiation and others in its computing or storage device. Coherent capable distributed computing architecture further enables (1) the use of computing nodes across a network (such as Ethernet) whether on-board or remote, (2) the use of different processing system types (heterogenous computing) in a distributed computing cluster. The architecture discussed in this paper also enable the use of remote networked memory/storage system in distributed computing environment. LeWiz Communications and Western Digital previously open sourced an architecture that can support coherent distributed computing. This paper will present its architecture, potential applications and advantages.

11:30
Application of AMD Versal Adaptive SoC to Radar Space Time Adaptive Processing in Space

ABSTRACT. Space Time Adaptive Processing (STAP) radar systems require rapid execution of complex matrix multiplications to calculate and apply covariances and adaptive filtering weights, to achieve filtering of ground clutter and to mitigate jamming. Developers of these systems require capabilities to rapidly simulate and prototype their designs. In this paper we discuss the signal processing requirements of a modern STAP radar system, and describe the implementation of a STAP design in multiple phases, with a hardware-in-the-loop simulation using MATLAB®/Simulink® tools from MathWorks® and with a full implementation in the AMD Versal Adaptive SoC, which is available as a radiation-tolerant space-grade device. Our implementation phases make use of the Adaptive Intelligent Engines in the Versal architecture to achieve rapid execution of matrix multiplication with complex values and to allow rapid modification of algorithms, without incurring additional development time due to repeated place-and-route and static timing analysis cycles.

10:30-12:00 Session 3B: Workshop: STINT
Location: Lovelace
10:30
Some Challenges of Implementing Delay Tolerant Networking at Mars
11:00
Insights into HDTN’s High Performance LTP Implementation
11:30
Building DTN networking for LunaNet Service Providers
10:30-12:00 Session 3C: Workshop: BPFTS
Location: Hahn Auditorium
10:30
Workshop Introduction (1st Session)
10:50
University of Pittsburgh: New Approaches Toward Advanced Data Handling for Space Products
11:25
UFRGS: New Approaches Toward Advanced Data Handling for Space Products
10:30-12:00 Session 3D: SMC-IT: Mission Design
Location: Turing
10:30
Maintaining Mars Rover Operations Software On A Budget

ABSTRACT. Abstract—This paper characterizes and discusses the software scripts and tools used by the Mars Science Laboratory rover planner team, and describes the effort to unify the management of this tool suite. Rover Planners have historically used numerous command line tools to aid them in designing and implementing activities over the course of a tactical planning shift. The bulk of these tools were developed ad-hoc by various individuals, with little oversight or organization. As a result, the scripts were poorly documented, difficult to locate, and mostly did not have any formal testing associated with them. As part of the modernization process, we identified all of the existing scripts and their source code locations, moved them to a centralized location within the operations venue, and wrote unit tests and integration tests for every tool. Many tools were upgraded from Python 2 to Python 3, or converted from other languages to Python 3 where feasible. We implemented a system for reviewing changes and continuous testing by moving all scripts to a single git repository, where we track and actively maintain them. Pull requests are tested automatically using Jenkins, and the entire suite of scripts and library functions is tested upon every deployment of the suite. We manage feature requests and bug fixes via GitHub issues, and a working group meets biweekly to discuss changes and progress on efforts relating to software-based tools for the MSL rover planners. In this paper, we detail the design of a unified system for managing these command line tools, the implementation of said system, and the innovation and utility of these tools and how they improve the tactical planning process for MSL Rover Planners. We present this framework as an example for other mission operations teams to use to manage standalone command line scripts that make use of common tools and services.

11:00
Minimum Requirements for Space System Cybersecurity - Ensuring Cyber Access to Space

ABSTRACT. Space systems are continuously under cyber attack. Minimum cybersecurity design requirements are necessary to preserve our access to space. This paper proposes a scalable, extensible method for developing minimum cyber design principles and subsequent requirements for a space system based on any given mission priority. To test our methodology, we selected the fundamental mission priority of preserving access to space by preventing the permanent loss of control of a satellite. We then generate the minimum number of secure-by-design principles that can result in the permanent loss of control of a satellite and translate these into example minimum requirement `shall' statements. Our proposed minimum requirements methodology and example can serve as a starting point for policymakers aiming to establish security requirements for the sector. Further, our methodology for establishing minimum requirements will be engaged for prioritizing the efforts of the emergent IEEE International Technical Standard for Space Cybersecurity (Working Group P3349).

11:30
Information Systems for Crew-Led Operations Beyond Low-Earth Orbit

ABSTRACT. On past and present human space missions, the management of vehicle health and status has primarily been executed from Earth. Missions such as Apollo, Space Shuttle, and ISS have relied on a safety net of ground-based experts with access to real-time telemetry data, broad and deep systems expertise, and powerful analytical and computing capabilities. The ground team monitors and manages the vehicle’s health in real-time and responds quickly to critical situations and malfunctions. Ground operators also provide real-time oversight and verbal guidance to flight crewmembers, especially during complex procedure execution and high-risk activities like extra-vehicular activities.

However, this operational paradigm, in place for 60 years, will not transfer to long duration exploration missions beyond low Earth orbit (LEO). Lunar and deep-space crewed missions will encounter delayed communications that prohibit real-time operational and medical support. Additionally, there will be infrequent resupply and a diminished capacity to evacuate or rescue crewmembers. A small crew must operate independently, managing the vehicle’s state, responding to time-critical events, and executing complex procedures, all without the safety net of real-time support.

A key challenge for a small crew lies in the vast amount of data they must process to support procedure execution and anomaly response. In today’s ISS mission control center, 15-20 flight controllers (working 24 hrs/day in three shifts) continuously monitor real-time data for their respective subsystems, supplemented by Back Room and Mission Evaluation Room (MER) engineers. They work to detect failures, assess impacts, troubleshoot, identify workarounds, and oversee procedure execution—all of which require access to and understanding of extensive engineering and procedure information, as well as system build, test, and configuration documentation.

As problem solving transitions onboard for missions beyond LEO, the crew needs more than mere access to this wealth of information. It must be compiled, refined, and presented appropriately to support a small crew with far less time and expertise. This challenge is aggravated by the relatively underpowered computing capabilities available to crews in space which are designed to endure radiation and other environmental hazards. Consequently, the capability of these systems may lag behind their terrestrial counterparts by years or even decades. Additionally, crews in space contend with significantly less display real estate compared to ground operators. While ground operators can utilize multiple large displays, crew must manage with resources more akin to a single laptop display.

Our team’s investigation into past anomalies on Apollo and ISS missions unveiled key characteristics that make unanticipated, time-critical anomalies so challenging to resolve, including imperfect sensor data, complex causal relationships, and limited intervention options. Beyond LEO, onboard systems need capabilities that will support the crew in creative and critical problem-solving to overcome some of those challenges. This paper extends our past work, identifying the core crew interface characteristics needed to support onboard time-constrained problem solving and decision making under conditions of delayed communications with the ground team. Drawing insights from analogous domains, including healthcare and nuclear power, we present preliminary recommendations for organizing and integrating information for effective problem solving. A case study of an actual ISS anomaly resolution will be used to envision the onboard information and decision support systems for Earth independent problem solving.

12:00-13:15Lunch Break
13:15-15:15 Session 4A: SCC: High Performance Computing Architectures
Location: Learning Lab
13:15
Selecting Space Processors for High Order Wavefront Control Adaptive Optics Systems

ABSTRACT. The real time control of many-actuator adaptive optics systems will allow future space telescopes to suppress starlight and directly image and characterize exoplanets. In the future, a measurement by this technique may be the first to directly detect extraterrestrial life in the universe. However, the real-time execution of adaptive control algorithms will place unprecedented demands on spaceborne processors. Previous work has estimated the necessary level of computational system performance based on computational density analysis. In this work, we first evaluate the relevant algorithms in numerical detail, and decompose the top-level computational system into subsystems. We then perform requirements flow-down to these subsystems to evaluate the expected performance of a range of candidate processors. We additionally consider radiation degradation of the control processor within the context of a high contrast imaging mission. With this system decomposition and requirements flow-down, we survey relevant space processors for their expected performance on wavefront sensing and control algorithms. This analysis supports the need for further development of high performance radiation tolerant processors.

13:45
Characterizing Computational Resources of GNC Algorithms

ABSTRACT. The motion of vehicles is influenced by controllers designed in Guidance, Navigation, and Control (GNC). The classic tradeoff in GNC algorithms is between performance and actuation, but any quantifiable variable can be part of the penalties that influence the governance of the system. The purpose of this research is to characterize computational resources so that they can be used by GNC controllers in situations where limited computational hardware, such as a Raspberry Pi Model B+ Rev 1.2, is combined with computationally expensive algorithms, such as Model Predictive Control (MPC). In application, lack of funding or external factors can cause available computing resources to be limited. In space applications, processors have to be radiation hardened to become fault tolerant of the ionized space environment they need to operate in. This radiation hardening impairs the availability of computational resources, such that our everyday cell phones can have orders of magnitude larger computational resources. Several computational metrics such as central processing unit (CPU) utilization, active power consumption, and physical memory usage are of primary interest in the research. This research utilizes a test bench of different hardware (HW), where algorithms are loaded and executed “on-board” and the computational resources measured during execution. These signals are then analyzed using the principles of system identification: measurement of the signal, selecting a model structure, estimating adjustable parameters in the model structure, and evaluating the estimated model’s predictive performance. The objective computational resource predictive model is a spring-mass-damper. This predictive model will then be incorporated into control synthesis and the resulting system dynamics adjusted based on penalties incurred by the computational resource status. The algorithms are compiled in C++. The MPC algorithm has tunable convergence parameters such as maximum number of iterations and numerical tolerance. These values are adjusted to create different signal dynamics in the computational resource metrics for analysis. In one benchmark exercise, the executable is called in a loop with static values to repeatedly provide an “impulse” to the computational metrics. Active power consumption is measured with an ONSET Hobo Plug Load Logger UX120-018 and computational resources collected through linux terminal monitor commands and output redirected to file. Follow-on work will evaluate GNC algorithms with representative spacecraft values and create a control structure called Real-Time Recursive Optimization (R2O) that adjusts the GNC algorithms given the availability of computational resources. Alternative computational resource collection methods from inside the user-issued process are underway.

14:15
Performance Characterization of Gemini APU Processing-in-Memory Devices for Space

ABSTRACT. Autonomous space missions are dependent on their onboard computing capability to carry out their prescribed tasks. Ideal flight hardware systems must be capable of performing mission-critical tasks in real time while satisfying size, weight, power, and cost (SWaP-C) constraints of the mission. Additionally, space hardware must also be robust to the unique hazards of space environments. To expand the onboard computing capability for future space missions, new hardware platforms must be evaluated to determine their computational performance, power usage, and radiation sensitivity. One such hardware platform is the Gemini Associative Processing Unit (APU), which features a unique processing-in-memory (PIM) architecture to limit memory transfer operations. With this architecture, computations can be performed directly in memory, eliminating the need for excess data transfers which can negatively affect performance. This study conducts a performance comparison of Gemini APU devices with both modern desktop hardware as well as current generation flight hardware. Based on the results collected in this research, Gemini APU devices can provide much higher performance per watt than modern terrestrial CPUs, and offer uniquely scalable performance at a power profile similar to modern flight hardware.

14:45
Space System High Performance, Multi-Channel FPGA Framework for Data Capture, Transmission and Analysis

ABSTRACT. Large space systems such as space telescopes capture high resolution images continuously. These are processed with on-board DSP, then packetized for continuous transmission via network such as TCP/IP and Ethernet to remote processing nodes. Such transmission tend to be bursty and can reach 100Gbps speed making software based solution unfeasible. Space-capable FPGA devices are used to implement these functions in hardware. Space systems, however, are constrained by cost, available FPGA resources and power. This paper presents (1) a scalable, multi-channel architecture for hardware implementation of such data capture, transmission, analysis chain, (2) an interface mechanism for transmission of data coherently for remote system to receive and analyze using commercial tools, (3) a framework for building and validating such high performance system using commercially available computing systems. Lastly, it discusses other potential applications of such architecture in Earth orbit systems requiring data capture and analysis.

13:15-15:15 Session 4B: Workshop: STINT
Location: Lovelace
13:15
The Architectural Refinement of µD3TN: Toward a Software-Defined DTN Protocol Stack

ABSTRACT. This paper provides a comprehensive overview of the µD3TN project's development, detailing its transformation into a flexible and modular software implementation of the Delay-/Disruption-Tolerant Networking (DTN) Bundle Protocol. Originating from µPCN, designed for microcontrollers, µD3TN has undergone significant architectural refinement to increase flexibility, compatibility, and performance across various DTN applications. Key developments include achieving platform independence, supporting multiple Bundle Protocol versions concurrently, introducing abstract Convergence Layer Adapter (CLA) interfaces, and developing the so called Application Agent Protocol (AAP) for interaction with the application layer. Additional enhancements, informed by field tests, include Bundle-in-Bundle Encapsulation and exploring a port to the Rust programming language, indicating the project's ongoing adaptation to practical needs. The paper also introduces the Generic Bundle Forwarding Interface and AAPv2, showcasing the latest innovations in the project. Moreover, it provides a comparison of µD3TN's architecture with the Interplanetary Overlay Network (ION) protocol stack, highlighting some general architectural principles at the foundation of DTN protocol implementations.

13:45
Distributed Volume Management in Space DTNs: Scoping Schedule-Aware Bundle Routing

ABSTRACT. This paper addresses critical improvements in the Schedule-Aware Bundle Routing (SABR) standard, pivotal for distributed space missions based on Delay-Tolerant Networking (DTN). With a focus on volume management, defined as efficiently allocating and utilizing the data transmission capacity of network contacts, we explore enhancements for distributed and scheduled DTNs. Our analysis begins by identifying and scrutinizing existing gaps in volume management within the SABR framework. We then introduce a novel concept coined contact segmentation, which streamlines the management of the transmission volumes. Our approach spans all network contacts, initial and subsequent, by unifying previously separate methods such as Effective Volume Limit (EVL), Earliest Transmission Opportunity (ETO), and Queue-Delay (QD) into a single process. Lastly, we propose a refined generic interface for volume management in SABR, enhancing the system’s maintainability and flexibility. These advancements rectify current limitations in volume management and lay a foundation for more resilient and adaptable space DTN operations in the future.

14:15
Toward a Distributed and Autonomous DTN Security Environment

ABSTRACT. This document surveys existing terrestrial network security practices, focusing on X.509 public key infrastructure (PKIX), and identifies ways that the existing systems and protocols can be used in a delay-tolerant networking (DTN) environment. Additional discussion of protocols currently under development shows how PKIX security can be used directly and efficiently within a DTN. These are combined into one possible vision for distributed and autonomous security within the NASA LunaNet architecture.

14:45
Adding Quality of Service Support to Bundle Protocol Through an Extension Block

ABSTRACT. The Bundle Protocol (BP) was designed to address the challenges inherent in space communications. While already in use in several projects led by various space agencies, including the European Space Agency (ESA) and the National Aeronautics and Space Administration (NASA), there is a need to expand BP’s capabilities, including in Quality of Service (QoS) support, an area currently lacking standardization. This document proposes a dual QoS support block for BP which facilitates the definition of QoS requirements at the source in an immutable manner while allowing dynamic adjustments by networks or subnetworks. Furthermore, preliminary results are presented, analyzing the effects of the proposed traffic prioritization system and the weighted queue management. These results show improved end-to-end delay for time-sensitive information, and a higher rate of achieved QoS requirements for all priority classes, as well as a fairer approach to network scheduling.

13:15-15:15 Session 4C: Workshop: Cybersecurity
Location: Hahn Auditorium
13:15
Introduction to workshop and logistic
13:30
Panel Presentations
14:30
Breakout Sessions
13:15-15:15 Session 4D: Workshop: TFM
Location: Turing
13:15
Welcome and introductory comments
13:20
Foundation Models and Patterns for Science Time Series
13:40
AI Foundation Models for NASA Science: a Culture of Openness
14:00
Intelligent Parsing of Academic Literature Using Large Language Models
14:20
Panel
15:15-15:45Coffee Break
15:45-17:15 Session 5A: SCC: High Performance Computing Architectures
Location: Learning Lab
15:45
Rad Hard Datacenter for Space

ABSTRACT. A datacenter for use in Space offers high performance computing similar to small terrestrial datacenters, including storage, networking, and cloud computing. When operated using software platforms from commercial Cloud Service Providers, the range of enabled applications is quite similar to terrestrial clouds. The Space datacenter is planned for reliable power-efficient operation for longer than 30 years in any Space environment, employing on-board AI-based FDIR and self-healing. A small version is based on a single 6U-VPX enclosure. The large version is packaged in less than one cubic meter and offers about 13 TFLOP, 100 AI/ML TOPS and 3.5 PB storage while dissipating 12 kWatt.

16:15
Reliable, ML-Based Image Processing and Compression for an Accelerated Onboard Imaging Pipeline

ABSTRACT. Sensing and onboard-processing capabilities of next-generation spacecraft continue to evolve. Enabled by advances in avionic systems, large amounts of data can be collected and stored on orbit. Nevertheless, loss of signal, communication delays, and limited downlink rates remain a bottleneck for delivering data to ground stations or between satellites. This research investigates a multistage image-processing pipeline and demonstrates rapid collection, detection, and transmission of data using the Space Test Program - Houston 7 - Configurable and Autonomous Sensor Processing Research experiment aboard the International Space Station, as a case study. Machine-learning (ML) models are leveraged to perform intelligent processing and compression of data prior to downlink to maximize available bandwidth. Furthermore, to ensure accuracy and preserve data integrity, a fault-tolerant ML framework is employed to increase pipeline reliability. This pipeline fuses the fault-tolerant Resilient Tensorflow framework with ML-based tile classification and the CNNJPEG compression algorithm. This research shows that the imaging pipeline is able to alleviate the impact of limited communication bandwidth by using reliable, autonomous data processing and compression techniques to achieve reduced transfer sizes of essential data. The results highlight the benefits provided by resilient classification and compression including minimized storage use and reduced downlink time. The findings of this research are used to assess the feasibility of such a system for future space missions. The combination of these approaches enables the system to achieve up to an 98.67% reduction in data size and downlink time as well as the capacity to capture imagery over a 75.19x longer time period for a given storage size, respectively, while maintaining reconstruction quality and data integrity.

16:45
Radiation Tolerant MNEMOSYNE boot memory and 80 bits bus width DDR4 for HPSC processor.

ABSTRACT. In space applications, the demand for high-performance computing systems capable of withstanding extreme conditions is paramount. Those systems built around processors and/or FPGAs require highly reliable and high-performance components. Boot/configuration memory and processing memory are essential to the mission success.

16:50
Integrated Mobile Evaluation Testbed for Robotics Operations (IMETRO)

ABSTRACT. Robotic test facilities for dexterous mobile robotic manipulation will be crucial for proving the viability of advanced terrestrial technologies for space applications. A need has been expressed by many commercial, academic, and international partners for reference tasks and mockups of items and interfaces relevant to Moon to Mars exploration use cases. IMETRO is a new robotics test facility at NASA Johnson Space Center in Houston which will help to meet this need for both physical and digital twin robotics testbeds for Artemis Campaign Use Cases. IMETRO Features: -COTS mobile robots: provided by facility to allow partners to test software, sensors, tools, end effectors, etc. -Medium to high fidelity mockups and interface testbeds relevant to multiple program stakeholders: reduce duplication and encourage standardization among partners -Remote operations - simulate supervised autonomy of robots in space from earth -Digital-twin open-source robotic simulations for early partner s/w development and testing: provide digital models of the physical testbeds and COTS robots in the facility

15:45-17:15 Session 5B: Workshop: STINT
Location: Lovelace
15:45
Panel: DTN Protocol Stack Architectures" Panelists: Felix Flentge (ESA), Rachel Dudukovich (NASA Glenn), Scott Burleigh (former JPL), Felix Walter (D3TN)
15:45-17:15 Session 5D: Workshop: TFM
Location: Turing
15:45
Ethics and Trustworthy Foundation Models
16:05
Leveraging AI for Assurance of Critical Software Systems
16:25
Panel
17:00
Concluding remarks
17:15-19:15Reception