View: session overviewtalk overview
Abstract: Numerous preventive maintenance policies have been published in the reliability and maintenance literature. Nevertheless, few examples on the application of preventive maintenance policies have been reported. The reasons are various, mainly because it is notoriously difficult to collect failure data. As a result, many developed maintenance policies divorce from the ground truth and are therefore inapplicable. This talk discusses some possible methods to shed some light on such problems. We tackle the challenge from a perspective of estimating system reliability based on sparse data and integrating various uncertainties in optimisation of maintenance policies. The uncertainties are stemmed from the uncertainty of parameter estimation on samples of small size, model specification uncertainty, and cost information uncertainty.
10:10 | Optimal maintenance decision-making for continuously degrading systems with imperfect repairs using Markov decision processes ABSTRACT. In many practical systems, repair is often a more economical option than direct replacement. However, repairs usually have imperfect effects, as they only stochastically reduce degradation without fully restoring the system. Their effectiveness varies due to factors such as repair quality, component wear, and operational conditions. Considering this stochastic behavior, we develop a condition-based maintenance policy for a continuously degrading system that incorporates imperfect repairs alongside preventive and corrective replacements, reflecting realistic maintenance practices. The system degradation is modeled using an inverse Gaussian process with a general shape function, while the stochastic nature of repair effectiveness is captured by a beta distribution. Instead of relying on a parametric policy representation, we formulate the problem as a Markov decision process to directly learn a state-to-maintenance-action mapping under the long-run average cost criterion. The policy is optimized using dynamic programming with function approximation to efficiently handle the mixed discrete-continuous state space. To evaluate its performance, we conduct a comparative study against alternative maintenance policies, demonstrating the potential cost benefits of adaptive decision-making. Numerical experiments show that a near-optimal policy can be learned while effectively capturing degradation dynamics and repair uncertainties. |
10:30 | Joint optimization of operation & maintenance for navy ships PRESENTER: Mirel Maraha ABSTRACT. The operational performance of naval ships relies on their deployment and maintenance. The deployment is the assignment of specific missions to a specific ship. The missions indicate the type of use and the type of (weather) conditions in which the naval ship is deployed. The type of use and conditions influence the system degradation and in turn this has an influence on the maintenance needed. Maintenance needs are also affected by prescribed protocols and the present condition of the ship. Planning the deployment in accordance with maintenance, therefore, is an intricate challenge. Determining the best moment for maintenance is difficult when one does not want to interfere with the ship deployment. But it is also difficult to plan the deployment while knowing that without performing interim maintenance future missions might not be feasible. Both deployment and maintenance need to be scheduled simultaneously in order to achieve optimal availability of naval ships. They are also both needed to maximize the percentage of successful missions. This research focuses on ways to jointly plan maintenance and deployment, with the objective to optimize the availability of navy ships and maximize the percentage of successful missions. A mathematical optimization model has been created that allows the joint optimization of both deployment & maintenance. Based on preliminary research a Markov Decision Process (MDP) has been selected as the most suitable approach. This powerful mathematical framework can be used to model (sequential) decision-making when outcomes are partly random and partly under control of a decision-maker. The created mathematical model will be demonstrated on a case study for the Royal Netherlands Navy. Results are used to advise on planning of maintenance and deployment. |
10:50 | Interpretable Maintenance Impact Quantification Using JEPA-KAN with Self-Supervised Contrastive Learning: an Aircraft case study PRESENTER: Weikun Deng ABSTRACT. Optimizing maintenance strategies is critical for ensuring the safety, reliability, and cost-effectiveness of aircraft operations. A fundamental challenge in predictive maintenance lies in quantifying the impact of maintenance interventions. Existing approaches for evaluating this impact generally fall into two categories: model-based methods, which rely on predefined mathematical formulations but often fail to capture real-world complexities, and data-driven methods, which leverage real-time monitoring data but function as black-box models without explicitly characterizing maintenance effects. This lack of interpretability and generalizability limits their practical application. To address these challenges, this paper introduces a novel self-supervised learning approach for interpretable maintenance impact quantification. Particulerly, by employing a Siamese architecture with interpretable Boltzmann Knowledge-Action Networks (KANs), the proposed framework derives an analytical expression that quantitatively describes maintenance effects. This allows for a transparent and physically meaningful assessment of maintenance efficiency, bridging the gap between data-driven adaptability and model-based interpretability. Experimental evaluations on real-world aircraft maintenance datasets demonstrate that the proposed approach outperforms traditional methods in predicting post-maintenance system health while offering a clear, explainable rationale for its predictions. These findings highlight the potential of interpretable deep learning models to enhance predictive maintenance, ultimately supporting more informed decision-making and improving aviation safety and efficiency. |
10:10 | Model-Based Load-Dependent Degradation Modeling For PEM Fuel Cells: A Multi-Health Index Approach Toward Energy Management in Multi-Stack Systems ABSTRACT. Multi-stack proton exchange membrane fuel cells (PEMFCs) present a promising solution for high-power applications and carbon-free energy. Despite significant advancements in PEMFC technology, durability and cost remain key challenges for large-scale commercialization. One potential approach to mitigating degradation is the optimization of operating parameters, particularly power demand, which is a major influencing factor. While power is generally dictated by the application's requirements, in a multi-stack configuration, it can be dynamically allocated among the stacks. Determining the optimal load distribution requires a reliable degradation model, which remains a challenge due to the system's complexity and the multi-faceted nature of degradation mechanisms. This paper proposes a load-dependent degradation model incorporating two key health indices that contribute to fuel cell performance loss: electrochemical surface area (ECSA) degradation and internal resistance increase. The ECSA is linked to power demand through a platinum dissolution model, while resistance evolution is represented as a load-dependent stochastic process. These indices are then integrated into a fuel cell potential model, enabling a more accurate assessment of degradation dynamics. The proposed model provides a foundation for optimizing load allocation strategies, ultimately enhancing PEMFC lifespan and performance in multi-stack architectures. |
10:30 | Uncertainty Quantification as a Complementary Latent Health Indicator for Remaining Useful Life Prediction on Turbofan Engines ABSTRACT. Health Indicators (HIs) are essential for predicting system failures in predictive maintenance. While methods like RaPP (Reconstruction along Projected Pathways) improve traditional HI approaches by leveraging autoencoder latent spaces, their performance can be hindered by both aleatoric and epistemic uncertainties. In this paper, we propose a novel framework that integrates uncertainty quantification into autoencoder-based latent spaces, enhancing RaPP-generated HIs. We demonstrate that separating aleatoric uncertainty from epistemic uncertainty and cross combining HI information is the driver of accuracy improvements in Remaining Useful Life (RUL) prediction. Our method employs both standard and variational autoencoders to construct these HIs, which are then used to train a machine learning model for RUL prediction. Benchmarked on the NASA C-MAPSS turbofan dataset, our approach outperforms traditional HI-based methods and end-to-end RUL prediction models and is competitive with RUL estimation methods. These results underscore the importance of uncertainty quantification in health assessment and showcase its significant impact on predictive performance when incorporated into the HI construction process. |
10:50 | A Petri Net approach to predict the lifespan of a railway track asset PRESENTER: Emily Buttriss ABSTRACT. Accurate prediction of when a track asset requires renewal is crucial in any efficient management strategy, to ensure that adequate funding for replacements is available when needed. The end of an asset’s useful lifetime can be defined by multiple criteria, including the amount of maintenance works completed, the frequency of interventions, the total cost of maintenance, and the overall condition of the asset. This study develops a Petri net model to predict the lifespan of a component on the railway line, given the renewal criteria. The approach consists of modelling the inspection, degradation, maintenance, and renewal processes via sub-models, with strategic placing of tokens to indicate the chosen renewal strategy. The thresholds for maintenance, including the required action and its expected timeframe, are taken directly from railway operational standards, and the expected degradation events are based on a reliability study of historical track failure data. The model has been computed via Monte Carlo simulation, using data from the HS1 railway line, situated in the UK. The results from this study aim to support the railway industry when selecting the most cost-effective renewal strategy that keeps the track asset in a safe, practical condition for the longest time. |
11:30 | Task Grouping Optimization for Gas Insulated Substations (GIS) Predictive Maintenance: Trade-off between cost and environmental impact PRESENTER: Wenbo Wu ABSTRACT. An effective predictive maintenance plan ensures operational reliability while achieving key optimization goals. In the context of Gas-Insulated Substations (GIS), our core concerns lie in minimizing maintenance costs and mitigating the environmental impact of sulfur hexafluoride (SF6) gas leakage, a potent greenhouse gas. Building on our previous work optimizing maintenance schedules for individual GIS components, this study extends the approach to a second-stage task-grouping optimization consolidating maintenance tasks across multiple GIS components to address operational inefficiencies such as increased costs, underutilized resources, and prolonged downtime. Specifically, our approach combines maintenance tasks that are temporally close and operationally compatible, thereby reducing travel costs and improving resource utilization. The proposed method formulates the grouping problem as a Mixed Integer Linear Programming model, incorporating key constraints such as task time windows, resource availability, and operational priorities. Furthermore, the trade-off between maintenance costs and SF6 leakage quantity is analyzed to account for its environmental impact. |
11:50 | Mitigating Financial Risk in Maintenance Contracts for Heterogeneous Machines PRESENTER: Stijn Loeys ABSTRACT. We assess the financial risk incurred by service providers responsible for delivering all maintenance under a fixed upfront fee agreement for a heterogeneous machine fleet. To minimize maintenance costs, a preventive maintenance policy is optimized by sequentially learning the machine-specific failure behavior using data accrued during the contract. We use Bayesian updating to jointly tailor a machine's preventive maintenance policy and quantify the associated uncertainty of the maintenance cost distribution. Through an extensive simulation study, we demonstrate that incorporating maintenance visits or health indicator data into the estimation of the failure distribution and the optimization of the preventive maintenance policy leads to more accurate cost predictions and hence reduced financial risk. This data-driven, machine-specific approach results in more competitive and reliable contracts compared to a one-size-fits-all strategy. |
12:10 | Towards Automating xGSPN-mBSPN Model Generation for Scalable Fault Diagnostics in Dynamic Systems ABSTRACT. Efficient fault diagnosis is fundamental to industrial reliability and diagnostic applications, particularly for ensuring system safety and efficient performance. While traditional model-based fault detection approaches have proven effective, they face scalability challenges due to the manual effort required in model construction. This paper introduces an automated framework, integrating extended Generalized Stochastic Petri Nets (xGSPNs) for system modelling with modified Bayesian Stochastic Petri Nets (mBSPNs) for fault diagnosis. A set of novel algorithms is proposed to enable the automatic generation of the xGSPN model from system specifications and derivation of the mBSPN diagnostic module from the xGSPN representation, including the automated construction of input Conditional Probability Tables (iCPTs), required for diagnostic reasoning. These automation processes reduce manual effort, improve model accuracy, and enhance adaptability for large-scale and time-varying systems. The effectiveness of the proposed approach is validated using a water tank level control system, demonstrating its capability in detecting and diagnosing single and multiple faults. The findings contribute to advancing hybrid fault detection and diagnostic methodologies, making them more practical for industrial reliability and fault diagnostics applications. |
12:30 | Predicting the remaining life of lithium-ion batteries: a frugal data-based approach ABSTRACT. Predictive maintenance aims to anticipate the end-of-life of components, thereby optimising human intervention and use of parts, but requires constant monitoring. Calculating the Remaining Useful Life (RUL), which estimates the time remaining for a system to operate satisfactorily, is a crucial stage in predictive maintenance. Classically, there are three approaches for calculating RUL: with models, with data or with hybrid methods. Methods using data, whether statistical or using neural networks, often require very large quantities of data. In this paper, a frugal approach in terms of the data required is proposed to calculate the RUL of lithium-ion batteries. This method use a polynomial as an approximation of the capacity over the cycles, whose coefficients are obtained through least mean squares optimization un der linear constraints. With an average prediction horizon of 30 % remaining lifetime, this method is relevant in terms of computational complexity and required quantity of data. |
11:30 | Real-Time Monitoring of Nozzle Clogging in Cold Spray Process Using Airborne Acoustic Emission and Data-Driven Prognostics ABSTRACT. The cold spray process is an emerging solid-state deposition technique that accelerates metallic powder particles to supersonic speeds through a converging-diverging nozzle using a carrier gas. Upon impact with a substrate, the particles form a dense, adherent deposition, making cold spray highly suitable for coating and repair applications. Its key advantages include low-temperature operation, minimal oxidation, and reduced thermal degradation of the substrate, making it particularly attractive for aerospace, automotive, and other industries. A common problem during cold spray is nozzle clogging. It can occur due to the slow buildup of powder inside the nozzle, which restricts the gas-particle flow and thereby affects the quality of the deposit. Detection of clogging is therefore important for quality assurance. Furthermore, the ability to predict the occurrence of clogging in advance could enable corrective action before part quality is affected. In previous work, the authors showed the potential of airborne acoustic emission (AAE) as a real-time, non-intrusive monitoring technique, providing valuable insights into the process without requiring a direct line of sight with the spray plume. Preliminary experiments showed that the acoustic waves generated during the process contained valuable information related to particle velocity, nozzle positioning with respect to the object, and nozzle clogging. In this work, the analysis of the AAE signals was further developed for the detection and prognostics of nozzle clogging. Run-to-failure experiments were performed to analyse nozzle clogging progression, reveal its stochasticity and extract relevant features to characterize the different clogging stages: healthy condition, clogging initiation, clogging buildup, and the end of life. A health indicator (HI) was derived from AAE signal features to quantify varying levels of nozzle clogging. This HI was not only used to quantify clogging progression but was also directly linked to the quality of the deposited coating. As clogging began, porosity increased gradually in the deposit, deteriorating its quality. Data-driven models developed during this work leverage the HI for prognostics, estimating the nozzle’s remaining useful life (RUL) before complete clogging occurs. Combining AE-based monitoring with predictive algorithms is expected to minimize unplanned downtime, reduce material waste by reducing rejected parts, and improve the overall efficiency of cold spray applications. The proposed methodology is validated through further experiments with varying process parameters, assessing its effectiveness in early clogging detection and RUL estimation. This work highlights the potential of condition based maintenance in cold spray applications. Early detection of clogging and linking HI to deposit quality ensure consistent coating performance and optimized operational efficiency. |
11:50 | Data-driven Prognostics under uncertainty: A comparative study on the state-of-the-art HMMs ABSTRACT. As systems become more complex, the task of making them safe and reliable without wasting materials and resources proves challenging. To tackle this challenge, the field of Prognostics and Health Management (PHM) is emerging, providing novel modelling techniques to predict the future damage state of these systems and optimize maintenance strategies. A paradigm shift in the PHM field has occurred in the last few years, where analytical modelling has been complimented (or entirely replaced) with data-driven modelling. This transition is driven by the growing capabilities of data-driven models and the increasing complexity of systems, which often render analytical methods either inaccurate or computationally prohibitive. Due to the ever-increasing popularity of ANNs, novel and high-performing models have been devised and applied in PHM. However, an aspect often overlooked when applying ANNs for prognostic tasks is that they are, by nature, deterministic. Contrastingly, predicting any value in the future is inherently stochastic. Thereby, any predicted variable needs to be modelled as a random variable to quantify the associated uncertainty coming from the process and the prediction itself. For that reason, stochastic models are getting more traction in the PHM field. Hidden Markov models (HMMs) are one of the most popular stochastic models for predictive tasks since they have a rich mathematical formulation to model the system's hidden (not directly observable) degradation process while properly considering the associated uncertainty. A plethora of modifications to the vanilla HMM have been devised that: relax the Markovian assumption (Hidden Semi-Markov Model), use adaptation mechanisms to handle outlying cases in time (AHSMM), or use similarity-based learning to enhance their performance (SL-HSMM). Although these extensions enhance predictive capability, they also introduce additional computational costs. This study evaluates the performance and computational efficiency of these advanced HMM variants using real-world experimental data from carbon-fiber-reinforced polymer composite specimens subjected to tensile-tensile fatigue loading. Acoustic emission data are utilized to predict the remaining useful life (RUL) of the specimens. Multiple HMM-based models are compared both in terms of accuracy and uncertainty quantification. A novel equation for calculating the RUL is also presented and compared with the literature-standard one. Therefore this work provides a comprehensive assessment of state-of-the-art Hidden Markov Models for PHM applications. |
12:10 | Optimal maintenance planning for offshore wind farms considering time-varying costs and limited manpower PRESENTER: Rommert Dekker ABSTRACT. With wind energy taking up a bigger share of the worldwide electricity production each year and the desire to have switched to a fully sustainable global energy landscape by 2050, finding least-cost maintenance programs for wind turbine components becomes increasingly important. In this thesis we analyse the problem of determining which maintenance activities should be conducted at times other than originally planned when encountering limited available manpower. We consider the period-dependent age replacement policy (p-ARP), block replacement policy (p-BRP) and modified block replacement policy (p-MBRP) to construct least-cost maintenance policies for a single component under time-varying costs. The first two form the groundwork for three algorithms that we propose in case of dealing with multiple components where only a limited number can be maintained simultaneously. First, we propose the Dynamic Maintenance Delay (DMD) heuristic that deals with delaying preventive maintenance activities. Next, we include bringing forward maintenance by introducing the Dynamic Maintenance Reschedule (DMR) and Static Maintenance Reschedule (SMR) heuristics. We evaluate the performances by means of simulation for eighty identical components. Moreover, we compare the outcomes with the optimal policy in case of two components. |
12:30 | Redefining Prognostic Essentials: Focus on Reliability, Robustness and Feasibility. ABSTRACT. Prognostics play a pivotal role in predictive and prescriptive maintenance by forecasting the future health and performance of assets based on their current condition and operational context. While traditional research has focused on enhancing RUL prediction accuracy, this work argues that the feasibility, robustness, and reliability characteristics are equally vital for addressing the demands of modern maintenance strategies. To effectively support maintenance decision-making, prognostic methodologies must meet three critical criteria: feasibility, robustness, and reliability. Feasibility refers to the ability of a prognostic methodology to function effectively with limited degraded data, as obtaining extensive degradation datasets can be prohibitively expensive. Robustness ensures that prognostics maintain reliable performance across a wide range of operational conditions, including those not encountered during training. Reliability is vital due to the inherent uncertainties in prognostics, arising from factors such as manufacturing variations, unpredictable future loading conditions, and environmental influences. Motivated by these requirements, this work introduces a novel adaptive similarity-based prognostic methodology inspired by Markov models. The proposed approach will be evaluated and validated against state-of-the-art methods using both simulated and real-world data from the aerospace sector. |
14:10 | Evaluation of Mission Reliability for a System in the Presence of Spare Parts PRESENTER: Reem Alrashed ABSTRACT. The main objective of this study is to evaluate mission reliability for a system in the presence of spare parts. That is achieved by using a Markov stochastic process and survival signature methodology. The survival signature [1] provides a methodology to evaluate the reliability of systems with multiple types of components. This study presents an analysis of a system with multiple types of components, supposing that the distribution function of the failure time of components is an Erlang distribution. During the mission, some spare parts are available for some component types to replace failed components immediately [2]. However, when all available spare parts have been used, further failing components cannot be replaced or repaired. The study introduces the method for a system with just one component type, after which the application to a system with multiple component types is presented. This method can be used as input to make decisions about the number of spare parts available to meet some mission reliability requirements. References: [1] Coolen, F. P., and Coolen-Maturi, T. (2012). Generalizing the signature to systems with multiple types of components. In Complex systems and dependability (pp.115-130). Springer Berlin Heidelberg. [2] Van Houtum, G. J., and Kranenburg, B. (2015). Spare parts inventory control under system availability constraints (Vol. 227). Springer. |
14:30 | ReLife: an open-source Python library for data-driven decision-making in asset management based on reliability theory ABSTRACT. In the context of aging infrastructures and climate change, asset managers face critical investment decisions. ReLife is an open-source Python library that aims to provide various statistical methods, including survival analysis and advanced reliability models like condition-based maintenance and optimal asset replacement. The library also provides a renewal process sampling, includes a renewal equation solver, and supports Lebesgue-Stieltjes integration. ReLife helps asset managers select optimal maintenance policies that minimize expected equivalent annual costs and socio-economic impacts while serving as a collaborative platform for reliability professionals and researchers. |
14:50 | Evaluating Maintenance Strategies for Locomotive Wheelsets Using Petri Net-Based Modelling Approach PRESENTER: Maksym Ocheretniuk ABSTRACT. Effective locomotive fleet management is crucial for ensuring the safe, reliable and cost-efficient operation of railway systems. One of the primary maintenance tasks in railway transportation is the condition monitoring and management of rolling stock wheelsets, which directly influence operational safety, ride comfort, and maintenance costs. This paper proposes a Petri net-based approach to model the state transitions of wheelset flange and tread degradation to evaluate different maintenance strategies for locomotive wheelsets. The methodology aims to enhance decision-making regarding maintenance intervals, minimize locomotive downtime, and reduce maintenance costs, while maintaining high safety and reliability standards. A key feature of the proposed approach is the ability to simulate different maintenance strategies for the locomotive wheelsets and evaluate their impact on fleet performance metrics, such as availability, reliability, and life-cycle costs. By varying the inspection and maintenance policies within the simulation, the optimal strategy that balances safety, cost, and operational efficiency can be identified. Another important outcome is that the proposed model allows the safety risks to be evaluated when the locomotive wheelset condition is outside the standard’s limits that are established by maintenance policy. Additionally, the model can be used as a decision-support tool that helps to evaluate the trade-offs between different maintenance policies and select the most cost-effective strategy |
15:10 | Reliability Prediction for Combined Hardware-Software Systems Using Survival Signature ABSTRACT. The increasing integration of hardware and software in modern safety-critical systems has increased the need for accurate reliability prediction models to prevent catastrophic failures. Several reliability approaches have been developed in the literature to evaluate the performance of combined hardware-software systems based on different assumptions using various statistical and probabilistic methods [1]. However, there remains a need to develop models that can be applied to a wide range of such systems. This research presents a unified framework for predicting the reliability of a combined hardware-software system using the survival signature concept [2]. A key advantage of this approach is its applicability to large systems and networks, as it can handle multiple types of components without requiring the assumption that their failures are independent and identically distributed. This makes it particularly well-suited for real-world systems that integrate hardware and software components. The methodology enables system reliability evaluation by examining how the system’s structure and component characteristics affect overall system performance. The survival signature methodology is implemented in this research to predict the reliability of a system consisting of n subsystems, where each subsystem comprises one hardware component and one software module, and the system requires at least k of functioning subsystems to function. The analysis integrates system failure diagnosis to determine whether failures originate from hardware or software components, using diagnostic equations derived from the survival signature approach. This enables the evaluation of failure propagation and the identification of component types with the greatest impact on reliability. Initial results demonstrate the effectiveness of this unified approach in predicting system reliability by incorporating both hardware and software failures, providing a foundation for improved system design and maintenance strategies. References [1] Sourav Sinha, Neeraj Kumar Goyal, and Rajib Mall. Survey of combined hardware-software reliability prediction approaches from architectural and system failure viewpoint. International Journal of System Assurance Engineering and Management, 10:453–474, 2019. [2] Frank P. A. Coolen and Tahani Coolen-Maturi. Generalizing the signature to systems with multiple types of components. In Complex Systems and Dependability, pages 115–130. Springer, 2012. |
15:30 | Trust is Good, Monitoring is Better: FPGA- & TEE-Based Monitoring for Malware-Detection PRESENTER: Friederike Bruns ABSTRACT. Ensuring trustworthiness in electronic systems is crucial to maintain safety and data integrity. Safety properties of robotic components are rigorously validated during development and, similarly, security requires ongoing monitoring during system operation as well. However, this monitoring must also safeguard its own components from tampering.We propose a novel runtime monitoring approach using application-specific monitors within an FPGA-based Trusted Execution Environment (TEE). To protect these monitors from supply chain attacks during design, fabrication, testing, or packaging, the TEE is programmed as the final step before deployment. The monitors are directly generated from formal constraint specifications established during the design and test phases. Our approach is demonstrated on a RISC-V-based System-on-Chip (SoC) for robotic applications, featuring a force sensor and a CAN-bus interface. We monitor the timing behaviour of hardware and software to detect malicious modifications affecting data transmission to a control unit. In an FPGA prototype, the monitors successfully identified hardware and software tampering. In real ASIC implementations, programming the TEE post-packaging ensures resilience against supply chain attacks. |