Towards In-silico Design of Fusion Power-plant Systems
ABSTRACT. In the next two decades, significant effort will be expended globally to deliver first-of-a-kind, commercialisable fusion power-plant facilities. With compressed timelines and budgetary constraints, traditional engineering methodologies must be accelerated through in-silico design and qualification, using validated computational models of fusion components and devices. To derive actionable conclusions from simulation demands sufficient fidelity of modelling, and the quantification and propagation of uncertainties. It is further desirable to facilitate exploration and discovery within a parameterised design space through modern techniques in fields such as multi-objective optimisation and data science.
The ability to exploit such technologies assumes a mechanism to automatically prepare and evaluate a model for a given design specification. Furthermore, such iterative methodologies imply the production of large volumes of simulated data; avoiding a scenario where the computational cost to generate such data is prohibitive necessitates the employment of software applications that are highly performant. In this talk, ongoing activities at the United Kingdom Atomic Energy Authority in support of these ambitions are reviewed. A growing suite of open-source multi-physics applications for modelling the extremes of the fusion environment is described. The scalability of these software tools, and the potential to leverage future exascale architectures is discussed. Finally, recent developments of a novel framework for the automated design of breeder blanket – a critical sub-system of a magnetic-confinement fusion power-plant – are reported.
Variable Discovery with Large Language Models for Metamorphic Testing of Scientific Software
ABSTRACT. When testing scientific software, it is often challenging or even impossible to craft a test oracle for checking whether the program under test produces the expected output when being executed on a given input – also known as the oracle problem in software engineering. Metamorphic testing mitigates the oracle problem by reasoning on necessary properties that a program under test should exhibit regarding multiple input and output variables. A general approach consists of extracting metamorphic relations from auxiliary artifacts such as user manuals or documentation, a strategy particularly fitting to testing scientific software. However, such software typically has large input-output spaces, and the fundamental prerequisite – extracting variables of interest – is an arduous and non-scalable process when performed manually. To this end, we devise a workflow around an autoregressive transformer-based Large Language Model (LLM) towards the extraction of variables from user manuals of scientific software. Our end-to-end approach, besides a prompt specification consisting of few-shot examples by a human user, is fully automated, in contrast to current practice requiring human intervention. We showcase our LLM workflow over three real case studies of scientific software documentation, and compare variables extracted to a ground truth manually labelled by experts.
CLARIN-Emo: Training Emotion Recognition Models using Human Annotation and ChatGPT
ABSTRACT. In this paper, we investigate whether it is possible to automatically annotate texts with ChatGPT or generate both artificial texts and annotations for them. We prepared three collections of texts annotated with emotions at the level of sentences and/or whole documents. CLARIN-Emo contains the opinions of real people, manually annotated by six linguists. Stockbrief-GPT consists of real human articles annotated by ChatGPT. ChatGPT-Emo is an artificial corpus created and annotated entirely by ChatGPT. We present an analysis of these corpora and the results of Transformer-based methods fine-tuned on these data. The results show that manual annotation can provide better-quality data, especially in building personalized models.
ABSTRACT. We propose a hybrid representation of syntactic structures, combining constituency and dependency information. The headed constituency trees that we use offer the advantages of both those approaches to representing syntactic relations within a sentence, with a focus on consistency between them. Based on this representation, we introduce a new constituency parsing technique capable of handling discontinuous
structures. The presented approach is centred around head paths in the constituency tree that we refer to as spines and the attachments between them. Our architecture leverages a dependency parser and a large BERT model and achieves 95.98% F1 score on a dataset where ≈10% of trees contain discontinuities.
Inference of over-constrained NFA of size k+1 to efficiently and systematically derive NFA of size k for grammar learning
ABSTRACT. Grammatical inference involves learning a formal grammar as a finite state machine or set of rewrite rules. This paper focuses on inferring Nondeterministic Finite Automata (NFA) from a given sample of words: the NFA must accept some words, and reject others. Our approach is unique in that it addresses the question of whether or not a finite automaton of size $k$ exists for a given sample by using an over-constrained model of size $k+1$. Additionally, our method allows for the identification of the automaton of size $k$ when it exists. While the concept may seem straightforward, the effectiveness of this approach is demonstrated through the results of our experiments.
ABSTRACT. Data Maps is an interesting method of graphical representation of datasets, which allows observing the model's behavior for individual instances in the learning process (training dynamics). The method groups elements of a dataset into easy-to-learn, ambiguous, and hard-to-learn. In this article, we present an extension of this method, Differential Data Maps, which allows you to visually compare different models trained on the same dataset or analyze the effect of selected features on model behavior. We show an example application of this visualization method to explain the differences between four personalized model architectures in a sentiment analysis task. We use three datasets that differ in the type of human context: user-annotator, user-author, and user-author-annotator. Our results show that with the new explainable AI method, it is possible to pose new hypotheses explaining differences in the quality of model performance, both at the level of features in the datasets and differences in model architectures.
User Popularity Preference Aware Sequential Recommendation
ABSTRACT. In recommender systems, users' preferences for popularity are diverse and dynamic, which reveals the different items that users prefer. Therefore, identifying user popularity preferences are significant for personalized recommendations. Although many methods have analyzed user popularity preferences, most of them only consider particular types of popularity preferences, leading to inappropriate recommendations for users who have other popularity preferences. To comprehensively study user popularity preferences, we propose a User Popularity preference aware Sequential Recommendation (UPSR) method. By sequentially perceiving user behaviors, UPSR captures the type and the evolution of user popularity preferences. Furthermore, UPSR employs contrastive learning to gather similar users and enhance user interest encoding. Then, we can match items and user popularity preferences more accurately and make more proper recommendations. Extensive experiments validate that UPSR not only outperforms the state-of-the-art methods but also reduces popularity bias.
B2-FedGAN: Balanced Bi-directional FefAvg based GAN
ABSTRACT. In Federated Learning (FL), a shared model is learned across dispersive clients each of which often has small and heterogeneous data. As such, datasets in FL setting may suffer from the non-IID (Independent and identically distributed) problem. In this paper, we propose a BAGAN as machine learning model which has the ability to create data for minority classes, and a Bi-FedAvg model as a new approach to mitigate non-IID problems in FL settings.
The comparison performance between FedAvg and Bi-FedAvg in both IID and Non-IID environments will be shown in terms of accuracy, converge stability and category cross-entropy loss. On the other hand, the training and testing performance among FedAvg, FedAvg with a conditional GAN model, and FedAvg with BAGAN-GP model, on IID and Non-IID environments with three imbalanced datasets will be compared and discussed. The results indicate that Bi-FedAvg fails to outperform Fed-Avg, for Bi-FedAvg suffers from model quality loss or even divergence when running on non-IID data partitions. In addition to that, our experiments demonstrate that higher quality images for complex image datasets can be generated by BAGAN and combining federated learning and Balancing GAN model together is conducive to obtaining a high-level privacy-preserving capability and achieving more competitive model performance. The project will give an inspired further exploration of the implementation of a combination between Federated learning and BAGAN on image classification in real-world scenarios.
ABSTRACT. While fairness-aware machine learning algorithms have been receiving increasing attention, the focus has been on centralized machine learning, leaving decentralized methods underexplored. Federated Learning is a decentralized form of machine learning where clients train local models with a server aggregating them to obtain a shared global model. Data heterogeneity amongst clients is a common characteristic of Federated Learning, which may induce or exacerbate discrimination of unprivileged groups defined by sensitive attributes such as race or gender. In this work we propose FAIR-FATE: a novel FAIR FederATEd Learning algorithm that aims to achieve group fairness while maintaining high utility via a fairness-aware aggregation method that computes the global model by taking into account the fairness of the clients. To achieve that, the global model update is computed by estimating a fair model update using a Momentum term that helps to overcome the oscillations of non-fair gradients. To the best of our knowledge, this is the first approach in machine learning that aims to achieve fairness using a fair Momentum estimate. Experimental results on real-world datasets demonstrate that FAIR-FATE outperforms state-of-the-art fair Federated Learning algorithms under different levels of data heterogeneity.
Data Heterogeneity Differential Privacy: From Theory to Algorithm
ABSTRACT. Traditionally, the random noise is equally injected when training with different data instances in the field of differential privacy (DP). In this paper, we first give sharper excess risk bounds of DP stochastic gradient descent (SGD) method. Considering most of the previous methods are under convex conditions, we use Polyak-{\L}ojasiewicz condition to relax it in this paper. Then, after observing that different training data instances affect the machine learning model to different extent, we consider the heterogeneity of training data and attempt to improve the performance of DP-SGD from a new perspective. Specifically, by introducing the influence function (IF), we quantitatively measure the contributions of various training data on the final machine learning model. If the contribution made by a single data instance is so little that attackers cannot infer anything from the model, we do not add noise when training with it. Based on this observation, we design a `Performance Improving' DP-SGD algorithm: PIDP-SGD. Theoretical and experimental results show that our proposed PIDP-SGD improves the performance significantly.
Learning shape-preserving autoencoder for the reconstruction of functional data from noisy observations
ABSTRACT. We propose a new autoencoder preserving general shape (monotonicity, convexity) of input functions after their reconstruction without imposing a priori constraints. These properties are inherent to the coefficients of the Bernstein-Durrmeyer polynomials that serve here as theoretical descriptors. Their estimates, computed from noisy observations by the coder, play the role of latent variables. The approach is purely nonparametric, i.e., no prior finite-dimensional model is assumed. The answer the question of how many of latent variables
should be used for an acceptable reconstruction accuracy of a family of functional data is inferred from learning based on the proposed approximation of the Akaike Information Criterion. A distinguishing feature of this autoencoder is thatthe coder and encoder are designed as precomputed and stored matrices with the Bernstein polynomial entries. Thus, after selecting the number of latent variables, the autoencoder usage has low computational complexity since it is linear with respect to observations. The proposed computational algorithms are tested on both synthetic and real data arising in mechanical engineering when control of damping vibrations
is necessary.
Timeseries Anomaly Detection Using SAX and Matrix Profiles Based Longest Common Subsequence
ABSTRACT. Similarity search is one of the most popular techniques for time series anomaly detection. This study proposes SAX-MP, a novel similarity search approach that combines Symbolic Aggregate ApproXimation (SAX) and ma-trix profile (MP). The proposed SAX-MP method consists of two phases. The SAX method is used in the first phase to extract all of the subsequences of a time series, convert them to symbolic strings and stored these strings in an array. In the second phase, the proposed method calculates the MP based on the symbolic strings represented for all subsequences extracted in the first phase. Since a subsequence is represented by a symbolic string, the MP is calculated using a distance-based longest common subsequence rather than the z-normalized Euclidean distance. Top-k discords are detected based on the similarity MP. The proposed SAX-MP is implemented on several time series datasets. Experimental results reveal that the SAX-MP method is par-ticularly effective at detecting anomalies when compared to HOT SAX and MP-based methods.
The idea of a student research project as a method of preparing a student for professional and scientific work
ABSTRACT. In the paper we present the idea and implementation of a student research project course within the master's program at the Faculty of Electronics, Telecommunications and Informatics, Gdańsk Tech. It aims at preparing students for performing research and scientific tasks in future professional work. We outline the evolution from group projects into research project and the current deployment of both at bachelor's and master's levels respectively, management of projects i.e. steps, reporting and monitoring at both faculty and individual project's levels within our custom-built Research Project System (RPS). We further elaborate on adopted formal settings and agreements especially considering the possibility of external clients taking part in the projects. Methodology of conducting and several examples of awarded projects are presented along with statistics on the number of submitted/conducted projects as well as those finalized with actual submitted/published research papers/patents proving actual (inter)national impact of the course.
Empirical studies of students behaviour using Scottie-Go block tools to develop problem-solving experience
ABSTRACT. This research is introducing and evaluating the new method supporting grow of programming skills, computational thinking and development of problem-solving approach that evolutionarily introduce programming good practices and paradigms through a block-based programming. The proposed approach utilizes problem-based Scottie-Go game followed by Scratch programming environment implanted to Python programming course to improve the learners' programming skills and keeps motivation for further discovery of computational problem-solving activity. To date, practically little work has been devoted to examining the relationship between beginner development environments and the development practices they stimulate in their users. This article tries to shed light on this aspect of learning programming by carefully examining the behaviour of novice programmers using the innovative block-based programming learning method.
Symbolic calculation behind floating-point arithmetic: didactic examples and experiment using CAS
ABSTRACT. Floating-point arithmetic (FP) is taught at universities in the framework of different academic courses, for example: Numerical Analysis, Computer Architecture or Operational Systems. In this paper we present some simple ,,pathological” and ,,non-pathological” examples for comparison with symbolic calculations which lie behind calculations in FP with double precision. The CAS programs: Mathematica and wxMaxima are used for calculation. We explain, by making calculations directly from the definition of double precision, why in the presented examples such final results were obtained. We present a didactic experiment for students of the Informatics Faculty of Warsaw University of Life Sciences within the course of Numerical Methods. The didactic approach presented in this article is multidisciplinary - it combines theoretical issues in the field of numerical analysis with the use of practical CAS computing tools.
Code Semantics Learning with Deep Neural Networks: an AI-based Approach for Programming Education
ABSTRACT. Modern programming languages are very complex, diverse, and non-uniform in their structure, code composition, and syntax. Therefore, it is a difficult task for computer science students to retrieve relevant code snippets from large code repositories, according to their programming course requirements. To solve this problem, an AI-based approach is proposed, for students to better understand and learn code semantics, with solutions for real-world coding exercises. First, a large number of solutions are collected from a course titled ``Algorithms and Data Structures'' and preprocessed, by removing unnecessary elements. Second, the solution code is converted into a sequence of words and tokenized. Third, the sequence of tokens is used to train and validate the model, through a word embedding layer. Finally, the model is used for the relevant code retrieval and classification task, for the students. In this study, a bidirectional long short-term memory neural network (BiLSTM) is used as the core deep neural network model. For the experiment, approximately 120,000 real-world solutions from three datasets are used. The trained model achieved an average precision, recall, F1 score, and accuracy of 94.35\%, 94.71\%, 94.45\%, and 95.97\% for the code classification task, respectively. These results show that the proposed approach has potential for use in programming education.
Constrained aerodynamic shape optimization using neural networks and sequential sampling
ABSTRACT. Aerodynamic shape optimization (ASO) involves computational fluid dynamics (CFD)-based search for an optimal aerodynamic shape such as airfoils and wings. Gradient-based optimization (GBO) with adjoints can be used efficiently to solve ASO problems with many design variables, but problems with many constraints can still be challenging. The recently created efficient global optimization algorithm with neural network (NN)-based prediction and uncertainty (EGONN) partially alleviates this challenge. A unique feature of EGONN is its ability to sequentially sample the design space and continuously update the NN prediction using an uncertainty model based on NNs. This work proposes a novel extension to EGONN that enables efficient handling of nonlinear constraints and a continuous update of the prediction and prediction uncertainty data sets. The proposed algorithm is demonstrated on constrained airfoil shape optimization in transonic flow and compared against state-of-the-art GBO with adjoints. The results show that the proposed constrained EGONN algorithm yields comparable optimal designs as GBO at a similar computational cost.
Outlier detection under False Omission Rate control
ABSTRACT. We argue that in many practical situations control of False
Omission Rate (FOR) or Bayesian False Omission Rate (BFOR) is of
primary importance. We develop and investigate such rule in the context
of outlier detection, and propose its empirical formulation for practical
use. We consider several score statistics used to detect outliers and study
how well the introduced method controls FOR in practice. It is shown by
analysis of several datasets that FOR control in contrast to FDR control
is inherently tied to performance of the score statistic employed on both
inlier and outlier data sets.
Symbolic-Numeric Computation in Modeling the Dynamics of the Many-Body System TRAPPIST
ABSTRACT. Modeling the dynamics of the exoplanetary system TRAPPIST with seven bodies of variable mass moving around a central parent star along quasi-elliptic orbits is discussed. The bodies are assumed to be spherically symmetric and attract each other according to Newton's law of gravitation. In this case, the leading factor of dynamic evolution of the system is the variability of the masses of all bodies. The problem is analyzed in the framework of Hamiltonian's formalism and the differential equations of motion of the bodies are derived in terms of the osculating elements of aperiodic motion on quasi-conic sections. These equations can be solved only numerically but their right-hand sides contain many oscillating terms and so it is very difficult to obtain their solutions over long time intervals with necessary precision. To simplify calculations and to analyze the behavior of orbital parameters over long time intervals we replace the perturbing functions by their secular parts and obtain a system of the evolutionary equations composed by 28 non-autonomous linear differential equations of the first order. Choosing some realistic laws of mass variations and physics parameters corresponding to the exoplanetary system TRAPPIST, we found numerical solutions of the evolutionary equations.
All the relevant symbolic and numeric calculations are performed with the aid of the computer algebra system Wolfram Mathematica.
Simulation-Based Optimisation Model as an Element of a Digital Twin Concept for Supply Chain Inventory Control
ABSTRACT. Supply chain management is a critical success factor for many manufacturing companies. During the pandemic period, the problem of meeting delivery on time according to customer needs has intensified in many companies around the world. Companies would like to keep inventories at a level that ensures smooth order fulfillment while minimising their own costs. Deter-mining the optimal parameters is, however, a major challenge for Supply Chain inventory policy (SC). Combining simulation methods with optimisation techniques offers a methodology for obtaining an acceptable solution and, at the same time, provides a high degree of flexibility in the formulation of assumptions and the possibility of improving the decision-making process with respect to risk management. In this paper we present a simulation-based optimisation model to improve the quality of inventory management decisions in SC design and planning. Finally, we refer to the benefits of implementing the model in the concept of digital twins.
Real-time reconstruction of complex flow in nanoporous media: linear vs non-linear decoding
ABSTRACT. Physical field reconstruction from limited real-time data is a topical inverse problem that attracts substantial research effort, and complex geometries present a formidable challenge. The paper describes a reconstruction of the velocity field of a steady fluid flow through a two-dimensional porous structure from the real-time gauge readings (that is, velocity values obtained at specific fixed locations). The dataset is composed of 300 Lattice-Boltzmann simulations of the flow with different boundary conditions. The number of the gauges and their locations are varied. Two reconstruction techniques are applied: neural network and linear least squares solver. The linear solver outperforms the NN in terms of both speed and precision. Sensor locations are optimized by Monte Carlo method. The porous structure is mapped onto a graph and the optimization is performed by Metropolis type node-to-node trial displacements of the gauges. With 100 gauges, the linear method enables reconstruction of the velocity field in a porous structure discretized on 256 x 256 2D grid with the normalized error of 0.57\%.
ABSTRACT. This paper describes the design and implementation of parallel neural networks (PNNs) with the novel programming language Golang.
We follow in our approach the classical Single-Program Multiple-Data (SPMD) model where a PNN is composed of several sequential neural networks, which are trained with a proportional share of the training dataset.
We used for this purpose the MNIST dataset, which contains binary images of handwritten digits. Our analysis focusses on different activation functions and optimizations in the form of stochastic gradients and initialization of weights and biases. We conduct a thorough performance analysis, where network configurations and different performance factors are analyzed and interpreted.
Golang and its inherent parallelization support proved very well for parallel neural network simulation by considerable decreased processing times compared to sequential variants.
An analysis of Universal Differential Equations for data-driven discovery of Ordinary Differential Equations
ABSTRACT. In the last decade, the scientific community has devolved its attention to the deployment of data-driven approaches in scientific research to provide accurate and reliable analysis of a plethora of phenomena.
Most notably, Physics-informed Neural Networks and, more recently, Universal Differential Equations (UDEs) proved to be effective both in system integration and identification. However, there is a lack of an in-depth analysis of the proposed techniques. In this work, we make a contribution by testing the UDE framework in the context of Ordinary Differential Equations (ODEs) discovery. In our analysis, performed on two case studies, we highlight some of the issues arising when combining data-driven approaches and numerical solvers, and we investigate the importance of the data collection process. We believe that our analysis represents a significant contribution in investigating the capabilities and limitations of Physics-informed Machine Learning frameworks.
Non-local Neural closure models of partial differential equations
ABSTRACT. There is a growing interest in developing hybrid representations that gather both a physical model and a machine learning component. In this work, we leverage recent advances in neural networks to derive new subgrid scale models for Computational Fluid Dynamics (CFD) applications. We show on a simple case study the relevance of the proposed framework.
Battery voltage response prediction with physics-informed machine learning
ABSTRACT. In this work, we investigate the problem of predicting the dynamical response of a battery cell’s voltage given the input current and temperature profiles. This is a challenging task, due to the nonlinear nature of the battery dynamics, yet also an important task, as it constitutes an intermediate step towards estimating e.g., battery state of charge. Two widely used strategies to address this challenge include empirical modeling and data-driven modeling. Empirical modeling approach, such as the Equivalent circuit model (ECM), approximates the battery dynamics via defining electrical-circuit analogs and it is highly efficient to simulate. However, ECM usually contains unknown parameters, and not all characteristics of the battery dynamics can be captured. On the other hand, the data-driven approach can directly mine physical relations from the battery measurement data, assuming that the training data is diverse enough. Unfortunately, it is hard to verify this assumption in practice, making data-driven modeling prone to overfitting and leading to non-physical results.
The present study aims to fully exploit the respective strengths, while avoiding the weaknesses of the two aforementioned methods by adopting a physics-informed machine learning (PI-ML) technique. Specifically, we employed physics-informed Neural ODE [1] to predict the battery voltage response. This approach directly approximates the derivative of the voltage evolution and provides a flexible framework to allow having separate physics-informed and data-driven kernels, where the former represents the known formal relations (i.e., ECM), and the latter explicitly captures the discrepancy between the modeled relations and the reality. In the current study, we followed the modeling strategy proposed by Nascimento et al. [2], where we integrate the ordinary differential equations of ECM via a customized recurrent neural network.
To facilitate machine learning, we have collected lab testing data of a new lithium-ion cell under realistic load cycles. Our results indicate that the learned data-driven kernel, which is represented as a feed-forward neural network, can improve the ECM-described dynamics. As a result, the obtained prediction showed more accurate results of voltage prediction compared to the original ECM model.
Despite the higher prediction accuracy achieved by the learned data-driven kernel of Neural ODE, it is still a black-box in nature. To facilitate interpretation, we further employed the Sparse Identification of Nonlinear Dynamics (SINDy) approach [3] to distill the physical knowledge captured by the learned data-driven kernel. We managed to obtain a simple linear equation to approximate the trained data-driven kernel to obtain a fully transparent, closed-form governing equation that describes the battery dynamics. Our results showed that despite using a linear approximation, the performance drop from the ECM with the full data-driven kernel is insignificant. Our observations indicate that SINDy can extract most of the physical knowledge encapsulated by the trained data-driven kernel and potentially even contributed to mitigation of the slight overfitting of the data-driven kernel, thus promoting high predictive capability in long-term forecasting.
In summary, in the current work we demonstrated the effectiveness of leveraging physics-informed Neural ODE that combines prior physical knowledge and insights from data to achieve more accurate prediction of the battery voltage response. In addition, we also showed that the SINDy approach is invaluable as an explanation tool, as it promotes transparency and potentially leads to a more generalizable model.
Physics-Informed Long Short-Term Memory for Forecasting and Reconstruction of Chaos
ABSTRACT. We present the Physics-Informed Long Short-Term Memory (PI-LSTM) network to reconstruct and predict the evolution of unmeasured variables in a chaotic system. The training is constrained by a regularization term, which penalizes solutions that violate the system’s governing equations. The network is showcased on the Lorenz-96 model, a prototypical chaotic dynamical system, for a varying number of variables to reconstruct. First, we show the PI-LSTM architecture and explain how to constrain the differential equations, which is a nontrivial task in LSTMs. Second, the PI-LSTM is numerically evaluated in the long-term autonomous evolution to study its ergodic properties.
We show that it correctly predicts the statistics of the unmeasured variables, which cannot be achieved without the physical constraint. Third, we compute the Lyapunov exponents of the network to infer the key stability properties of the chaotic system. For reconstruction purposes, adding the physics-informed loss qualitatively enhances the dynamical behaviour of the network, compared to a data-driven only training. This is quantified by the agreement of the Lyapunov exponents. This work opens up new opportunities for state reconstruction and learning of the dynamics of nonlinear systems.
Fixed-Budget Online Adaptive Learning for Physics-Informed Neural Networks. Towards Parameterized Problem Inference.
ABSTRACT. Physics-Informed Neural Networks (PINNs) have gained much attention in various fields of engineering thanks to their capability of incorporating physical laws into the models. PINNs integrate the physical constraints by minimizing the partial differential equations (PDEs) residuals on a set of collocation points. The distribution of these collocation points appears to have a huge impact on the performance of PINNs and the assessment of the sampling methods for these points is still an active topic. In this paper, we propose a Fixed-Budget Online Adaptive Learning (FBOAL) method, which decomposes the domain into sub-domains, for training collocation points based on local maxima and local minima of the PDEs residuals. The effectiveness of FBOAL is demonstrated for non-parameterized and parameterized problems. The comparison with other adaptive sampling methods is also illustrated. The numerical results demonstrate important gains in terms of the accuracy and computational cost of PINNs with FBOAL over the classical PINNs with non-adaptive collocation points. We also apply FBOAL in a complex industrial application involving coupling between mechanical and thermal fields. We show that FBOAL is able to identify the high-gradient locations and even give better predictions for some physical fields than the classical PINNs with collocation points sampled on a pre-adapted finite element mesh built thanks to numerical expert knowledge. From the present study, it is expected that the use of FBOAL will help to improve the conventional numerical solver in the construction of the mesh.
Champion Recommendation in League of Legends Using Machine Learning
ABSTRACT. League of Legends (LoL) is a Multiplayer Online Battle Arena (MOBA) game with over 160 champions and a competitive esports scene. The ban-and-pick system allows players to choose champions before a match, which can greatly impact the outcome, especially in professional leagues. This article presents an overview of the development of a champion recommendation system for League of Legends, with a focus on the evaluation and comparison of multiple machine learning models. The system offers real-time recommendations during the pick and ban phase, utilizing data from professional and high-level games. The accuracy and performance of the various models are analyzed and presented, providing insights into the strengths and weaknesses of each approach in solving the task of champion recommendation. Results show that the player's statistics on a champion are the biggest determinants of a game's outcome.
A Method of Social Context Enhanced User Preferences for Conversational Recommender Systems
ABSTRACT. Conversational recommender systems (CRS) can dynamically capture user fine-grained preference by directly asking whether a user likes an attribute or not. However, like traditional recommender systems, accurately comprehending users' preferences remains a critical challenge for CRS to make effective conversation policy decisions. While there have been various efforts made to improve the performance of CRS, they have neglected the impact of the users' social context, which has been proved to be valuable in modeling user preferences and enhancing the performance of recommender systems. In this paper, we propose a social-enhanced user preference estimation model (SocialCRS) to leverage the social context of users to better learn user embedding representation. Specifically, we construct a user-item-attribute heterogeneous graph and apply a graph convolution network (GCN) to learn the embeddings of users, items, and attributes. Another GCN is used on the user social context graph to learn the social embedding of users. To estimate better user preference, the attention mechanism is adopted to aggregate the embedding of the user's friends. By aggregating these users' embeddings, we obtain social-enhanced user preferences. Through extensive experiments on two public benchmark datasets in a multi-round conversational recommendation scenario, we demonstrate the effectiveness of our model, which significantly outperforms the state-of-the-art CRS methods.
Forest Image Classification based on Deep learning and XGBoost algorithm
ABSTRACT. The main aim of this research paper is to present results
obtained from an ensambled approach that employed deep learning and
machine learning techniques for forest image classification. Deep learning
and machine learning methods have been recently used in forest classi-
fication problems, and have shown significant improvement in terms of
efficacy. However, as attributed from the literature, they have a challenge
of having insufficient model variance and restricted generalization capa-
bilities. This study has proposed an ensamble approach of Deep Learning
technique (ResNet50 in particular), and machine learning model (specif-
ically XGBoost) to increase prediction capability of classifying satellite
forest images. The sole purpose of ResNet50 is to generate a set of fea-
tures which will in turn be used by the XGBoost algorithm to perform
the classification process. The XGBoost algorithm was compared against
other classifiers such as random forest (RF) and light gradient boost
machine (LGBM). The best classification results was obtained from XG-
Boost(0.77), followed by RF(0.74) and lastly LGBM(0.73).
Cerebral vessel segmentation in CE-MR images using deep learning and synthetic training datasets
ABSTRACT. This paper presents a novel architecture of a convolutional neural network designed for the segmentation of intracranial arteries in contrast-enhanced magnetic resonance angiography (CE-MRA). The proposed architecture is based on the V-Net model, however, with substantial modifications in the bottleneck and the decoder part. In order to leverage multiscale characteristics of the input vessel patterns, we postulate to pass the network embeddings generated on the encoder path output through the atrous spatial pyramid pooling block. We motivate that this mechanism allows the decoder part to rebuild the segmentation mask based on local features, however, determined at various ranges of voxels neighborhoods. The ASPP outputs are aggregated using a simple gated recurrent unit which, on the other hand, facilitates learning of feature maps' relevance with respect to the final output. We also propose to enrich the global context information provided to the decoder by including a vessel-enhancement block responsible for filtering out background tissues. In this study, we also aimed to verify if it is possible to train an effective deep-learning vessel segmentation model based solely on synthetic data. For that purpose, we reconstructed 30 realistic cerebral arterial tree models and used our previously developed MRA simulation framework.
The role of conformity in opinion dynamics modelling with multiple social circles
ABSTRACT. Interaction with others influences our opinions and behaviours. Our activities within various social circles lead to different opinions expressed in various situations, groups, and ways of communication. Earlier studies on agent-based modelling of conformism within networks were based on a single-layer approach. Contrary to that, in this work, we propose a model incorporating conformism in which a person can share different continuous opinions on different layers depending on the social circle. Afterwards, we extend the model with more components that are known to influence opinions, e.g. authority or openness to new views. These two models are then compared to show that only sole conformism leads to opinion convergence.
CA-based Collective Behavior Approach to Solve Coverage and Lifetime Optimization Problems in Wireless Sensor Networks
ABSTRACT. We propose a multi-agent approach based on applying of Cellular Automata (CA) participating in spatial Prisoner's Dilemma iterated games to solve in a distributed way a problem of lifetime optimization in Wireless Sensor Networks (WSN). Nodes of a WSN graph created for a given deployment of WSN in the monitored area are considered as agents of a multi-agent system that collectively take decisions to turn-on or turn-off their batteries. A local, agent-player payoff function incorporates issues of area coverage and sensors energy spending. Payoffs of agent-players depend not only on their personal decisions but also on their neighbor's decisions. Agents act in such a way to maximize their own rewards which results in achieving by them a solution corresponding to the Nash equilibrium. We show that the system is self--optimizing, i.e. can optimize global criteria not known for players, related to WSN, and provide a balance between requested coverage and spending energy, and result in expanding WSN lifetime. %Proposed by the multi-agent system solutions fulfill Pareto optimality principles. The solutions proposed by the multi-agent system fulfill the Pareto optimality principles, and the desired quality of solutions can be controlled by user-defined parameters. The proposed approach is validated by a number of experimental results.
From Online Behaviours to Images: A Novel Approach to Social Bot Detection
ABSTRACT. Online Social Networks have revolutionized how we consume and share information, but they have also led to a proliferation of content not always reliable and accurate. One particular type of social accounts is known to promote unreputable content, hyperpartisan, and propagandistic information. They are automated accounts, commonly called bots. Focusing on Twitter accounts, we propose a novel approach to bot detection: we first propose a new algorithm that transforms the sequence of actions that an account performs into an image; then, we leverage the strength of Convolutional Neural Networks to proceed with image classification. We compare our performances with state-of-the-art results for bot detection on genuine accounts / bot accounts datasets well known in the literature. The results confirm the effectiveness of the proposal, because the detection capability is on par with the state of the art, if not better in some cases.
Combining Outlierness Scores and Feature Extraction Techniques for Improvement of OoD and Adversarial Attacks Detection in DNNs
ABSTRACT. Out-of-distribution (OoD) detection is one of the challenges for deep networks used for image recognition. Although recent works have proposed several state-of-the-art methods of OoD detection, no clear recommendation exists as to which of the methods is inherently best. Our studies and recent works suggest that there is no universally best OoD detector, as performance depends on the in-distribution (ID) and OoD benchmark datasets.
This leaves ML practitioners with an unsolvable problem - which OoD methods should be used in real-life applications where limited knowledge is available on the structure of ID and OoD data. To deal with this problem, we propose a novel, ensemble-based OoD detector that combines outlierness scores from different categories: prediction score-based, (Mahalanobis) distance-based, and density-based.
We showed that our method consistently outperforms individual SoTA algorithms in the task of (i) the detection of OoD samples and (ii) the detection of adversarial examples generated by a variety of attacks (including CW, DeepFool, FGSM, OnePixel, etc.). Adversarial attacks commonly rely on the specific technique of CNN feature extraction (GAP – global average pooling). We found that the detection of adversarial examples as OoD significantly improves if we also ensemble over different feature extraction methods(such as GAP, cross-dimensional weighting (CroW), and layer-concatenated GAP).
Our method can be readily applied with popular DNN architectures and does not require additional representation retraining for OoD detection.
Machine learning detects anomalies in OPS-SAT telemetry
ABSTRACT. Detecting anomalies in satellite telemetry data is pivotal in ensuring its safe operations. Although there exist various data-driven techniques for the task of determining abnormal parts of the signal, they are virtually never validated over real telemetries. Analyzing such data is challenging due to its intrinsic characteristics, as telemetry may be noisy and affected by incorrect acquisition, resulting in missing parts of the signal. In this paper, we tackle this issue and propose a machine learning approach for detecting anomalies in single-channel satellite telemetry. To validate its capabilities in a practical scenario, we build a dataset capturing the nominal and anomalous telemetry data captured on board OPS-SAT - a nanosatellite launched and operated by the European Space Agency. Our extensive experimental study showed that the proposed algorithm offers high-quality anomaly detection in real-life satellite telemetry, reaching 98.4% accuracy over the unseen test set.
Forecasting Cryptocurrency Prices using Contextual ES-adRNN with Exogenous Variables
ABSTRACT. In this paper, we introduce a new approach to multivariate forecasting cryptocurrency prices using a hybrid contextual model combining exponential smoothing (ES) and recurrent neural network (RNN). The model consists of two tracks: the context track and the main track. The context track provides additional information to the main track, extracted from representative series. This information as well as information extracted from exogenous variables is dynamically adjusted to the individual series forecasted by the main track. The RNN stacked architecture with hierarchical dilations, incorporating recently developed attentive dilated recurrent cells, allows the model to capture short and long-term dependencies across time series and dynamically weight input information. The model generates both point daily forecasts and predictive intervals for one-day, one-week and four-week horizons. We apply our model to forecast prices of 15 cryptocurrencies based on 17 input variables and compare its performance with that of comparative models, including both statistical and ML ones.
Feature Extraction for Bankruptcy Prediction Using Autoencoder
ABSTRACT. The enterprise is characterized not only by financial data but also by other features: from macroeconomics, sectors, social, board, and management perspectives. Depending on the availability of data, the number of features grows rapidly. Feature selection techniques find a narrow subset from the initial space, while feature extraction techniques ensure maximum information retention from the initial data space. In this work, using autoencoder as a nonlinear feature extraction method, we reduce the feature set for efficient model creation by creating various autoencoder structure composition strategies. The first strategy compresses all the data, whereas the second compresses each data type separately, resulting in 12 unique autoencoders for each extract data set. In the classification, phase is used eight, different methods (with their modifications): LG, DT, RF, XGBoost, ANN, CNN, EML, and SOM. The results have shown that features extracted by different data types give better results when extracted all at once from the merged data sources. This, lead to the creation of a novel method of using an autoencoder, which could improve bankruptcy prediction models.
Enhanced Emotion and Sentiment Recognition for Empathetic Dialogue System Using Big Data and Deep Learning Methods
ABSTRACT. The article presents the results of work on improving sentiment and emotion recognition for Polish texts using big data-based expansion process and larger neural language models. The proposed recognition method is intended to serve in a dialogue system in a therapeutic system to analyze sentiment and emotion in human utterances. First, the language model is enhanced, by replacing the BERT neural language model with RoBERTa. Next, the emotion-based text corpus is enlarged. A novel process of augmenting an emotion-labeled text corpus using semantically similar data from an unlabeled corpus, inspired by semi-supervised learning methods, is proposed. The process of using the Common Crawl web archive to create an enlarged corpus, named CORTEX+pCC, is presented. An empathetic dialogue system named Terabot, incorporating the elaborated method, is also described. The system is designed to employ elements of cognitive-behavioral therapy for psychiatric patients. The improved language model trained on the enlarged CORTEX+pCC corpus resulted in remarkably improved sentiment and emotion recognition. The average accuracy and F1 scores increased by around 3\% and 8\% relative, which will allow the dialogue system to operate more appropriately for the emotional state of the patient.
A Contrastive Self-Distillation BERT with Kernel Alignment-Based Inference
ABSTRACT. Early exit, as an effective method to accelerate pre-trained language models, has recently attracted much attention in the field of natural language processing. However, existing early exit methods are only suitable for low acceleration ratios due to two reasons: (1) The shallow classifiers in the model lack semantic information. (2) Exit decisions in the intermediate layers are unreliable. To address the above issues, we propose a Contrastive self-distillation BERT with kernel alignment-based inference (CsdBERT), which aims to let shallow classifiers learn deep semantic knowledge to make comprehensive predictions. Specifically, we classify the early exit classifiers into teachers and students based on classification loss to distinguish the representation ability of the classifiers. Firstly, we present a contrastive learning approach between teacher and student classifiers to maintain the consistency of class similarity between them. Then, we introduce a self-distillation strategy between these two kinds of classifiers to solidify learned knowledge and accumulate new knowledge. Finally, we design a kernel alignment-based exit mechanism to identify samples of different difficulty for accelerating BERT inference. Experimental results on the GLUE and ELUE benchmarks show that CsdBERT not only achieves state-of-the-art performance, but also maintains 95% performance at 4x speed.
DeBERTNeXT: A Multimodal Fake News Detection framework
ABSTRACT. There is a rapid influx of fake news nowadays, which poses an immense threat to our society. Fake news has been impacting us in several ways, which include changing our thoughts, manipulating opinions, and also causing chaos due to misinformation. With the ease of access and sharing information on social media platforms, such fake news or misinformation has been spreading in different formats, which include text, image, audio, and video. Although there have been a lot of approaches to detecting fake news in textual format only, however, multimodal approaches are less frequent as it is difficult to fully use the information derived from different modalities to achieve high accuracy in a combined format. To tackle these issues, we introduce DeBertNeXT which is a multimodal fake news detection model that utilizes both textual and visual information from an article for fake news classification. We perform experiments on the immense Fakeddit dataset and two other smaller benchmark datasets named Politifact and Gossipcop. Our model outperforms the existing models on the Fakeddit dataset by about 3.80%, Politifact by 2.10%, and Gossipcop by 1.00%.
Fine-Tuning Large Language Models for Answering Programming Questions with Code Snippets
ABSTRACT. We study the ability of pretrained large language models (LLM) to answer questions from online question answering fora such as Stack Overflow. We consider question-answer pairs where the main part of the answer consists of source code. On two benchmark datasets—CoNaLa and a newly collected dataset based on Stack Overflow—we investigate how a closed-book question answering system can be improved by fine-tuning the LLM for the downstream task, prompt engineering, and data preprocessing. We use publicly available autoregressive language models such as GPT-Neo, CodeGen, and PanGu-Coder, and after the proposed fine-tuning achieve a BLEU score of 0.4432 on the CoNaLa test set, significantly exceeding previous state of the art for this task.
Bayesian Networks for Named Entity Prediction in Programming Community Question Answering
ABSTRACT. Within this study, we propose a new approach for natural language processing using Bayesian networks to predict and analyze the context and how this approach can be applied to the Community Question Answering domain. We discuss how Bayesian networks can detect semantic relationships and dependencies between entities, and this is connected to different score-based approaches of structure-learning. We compared the Bayesian networks with different score metrics, such as the BIC, BDeu, K2 and Chow-Liu trees. Our proposed approach out-performs the baseline model at the precision metric. We also discuss the influence of penalty terms on the structure of Bayesian networks and how they can be used to analyze the relationships between entities. In addition, we examine the visualization of directed acyclic graphs to analyze semantic relationships. The article further identifies issues with detecting certain semantic classes that are separated in the structure of directed acyclic graphs. Finally, we evaluate potential improvements for the Bayesian network approach.
A Framework for Effective Guided Mnemonic Journeys
ABSTRACT. The memory palace, also known as the memory journey, is a mnemonic technique where information to be remembered is encountered along a predetermined path through envisioned places, creating strong spatial and visual connections between the material and specific locations and vivid images. Constructing a memory palace for complex computer science concepts may prove challenging for students, as it requires the identification, selection, and organization of essential ideas from the material. This task is better suited to an expert on the curriculum, such as the instructor.
In this paper, a framework for designing and delivering Guided Mnemonic Journeys is proposed. Led by a teacher in person or via audio or video, the approach uses a virtual tour of the university campus at its core. The approach combines various mnemonic techniques to create a more comprehensive approach to memory enhancement. The instructor plans the story arc and places various mnemonic cues along the path and then guides the students through the narrative and imagery, providing context, clear structure, and instructions for each step.
A pilot study indicated that students perceived the delivered Guided Mnemonic Journeys positively, appreciating the rich didactic activity and the opportunity to learn new mnemonic techniques.
Analysis of outcomes from the gamification of a collaboration intensive course on computer networking basics
ABSTRACT. The paper contains a comparison of the results of conducting a computer networking basics course in a gamified and non-gamified way. In the analysis we focus rather on students' learning outcomes, measured by exam scores, rather than on their satisfaction, and other subjective metrics. The experience presented in the article was gathered during last four years (2019-2022). The COVID-19 pandemic, which affected the way the classes were conducted in 2020 and 2021, gave us an opportunity to compare the same means of gamification used in the same course in different learning environments. The evaluation results show that gamification yielded better results when applied in an in-person environment.
Towards an Earned Value Management Didactic Simulation Tool to Engineering Management Teaching
ABSTRACT. Agile development (AD) is a methodology that many small businesses have adopted for production convenience, and educators have taken notice of the trend. A need to implement some form of agile development in undergraduate programs at universities is now clear, particularly for undergraduate engineering students who should understand their role in a project focused on AD. This paper presents our preliminary evaluation of user experience (UX) using an Earned Value Management (EVM) simulator, which helps the student understand the team member's role in an agile development process. The simulator uses a Task-board interface to display task status changes, a burn-down chart to depict the remaining work, and EVM metrics to assess the efficiency of the teamwork. Using the Task-board and EVM models, the simulator offers students different agile project management experimental experiences.
Efficient uncertainty quantification using sequential sampling-based neural networks
ABSTRACT. Uncertainty quantification (UQ) of an engineered system involves the identification of uncertainties, modeling of the uncertainties, and the forward propagation of the uncertainties through a system analysis model. In this work, a novel surrogate-based forward propagation algorithm for UQ is proposed. The proposed algorithm is a new and unique extension of the recent efficient global optimization using neural network (NN)-based prediction and uncertainty (EGONN) algorithm which was created for optimization. The proposed extended algorithm is specifically created for UQ and is called uqEGONN. The uqEGONN algorithm sequentially and simultaneously samples two NNs, one for the prediction of a nonlinear function and the other for the prediction uncertainty. The uqEGONN algorithm terminates based on the absolute relative changes in the summary statistics based on Monte Carlo simulations (MCS), or a given maximum number of sequential samples. The algorithm is demonstrated on the UQ of the Ishigami function. The results show that the proposed algorithm yields comparable results as MCS on the true function and those results are more accurate than the results obtained using space-filling Latin hypercube sampling to train the NNs.
Reduction of the computational cost of tuning methodology of a simulator of a physical system.
ABSTRACT. We propose a methodology for calibrating a physical system simulator and whose computational model represents its events in time series. The method-ology reduces the search space of the fit parameters by exploring a database that contains stored historical events and their corresponding simulator fit pa-rameters. We carry out the symbolic representation of the time series using or-dinal patterns to classify the series, which allows us to search and compare by similarity on the stored data of the series represented. This classification strat-egy allows us to speed up the parameter search process, reduce the computa-tional cost of the adjustment process and consequently improve energy cost savings. The experiences showed a reduction in the computational cost of 29% compared with our tuning methodology proposed in previous research.
ABSTRACT. A new surrogate-assisted, pruned dynamic programming-based optimal path search algorithm - studied in the context of ship weather routing - is shown to be both effective and (energy) efficient. The key elements in achieving this - the fast and accurate physics-based surrogate model, the pruned simulation, and the OpenCL-based SPMD-parallelisation of the algorithm - are presented in detail. The included results show the high accuracy of the surrogate model (relative approximation error medians smaller than 0.2%), its efficacy in terms of computing time reduction resulting from pruning (from 43 to 60 times), and the notable speedup of the parallel algorithm (up to 9.4). Combining these effects gives up to 565 times faster execution. The proposed approach can also be applied to other domains. It can be considered as a dynamic programming based, optimal path planning framework parameterised by a problem specific (potentially variable-fidelity) cost-function evaluator.
Hierarchical Learning to Solve PDEs Using Physics-Informed Neural Networks
ABSTRACT. The neural network-based approach to solving partial differential equations has attracted considerable attention. In training a neural network, the network learns global features corresponding to low-frequency components while high-frequency components are approximated at a much slower rate. For a class of equations in which the solution contains a wide range of scales, the network training process can suffer from slow convergence and low accuracy due to its inability to capture the high-frequency components. In this work, we propose a sequential training based on a hierarchy of networks to improve the convergence rate and accuracy of the neural network solution to partial differential equations. The proposed method comprises multi-training levels in which a newly introduced neural network is guided to learn the residual of the previous level approximation. We validate the efficiency and robustness of the proposed hierarchical approach through a suite of partial differential equations.
Transparent Checkpointing for Automatic Differentiation of Program Loops through Expression Transformations
ABSTRACT. Automatic differentiation (AutoDiff) in machine learning is
largely restricted to expressions used for neural networks (NN). with the
depth rarely exceeding a few tens of layers. In stark contrast to NN,
numerical simulations typically involve iterative algorithms like time
steppers that lead to millions of iterations. Even for modest-sized models, this may yield infeasible memory requirements when applying the
adjoint method, also called backpropagation, to time-dependent problems. In this situation, checkpointing algorithms provide a trade-off between recomputation and storage. In this paper, we present the package
Checkpointing.jl that leverages expression transformations in the programming language Julia and the package ChainRules.jl to automatically
and transparently transform loop iterations into differentiated loops. The
user may choose between various checkpointing algorithm schemes and
storage devices. We describe the unique design of Checkpointing.jl and
demonstrate its features on an automatically differentiated MPI implementation of Burgers’ equation on the Polaris cluster at the Argonne
Leadership Computing Facility.
Semi-supervised Learning Approach to Efficient Cut Selection in the Branch-and-Cut Framework
ABSTRACT. Mixed integer programming (MIP) is an extremely versatile subclass of mathematical optimization problems. Applications of MIP are ubiquitous in our world today, ranging from scheduling to network design to production planning. Owing to its integrality constraints, MIP problems can be extremely difficult to solve efficiently, especially at large scales. The standard approach in state-of-the-art commercial solvers is called branch-and-cut. The branch-and-cut framework recursively reduces the solution space by splitting the original MIP problem into subproblems (branching). At each of these subproblems, cutting planes are added to further reduce the solution space (cutting). The selection of these cuts is an integral part of the branch-and-cut process as high-quality cuts can greatly increase solving efficiency. Currently, cut selection is decided by heuristics that both require expert knowledge and lack generalizability. In this paper, we propose an efficient and highly generalizable cut selection scheme based on semi-supervised learning. First, we design a cut evaluation metric that labels cuts based on whether they are efficient or not. Then, we train a deep learning classification model with unsupervised pre-training as a ranking function for cuts. In our evaluation, the proposed model outperforms standard heuristics and is comparable to existing machine learning approaches. Furthermore, the model is shown to be generalizable over both problem size and problem class.
A Case Study of the Profit-Maximizing Multi-Vehicle Pickup and Delivery Selection Problem for the Road Networks with the Integratable Nodes
ABSTRACT. This paper is a study of an application-based model in profit-maximizing multi-vehicle pickup and delivery selection problem (PPDSP). The graph-theoretic model proposed by existing studies of PPDSP is based on transport requests to define the corresponding nodes (i.e., each request corresponds to a pickup node and a delivery node). In practice, however, there are probably multiple requests coming from or going to an identical location. Considering the road networks with the integratable nodes as above, we define a new model based on the integrated nodes for the corresponding PPDSP and propose a novel mixed-integer formulation. In comparative experiments with the existing formulation, as the number of integratable nodes increases, our method has a clear advantage in terms of the number of variables as well as the number of constraints in the generated instances, and the accuracy of the optimized solution obtained within a given time.
An efficient ViT-based spatial interpolation learner for field reconstruction
ABSTRACT. In the field of large-scale field reconstruction, Kriging has been a commonly used technique for spatial interpolation at unobserved locations. However, Kriging's effectiveness is often restricted when dealing with non-Gaussian or non-stationary real-world fields, and it can be computationally expensive. On the other hand, supervised deep learning models can potentially address these limitations by capturing underlying patterns between observations and corresponding fields. In this study, we introduce a novel deep learning model that utilizes vision transformers and autoencoders for large-scale field reconstruction. The new model is named ViTAE. The proposed model is designed specifically for large-scale and complex field reconstruction. Experimental results demonstrate the superiority of ViTAE over Kriging. Additionally, the proposed ViTAE model runs more than 1000 times faster than Kriging, enabling real-time field reconstructions.
Learning 4DVAR inversion directly from observations
ABSTRACT. Variational data assimilation and deep learning share many algorithmic aspects in common. While the former focuses on system state estimation, the latter provides great inductive biases to learn complex relationships. We here design a hybrid architecture learning the assimilation task directly from partial and noisy observations, using the mechanistic constraint of the \4DVAR algorithm. Finally, we show in an experiment that the proposed method was able to learn the desired inversion with interesting regularizing properties and that it also has computational interests.
Towards Online Anomaly Detection in Steel Manufacturing Process
ABSTRACT. Data generated by manufacturing processes can often be represented as a data stream.
The main characteristics of these data are that it is not possible to store all the data in memory, the data are generated continuously at high speeds, and it may evolve over time.
These characteristics of the data make it impossible to use ordinary machine learning techniques.
Specially crafted methods are necessary to deal with these problems, which are capable of assimilation of new data and dynamic adjustment of the model.
In this work, we consider a cold rolling mill, which is one of the steps in steel strip manufacturing, and apply data stream methods to predict distribution of rolling forces based on the input process parameters.
The model is then used for the purpose of anomaly detection during online production.
Three different machine learning scenarios are tested to determine an optimal solution that fits the characteristics of cold rolling.
The results have shown that for our use case the performance of the model trained offline deteriorates over time, and additional learning is required after deployment.
The best performance was achieved when the batch learning model was re-trained using a data buffer upon concept drift detection.
We plan to use the results of this investigation as a starting point for future research, which will involve more advanced learning methods and a broader scope in relation to the cold rolling process.
Taking a Shortcut Through Phase Space: Neural Networks Solving Differential Equations
ABSTRACT. Differential equations are a formidable tool to describe the behaviour and evolution of complex systems. Solving these differential equations is therefore crucial in order to be able to model and understand these systems. This has prompted, over the years, the development of analytical and numerical methods to solve differential equations, each with their advantages and limitations. Physics-Informed Neural Networks (PINNs) offer an alternative perspective for this problem. Since Neural Networks are universal function approximators, they can be trained to approximate the solution of a differential equation. PINNs achieve this by incorporating the differential equation itself in their loss function. In our contribution, we present our recent work aimed at improving the efficiency of PINNs in solving differential equations by analysing the role of the formulation of the problem on the learning capabilities of the network.
Balancing agents for mining imbalanced multiclass datasets - performance evaluation
ABSTRACT. The paper deals with mining imbalanced multiclass datasets.
The goal of the paper is to evaluate the performance of several balancing
agents implemented by the authors. Agents have been constructed from
5 state-of-the-art classifiers designed originally for mining binary imbalanced
datasets. To transform binary classifiers into multiclass ones,
we use the one-versus-one (OVO) approach using the collective
wisdom demonstrated by the majority voting. The paper describes our
approach and provides a detailed description of the respective balancing
agents. Their performance is evaluated in an extensive computational experiment.
The experiment involved multiclass imbalanced datasets from
the Keel imbalanced datasets repository. Experiment results allowed us to
select the best-performing balancing agents using statistical tools.
Decision Tree-Based Algorithms for Detection of Damage in AIS Data
ABSTRACT. Automatic Identification System (AIS) is a system developed for maritime traffic monitoring and control. The system is based on an obligatory automatic exchange of information transmitted by the ships. The Satellite-AIS is a next generation of AIS system based on a satellite component that allows AIS to operate with a greater range. However, due to technical limitations, some AIS data collected by the satellite component are damaged, which means that AIS messages might contain errors or unspecified (missing) values. Thus, the problem of reconstruction of the damaged AIS data needs to be considered for improving performance of the Satellite-AIS in general. The problem is still open from research point of view. The aim of the paper is to compare selected decision tree based algorithms for detecting the damaged AIS messages. A general concept of detection of the damaged AIS data is presented. Then, the assumption and results of the computational experiment are reported, together with final conclusions.
Impact of Input Data Preparation on Multi-Criteria Decision Analysis Results
ABSTRACT. Multi-criteria decision analysis (MCDA) methods support stakeholders in solving decision-making problems in an environment that simultaneously considers multiple criteria whose objectives are often conflicting. These methods allow the application of numerical weights representing the relevance of criteria and, based on the provided decision matrices with performances of alternatives, calculate their scores based on which rankings are created. MCDA methods differ in their algorithms and can calculate the scores of alternatives given constructed reference solutions or focus on finding compromise solutions. An essential initial step in many MCDA methods is the normalization procedure of the input decision matrix, which can be performed using various techniques. The possibility of using different normalization techniques implies getting different results. Also, the imprecision of the data provided by decision-makers can affect the results of MCDA procedures. This paper investigates the effect of normalizations other than the default on the variability of the results of three MCDA methods: Additive Ratio Assessment (ARAS), Combined Compromise Solution (CoCoSo), and Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS). The research demonstrated that the normalization type's impact is noticeable and differs depending on the explored MCDA method. The results of the investigation highlight the importance of benchmarking different methods and techniques in order to select the method that gives solutions most robust to the application of different computing methods supporting MCDA procedures and input data imprecision.