The Computational Planet – with Computational Insects: Wavelet-Based Adaptive Simulations of Flapping Flight
ABSTRACT. Flying insects, spectacular little flying machines with enormous evolutionary success, are an important source of inspiration for a large, interdisciplinary community of scientists. The aerodynamic mechanisms they use for propulsion are quite different from human-designed fliers, and many aspects of their locomotion are not yet understood.
In this lecture, I will illustrate what insights numerical simulations can contribute to our knowledge on the aerodynamics of flapping flight and discuss why such simulations are highly challenging. I present WABBIT, my open-source framework for performing such simulations. This code solves the Navier-Stokes equations on dynamically adapted grids with local refinement, and it runs on massively-parallel supercomputers. For grid adaptivity I use biorthogonal wavelets as a tool for compression and regularity detection.
I will show computational results on bumblebees flying in heavy turbulence, demonstrating that flapping flight in turbulence is limited by flight control, but not force production. Finally, I will conclude with the latest results on the peculiar flight of some of the smallest insects. They often feature bristled wings that have lost almost their entire membrane. Yet, those animals fly surprisingly well, and I discuss how numerical simulations helped understanding their aerodynamics.
Incremental Mining of Frequent Serial Episodes Considering Multiple Occurrences
ABSTRACT. The need to analyze information from streams arises in a variety of applications. One of the fundamental research directions is to mine sequential patterns over data streams. Current studies mine series of items based on the existence of the pattern in transactions but pay no attention to the series of itemsets and their multiple occurrences. The pattern over a window of itemsets stream and their multiple occurrences, however, provides additional capability to recognize the essential characteristics of the patterns and the inter-relationships among them that are unidentifiable by the existing items and existence based studies. In this paper, we study such a new sequential pattern mining problem and propose a corresponding efficient sequential miner with novel strategies to prune search space efficiently. Experiments on both real and synthetic data show the utility of our approach.
Classifying Anomalous Members in a Collection of Multivariate Time Series Data Using Large Deviations Principle: An Application to COVID-19 Data
ABSTRACT. Anomaly detection for time series data is often aimed at identifying extreme behaviors within an individual time series. However, identifying extreme trends relative to a collection of other time series is of significant interest, like in the fields of public health policy, social justice and pandemic propagation. We propose an algorithm that can scale to large collections of time series data using the concepts from the theory of large deviations. Exploiting the ability of the algorithm to scale to high-dimensional data, we propose an online anomaly detection method to identify anomalies in a collection of multivariate time series. We demonstrate the applicability of the proposed Large Deviations Anomaly Detection (LAD) algorithm in identifying counties in the United States with anomalous trends in terms of COVID-19 related cases and deaths. Several of the identified anomalous counties correlate with counties with documented poor response to the COVID pandemic.
Boosted Ensemble Learning based on Randomized NNs for Time Series Forecasting
ABSTRACT. Time series forecasting is a challenging problem when a time series expresses multiple seasonality, nonlinear trend and varying variance. In this work, to forecast complex time series, we propose ensemble learning based on randomized neural networks, which is boosted in three ways. These comprise ensemble learning based on residuals, corrected targets and an opposed response. The latter two methods of ensembling are employed to ensure similar forecasting tasks are solved by all ensemble members, which justifies the use of exactly the same base models as ensemble members. Unification of the tasks for all members simplifies ensemble learning and leads to increased forecasting accuracy. This was confirmed in an experimental study including forecasting time series with triple seasonality, in which we compare our three variants of ensemble boosting. The strong points of the proposed ensembles based on RandNNs are extremely rapid training and pattern-based time series representation, which extracts relevant information from time series.
Robust control of perishable inventory with uncertain lead time using neural networks and genetic algorithm
ABSTRACT. The expansion of modern supply chains constantly triggers the need of maintaining resilience and agility for higher profit. There is a need to change the standard methods of inventory control to new approaches that are highly adaptable to uncertainties that emerged as a result of supply chains globalization. In this paper, a novel approach based on neural network, state-space control and robust optimization is proposed to support the perishable inventory replenishment decisions subject to uncertain lead times. We develop an approach based on the Wald criterion to compute optimal robust (i.e. “best of the worst” case) controller parameters. We incorporate lead-time specific perturbations through plausible scenarios using several lead times sets. Based on extensive numerical experiments, the obtained solutions highlight that the approach provides stable and robust solutions even for high lead times.
Neuroevolutionary Feature Representations for Causal Inference
ABSTRACT. Within the field of causal inference, we consider the problem of estimating heterogeneous treatment effects from data. We propose and validate a novel approach for learning feature representations to aid the estimation of the conditional average treatment effect or CATE. Our method focuses on an intermediate layer in a neural network trained to predict the outcome from the features. In contrast to previous approaches that encourage the distribution of representations to be treatment-invariant, we leverage a genetic algorithm that optimizes over representations useful for predicting the outcome to select those less useful for predicting the treatment. This allows us to retain information within the features useful for predicting outcome even if that information may be related to treatment assignment. We validate our method on synthetic examples and illustrate its use on a real life dataset.
Privacy paradox in social media: A system dynamics analysis
ABSTRACT. The term ‘privacy paradox’ refers to the apparent inconsistency between people’s concerns about their privacy and their actual privacy behaviour. Although several possible explanations for this phenomenon have been provided so far, these assume that (1) all people share the same privacy concerns and (2) a snapshot at a given point in time is enough to explain the phenomenon. To overcome these limitations, this article presents a system dynamics simulation model that considers the diversity of privacy concerns during the process of social media adoption and identifies the conditions under which the privacy paradox emerges. The results show that (1) the least concerned minority can induce the more concerned majority to adopt social media and (2) even the most concerned minority can be hindered by the less concerned majority from discarding social media. Both (1) and (2) are types of situations that reflect the privacy paradox.
Retrofitting structural graph embeddings with node attribute information
ABSTRACT. Representation learning for graphs has attracted increasing attention in recent years. In this paper, we define and study a new problem of learning attributed graph embeddings. Our setting considers how to update existing nodes' representations that comes from structural graph embedding methods when some additional nodes' attributes are given. To this end, we propose a Graph Embedding RetroFitting (GERF) method, which delivers a compound node embedding that follows both graph structure and attribute space similarity. Unlike other attributed graph embedding methods, GERF is a novel representation learning method that does not require recalculation of the embedding from scratch but rather uses existing ones and retrofits embedding according to neighborhoods defined by the graph structure and node attributes space. Moreover, our approach keeps the same embedding space at all the time, and therefore, allows comparing the positions of embedding vectors and quantifying the impact of attributes on representation update. To obtain high-quality embeddings our GERF method updates embedding vectors by optimizing the invariance loss, graph neighbor loss and attribute neighbor loss. Experiments on WikiCS, Amazon-CS, Amazon-Photo and Coauthor-CS datasets demonstrate that despite working in retrofitting manner our proposed algorithm receives similar results compared other state-of-the-art attributed graph embedding models.
Coevolutionary Approach to Sequential Stackelberg Security Games
ABSTRACT. The paper introduces a novel coevolutionary approach (CoEvoSG) for solving Sequential Stackelberg Security Games. CoEvoSG maintains two competing populations of players' strategies. In the process inspired by biological evolution both populations are developed simultaneously in order to approximate Stackelberg Equilibrium. The comprehensive experimental study based on over 500 test instances of two game types proved CoEvoSG's ability to repetitively find optimal or close to optimal solutions. The main strength of the proposed method is its time scalability which is highly competitive to the state-of-the-art algorithms and allows to calculate bigger and more complicated games than ever before. Due to the generic and knowledge-free design of CoEvoSG, the method can be applied to diverse real-life security scenarios.
Modularity promotes excitation-inhibition balance in synaptically coupled neuronal networks
ABSTRACT. Introduction:
Understanding the organizational principles underlying networks in the brain, that appear at different scales – from those involving synapses and gap junctions that connect individual neurons to tracts that link large brain regions - and their potential role in shaping the functional dynamics of the nervous system is one of the most exciting challenges in neuroscience [1]. The human brain is composed of billions of neurons, that are classified as either excitatory or inhibitory (based on the effect of their activity on the activity of other neurons that receive inputs from them). These are connected in intricate arrangements, generating complex collective activity that results from the stimuli they are subject to, as well as, their connection topology [2,3]. The networks in the brain, in common with those seen in many other complex systems, are seen to have modular organization. It is a fundamental mesoscopic design principle for networks and is characterized by relatively high density of connections between neurons occurring in the same module and sparse connections between those belonging to different modules [4,5]. While the structural signatures of modular neuronal networks have been studied [6,7], little is known about the precise role played by such complex network architecture, e.g., in generating spontaneous, self-sustained network activity [8]. Such persistent activity patterns which are sustained even in the absence of external stimulus, are considered to occur as a result of balancing excitation and inhibition in the network which prevents runaway excitation (due to high excitation) on one hand and quiescence (due to excess inhibition) on the other [9].
Main Outcomes:
In this work we use a minimal model of spiking neurons connected via synapses to show:
(i) modular connection topology promotes persistent activity in a network comprising populations of excitatory (E) and inhibitory (I) neurons (ii) by systematically varying the modularity of the E-I network, three different dynamical regimes, characterized by Persistent Fluctuations, Periodic Bursts and Activity Death, respectively, are observed (iii) networks comprising only excitatory neurons can support persistent activity but only over an extremely short, optimal range of network modularity (iv) introducing a fraction f of inhibitory neurons in such networks allows persistence to be observed even in a completely homogeneous network for an optimal range of f. (v) The range of f over which persistent activity is observed is considerably enhanced by modular organization of the connection topology. (vi) there is both an optimal modularity, as well as, an optimal fraction of inhibitory neurons in a network for which the probability of persistent activity in the network is maximum starting from any initial state.
Conclusion:
In this work, we show that modular network architecture, together with the occurrence of a suitable ratio of excitatory and inhibitory neurons, results persistent network activity. It highlights the impact of modular architecture in promoting robust collective dynamics of neuronal networks and throws light on the structure-dynamics-function relationship in such networks. Our work also relates network organization to the question of excitation-inhibition (E − I) balance and suggests that modularity promotes E − I balance in the network. The balance between excitation and inhibition plays a key role in normal functioning of neuronal networks (e.g., cerebral cortex) [9]. Disrupting this balance alters the brain state resulting in either run away excitation over the entire brain (seen during pathological conditions such as epilepsy and ADHD) or complete absence of neuronal activity due to high inhibition. Thus, our results bring forward a hitherto understudied aspect of the functional roles of modular brain architecture
References:
[1] D. S. Bassett and O. Sporns. Network neuroscience. Nature Neuroscience, 20(3): 353-364, 2017. doi: 10.1038/nn.4502
[2] R. Singh, S. N. Menon, and S. Sinha. Complex patterns arise through spontaneous symmetry breaking in dense homogeneous networks of neural oscillators. Scientific Reports 6: 22074, 2016. doi: 10.1038/srep22074
[3] V. Sreenivasan, S. N. Menon and S. Sinha. Emergence of coupling-induced oscillations and broken symmetries in heterogeneously driven nonlinear reaction networks. Scientific Reports 7: 1594, 2017. doi: 10.1038/s41598-017-01670-y
[4] R. K. Pan and S. Sinha. Modularity produces small-world networks with dynamical time-scale separation. EPL 85(6): 68006, 2009. doi: 10.1209/0295-5075/85/68006
[5] A. Pathak, S. N. Menon, and S. Sinha. Mesoscopic architecture enhances communi- cation across the macaque connectome revealing structure-function correspondence in the brain. arXiv preprint arXiv:2007.14941, 2020
[6] O. Sporns and R. F. Betzel. Modular brain networks. Annu. Rev. Psychol., 67: 613–640, 2016. doi: 10.1146/annurev-psych-122414-033634
[7] R. K. Pan, N. Chatterjee, and S. Sinha. Mesoscopic organization reveals the constraints governing Caenorhabditis elegans nervous system. PloS One 5(2): e9240, 2010. doi: 10.1371/journal.pone.0009240
[8] S. Sinha, J. Saramäki and K. Kaski. Emergence of self-sustained patterns in small-world excitable media. Physical Review E, 76(1): 015101, 2007. doi: 10.1103/PhysRevE.76.015101
[9] W. L. Shew, H. Yang, S. Yu, R. Roy and D. Plenz. Information capacity and transmission are maximized in balanced cortical networks with neuronal avalanches. Journal of Neuroscience, 31(1): 55-63, 2011. doi: 10.1523/JNEUROSCI.4637-10.2011
Acknowledgement: ZRS acknowledges travel and conference registration support by the European Union’s Horizon 2020 research and innovation programme under grant agreement Sano No 857533
Detecting Ottokar II’s 1248–1249 uprising and its instigators in co-witnessing networks
ABSTRACT. We provide a detailed case study showing how social network analysis allows to detect a global event and identify the responsible actors in a historical network. We study the middle 13th century in Czech lands, where a rigid political structure of noble families surrounding the monarchs led to the uprising of part of the nobility. Having collected data on approximately 2,400 noblemen from 576 charters, we attempted to uncover social network features pointing to the rebellion and expose the noblemen who joined it. We observed, among others, assortativity increasing before and resetting to random after the rebellion, a drop in the number of stable connections and subgraph similarity between yearly networks and regional titles (burgraves) rising in centrality above royal court officials in that period.
Machine Learning based surrogate models for COVID-19 infection risk assessment.
ABSTRACT. Airborne virus containing particles are expelled from the infected person as aerosols during the processes such as simply breathing which may be carried by air currents for substantial times and distances. A CFD model was used to track dissipation of breath in a room with ventilation using a passive scalar. A parametric space with a large array of variables was an excessive computational load, so a novel Deep and Shallow Machine Learning model to surrogate a now-classical CFD-based pipeline was developed. The Machine Learning-based surrogate was able to learn the relations starting from a series of CFD-based simulations.
Towards Social Machine Learning for Natural Disasters
ABSTRACT. We propose an approach for integrating social media data with physical data from satellites for the prediction of natural disasters. We show that this integration can improve accuracy in disaster management models, and propose a modular system for disaster instance and severity prediction using social media as a data source. The system is designed to be extensible to cover many disaster domains, social media platform streams, and machine learning methods. We additionally present a test case in the domain context of wildfires, using Twitter as a social data source and physical satellite data from the Global Fire Atlas. We show as a proof of concept for the system how this model can accurate predict wildfire attributes based on social media analysis, and also model social media sentiment dynamics over the course of the wildfire event. We outline how this system can be extended to cover wider disaster domains using different types of social media data as an input source, maximising the generalisability of the system.
Social Media and News Diffusion Effect on Macroeconomic Indicator Interdependencies
ABSTRACT. In a global world, the Interconnectivity between financial markets and commodity prices is becoming a fundamental feature of economic systems affecting macroeconomic trends. This paper proposes various approaches to create seven different networks of several select classical economic indicators, including financial market indices and relevant commodity prices as follows: the broader S&P 500 index comprised of 500 large companies listed on stock exchanges in the U.S., the narrower Dow Jones Industrial Average (DJIA) index consisting of 30 prominent U.S. listed companies, the FTSE 100, including one hundred companies with largest market capitalization trading on the London Stock Exchange, the Hang Seng index representing the largest companies of the Hong Kong stock market, the BSE Index of 30 well-established and financially sound companies listed on the Bombay Stock Exchange, Crude oil price and Gold price.
In a global world, the Interconnectivity between financial markets and commodity prices is becoming a fundamental feature of economic systems affecting macroeconomic trends. This paper proposes different approaches to create seven networks of seven select classical economic indicators including financial market indices and relevant commodity prices as follows: the broader S&P 500 index comprised of 500 large companies listed on stock exchanges in the U.S., the narrower Dow Jones Industrial Average (DJIA) index consisting of 30 prominent U.S. listed companies, the FTSE 100, including one hundred companies with largest market capitalization trading on the London Stock Exchange, the Hang Seng index representing the largest companies of the Hong Kong stock market, the BSE Index of 30 well-established and financially sound companies listed on the Bombay Stock Exchange, Crude oil price and Gold price.
The first five approaches to compute the interdependence between each pair of selected indicators are based on correlations of their: daily prices, daily returns, sentiments extracted from Twitter, sentiment from Reddit, and sentiment from GDELT news [5]. To obtain the sentiments from these three media sources, we utilize a sentiment evaluation approach for finance texts [1] based on the RoBERTa model [2] to calculate the cumulative daily sentiment for texts and news related to the selected economic indicators.
For the sixth approach, we query the Google news API for each pair of indicators and obtain the content and timestamps of the latest 100 news. We then calculate the average daily frequency as the ratio between the number of news related to the indicator pair and the period calculated as the number of days between the first and last news item. Additionally, we extract interdependency insights from the frequency of joint appearances of classical economic indicators in Google news as a sixth approach.
The final, seventh, approach determines each classical economic indicator's impact on others' price forecasting. For this approach, we use a machine learning model based on XGBoost [3] for forecasting the price of a particular economic indicator based on the past price data and news sentiments of all seven indicators. We apply an explainable ML algorithm (we use the SHAP model [4]) to explore the importance of the features in the forecasting model. The values of features' importance show how significant each feature is in predicting the price of the target classical economic indicator. We use the feature significance magnitude as a link weight in the network.
The proposed methodology offers a holistic insight into the dynamics in the global stock markets, including the U.S., the U.K., Hong Kong, and the Indian Stock Exchange markets and relevant commodity prices such as oil and gold. For example, in figure 1, the network is constructed using the interdependence between node pairs computed using the seventh method. It shows the dependency between nodes based on how important a node (indicator) is in predicting the other node's value.
Data-driven reduced-order models for nonlinear vibrations: proof of concept and initial insights
ABSTRACT. The global climate effort is increasingly mindful of lightweight, flexible engineering systems – which are significantly more energy efficient than their predecessors – as solutions to meet ambitious emissions targets. Such designs include next-generation aircraft with high-aspect ratio wings. This increase in efficiency introduces geometric nonlinearity to the system, leading to a dramatic, and multifaceted, increase in complexity, with both the structural behaviour and the mathematical techniques required to capture it providing a much greater challenge. Given the potential benefits of these designs, as well as the cost associated with their development, recent years have observed a marked interest in accurately and effectively predicting nonlinear dynamics.
Geometric nonlinearity occurs in structures experiencing amplitudes of vibration that cause excessive strain of the structure, leading to a nonlinear force-displacement relationship, in place of the linear correlation defined by Hooke’s law. The equations of motion for nonlinear structures typically take the form
M(dx^2/dt^2)+C(dx/dt)+Kx+F_NL (x)=F(t), (1)
where M, C, and K are the mass, damping, and linear stiffness matrices, respectively, x is a vector of displacements, F_NL is a nonlinear function in terms of these displacements (often assumed to be a cubic polynomial), and F is an external forcing term.
Despite the relatively simple appearance of Eq. (1), the nonlinearity of F_NL dramatically increases the complexity of the problem and, when the system is designed and investigated using commercial finite element (FE) software, is not necessarily known by the user. Further, in most real-world structures, the number of elements required to accurately capture the nonlinear behaviour can be prohibitively computationally expensive for applications such as control or optimisation. In this case, a non-intrusive reduced-order model (NIROM), which utilizes the nonlinear solver of the FE software to reduce the order of the model without accessing the source code, can be used. There are several leading methods for NIROM generation, but they typically involve a series of static forces and displacements, to which regression analysis can be applied to find the nonlinear coefficients. However, these methods have been shown to be highly dependent on excitation level, and the lack of clear rules to ensure accuracy has prevented uptake of the techniques in a real-world setting.
In light of this point, this presentation explores initial steps made to replace the aforementioned static approach with a deep learning methodology, which can be applied to realistic (and eventually operational) time series data. One of the key challenges for NIROMs is to ensure accuracy in both the time domain – which requires the incorporation of hysteretic behaviour – and the frequency domain – which involves the prediction of complex interactions and energy transfer between vibrational modes. To that end, this work employs long short-term memory (LSTM) to train the nonlinear coefficients in Eq. (1). A 20-degree-of-freedom (20DOF), linear mass-spring system, to which several nonlinear spring configurations are added, is used to undertake a fundamental exploration of the LSTM. The low-order models are compared with the 20DOF model in both the time and frequency domain, and optimal model sizes are calculated for both. In doing so, the training requirements for multi-domain accuracy are assessed.
Another key element of this research is to investigate optimal training data time series to guarantee NIROM accuracy. Three forcing approaches are considered: random excitation, periodic excitation, and sine sweep excitation. The various advantages and disadvantages of all three approaches is discussed. Sine sweeps provide a clear advantage for hysteretic behaviour, whereas random excitation has the potential to activate a large number of modes in a shorter timeframe, and periodic forcing allows complex parameter regions to be explored in more depth. It is possible that a combination of these approaches may provide the most complete solution, and this is discussed in more detail.
The final part of this presentation provides an overview of how this methodology will be refined and expanded to accommodate nonlinear systems in a real-world and industrial setting. As well as the direct expansions to the results and methods presented here, the direct steps that can be made to accelerate industrial application are explored. These include, but are not limited to, uncertainty quantification (and subsequent updates to the NIROM), data set optimisation, and integration of external physical field (such as aeroelasticity).
ABSTRACT. The study of the temporal evolution of luminosity (lightcurves) of space ob-jects is a subject of ongoing research. It is of particular interest in the field of Space Situational Awareness (SSA) as a means to obtain as much infor-mation as possible about tracked or untracked orbiting bodies. Its main ad-vantage is that it can be performed using optical sensors, which are relatively simple and do not require very large capital investments. In this paper, we study the application of Long-Short Term Memory (LSTM) recurrent neural networks to the inversion of lightcurves, a non-linear estimation problem. It is intended to derive information about the attitude motion of the studied ob-ject, namely its rotation axis, angular velocity and, if possible, its orientation for each observed point in time. The model is trained and tested with sets of lightcurves obtained by using a simulator, which allows for different orbit parameters, geometries and materials. A different model is developed for each of the studied satellite configurations (consisting in a combination of geometry and materials) and its accuracy is measured. It is also explored whether a minimum level of geometric complexity is required to successfully perform the lightcurve inversion.
Streaming detection of significant delay changes in public transport systems
ABSTRACT. Public transport systems are expected to reduce pollution and contribute to sustainable development. However, disruptions in public transport such as delays may negatively affect mobility choices.
To quantify delays, aggregated data from vehicle locations systems are frequently used. However, delays observed at individual stops are caused inter alia by fluctuations in running times and propagation of delays occurring in other locations. Hence, in this work, we propose both the method detecting significant delays and reference architecture, relying on stream processing engines, in which the method is implemented. The method can complement the calculation of delays defined as deviation from schedules. This provides both online rather than batch identification of significant and repetitive delays, and resilience to the limited quality of location data.
The method we propose can be used with different change detectors, such as ADWIN, applied to location data stream shuffled to individual edges of a transport graph. It can detect in an online manner at which edges statistically significant delays are observed and at which edges delays arise and are reduced. Detections can be used to model mobility choices and quantify the impact of repetitive rather than random disruptions on feasible trips with multimodal trip modelling engines. The evaluation performed with the public transport data of over 2000 vehicles confirms the merits of the method and reveals that a limited-size subgraph of a transport system graph causes statistically significant delays.
A study on the prediction of evapotranspiration using freely available meteorological data
ABSTRACT. Due to climate change, the hydrological drought is assuming a structural character with a tendency to worsen in many countries. The frequency and intensity of droughts is predicted to increase, particularly in the Mediterranean region and in Southern Africa. Since a fraction of the fresh water that is consumed is used to irrigate urban fabric green spaces, which are typically made up of gardens, lanes and roundabouts, it is urgent to implement water waste prevention policies. Evapotranspiration (ETO) is a measurement that can be used to estimate the amount of water being taken up or used by plants, allowing a better management of the watering volumes but, the exact computation of the evapotranspiration volume is not possible without using complex and expensive sensor systems.
In this study, several machine learning models were developed to estimate reference evapotranspiration and solar radiation from a reduced-feature dataset, such has temperature, humidity, and wind. Two main approaches were taken: (i) directly estimate ETO, or (ii) previously estimate solar radiation and then inject it into a function or method that computes ETO. For the later case, two variants were implemented, namely the use of the estimated solar radiation as (ii.1) a feature of the machine learning regressors and (ii.2) the use of FAO-56PM method to compute ETO, which has solar radiation as one of the input parameters. Using experimental data collected from a weather station located in Vale do Lobo, south Portugal, the later approach achieved the best result with a coefficient of determination $(R^2)$ of 0.975 over the test dataset. As a final notice, the reduced-set features were carefully selected so that they are compatible with online freely available weather forecast services.
A Personalized Federated Learning Algorithm for One-Class Support Vector Machine: an Application in Anomaly Detection
ABSTRACT. Federated Learning (FL) has recently emerged as a promising method that employs a distributed learning model structure to overcome data privacy and transmission issues paused by central machine learning models. In FL, datasets collected from different devices or sensors are used to train local models (clients) each of which shares its learning with a centralized model (server). However, this distributed learning approach presents unique learning challenges as the data used at local clients can be non-IID (Independent and Identically Distributed) and statistically diverse which decrease learning accuracy in the central model. In this paper, we overcome this problem by proposing a novel Personalized Conditional FedAvg (PC-FedAvg) for One-Class Support Vector Machine (OCSVM) which aims to control weights communication and aggregation augmented with a tailored learning algorithm to personalize the resulting support vectors at each client. Our experimental validation on two datasets showed that our PC-FedAvg precisely constructed generalized clients' models and thus achieved higher accuracy compared to other state-of-the-art methods.
GBLNet: Detecting Intrusion Traffic with Multi-granularity BiLSTM
ABSTRACT. Detecting and intercepting malicious requests are some of the most widely used ways against attacks in the network security, especially in the severe COVID-19 environment. Most existing detecting approaches, including matching blacklist characters and machine learning algorithms are proven to be vulnerable to sophisticated attacks. To address the above issues, a more general and rigorous detection method is required. In this paper, we formulate the problem of detecting malicious requests as a temporal sequence classification problem, and propose a novel deep learning model namely GBLNet, girdling bidirectional LSTM with multi-granularity CNNs. By connecting the shadow and deep feature maps of the convolutional layers, the malicious feature extracting ability is improved on more detailed functionality. Experimental results on HTTP dataset CSIC 2010 demonstrate that GBLNet can efficiently detect intrusion traffic with superior accuracy and evaluating speed, compared with the state-of-the-arts.
AMDetector: Detecting Large-scale and Novel Android Malware Traffic
ABSTRACT. In the severe COVID-19 environment, encrypted mobile malware is increasingly threatening personal privacy, especially those targeting on Android platform. Existing methods mainly focus on extracting features from Android Malware (DroidMal) by reversing the binary samples, which is sensitive to the deduction of the available samples. Thus, they fail to tackle the insufficiency of the novel DoridMal. Therefore, it is necessary to investigate an effective solution to classify large-scale DroidMal, as well as to detect the novel one. We consider few-shot DroidMal detection as DoridMal encrypted network traffic classification and propose an image-based method with meta-learning, namely AMDetector, to address the issues. By capturing network traffic produced by DroidMal, samples are augmented and thus cater to the learning algorithms. Firstly, DroidMal encrypted traffic is converted to session images. Then, session images are embedded into a high dimension metric space, in which traffic samples can be linearly separated by computing the distance with the corresponding prototype. Large-scale and novel DroidMal traffic is classified by applying different meta-learning strategies. Experimental results on public datasets have demonstrated the capability of our method to classify large-scale known DroidMal traffic as well as to detect the novel one. It is encouraging to see that, our model achieves superior performance on known and novel DroidMal traffic classification among the state-of-the-arts. Moreover, AMDetector is able to classify the unseen cross-platform malware.
Accessing the Spanish Digital Network of Museum Collections through an Interactive Web-based Map
ABSTRACT. Within the scope of the SeMap project, we have developed a web-based tool that aims to offer innovative dissemination of movable assets held in museums, linking them semantically and through interactive maps. SeMap is focused on depicting the objects that are catalogued in CER.ES, the Spanish Digital Network of Museum Collections, which offers a catalogue of 300,000+ objects. To properly represent such objects in the SeMap tool, which considers their semantic relations, we needed to preprocess the data embedded in catalogues, and to design a knowledge graph based on CIDOC-CRM. To that end, the collaboration among academia, heritage curators and public authorities was of high relevance. This paper describes the steps taken to represent the CER.ES objects in the SeMap tool, focusing on the interdisciplinary collaboration. We also bring the results of a usability test, that proves the developed map is usable
MiDaS: extract golden results from Knowledge Discovery even over incomplete Databases
ABSTRACT. The continuous growth in data collection requires effective and efficient capabilities to support Knowledge Discovery in Databases (KDD) over large amounts of complex data. However, as activities such as data acquisition, cleaning, preparation, and recording may lead to incompleteness, impairing the KDD processes, specially because most analysis methods do not adequately handle missing data. To analyze complex data, such as performing similarity search or classification tasks, KDD processes require similarity assessment. However, incompleteness can disrupt the assessment evaluation, making the system unable to compare incomplete tuples. Therefore, incompleteness can render databases useless for knowledge extraction or, at best, dramatically reducing their usefulness. In this paper, we propose MiDaS, a framework based on a RDBMS system that offers tools to deal with missing data employing several strategies, making it possible to assess similarity over complex data, even in the presence of missing data at KDD scenarios. We show experimental results of analyses using MiDaS for similarity retrieval, classification, and clustering tasks over publicly available complex datasets, evaluating the quality and performance of several missing data treatments. The results highlight that MiDaS is well-suited for dealing with incompleteness enhancing data analysis in several KDD scenarios.
Uncertainty occurrence in projects and its consequences for project management
ABSTRACT. Based on a survey (with 350 respondents), the occurrence of uncertainty, defined as incomplete or imperfect knowledge, in the project planning or preparation stage was described and quantified. The uncertainty with respect to customer expectations, project result, methods to be used, project stages duration and cost, and (both human and material) resources was considered, and its consequences for project management and whole organisations analysed. The results show that the scope of uncertainty in projects cannot be neglected in practice and requires the usage, in the project planning or preparation stage and during the whole project course, of advanced project and uncertainty management methods. The questionnaire used in the paper is recommended to be applied in organisations in order to measure and track the uncertainty scope in the projects being implemented and adopt a tailor-made uncertainty management approach. Agile-based approaches seem to be highly recommendable in this respect. Future research directions are proposed, which include the application of a special type of fuzzy numbers to project management.
On a nonlinear approach to uncertainty quantification on the ensemble of numerical solutions
ABSTRACT. The estimation of the approximation errors using the ensemble of numerical solutions is considered in the Inverse Problem statement. The Inverse Problem is set in the nonlinear variational formulation that provides additional opportunities. The ensemble of numerical results, obtained by the OpenFOAM solvers for the inviscid compressible flow with an oblique shock wave, is analyzed. The numerical tests demonstrated feasibility to obtain the approximation errors without any regularization. The refined solution, corresponding the mean of numerical solutions with the approximation error correction, is also computed.
Learning Scale-Invariant Object Representations with a Single-Shot Convolutional Generative Model
ABSTRACT. Contemporary machine learning literature highlights learning object-centric image representations' benefits, i.e. interpretability, and the improved generalization performance. In the current work, we develop a neural network architecture that effectively addresses the task of multi-object representation learning in scenes containing multiple objects of varying types and sizes. In particular, we combine SPAIR and SPACE ideas, which do not scale well to such complex images, and blend them with recent developments in single-shot object detection. The method overcomes the limitations of fixed-scale glimpses' processing by learning representations using a feature pyramid-based approach, allowing more feasible parallelization than all other state-of-the-art methods. Moreover, the method can focus on learning representations of only a selected subset of types of objects coexisting in scenes. Through a series of experiments, we demonstrate the superior performance of our architecture over SPAIR and SPACE, especially in terms of latent representation and inferring on images with objects of varying sizes.
Wavelet Scattering Transform for PhotoPlethysmoGraphic (PPG) Signal Analysis
ABSTRACT. Photoplethysmography (PPG) is a noninvasive optical method accepted in clinical use for
measurements of arterial oxygen saturation. Nowadays, PPG signal is also measured throughout wearable devices. The presented novelty is the procedure to study the dynamics of biomedical signals, using wavelet scattering transform based features to classification segments of a signal into two classes - chaotic and non-chaotic. Moreover, the proposition of chaotic measure is defined. The classification is based on a model trained on a carefully prepared training dataset which consists of signals of models with known characteristics. The aim of the study was to present the usefulness of the wavelet scattering transform for the analysis of biomedical signals on an example of PPG signal, and to indicate the importance of preparing the training set.
Convolutional neural network compression via tensor-train decomposition on permuted weight tensor with automatic rank determination
ABSTRACT. Convolutional neural networks (CNNs) are among the most commonly investigated models in computer vision. Deep CNNs yield high computational performance, but their common issue is a large size. For solving this problem, it is necessary to find effective compression methods which can effectively reduce the size of the network, keeping the accuracy on a similar level. This study provides important insights into the field of CNNs compression, introducing a novel low-rank compression method based on tensor-train decomposition on a permuted kernel weight tensor with automatic rank determination. The proposed method is easy to implement, and it allows us to fine-tune neural networks from decomposed factors instead of learning them from scratch. The results of this study examined on various CNN architectures and two datasets demonstrated that the proposed method outperforms other CNNs compression methods with respect to parameter and FLOPS compression at a low drop in the classification accuracy.
Your Social Circle Affects Your Interests: Social Influence Enhanced Session-Based Recommendation
ABSTRACT. Session-based recommendation aims at predicting the next item given a series of historical items a user interacts with in a session. Many works try to make use of social network to achieve a better recommendation performance. However, existing works treat the weights of user edges as the same and thus neglect the differences of social influences among users in a social network, for each user's social circle differs widely. In this work, we try to utilize an explicit way to describe the impact of social influence in recommender system. Specially, we build a heterogeneous graph, which is composed of users and items nodes. We argue that the fewer neighbors users have, the more likely users may be influenced by neighbors, and different neighbors may have various influences on users. Hence weights of user edges are computed to characterize different influences of social circles on users in a recommendation simulation. Moreover, based on the number of followers and PageRank score of each user, we introduce various computing methods for weights of user edges from a comprehensive perspective. Extensive experiments performed on three public datasets demonstrate the effectiveness of our proposed approach.
Action Recognition in Australian Rules Football through Deep Learning
ABSTRACT. Understanding player's actions and activities in sports is crucial to analyze player and team performance. Within Australian Rules football, such data is typically captured manually by multiple (paid) spectators working for sports data analytics companies. This data is augmented with data from GPS tracking devices in player clothing. This paper focuses on exploring the feasibility of action recognition in Australian rules football through deep learning and use of 3-dimensional Convolutional Neural Networks (3D CNNs). We identify several key actions that players perform: kick, pass, mark and contested mark, as well as non-action events such as images of the crowd or players running with the ball. We explore various state-of-the-art deep learning architectures and developed a custom data set containing over 500 video clips targeted specifically to Australian rules football. We fine-tune a variety of models and achieve a top-1 accuracy of 77.45\% using R2 + 1D ResNet-152. We also consider team and player identification and tracking using You Only Look Once (YOLO) and Simple Online and Realtime Tracking with a deep association metric (DeepSORT) algorithms. To the best of our knowledge, this is the first paper to address the topic of action recognition in Australian rules football.
Computational Challenges for Biomolecular Simulation Approaches to Drug Discovery
ABSTRACT. Recent advances in artificial intelligence approaches to protein structure prediction offer the exciting prospect of vastly extending the number of disease targets that can be tackled by structure-based drug design approaches. Proteins are, however, constantly in motion, changing their shape and wiggling and jiggling around. Moreover, the protein motions cover a wide range of spatial and temporal scales that influence drug-target affinity and binding kinetics. The dynamic nature of protein structures thus provides challenges and opportunities for drug design. I will discuss the use of multiresolution molecular simulation techniques and machine learning to study macromolecular structural ensembles, including recent examples from studies of SARS-CoV-2 proteins.
Dynamic classification of bank clients by the predictability of their transactional behavior
ABSTRACT. We propose a method for dynamic classification of bank clients by the predictability of their transactional behavior (with respect to the bank’s chosen prediction model, quality metric, and predictability measure). The method adopts incremental learning to perform client segmentation based on their predictability profiles and can be used by banks not only for determining predictable (and thus profitable, in a sense) clients currently but also for analyzing their dynamics during economical periods of different types. Our experiments show that (1) bank clients can be effectively divided into predictability classes dynamically, (2) the quality of prediction and classification models is significantly higher with the proposed incremental approach than without it, (3) clients have different transactional behavior in terms of predictability before and during the COVID-19 pandemics. The source code, public datasets, and results related to our study are available on GitHub.
Stock Predictor with Graph Laplacian-based Multi-task Learning
ABSTRACT. The stock market is a complex network that consists of individual stocks exhibiting various financial properties and different data distribution. For stock prediction, it is natural to build separate models for each stock but also consider the complex hidden correlation among a set of stocks. We propose a federated multi-task stock predictor with financial graph Laplacian regularization (FMSP-FGL). Specifically, we first introduce a federated multi-task framework with graph Laplacian regularization to fit separate but related stock predictors simultaneously. Then, we investigate the problem of graph Laplacian learning, which represents the association of the dynamic stock. We show that the proposed optimization problem with financial Laplacian constraints captures both the inter-series correlation between each pair of stocks and the relationship within the same stock cluster, which helps improve the predictive performance. Empirical results on two popular stock indexes demonstrate that the proposed method outperforms baseline approaches. To the best of our knowledge, this is the first work to utilize the advantage of graph Laplacian in multi-task learning for financial data to predict multiple stocks in parallel.
Forecasting bank default with the Merton model: The case of US banks
ABSTRACT. This paper examines whether the probability of default (Merton, 1974) can be applied to banks’ default predictions. Using the case of US banks in the post-crisis period (2010–2014), we estimate several Cox proportional hazard mod-els as well as their out-of-sample performance. As a result, we find that the Merton measure, that is, the probability of default, is not a sufficient statistic for predicting bank default, while, with the 6-month forecasting horizon, it is an extremely significant predictor and its functional form is a useful construct for predicting bank default. Findings suggest that (i) predicting banks’ defaults over a mid- to long-term horizon can be done more effectively by adding the inverse of equity volatility and the value of net income over total assets, and (ii) the role of the capital adequacy ratio is doubtful even in short-run default prediction.
ARIMA Feature-Based Approach to Time Series Classifcation
ABSTRACT. Time series classification is a supervised learning problem that aims at labelling time series according to their class belongingness. Time series can be of variable length. Many algorithms have been proposed, among which feature-based approaches play a key role, but not all of them are able to deal with time series of unequal lengths. In this paper, a new feature-based approach to time series classification is proposed. It is based on ARIMA models constructed for each time series to be classified. In particular, it uses ARIMA coefficients to form a classification model together with sampled time series data points. The proposed method was tested on a suite of benchmark data sets and obtained results are compared with those provided by the state-of-the-art approaches. The proposed method achieves satisfying classification accuracy and is suitable for time series of unequal lengths.
ABSTRACT. The increasing volume and variety of science data has led to
the creation of metadata extraction systems that automatically derive
and synthesize relevant information from files. A fundamental component
of metadata extraction systems is the mapping of extractors—lightweight
tools to mine information from a particular file types—to each file in a
repository. However, existing methods do little to address the variety and
scale of science data, thereby leaving valuable data unextracted or wasting significant compute resources applying incorrect extractors to data.
We construct an extractor scheduler that leverages file type identification (FTI) methods. We show that by training lightweight multi-label,
multi-class statistical models on byte samples from files, we can correctly
map 35% more extractors to files than by using libmagic. Further, we
introduce a metadata quality toolkit to automatically assess the utility
of extracted metadata.
Validation and Optimisation of Player Motion Models in Football
ABSTRACT. Modelling the trajectorial motion of humans along the ground
is a foundational task in the quantitative analysis of sports like association
football. Most existing models of football player motion have not
been validated yet with respect to actual data. One of the reasons for this
lack is that performing such a validation is not straightforward, because
models of player motion are usually phrased in a way that emphasises
possibly reachable positions rather than expected positions. Since positional
data of football players typically contains outliers, this data may
misrepresent the range of actually reachable positions.
This paper proposes a validation routine for trajectorial motion models
that measures and optimises the ability of a motion model to accurately
predict all possibly reachable positions by favoring the smallest predicted
area of reachable positions that encompasses all observed reached positions
up to a manually defined threshold. We demonstrate validation and
optimisation on four different motion models, assuming (a) motion with
constant speed, (b) motion with constant acceleration, (c) motion with
constant acceleration with a speed limit, and (d) motion along two segments
with constant speed. Our results show that assuming motion with
constant speed or constant acceleration without a limit on the achievable
speed is particularly inappropriate for an accurate distinction between
reachable and unreachable locations. Motion along two segments of constant
speed provides by far the highest accuracy among the tested models
and serves as an efficient and accurate approximation of real-world player
motion.
A hypothetical agent-based model inspired by the abstraction of solitary behavior in tigers and its employment as a chain code for compression
ABSTRACT. In this paper, we design an agent-based modeling simulation that represents the solitary behavior in tigers and utilize it in encoding image information. Our model mainly depends on converting the digital data to a virtual environment that has paths marked differently based on the allocation of the original data itself. Then, we introduce virtual tigers to the environment to begin the encoding process. These tiger agents are separated from each other and the algorithm monitors their movements and keeps them away from each other. Tigers follow a relative movement style that encodes each tiger's movement direction based on the previous one. The encoding approach allows particular movements that occur in different directions to be encoded in a similar way. After that, we apply Huffman coding on the chain of movements the purpose of which is to reduce the size and have a new representation. The experimental findings reveal that we could obtain better results than leading standards in bi-level image compression including JBIG family methods. Our findings strengthen the findings of previous studies that incorporated biological behaviors within agent-based modeling simulations and provide a new abstraction to be utilized in information processing research.
Networks clustering-based approach for search of reservoirs-analogues
ABSTRACT. This article presents a new look at the problem of finding analogue reservoirs, representing reservoirs as a network and solving the problem of finding analogues as a problem of finding communities in the network. The proposed network approach allows us to effectively search for a cluster of analogues and restore missing parameters in the target reservoir based on the found clusters of analogues. Various methods of building a network of reservoirs were tested, and various algorithms for searching for clusters in the network, conclusions were drawn about the most effective for reservoirs. Also, the network approach was compared with the baseline approach and showed greater efficiency. Three approaches were also compared to restore gaps in the target reservoir using clusters of analogues. A conclusion was made about the most effective for the case with several gaps in the reservoir. All experiments were carried out on a real database of reservoirs.
Analyzing the usefulness of public web camera video sequences for calibrating and validating pedestrian dynamics models
ABSTRACT. Calibrating and validating the pedestrian dynamics models is usually conducted using data obtained in experiments with groups of people. An interesting alternative to using data obtained in such a way is using data from public web cameras. The article presents a case study of using public web cameras in the analysis of pedestrian dynamics and social behavior.
Analysis of public transport (in)accessibility and land use pattern in different areas in Singapore
ABSTRACT. As more and more people continue to live in highly urbanised areas across the globe, reliable accessibility to amenities and services plays a vital role in sustainable development. One of the challenges in addressing this issue is the consistent and equal provision of public services, including transport for residents across the urban system. In this study, using a novel computational method combining geometrical analysis and information-theoretic measures, we analyse the accessibility to public transport in terms of the spatial coverage of the transport nodes (stops) and the quality of service at these nodes across different areas. Furthermore, using a network clustering procedure, we also characterise the land use pattern of those areas and relate that to their public transport accessibility. Using Singapore as a case study, we find that the commercial areas in the CBD area expectedly have excellent accessibility and the residential areas also have good to very good accessibility. However, not every residential area is equally accessible. While the spatial coverage of stops in these areas is very good, the quality of service indicates substantial variation among different areas, with high contrast between the central and eastern regions compared to the others in the west and north of the city-state. We believe this kind of analysis could yield a good understanding of the current level of public transport services across the urban system, and their disparity will provide valuable and actionable insights into the future development plans.
Private and public opinions in a model based on the total dissonance function: A simulation study
ABSTRACT. We study an agent-based model of opinion dynamics in which an agent's private opinion may differ significantly from that expressed publicly. The model is based on the so-called total dissonance function. The behavior of the system depends on the competition between the latter and social temperature. We focus on a special case of parental and peer influence on adolescents. In such a case, as the temperature rises, Monte Carlo simulations reveal a sharp transition between a state with and a state without private-public opinion discrepancy. This may have far-reaching consequences for developing marketing strategies.
Augmenting Graph Inductive Learning Model With Topographical Features
ABSTRACT. Knowledge Graph (KG) completion aims to find the missing entities or relationships in a knowledge graph. Although many approaches have been proposed to construct complete KGs, graph embedding methods have recently gained massive attention. These methods performed well in transductive settings, where the entire collection of entities must be known during training. However, it is still unclear how effectively the embedding methods capture the relational semantics when new entities are added to KGs over time. This paper proposes a method, AGIL, for learning relational semantics in knowledge graphs to address this issue. Given a pair of nodes in a knowledge graph, our proposed method extracts a subgraph that contains common neighbors of the two nodes. The subgraph nodes are then labeled based on their distance from the two input nodes. Some heuristic features are computed and given along with the adjacency matrix of the subgraph as input to a graph neural network. The GNN predicts the likelihood of a relationship between the two nodes. We conducted experiments on five real datasets to demonstrate the effectiveness of the proposed framework. The AGIL in relation prediction outperforms the baselines both in the inductive and transductive setting.
Time Series Attention Based Transformer Neural Turing Machines for Diachronic Graph Embedding in Cyber Threat Intelligence
ABSTRACT. The cyber threats are often found to threaten individuals, organizations and countries to different degrees and evolve continuously over time. Cyber Threat Intelligence (CTI) is an effective approach to solve cyber security problems. However, existing processes are considered inherent responses to known threats. CTI experts recommend proactively checking for emerging threats in existing knowledge. In addition, most researches focus on static snapshots of the CTI knowledge graph, while ignoring the temporal dynamics. To this end, we create a novel framework TSA-TNTM (Time Series Attention based Transformer Neural Turing Machines) for diachronic graph embedding framework, which uses time series self-attention to capture the non-linearly evolving entity representations over time. We demonstrate significantly improved performance over various approaches. A series of benchmark experiments illustrate that TSA-TNTM could generate higher quality than the state-of-the-art word embedding models in tasks pertaining to semantic analogy, clustering, threat classification and proactively identify emerging threats in CTI fields.
Sensitivity Analysis and Machine Learning to emulate the Level-Ice Melt Pond Parametrisation
ABSTRACT. Sea ice plays an essential role in global ocean circulation and in regulating Earth's climate and weather. Melt Ponds that form on the ice have a profound impact on the Arctic's climate, and their evolution is one of the main factors affecting sea-ice albedo and hence the polar climate system. Sea ice’s recent rapid decline is an alteration to the global climate that GCMs systematically underestimate. This has been attributed in major part to small-scale processes that are not sufficiently captured by large-scale models. One of these processes is the formation of melt ponds on the surface of sea ice.
Accounting for the phases of evolution in albedo through the annual ice melting process in numerical models requires a parametrisation of melt ponds. Numerical experiments with sea-ice models demonstrate sensitivity of the ice thickness to melt-pond parametrisations, and studies have shown that models lacking a melt pond parametrisation can overestimate the summertime sea ice thickness by up to 40\%.
Melt pond parametrisations have increased in their complexity demonstrating notable success. parametrisations of these physical processes are based on a number of assumptions and can include many uncertain parameters that have a substantial effect on the simulated evolution of the melt ponds. More accurately determining the values of these parameters through observational studies, or by improving parametrisations in sea ice models, remain therefore an important task. With limited ability for in-situ observations of melt ponds, understanding the influence of each uncertain parameter on melt pond evolution is crucial.
In this study we take a state-of-the-art sea ice column physics model, Icepack, and we conduct a global sensitivity sensitivity analysis of all melt pond parameters. Sobol sequences are employed so as to ensure we sample the full parameter space, and an ensemble of perturbed-parameter simulations is performed. We focus our analysis on the effect of these parameter values on the simulated ice area fraction, ice thickness, effective pond area and total albedo. Results from the sensitivity analysis indicate that parameters controlling the amount of melt water allowed to run off to the ocean plays a substantial effect on the total sea ice volume, and its albedo. The results reveal that meltwater added to the melt ponds in early melting season has a greater role in influencing the sea ice properties.
Several studies have highlighted the potential for machine learning based parametrisation schemes, and machine learning has shown remarkable success in representing subgrid-scale processes and other parametrisations of Global Circulation Models. With an increased understanding of the influence of Icepack's melt pond parametrisation through our Sobol Sensitivity Analysis, we employ various machine learning algorithms, and test their ability to emulate the melt pond parametrisation in Icepack. As part of ongoing work, we seek to train our machine learning scheme on observations also, understanding if machine learning can here provide improvements on existing parametrisations of physical sea ice processes.
Enforcing State Constraints in Dynamical Systems Modelled with Neural Networks
ABSTRACT. Deep neural networks (NNs) are usually trained with unconstrained optimisation algorithms. With a reasoning similar to the constrained Kalman filter, incorporating known information in the form of equality constraints at certain checkpoints can potentially improve prediction accuracy. For continuous-time dynamical systems, the state constraints should be enforced in an ordinary differential equation (ODE) model which embeds NNs to represent a learned part of dynamics or a control policy. To this end, incremental correction methods are developed for post-processing of the dynamical systems modelled with NNs for which the parameters are determined by previous optimisation process. The proposed approach is to find a small amount of local correction needed to satisfy given state constraints with the updated solution. Algorithms for updating the neural network parameters and the control function are considered.
On-Edge Aggregation Strategies over Industrial Data Produced by Autonomous Guided Vehicles
ABSTRACT. Industrial IoT systems, such as those based on Autonomous Guided Vehicles (AGV), often generate a massive volume of data that needs to be processed and sent over to the cloud or private data centers. The presented research proposes and evaluates the approaches to data aggregation that help reduce the volume of readings from AGVs, by taking advantage of the edge computing paradigm. For the purposes of this article, we developed the processing workflow that retrieves data from AGVs, persists it in the local edge database, aggregates it in predefined time windows, and sends it to the cloud for further processing. We proposed two aggregation methods used in the considered workflow. We evaluated the developed workflow with different data sets and ran the experiments that allowed us to highlight the data volume reduction for each tested scenario. The results of the experiments show that solutions based on edge devices such as Jetson Xavier NX and technologies such as TimescaleDB can be successfully used to reduce the volume of data in pipelines that process data from Autonomous Guided Vehicles. Additionally, the use of edge computing paradigms improves the resilience to data loss in cases of network failures in such industrial systems.
Acquisition, storing, and processing system for interdisciplinary research in Earth sciences
ABSTRACT. The article presents the results of research carried out as part of the interdisciplinary cooperation of scientists in the field of geochemistry and computer science. Such a model of cooperation is justified and especially purposeful in resolving various environment protection tasks, including the issues of energy transformation and climate change. The research regards air quality monitoring case study conducted in Ochotnica, South Poland. The environmental data have been collected, stored, and processed using a set of sensor stations as well as a data storing and processing service. The stations and the service are very flexible, customizable and they have been successfully designed, implemented, and tested by the authors of this paper in the mentioned air quality monitoring case study. The collaboration in the conducted research has been an opportunity to create and test in practice a comprehensive, versatile and configurable data acquisition and processing system which supports not only this use case, but also can be applied to a wide variety of general data acquisition and data analysis purposes.
Detecting SQL Injection vulnerabilities using nature-inspired algorithms
ABSTRACT. In the past years, the number of users of web applications has increased. and also the number of critical vulnerabilities in these web applications. Web application security implies building websites to function as expected, even when they are under attack. SQL Injection is a web vulnerability caused by mistakes made by programmers, that allows an attacker to interfere with the queries that an application makes to its database. In many cases, an attacker can see, modify or delete data without proper authorization. In this paper, we propose an approach to detect SQL injection vulnerabilities in the source code, using nature-based algorithms: Genetic Algorithms (GA), Artificial Bee Colony (ABC), and Ant Colony Optimization (ACO). To test this approach empirically we used web applications purposefully vulnerable as Bricks, bWAPP, and Twitterlike. We also perform comparisons with other tools from the literature. The simulation results verify the effectiveness and robustness of the proposed approach.
Federated Learning for Anomaly Detection in Industrial IoT-enabled Production Environment Supported by Autonomous Guided Vehicles
ABSTRACT. Intelligent production requires maximum downtime avoidance since downtimes lead to economic loss for companies. Thus, the main idea of Industry 4.0 is automated production with real-time decision-making. For this purpose, new technologies such as Machine Learning (ML), Artificial Intelligence (AI), and Autonomous Guided Vehicles (AGVs) are integrated into production to optimize and automate many production processes. The increasing use of AGVs in production has far-reaching consequences for industrial communication systems. To make AGVs in production even more effective, we propose to use Federated Learning (FL) which provides a secure exchange of experience between intelligent manufacturing devices to improve prediction accuracy. We conducted research in which we exchanged experiences between the three virtual devices, and the results confirm the effectiveness of this approach in the production environment.
Performance of Explainable AI methods in asset failure prediction
ABSTRACT. Extensive research on machine learning models, which in the majority are black-boxes, created a great need for the development of Explainable Artificial Intelligence (XAI) methods.
Complex ML models usually require an external explanation method to understand their decisions.
The interpretation of the model predictions are crucial in many fields, i.e., predictive maintenance, where it is not only required to evaluate the state of an asset, but also to determine the root causes of the potential failure.
In this work, we present a comparison of state-of-the-art ML models and XAI methods, which we used for the prediction of the remaining useful life (RUL) of aircraft turbofan engines.
We trained five different models on the C-MAPSS dataset and used SHAP and LIME to assign numerical importance to the features.
We have compared the results of explanations using stability and consistency metrics and evaluated the explanations qualitatively by visual inspection.
The obtained results indicate that SHAP method outperforms other methods in the fidelity of explanations.
We observe that there exist substantial differences in the explanations depending on the selection of a model and XAI method, thus we find a need for further research in XAI field.
Supporting education in advanced mathematical analysis problems by symbolic computational techniques using computer algebra system
ABSTRACT. In this paper we present two didactic examples of the use of Mathematica's symbolic calculations in problems of mathematical analysis which we prepared for students of Warsaw University of Life Sciences. In Example 1 we solve the problem of convex optimization and next in Example 2 we calculate the complex integrals. We also describe a didactic experiment for students of the Informatics and Econometric Faculty of Warsaw University of Life Sciences.
TEDA: A Computational Toolbox for Teaching Ensemble Based Data Assimilaton
ABSTRACT. This paper presents an intuitive Python toolbox for Teaching Ensemble-based Data Assimilation (DA), TEDA. This toolbox responds to the necessity of having software for teaching and learning topics related to ensemble-based DA; this process can be critical to motivate undergraduate and graduate students towards scientific topics such as meteorological anomalies and climate change. Most DA toolboxes are related to operational software wherein the learning process of concepts and methods is not the main focus. TEDA facilitates the teaching and learning process of DA concepts via multiple plots of error statistics and by providing different perspectives of numerical results such as model errors, observational errors, error distributions, the time evolution of errors, ensemble-based DA, covariance matrix inflation, precision matrix estimation, and covariance localization methods, among others. By default, the toolbox is released with five well-known ensemble-based DA methods: the stochastic ensemble Kalman filter (EnKF), the dual EnKF formulation, the EnKF via Cholesky decomposition, the EnKF based on a modified Cholesky decomposition, and the EnKF based on B-localization. Besides, TEDA comes with three toy models: the Duffing equation (2 variables), the Lorenz 63 model (3 variables), and the Lorenz 96 model (40 variables), all of which exhibit chaotic behavior for some parameter configurations, which makes them attractive for testing DA methods. We develop the toolbox using the Object-Oriented Programming (OOP) paradigm, which makes incorporating new techniques and models into the toolbox easy. We can simulate several DA scenarios for different configurations of models and methods to better understand how ensemble-based DA methods work.
To Have and Have Not: Addressing inequities for learners accessing computational science environments
ABSTRACT. Learning computational science techniques involves acquiring both domain-based and technical skills, as well as having access to sufficient compute resources. Infrastructures enabling computational science for education might include HPC clusters at universities, platforms from industry such as AWS, Microsoft Azure or Google, or individual servers with GPUs that students can connect to. However, for educators with limited funding, obtaining access to needed resources is difficult, as the demand for resources continues to increase. In this paper we ask: In the absence of access to machines, how can educators navigate computing environment options, both public and privately-run, with respect to factors such as the number of students, possible maintenance issues, and cost to compute results? We outline options for educators with no (free) access to school resources, with specific attention to cost, for example, explaining the notoriously difficult-to-understand prices to perform compute jobs on commercial platforms, cost of a node on campus, versus purchasing an on-site machine. We find that, while funding resources might be available, understanding how to budget for computational spaces for coursework can be challenging. We also find that solutions are not tailored to educational settings where there might be a large number of students needing individual instances to learn. We feel that nascent steps such as these to offer options for obtaining appropriate resources is but one way to address the digital divide in access to resources.
Modeling Approach in Teaching Differential Equations
ABSTRACT. Teaching methodology is evolving with modern time. Gone are those days when students were learning procedural differential equations from the analytical perspective. There is a great need to update the pedagogies to introduce modeling first approach to appreciate the differential equations techniques. Inspired by MINDE workshop, I introduced the modeling first approach in my classroom teaching to welcome inquiry oriented learning. Data collection, data visualization, and parameter estimation using technology has gained better understanding of mathematical modeling using differential equations not only for Math major but also all STEM students who take differential equations as a core course. In this talk, I will present my effort to incorporate several modeling scenarios in my differential equations and mathematical modeling classes.
SEGP: Stance-Emotion joint Data Augmentation with Gradual Prompt-tuning for Stance detection
ABSTRACT. Stance detection is an important task in opinion mining, which aims to determine whether the author of a text is in favor of, against, or neutral towards a specific target. By now, the scarcity of annotations is one of the remaining problems in stance detection. In this paper, we propose a Stance-Emotion joint Data Augmentation with Gradual Prompt-tuning (SEGP) model to address this problem. In order to generate more training samples, we propose an auxiliary sentence based Stance-Emotion joint Data Augmentation (SEDA) method, formulate data augmentation as a conditional masked language modeling task. We leverage different relations between stance and emotion to construct auxiliary sentences. SEDA generates augmented samples by predicting the masked words conditioned on both their context and auxiliary sentences. Furthermore, we propose a Gradual Prompt-tuning method to make better use of the augmented samples, which is a combination of prompt-tuning and curriculum learning. Specifically, the model starts by training on only original samples, then adds augmented samples as training progresses. Experimental results show that SEGP significantly outperforms the state-of-the-art approaches.
Virtual Reality Prototype of a Linear Accelerator Simulator for Oncological Radiotherapy Training
ABSTRACT. Learning to operate medical equipment is one of the essential skills for providing efficient treatment to patients. One of the current problems faced by many medical institutions is the lack or shortage of specialized infrastructure for medical practitioners to conduct hands-on training. Medical equipment is mostly used for patients, limiting training time drastically. Virtual simulation can help alleviate this problem by providing the virtual embodiment of the medical facility in an affordable manner. This paper reports the current results of an ongoing project aimed at providing virtual reality-based technical training on various medical equipment to radiophysicist trainees. In particular, we introduce a virtual reality (VR) prototype of a linear accelerator simulator for oncological radiotherapy training. The paper discusses the main challenges and features of the VR prototype, including the system design and implementation. A key factor for trainees' access and usability is the user interface, particularly tailored in our prototype to provide a powerful and versatile yet friendly user interaction.
Image Features Correlation with the Impression Curve for Automatic Evaluation of the Computer Game Level Design
ABSTRACT. In this study, we present the confirmation of existence of the correlation of the image features with the computer game level Impression Curve. Even a single image feature can describe the impression value with good precision (significant strong relationship, Pearson r > 0,5). Best results were obtained using by combining several image features using multiple regression (significant very strong positive relationship, Pearson r = 0,75 at best). We also analyze the different set of image features at different level design stages (from blockout to final design) where significant correlation (strong to very strong) was observed regardless of the level design variant. Thanks to the study results, the user impression of virtual 3D space, can be estimated with a high degree of certainty by automatic evaluation using image analysis.
A review of 3D point clouds parameterization methods
ABSTRACT. 3D point clouds parameterization is a very important research topic in the fields of computer graphics and computer vision, which has many applications such as texturing, remeshing and morphing, etc. Different from mesh parameterization, point clouds parameterization is a more challenging task in general as there is normally no connectivity information between points. Due to this challenge, the papers on point clouds parameterization are not as many as those on mesh parameterization. To the best of our knowledge, there are no review papers about point clouds parameterization. In this paper, we present a survey of existing methods for parameterizing 3D point clouds. We start by introducing the applications and importance of point clouds parameterization before explaining some relevant concepts. According to the organization of the point clouds, we first divide point cloud parameterization methods into two groups: organized and unorganized ones. Since various methods for unorganized point cloud parameterization have been proposed, we further divide the group of unorganized point cloud parameterization methods into some subgroups based on the technique used for parameterization. The main ideas and properties of each method are discussed aiming to provide an overview of various methods and help with the selection of different methods for various applications.
Devulgarization of Polish Texts Using Pre-trained Language Models
ABSTRACT. In this paper we propose a text style transfer method for replacing vulgar expressions in Polish utterances with their non-vulgar equivalents while preserving the main characteristics of the text. We fine-tune three pre-trained language models (GPT-2, GPT-3 and T-5) on a newly created parallel corpus of sentences containing vulgar expressions and their equivalents. Then we evaluate the resulting models checking their style transfer accuracy, content preservation and language quality. To the best of our knowledge, the proposed solution is the first of its kind for Polish.
Classification and Generation of Derivational Morpho-semantic Relations for Polish Language
ABSTRACT. In this article, we take a new look on automated analysis and recognition of morpho-semantic relations in Polish. We present a combination of two methods for join exploration on word-form information -- generating new forms (morphology) and classifying pairs of words in derivational relations (lexical semantics). As a method of generation, we used the Transformer architecture in the seq-2-seq task by operating on character strings. Classification is performed using a neural network and a vector representation using the fastText method. In the classification process, we paid attention to the important elements in the classified words. At the very end, we discussed the results obtained in the experiments.
A deep neural network as a TABU support in solving LABS problem
ABSTRACT. One of the leading approaches for solving various hard discrete problems is designing advanced solvers based on local search heuristics. This observation is also relevant to the low autocorrelation binary sequence (LABS) -- an open hard optimisation problem that has many applications. There are a lot of dedicated heuristics such as the steepest-descent local search algorithm (SDLS), Tabu search or xLostovka algorithms. This paper introduce a new concept of combining well-known solvers with neural networks that improve the solvers' parameters based on the local context. The contribution proposes the extension of Tabu search (one of the well-known optimisation heuristics) with the LSTM neural network to optimise the number of iterations for which particular bits are blocked. Regarding the presented results, it should be concluded that the proposed approach is a very promising direction for developing highly efficient heuristics for LABS problem.
On the Explanation of AI-based Student Success Prediction
ABSTRACT. Student success prediction is one of the many applications of artificial intelligence (AI) which helps educators identify the students requiring tailored support. The intelligent algorithms used for this task consider various factors to make accurate decisions. However, the decisions produced by these models often become ineffective due to lack of explainability and trust. To fill this gap, this paper employs several machine learning models on a real-world dataset to predict students' learning outcomes from their social media usage. By leveraging the SHapley Additive exPlanations (SHAP) to investigate the model outcomes, we conduct a critical analysis of the model outcomes. We found several sensitive features were considered important by these models which can lead to questions of trust and fairness regarding the use of such features. Our findings were further evaluated by a real-world user study.
Transfer Learning based Natural Scene Classification for Scene Understanding by Intelligent Machines
ABSTRACT. Scene classification carry out an imperative accountability in the current emerging field of automation. Traditional classification methods endure with tedious processing techniques. With the advent of CNN and deep learning models have greatly accelerated the job of scene classification. In our paper we have considered an area of application where the deep learning can be used to assist in the civil and military applications and aid in navigation. Current image classifications concentrate on the various available labeled datasets of various images. This work concentrates on classification of few scenes that contain pictures of people and places that are affected in the areas of flood . This aims at assisting the rescue officials at the need of natural calamities, disasters, military attacks etc. Proposed work explains a classifying system which can categorize the small scene dataset using transfer learning approach. We collected the pictures of scenes from sites and created a small dataset with different flood affected activities. We have utilized transfer learning model, RESNET in our proposed work which showed an accuracy of 88.88% for ResNet50 and 91.04% for ResNet101 and endow with a faster and economical revelation for the application involved.
A Survey on Sustainable Software Ecosystems to Support Experimental and Observational Science at Oak Ridge National Laboratory
ABSTRACT. In the search for a sustainable approach for software ecosystems that supports experimental and observational science (EOS) across Oak Ridge National Laboratory (ORNL), we conducted a survey to understand the current and future landscape of EOS software and data. In this paper, we describe the survey design we used to identify significant areas of interest, gaps and potential opportunities, followed by a discussion on the obtained responses. The survey formulates questions about project demographics, technical approach, and skills required for the present and the next five years. The study was conducted among 38 ORNL participants between June and July of 2021 and followed the required guidelines for human subjects training. We plan to use the collected information to help guide a vision for sustainable, community-based, and reusable scientific software ecosystems that need to adapt effectively to: i) the evolving landscape of heterogeneous hardware in the next generation of instruments and computing (e.g. edge, distributed, accelerators), and ii) data management requirements for data-driven science using artificial intelligence.
Software Architecture for Highly Scalable Urban Traffic Simulation
ABSTRACT. Parallel computing is currently the only possible method for providing sufficient performance of large scale urban traffic simulations. The need for representing large areas with detailed, continuous space and motion models exceeds the capabilities of a single computer in terms of performance and memory capacity. Efficient distribution of such computation, which is considered in this paper, poses a challenge due to the need of repetitive synchronization of the common simulation model. We propose an architecture for efficient memory and communication management, which allows executing simulations of huge urban areas and efficient utilization of hundreds of computing nodes. In addition to analyzing performance tests, we also provide general guidelines for designing large-scale distributed simulations.
Automated and Manual Testing in the Development of the Research Software RCE
ABSTRACT. Research software is often developed by individual researchers or small teams in parallel to their research work. The more people and research projects rely on the software in question, the more important it is that software updates implement new features correctly and do not introduce regressions. Thus, developers of research software must balance their limited resources between implementing new features and thoroughly testing any code changes.
We present the processes we use for developing the distributed integration framework RCE at DLR. These processes aim to strike a balance between automation and manual testing, reducing the testing overhead while addressing issues as early as possible. We furthermore briefly describe how these testing processes integrate with the surrounding processes for development and releasing.
Towards Automated Application-Agnostic Verification for Simulations
ABSTRACT. Recent advances in computational simulations and the wider access to exascale machines have revolutionised our ability to study real-world phenomena. Despite its wide applicability and fundamental importance, the verification of computational simulations remains a challenging and under-explored area that will require new mathematical and software engineering approaches to be addressed. In this work, we report on our ongoing effort on specifying an application-agnostic verification methodology for simulations. We identify the main data gathering activities that must be incorporated in the simulation development to make it amenable to verification, and present a set of verification strategies that can be applied.
Learning I/O Variables from Scientific Software's User Manuals
ABSTRACT. Scientific software often involves many input and output variables. Identifying these variables is important for such software engineering tasks as metamorphic testing. To reduce the manual work, we report in this paper our investigation of machine learning algorithms in classifying variables from software's user manuals. We identify thirteen natural-language features, and use them to develop a multi-layer solution where the first layer distinguishes variables from non-variables and the second layer classifies the variables into input and output types. Our experimental results on three scientific software systems show that random forest feedforward neural network can be used to best implement the first layer and second layer respectively.
Digging Deeper Into the State of the Practice for Domain Specific Research Software
ABSTRACT. We have developed a methodology for assessing the state of the software development practice for a given research software domain. Our methodology prescribes the following steps: i) Identify the domain; ii) Identify a list of candidate software packages; iii) Filter the list to a length of about 30 packages; iv) Collect repository related data on each software package, like number of stars, number of open issues, number of lines of code; v) Fill in the measurement template (the template consists of 108 questions to assess 9 qualities (including the qualities of installability, usability and visibility)); vi) Rank the software using the Analytic Hierarchy Process (AHP); vii) Interview developers (the interview consists of 20 questions and takes about an hour); and, viii) Conduct a domain analysis. The collected data is analyzed by: i) comparing the ranking by best practices against the ranking by popularity; ii) comparing artifacts, tools and processes to current research software development guidelines; and, iii) exploring pain points. We estimate the time to complete an assessment for a given domain at 173 person hours. The method is illustrated via the example of Lattice Boltzmann Solvers, where we find that the top packages engaged in most of recommended best practices, but still show room for improvement with respect to providing API documentation, a roadmap, a code of conduct, programming style guide, uninstall instructions and continuous integration.
Statistical prediction of extreme events from small datasets
ABSTRACT. We propose Echo State Networks (ESNs) to predict the statistics of extreme events in a turbulent flow.
We train the ESNs on small datasets that lack information about the extreme events. We asses whether the networks are able to extrapolate from the small imperfect datasets and predict the heavy-tail statistics that describe the events. We find that the networks correctly predict the events and improve the statistics of the system with respect to the training data in almost all cases analysed.
This opens up new possibilities for the statistical prediction of extreme events in turbulence.
Outlier detection for categorial data using clustering algorithms
ABSTRACT. Detecting outliers is a widely studied problem in many disciplines,
including statistics, data mining and machine learning. All anomaly detection
activities are aimed at identifying cases of unusual behavior when compared to
the remaining set. There are many methods to deal with this issue, which are
applicable depending on the size of the dataset, the way it is stored and the
type of attributes and their values. Most of them focus on traditional datasets
with a large number of quantitative attributes. While there are many solutions
available for quantitative data, it remains problematic to find efficient methods for
qualitative data. The main idea behind this article was to compare categorical data
clustering algorithms: K-modes and ROCK. In the course of the research, the
authors analyzed the clusters detected by the indicated algorithms, using several
datasets different in terms of the number of objects and variables, and conducted
experiments on the parameters of the algorithms. The presented study has made it
possible to check whether the algorithms detect the same outliers in the data and
how much they depend on individual parameters such as the number of variables,
tuples and categories of a qualitative variable.
Machine Learning-based Scheduling and Resources Allocation in Distributed Computing
ABSTRACT. In this work we study a promising approach for efficient online scheduling of job-flows in high performance and distributed parallel computing. The majority of job-flow optimization approaches, including backfilling and microscheduling, require apriori knowledge of a full job queue to make the optimization decisions. In a more general scenario when user jobs are submitted individually, the resources selection and allocation should be performed immediately in the online mode. In this work we consider a neural network prototype model trained to perform online optimization decisions based on a known optimal solution. For this purpose, we designed MLAK algorithm which implements 0-1 knapsack problem based on the apriori unknown utility function. In a dedicated simulation experiments with different utility functions MLAK provides resources selection efficiency comparable to a classical greedy algorithm.
A GPU-based algorithm for environmental data filtering
ABSTRACT. Nowadays, the Machine Learning (ML) approach is needful to many research fields.
Among these, the Environmental Science (ES) which involves a large amount of data
to be processed and collected. On the other hand, in order to provide a
reliable output, those data information must be assimilated. Since this process requires a large execution time
when the input dataset is very huge, here we propose a parallel GPU algorithm based on
a curve fitting method, to filter the starting dataset, by exploiting
the computational power of the CUDA tool. Our experiments show the achieved results in terms of performance.
Cloud as a Platform for W-Machine Didactic Computer Simulator
ABSTRACT. Effective teaching of how computers work is essential for future computer engineers and requires fairly simple computer simulators used in regular students' education.
This article shows the evaluation of alternative architectures (platform-based and serverless) for cloud computing as a working platform for a simple didactic computer simulator called W-Machine. The model of this didactic computer is presented at the microarchitecture level, emphasizing the design of the control unit of the computer. The W-Machine computer simulator allows students to create both new instructions and whole assembly language programs.
Extensible, cloud-based open environment for remote classes reuse
ABSTRACT. The purpose of the presented work was to ease the reuse of whole remote classes, including the workflow involving multiple tools used during the class. From our perspective, such a functionality is especially useful for introducing new topics into curricula, which happens quite often in laboratory-based courses int the Information Technologies area.
The proposed approach allows professors to shorten the time it takes to create remote collaboration environments designed and prepared by their peers, by integrating multiple tools in topic-oriented educational collaboration environments. To achieve that, we split the system, codenamed 'Sozisel', into two parts - one of them responsible for facilitating creation of educational environments templates (models) describing the educational event workflows composed of multiple tools, the other being responsible for executing the actual events.
For the environment to be extensible, we decided to use open, standards-based technologies only. That stands in contrast with most of commercially available environments, which are frequently based on proprietary technologies. The functional evaluation carried on the Sozisel prototype proved that the system does not require significant resources to be used in classes of medium size.
ABSTRACT. For over fifty years we have worked to improve the teaching of computer science and coding. Teaching computational science extends on these challenges as students may be less inclined towards coding given they have chosen a different discipline which may have only recently become computational. Introductory coding education could be considered a checklist of skills, however that does not prepare students for tackling innovative projects. To apply coding to a domain, students need to take their skills and venture into the unknown, persevering through various levels of errors and misunderstanding. In this paper we reflect on programming assignments in Curtin Computing's Fundamentals of Programming unit. In the recent Summer School, students were challenged to simulate the generation and movement of snowflakes, experiencing frustration and elation as they achieved varying levels of success in the assignment. Although these assignments are resource-intensive in design, student effort and assessment, we see them as the most effective way to prepare students for future computational science projects.
Computational Science 101 - Towards a Computationally Informed Citizenry
ABSTRACT. This article gives an overview of CSCI 1280, an introductory course in computational science being developed at the University of Nebraska at Omaha. The course is intended for all students, regardless of major, and is delivered in a fully asynchronous format that makes extensive use of ed tech and virtual technologies.
How to sort them? A network for LEGO bricks classification
ABSTRACT. LEGO bricks are highly popular due to the ability to build almost any type of creation. This is possible thanks to availability of multiple shapes and colors of the bricks. For the smooth build process the bricks need to properly sorted and arranged. In our work we aim at creating an automated LEGO bricks sorter. With over 3700 different LEGO parts bricks classification has to be done with deep neural networks. The question arises which model of the available should we use? In this paper we try to answer this question. The paper presents a comparison of 28 models used for image classification trained to classify objects to high number of classes with potentially high level of similarity. For that purpose a dataset consisting of 447 classes was prepared. The paper presents brief description of analyzed models, the training and comparison process and discusses the results obtained. Finally the paper proposes an answer what network architecture should be used for the problem of LEGO bricks classification and other similar problems.
ACCirO: A System for Analyzing and Digitizing Images of Charts with Circular Objects
ABSTRACT. There is recent interest in improving the accessibility of data representation in the form of charts in digital media. This necessitates the automated interpretation of images of charts available in documents and the internet. One of the approaches of automating the chart image interpretation involves decoding the data represented through graphical objects in the charts, \eg pie segments scatter points, along with semantic information obtained from the textual content of the chart image. We focus on scatter plots and pie charts, as they are amongst the most commonly used charts in marketing research and data analysis workflows. They commonly have circle objects. Thus, we propose a chart interpretation system, ACCirO (Analyzer of Charts with Circular Objects), that exploits the color and geometry of circular objects in scatter plots, its variants, and pie charts to extract the chart data from the chart images. We use deep learning-based OCR approaches for text recognition to add semantics. We generate appropriate templatized sentence structures using the extracted data table for text summarization of the charts. Text summarization enables improving the accessibility of these charts for visually challenged users. Overall, we use both image processing and deep learning approaches in our algorithm, which has improved the accuracy compared to the state-of-the-art. Our qualitative and quantitative results show the effectiveness of our proposed algorithm.
Comparing explanations from glass-box and black-box machine-learning models
ABSTRACT. Explainable Artificial Intelligence (XAI) aims at introducing transparency and intelligibility into the decision-making process of AI systems. In recent years, most efforts were made to build XAI algorithms that are able to explain black-box models. However, in many cases, including medical and industrial applications, the explanation of a decision may be worth equally or even more than the decision itself. This imposes a question about the quality of explanations. In this work, we aim at investigating how the explanations derived from black-box models combined with XAI algorithms differ from those obtained from inherently interpretable glass-box models. We also aim at answering the question whether there are justified cases to use less accurate glass-box models instead of complex black-box approaches. We perform our study on publicly available datasets.