ICCS 2021: INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE
PROGRAM FOR WEDNESDAY, JUNE 16TH
Days:
next day
all days

View: session overviewtalk overview

09:20-10:10 Session 2: Keynote Lecture 1
09:20
Does In Silico Medicine need Big Science?

ABSTRACT. The term Big Science is used to indicate the transformation of some research fields after the second world war, characterised by the creation of very large research groups and infrastructures. Historically bioengineering in general and computational biomedicine in particular has been characterised by small research groups working on a very narrowly defined problem; the opposite of big science. But in the last 12 years saw the emergence of three very large institute mostly focused on In Silico Medicine: the Auckland Bioengineering Institute; the Insigneo Institute; and the Sano Centre. The research team established by Prof Peter Hunter became a Large Scale Research Institute of the University of Auckland (NZ) with the name of Auckland Bioengineering Institute (ABI) in 2008. The Insigneo Institute at the University of Sheffield (UK) was established in 2012; the Sano Centre for Computational Medicine was established in Krakow (PL) in 2019. In this presentation, we analyse the motivations behind these endeavours, framing such analysis in the context of the barriers that are slowing the widespread adoption of In Silico Medicine methods in the clinical and industrial practice.

09:45
Towards Personalised Computational Medicine – Sano Centre Perspective

ABSTRACT. Patients are different in many aspects and these differences are enhanced by the complexity of diseases development. To provide more personalized treatment, human physiology models are becoming more and more complex and the amount of data is too large to be processed in a traditional way. Therefore, advanced methods of modelling and data analysis are necessary. This means that today's medicine is increasingly entering a field similar to engineering and we observe the development of computational medicine which adopt advanced computational technologies and data systems.

Since 2000 DICE Team (http://dice.cyfronet.pl/) has been involved in the research focused on the elaboration of problem-solving environments and decision support systems for medicine on top of distributed computing infrastructures. This research is financed mainly by projects of the European Commission and the main partners are the University of Sheffield, University of Amsterdam, Juelich Supercomputing Centre, LMU and Leibniz Supercomputing Centre in Munich.

As a result of this scientific collaboration a new scientific unit: Sano Centre for Computational Personalized Medicine - International Research Foundation was established in 2019, in Krakow (https://sano.science/).

Six Sano research teams cover such in-silico medicine areas as modelling and simulation, data science, artificial intelligence and machine learning methods, image processing, IT methods in medicine, large-scale computing, and decision-making support systems. Researchers at Sano use the computing and storage resources of PL-Grid, a.o. the Prometheus computer at Cyfronet AGH.This research will result in the development of tools supporting doctors in diagnostic and treatment processes. It is extremely valuable from the point of view of an individual patient and will reduce the costs of treatments. Modern computer technologies developed in Sano may also be used in pharmaceutical and biotechnology laboratories.

Acknowledgements. Sano Centre is financed by the European Union’s Horizon 2020 Teaming (grant 857533, 15 M€), the International Research Agendas Programme of the Foundation for Polish Science grant MAB PLUS/2019/13 co-funded by the European Union in the scope of the European Regional Development Fund (10 M €), and Polish Ministry of Education and Science (after 2023, 5 M €).

10:10-10:40Coffee Break
10:40-12:20 Session 3A: MT 1
10:40
Smoothing Speed Variability in Age-Friendly Urban Traffic Management

ABSTRACT. Traffic congestion has a negative impact on vehicular mobility, especially for senior drivers. Current approaches to urban traffic management focus on adaptive routing for the reduction of fuel consumption and travel time. Most of these approaches do not consider age-friendliness, in particular that speed variability is difficult for senior drivers. Frequent stop and go situations around congested areas are tiresome for senior drivers and make them prone to accidents. Moreover, senior drivers' mobility is affected by factors such as travel time, surrounding vehicles' speed, and hectic traffic. Age-friendly traffic management requires a multi-criteria solution, where fuel consumption and travel time are considered together with speed variability. This paper introduces a multi-agent pheromone-based vehicle routing algorithm that smooths speed variability while also considering older drivers during traffic light control. Simulation results demonstrate 17.6% improvement in speed variability as well as reducing travel time and fuel consumption by 11.6% and 19.8% respectively compared to the state of the art.

11:00
An innovative employment of NetLogo AIDS model in developing a new chain coding mechanism for compression

ABSTRACT. In this paper, we utilize the NetLogo HIV model in constructing an environment for bi-level image encoding and employ it in compression. Our model considers converting an image into a virtual environment that consists of female agents testing positive and negative for HIV. Female agents are scattered according to the allocation of the pixels in the original images to be tested. The simulation considers introducing male agents that test positive for HIV the purpose of which is to track their movements while infecting other HIV- female agents. The progressions of the HIV+ male agents within the simulation take advantage of the relative encoding approach previously utilized by other agent-based models. That is to say, the simulation allows generating a high proportion of similar movement forms that are similarly encoded regardless of the movements of agents. This is followed up by applying Huffman coding to the obtained chains of movement strings for further reduction. The final results reveal that our product could outperform existing benchmarks using all the images we employed in testing.

11:20
Simulation modeling of epidemic risk in supermarkets: Investigating the impact of social distancing and checkout zone design

ABSTRACT. We build an agent-based model for evaluating the effectiveness of safety regulations related to distancing, that have been introduced in supermarkets after the COVID-19 outbreak. The model is implemented in the NetLogo simulation platform and calibrated to actual point of sale data from one of major European retail chains. It enables realistic modeling of the checkout operations as well as of the airborne diffusion of SARS-CoV-2 particles. We find that opening checkouts in a specific order can reduce epidemic risk, but only under low and moderate traffic conditions. Moreover, scenarios where only every second checkout can be opened are suboptimal, as they significantly increase the time spent in the queue. Hence, redesigning supermarket layouts to increase distances between the queues can reduce epidemic risk only if the number of open checkouts is sufficient to serve customers during peak hours.

11:40
A multi-cell cellular automata model of traffic flow with emergency vehicles: effect of a corridor of life

ABSTRACT. The intensity of traffic flow is constantly increasing, which causes significant difficulties in the movement of emergency vehicles through the city. There are various macroscopic and microscopic road traffic models that allow traffic flow analysis. However, it should be emphasized that standard traffic flow models do not include emergency vehicle traffic. We propose a multi-agent microscopic model for analyzing traffic flow of emergency vehicles with some limitations to the distance between vehicles and their proper distribution ("corridor of life") to leave free passage for a privileged vehicle. Real data was used to calibrate and validate the model. Our simulation studies show the importance of certain aspects of road traffic (distance between vehicles, "corridor of life", size and type of roadside, frictional conflict, etc.) in order to increase / decrease the traffic flow in the aspect of an approaching of emergency vehicle.

12:00
HSLF: HTTP Header Sequence based LSH fingerprints for Application Traffic Classification

ABSTRACT. Distinguishing the prosperous network application is a challenging task in network management that has been extensively studied for many years. Unfortunately, previous work on HTTP traffic classification rely heavily on prior knowledge with coarse grained thus are limited in detecting the evolution of new emerging application and network behaviors. In this paper, we propose HSLF, a hierarchical system that employed application fingerprint to classify HTTP traffic. Specifically, we employ local-sensitive hashing algorithm to obtain the importance of each fields in HTTP header, from which a rational weight allocation scheme and fingerprint of each HTTP session are generated. Then, similarity of fingerprint among each applications are calculated to classify the unknown HTTP traffic. Performance on a real-world dataset of HSLF achieves an accuracy of 96.6\%, which outperforms classic machine learning methods and state-of-the-art models.

10:40-12:20 Session 3B: MT 2
10:40
Music genre classification: looking for the Perfect Network

ABSTRACT. This paper presents research on music genre recognition. It is a crucial task because there are millions of songs in the online databases. Classifying them by a human being is impossible or extremely expensive. As a result, it is natural to create methods that can assign a given track to a music genre. Here, the classification of music tracks is carried out by deep learning models. The Free Music Archive dataset was used to perform experiments. The tests were executed with the usage of Convolutional Neural Network, Convolutional Recurrent Neural Networks with 1D and 2D convolutions, and Recurrent Neural Network with Long Short-Term Memory cells. In order to combine the advantages of different deep neural network architectures, a few types of ensembles were proposed with two types of results mixing methods. The best results obtained in this paper, which are equal to state-of-the-art methods, were achieved by one of the proposed ensembles. The solution described in the paper can help to make the auto-tagging of songs much faster and more accurate in the context of assigning them to particular musical genres.

11:00
Big Data for National Security in the Era of COVID-19

ABSTRACT. The COVID-19 epidemic has changed the world dramatically as societies adjust their behaviour to meet the challenges and uncertainties of the new normal. These uncertainties have led to instabilities in several facets of society, most notably health, economy and public order. Increasing discontent within societies in response to government mandated measures to contain the pandemic have triggered social unrest, imposing serious threats to national security. Big Data Analytics can provide a powerful force multiplier to support policy and decision makers to contain the virus while at the same time dealing with such threats to national security. This paper presents the utilisation of a big data forecasting and analytics framework to deal with COVID-19 triggered social unrest. The framework is applied and demonstrated in two different disruptive incidents in the United States of America.

11:20
Efficient prediction of spatio-temporal events on the example of the availability of vehicles rented per minute

ABSTRACT. This article shows a solution to the problem of predicting the availability of vehicles rented per minute in the city. A grid-based spatial model with use of LSTM network augmented with Time Distribution Layer were developed and tested against actual vehicle availability dataset. The dataset was also made publicly available for researchers as a part of this study. The predictive model developed in the study is used in a multi-modal trip planner.

11:40
Grouped Multi-Layer Echo State Networks with Self-Normalizing Activations

ABSTRACT. We study prediction performance and memory capacity of Echo State Networks with multiple reservoirs built based on stacking and grouping. Grouping allows for developing independent subreservoir dynamics, which improves linear separability on readout layer. At the same time, stacking enables to capture multiple time-scales of an input signal by the hierarchy of non-linear mappings. Combining those two effects, together with a proper selection of model hyperparameters can boost ESN capabilities for benchmark time-series such as Mackey Glass System. Different strategies for determining subreservoir structure are compared along with the influence of activation function. In particular, we show that recently proposed non-linear self-normalizing activation function together with grouped deep reservoirs provide superior prediction performance on artificial and real-world datasets. Moreover, comparing to standard tangent hyperbolic models, the new models built using self-normalizing activation function are more feasible in terms of hyperparameter selection. As a software contribution, we prepared and published AutoESN library, which enables creating and testing of grouped multi-layer ESNs within PyTorch environment.

10:40-12:20 Session 3C: AIHPC4AS 1
10:40
Outlier removal for isogeometric spectral approximation with the optimally-blended quadratures

ABSTRACT. It is well-known that outliers appear in the high-frequency region in the approximate spectrum of isogeometric analysis of the second-order elliptic operator. Recently, the outliers have been eliminated by a boundary penalty technique. The essential idea is to impose extra conditions arising from the differential equation at the domain boundary. In this paper, we extend the idea to remove outliers in the superconvergent approximate spectrum of isogeometric analysis with optimally-blended quadrature rules. We show numerically that the eigenvalue errors are of superconvergence rate $h^{2p+2}$ and the overall spectrum is outlier-free. The condition number and stiffness of the resulting algebraic system are reduced significantly. Various numerical examples demonstrate the performance of the proposed method.

11:00
Deep learning model of logging-while-drilling electromagnetic measurements

ABSTRACT. To improve placement of a hydrocarbon well in the target formation, it is common to intentionally change the well’s trajectory (geosteer) in response to acquired measurements in real-time. The most reliable source of information remains the so-called extra- or ultra-deep electromagnetic (EM) measurements, which are sensitive to rock properties and formation boundaries in over one hundred feet. Reliable geosteering in a hydrocarbon reservoir requires real-time interpretation of data acquired in the field, which in tern requires thousands evaluations of the forward EM model. We propose to use a deep neural network (DNN) for forward modelling of full suit of EM measurements transmitted in during drilling to satisfy requirements of a real-time inversion.

The main contribution of this work is to assess the performance and accuracy of the proposed machine learning method when applied to typical realistic operations. For that, we use published geometry and geology corresponding to a section of the Goliat field in the Barents Sea. Using our experiments, we show the influence of different build-for-purpose training datasets on the training time and performance of the DNN model on the field case scenario. This is essential, since DNN ability to extrapolate the data is limited.

11:20
Socio-cognitive Evolution Strategies

ABSTRACT. Socio-cognitive computing is a paradigm developed for the last several years, it consists in introducing into metaheuristics mecha- nisms inspired by inter-individual learning and cognition. It was success- fully applied in hybridizing ACO and PSO metaheuristics. In this paper we have followed our previous experiences in order to hybridize the ac- claimed evolution strategies. The newly constructed hybrids were applied to popular benchmarks and compared with their referential versions.

11:40
AI-accelerated CFD simulation based on OpenFOAM and CPU/GPU computing

ABSTRACT. In this paper, we propose a method for accelerating CFD (computational fluid dynamics) simulations by integrating a conventional CFD solver with our AI module. The investigated phenomenon is responsible for chemical mixing. The considered CFD simulations belong to a group of steady-state simulations and utilize the MixIT tool, which is based on the OpenFOAM toolbox. The proposed module is implemented as a CNN (convolutional neural network) supervised learning algorithm. Our method distributes the data by creating a separate AI sub-model for each quantity of the simulated phenomenon. These sub-models can then be pipelined during the inference stage to reduce the execution time or called one-by-one to reduce memory requirements.

We examine the performance of the proposed method depending on the usage of the CPU or GPU platforms. For test experiments with varying quantities conditions, we achieve time-to-solution reductions around a factor of 10. Comparing simulation results based on the histogram equalization method shows the average accuracy for all the quantities around 92%.

10:40-12:20 Session 3D: BBC 1
10:40
Controlling costs in feature selection: information theoretic approach

ABSTRACT. Feature selection in supervised classification is a crucial task in many biomedical tasks. Most existing approaches assume that all features have the same cost. However, in many medical applications, this assumption may be inappropriate as the acquisition of values of some features can be costly. For example, in medical diagnosis, each diagnostic value extracted by a clinical test is associated with its own cost. The costs can also refer to non-financial aspects, for example the decision between invasive exploratory surgery and a simple blood test. In such cases, the goal is to select a subset of features associated with the class variable (e.g. occurrence of disease) within the assumed user-specified budget. We consider a general information theoretic framework that allows to control costs of features. The proposed criterion consists of two components: the first one describes feature relevance and the second one is a penalty for its cost. We introduce a cost factor which controls the trade-off between these two components. We propose a procedure in which the optimal value of the cost factor is chosen in a data-driven way. The experiments on artificial and real medical datasets indicate that, when the budget is limited, the proposed approach is superior to existing traditional feature selection methods. The proposed framework has been implemented in an open source Python library available at https://github.com/kaketo/bcselector

11:00
How fast vaccination can control the COVID-19 pandemic in Brazil?

ABSTRACT. The first case of Corona Virus Disease (COVID-19) was registered in Wuhan, China, in November 2019. In March, the World Health Organization (WHO) declared COVID-19 as a global pandemic. The effects of this pandemic have been devastating worldwide, especially in Brazil, which occupies the third position in the absolute number of cases of COVID-19 and the second position in the absolute number of deaths by the virus. A big question that the population yearns to be answered is: When can life return to normal? To address this question, this work proposes an extension of a SIRD-based mathematical model that includes vaccination effects. The model takes into account different rates of daily vaccination and different values of vaccine effectiveness. The results show that although the discussion is very much around the effectiveness of the vaccine, the daily vaccination rate is the most important variable for mitigating the pandemic. Vaccination rates of 1M per day can potentially stop the progression of COVID-19 epidemics in Brazil in less than one year.

11:20
Uncertainty Quantification of Tissue Damage Due to Blood Velocity in Hyperthermia Cancer Treatments

ABSTRACT. In 2020 cancer was responsible for almost 10 million deaths around the world. There are many types of treatments to fight against it, such as chemotherapy, radiation therapy, immunotherapy and stem cell transplant. Hyperthermia is a new treatment that is under study in clinical trials. The idea is to raise the tumour temperature to reach its necroses. Since this is a new treatment, some questions are open, which can be answered using computational models. Bioheat porous media model has been used to simulate this treatment, but each work adopts distinct values for some parameters, such as the blood velocity, which may impact the results obtained. Due to the uncertainties associated to the blood velocity parameter, in this paper uncertainty quantification is used to analyse its impacts on the simulations. The results of the in silico experiments has shown that considering the uncertainties presented in blood velocity it is possible to plan the hyperthermia treatment to ensure that the entire tumour site reaches the target temperature to kill it.

11:40
EEG-based Emotion Recognition – Evaluation Methodology Revisited

ABSTRACT. The challenge of EEG-based emotion recognition had inspired researchers for years. However, lack of efficient technologies and methods of EEG signal analysis hindered the development of successful solutions in this domain. Recent advancements in deep convolutional neural networks (CNN), facilitating automatic signal feature extraction and classification, brought a hope for more efficient problem solving. Unfortunately, vague and subjective interpretation of emotional states limits effective training of deep models, especially when binary classification is performed basing on datasets with non-bimodal distribution of emotional state ratings. In this work we revisited the methodology of emotion recognition, proposing to use regression instead of classification, along with appropriate result evaluation measures based on mean absolute error (MAE) and mean squared error (MSE). The advantages of the proposed approach are clearly demonstrated on the example of the well-established and explored DEAP dataset.

12:00
Modeling the electromechanics of a single cardiac myocyte

ABSTRACT. The synchronous and proper contraction of cardiomyocytes is essential for the correct function of the whole heart. Computational models of a cardiac cell may spam multiple cellular sub-components, scales, and physics. As a result, they are usually computationally expensive. This work proposes a low-cost model to simulate the cardiac myocyte's electromechanics. The modeling of action potential and active force is performed via a system of six ordinary differential equations. Cardiac myocyte's deformation that considers details of its geometry is captured using a mass-spring system. The mathematical model is integrated in time using Verlet's method to obtain the position, velocity, and acceleration of each discretized point of the single cardiac myocyte. Our numerical results show that the obtained action potential, contraction, and deformation reproduces very well physiological data. Therefore, the low-cost mathematical model proposed here can be used as an essential tool for the correct characterization of cardiac electromechanics.

10:40-12:20 Session 3E: COMS 1
10:40
Expedited Trust-Region-Based Design Closure of Antennas by Variable-Resolution EM Simulations

ABSTRACT. The observed growth in the complexity of modern antenna topologies fostered widespread employment of numerical optimization methods as the primary tools for final adjustment of the system parameters. This is mainly caused by the insufficiency of traditional design closure approaches, largely based on parameter sweeping. Reliable evaluation of complex antenna structures requires full-wave electromagnetic (EM) analysis. Yet, EM-driven parametric optimization is, more often than not, extremely costly, especially when a global search is involved, e.g., performed with population-based metaheuristic algorithms. Over the years, numerous methods of lowering these expenditures have been proposed. Among these, the methods exploiting variable-fidelity simulations started gaining certain popularity. Still, such frameworks are predominantly restricted to two levels of fidelity, referred to as coarse and fine models. This paper introduces a reduced-cost trust-region gradient-based algorithm involving variable-resolution simulations, in which the fidelity of EM analysis is selected from a continuous spectrum of admissible levels. The algorithm is launched with the coarsest discretization level of the antenna under design. As the optimization process converges, for re-liability reasons, the model fidelity is increased to reach the highest level at the final stage. The proposed algorithm allows for a significant reduction of the computational cost (up to sixty percent with respect to the reference trust-region algorithm) without compromising the design quality, which is corroborated by thor-ough numerical experiments involving four broadband antenna structures.

11:00
Optimum Design of Tuned Mass Dampers for Adjacent Structures via Flower Pollination Algorithm

ABSTRACT. It is a very known issue that tuned mass dampers (TMDs) on an effective system for structures subjected to earthquake excitations. TMDs can be also used as a protective system for adjacent structures that may pound to each other. With a suitable optimization methodology, it is possible to find an optimally tuned TMD that is effective in reducing the responses of structure with an additional protec-tive feature that reduces the amount of required seismic gap between adjacent structures by using an objective function. This function considers the displace-ment of structures with respect to each other. As the optimization methodology, the flower pollination algorithm (FPA) is used in finding the optimum parameters of TMDs of both structures. The method was evaluated on two 10- story adjacent structures and the optimum results were compared with harmony search (HS) based methodology.

11:20
On Fast Multi-Objective Optimization of Antenna Structures Using Pareto Front Triangulation and Inverse Surrogates

ABSTRACT. Design of contemporary antenna systems is a challenging endeavor, where conceptual developments and initial parametric studies, interleaved with topology evolution, are followed by a meticulous adjustment of the structure dimensions. The latter is necessary to boost the antenna performance as much as possible and often requires handling several and often conflicting objectives, pertinent to both electrical and field properties of the structure. Unless the designer’s priorities are already established, multi-objective optimization (MO) is the preferred way of yielding the most comprehensive information about the best available design trade-offs. Notwithstanding, MO of antennas has to be carried out at the level of full-wave electromagnetic (EM) simulation models which poses serious difficulties due to the high computational costs of the process. Popular mitigation methods include surrogate-assisted procedures; however, rendering reliable metamodels is problematic at higher-dimensional parameter spaces. This paper proposes a simple yet efficient methodology for multi-objective design of antenna structures, which is based on sequential identification of the Pareto-optimal points using in-verse surrogates, and triangulation of the already acquired Pareto front representation. The two major benefits of the presented procedure are low computational complexity, and uniformity of the produced Pareto set, as demonstrated using two microstrip structures, a wideband monopole, and a planar quasi-Yagi. In both cases, ten-element Pareto sets are generated at the cost of only a few hundreds of EM analyses of the respective devices. At the same time, the savings over the state-of-the-art surrogate-based MO algorithm are as high as seventy percent.

11:40
Optimizations of a Generic Holographic Projection Model for GPU's

ABSTRACT. Holographic projections are volumetric projections that make use of the wave-like nature of light and may find use in applications such as volumetric displays, 3D printing, lithography and LIDAR. Modelling different types of holographic projectors is straightforward but challenging due to the large number of samples that are required. Although computing capabilities have improved, recent simulations still have to make trade-offs between accuracy, performance and level of generalization. Our research focuses on the development of optimizations that make optimal use of modern hardware, allowing larger and higher-quality simulations to be run. Several algorithms are proposed; (1) a brute force algorithm that can reach 20% of the theoretical peak performance and reached a speedup of 43 w.r.t. a previous GPU implementation and (2) a Monte Carlo algorithm that is another magnitude faster but has a lower accuracy. These implementations help researchers to develop and test new holographic devices.

12:00
Similarity and Conformity Graphs in Lighting Optimization and Assessment

ABSTRACT. Lighting affects everyday life in terms of safety, comfort and quality of life. On the other side it consumes significant amounts of energy. Thanks to the effect of scale, even a small unit reduction of a power efficiency yields the significant energy and cost savings. Unfortunately, planning a highly optimized lighting installation is a task of the high complexity, due to a huge number of variants to be checked. In such circumstances it becomes necessary to use a formal model, applicable for automated bulk processing, which allows finding the best setup or estimating resultant installation power in an acceptable time, i.e., in hours rather than days. This paper introduces such a formal model relying on the similarity and conformity graph concepts. The examples of their practical application in outdoor lighting planning are also presented. Applying those structures allows reducing substantially a processing time required for planning large scale installations.

10:40-12:20 Session 3F: SOFTMAC 1
10:40
Novel finite element solvers for coupled Stokes-Darcy flow problems

ABSTRACT. In this talk, we present new finite element methods for resolving coupled Stokes-Darcy flow problems. The Darcy flow is discretized for the primal variable pressure using the novel weak Galerkin finite elements, for which the shape functions are separately defined in element interiors and edges/faces. The discrete weak gradients of these shape functions and later on Darcy velocity are established in (i) the Raviart-Thomas spaces for triangles or tetrahedra, (ii) the Arbogast-Correa space for quadrilaterals, (iii) the Arbogast-Tao space for hexahedra. The Stokes flow is discretized using the Bernardi-Raugel elements (BR1,P0) for triangles or tetrahedra, (BR1,Q0) for quadrilaterals and hexahedra. These two types of discretizations are combined at an interface, where kinematic, normal stress, and the Beavers-Joseph-Saffman (BJS) conditions are applied. Efficient implementation of these methods in Matlab (for 2-dim) and C++ (for 3-dim) will be discussed along with presentation of numerical experiments. A part of this talk is based on joint work with Simon Tavener at Colorado State University (USA) and Graham Harper and Tim Wildey (both at Sandia National Labs, USA).

11:00
Learning nonlinear upscaling model for nonlinear transport problems

ABSTRACT. We will present a new data-driven technique for building practical computational models for applications. Specifically, we will consider transport phenomena in heterogeneous media, which occur in many applications including groundwater hydrology, petroleum engineering, atmospheric sciences, filtration processes, membrane applications, and so on. For example, in porous media applications, the contaminant is transported by the velocity due to the flow patterns. In filtration applications, the dust particles are transported in the filter by the external flow. In many of these applications, the transport is due to the velocity field, which is highly heterogeneous. Our goal is to design an efficient data-driven computational upscaling model for the simulations of transport in heterogeneous media. The key element is to make use of measurement and simulation data to learn the macroscopic equations in the computational model by some deep learning techniques. We will present some numerical results. The research is partially supported by the Hong Kong RGC General Research Fund (Project numbers 14304719 and 14302018).

11:20
Staggered DG method for Darcy flows in fractured porous media on general meshes

ABSTRACT. Recently, polygonal finite element methods have received considerable attention. In this talk, we present and analyze a staggered discontinuous Galerkin method for Darcy flows in fractured porous media on fairly general meshes. A staggered discontinuous Galerkin method and a standard conforming finite element method with appropriate inclusion of interface conditions are exploited for the bulk region and the fracture, respectively. Our current analysis works on fairly general polygonal elements even in the presence of small edges. We prove the optimal convergence estimates in $L^2$ error for all the variables by exploiting the Ritz projection. Importantly, our error estimates are shown to be fully robust with respect to the heterogeneity and anisotropy of the permeability coefficients. Several numerical experiments including meshes with small edges and anisotropic meshes are carried out to confirm the theoretical findings. Finally, our method is applied in the framework of unfitted mesh.

11:40
Multi-phase compressible compositional simulations with phase equilibrium computation in the VTN specification

ABSTRACT. In this paper, we present a numerical solution of a multi-phase compressible Darcy's flow of a multi-component mixture in a porous medium. The mathematical model consists of mass conservation equation of each component, extended Darcy's law for each phase, and an appropriate set of the initial and boundary conditions. The phase split is computed using the phase equilibrium computation in the $VTN$-specification (known as VTN-flash). The transport equations are solved numerically using the mixed-hybrid finite element method and a novel iterative IMPEC scheme [1]. We provide two examples showing the performance of the numerical scheme.

12:00
Numerical Simulation of Free Surface Affected by Submarine with a Rotating Screw Moving Underwater

ABSTRACT. We conducted a numerical simulation of the free surface affected by the diving movement of an object such as a submarine. We have already proposed a computation method that combines the moving grid finite volume method and a surface height function method. In this case, the dive movement was expressed only as a traveling motion, not as a deformation. To express the deformation of a body underwater, the unstructured moving grid finite volume method and sliding mesh approach are combined. The calculation method is expected to be suitable for a computation with high versatility. After the scheme was validated, it was put to practical use. The free surface affected by a submarine with a rotating screw moving underwater was computed using the proposed method. Owing to the computation being for a relatively shallow depth, a remarkable deformation of the free surface occurred. In addition, the movement of the submarine body had a more dominant effect than a screw rotation on changing the shape of the free water surface.

10:40-12:20 Session 3G: CLDD 1
10:40
Chosen Challenges of Imbalanced Data Stream Classification

ABSTRACT. The standard artificial intelligence methods are – in general –dedicated to problems with both stationary and balanced concept characteristics. Meanwhile, the application of such developed solutions in production environments means that we provide the end-users of our systems with models outdated from the beginning - fitted to the problem sample available in the research phase of the project. The real data streams rarely reflect the stationary characteristics of the problems,and even less frequently presents concepts that appear in a balanced proportion. We relatively often deal with the problems that in the time domain simultaneously change their prior and posterior characteristics.

When designing methods of processing imbalanced data streams, it is necessary to take into account both the limited memory factor and the necessity to reduce the computational complexity of the applied algorithmics. Additionally, many problems of this type have a limited annotation, which forces an active approach to model construction.

The following will cover the subject of processing the data streams in which the concept does not have to have a stationary characteristic and is constantly changing both in posterior probability as a result of -relatively well-known - concept drift phenomenon and, much less frequently considered in literature - changes in the prior probability of the analyzed problems. A broad taxonomy of data streams will be analyzed, identifying the primitive relationships that determine the various difficulties of each variety of data streams. Particular emphasis will be placed on extreme cases, in which inference and decision making concerning the patterns of the minority class is carried out based on a very limited pool of examples. Groups of available solutions to problems, and - most of all - appropriate protocols for evaluating data stream methods will also be presented.

11:30
Soft Confusion Matrix Classifier for Stream Classification

ABSTRACT. In this paper, the issue of tailoring the soft confusion matrix (SCM) based classifier to deal with stream learning task is addressed. The main goal of the work is to develop a wrapping-classifier that allows incremental learning to classifiers that are unable to learn incrementally. The goal is achieved by making two improvements in the previously-developed SCM classifier. The first one is aimed at reducing the computational cost of the SCM classifier. To do so the definition of the fuzzy neighbourhood of an object is changed. The second one is aimed at effective dealing with the concept drift. This is done by employing the ADWIN-driven concept drift detector that is not only used to detect the drift but also to control the size of the neighbourhood. The obtained experimental results show that the proposed approach significantly outperforms the reference methods.

11:50
Some proposal of the high dimensional PU learning classification procedure

ABSTRACT. In our work, we propose a new classification method for positive and unlabeled (PU) data, called the LassoJoint classification procedure, which combines the thresholded Lasso approach in the first two steps with the joint method based on logistic regression, introduced by Teisseyre et. al. \cite{Tes}, in the last step. We prove that, under some regularity conditions, our procedure satisfies the screening property. We also conduct some simulation study in order to compare the proposed classification procedure with the oracle method.

10:40-12:20 Session 3H: CSOC 1
10:40
A Model for Urban Social Networks

ABSTRACT. Defining accurate and flexible models for real-world networks of human beings is instrumental to understand the observed properties of phenomena taking place across those networks and to support computer simulations of dynamic processes of interest for several areas of research – including computational epidemiology, which is recently high on the agenda. In this paper we present a flexible model to generate age-stratified and geo-referenced synthetic social networks on the basis of widely available aggregated demographic data and, possibly, of estimated age-based social mixing patterns. Using the Italian city of Florence as a case study, we characterize our network model under selected configurations and we show its potential as a building block for the simulation of infections’ propagation. A fully operational and parametric implementation of our model is released as open-source.

11:00
Three-state opinion q-voter model with bounded confidence

ABSTRACT. We study the q-voter model with bounded confidence on the complete graph. Agents can be in one of three states. Two types of agents behaviour are investigated: conformity and independence. We analyze whether this system is qualitatively different from a corresponding model without bounded confidence. The key result of this paper is that the system has two phase transitions: one between order-order phases and another between order-disorder phases.

11:20
The evolution of political views within the model with two binary opinions.

ABSTRACT. We study a model aimed to describe political views within two-dimensional approach, known as the Nolan chart or the political compass, which distinguish between opinions related to economic and personal freedom. We conduct Monte Carlo simulations and show that in the lack of noise, i.e. at social temperature T = 0, the consensus is impossible if there is a coupling between opinions related to economic and personal freedom. Moreover, for T > 0 we show how the strength of the coupling between these opinions can hamper or facilitate the consensus.

11:40
How to reach consensus? Better disagree with your neighbor.

ABSTRACT. We study the basic first passage properties (exit probability and exit time) of a discrete one-dimensional mathematical model of opinion dynamics. The model studied here is a generalized version of the original Sznajd model (SM). It means that an update of the system consists of a random choice of a source pair (i, i + 1) and the influence of this pair on the target, i.e. two agents surrounding this pair, i.e., at sites (i-1, i+2). If the source pair is unanimous then target agents take the same opinion as the source of influence. Otherwise, with probability p, target agents take opinion opposite to their nearest neighbors. We study the model via Monte Carlo simulations from two types of initial conditions, parametrized by the concentration c0 of agents with positive opinion at time t = 0: random as well as sorted one. We show that the exit probability does not change with the size of the system N, whereas the average exit time tau, i.e. the time to reach an absorbing state, scales with N as tau ~ N^alpha. Moreover, we show that the exit time behaves non-monotonically with p: it decreases with the increasing p up to a certain optimum p* = p*(c0), which is surprisingly high. It means that generally the consensus is reached more rapidly if agents disagree more often with their nearest neighbors in case of uncertainty.

12:00
Efficient calibration of a financial agent-based model using the method of simulated moments

ABSTRACT. We propose a new efficient method of calibrating agent-based models using the Method of Simulated Moments. It utilizes an optimization algorithm which gradually narrows down the search area by examining local neighborhoods of promising solutions. Our method obtains better calibration accuracy for a benchmark financial agent-based model in comparison to a broad selection of other methods, while using just a tiny fraction of their computational budget.

10:40-12:20 Session 3I: DisA 1
10:40
The Methods and Approaches of Explainable Artificial Intelligence

ABSTRACT. Artificial Intelligence has found innumerable applications, becoming ubiquitous in the contemporary society. Intelligent systems are being trusted with decision making increasingly more often; from making unnoticeable, minor choices to determining people’s fates (as in, for example, the case of predictive policing). This fact raises serious concerns about the lack of explainability of those systems. Finding ways to enable humans to comprehend the results provided by AI is a blooming area of research right now. This paper explores the current findings in the field of Explainable Artificial Intelligence (xAI), along with xAI methods and solutions that realise them. The paper provides an umbrella perspective on available xAI options, sorting them into a range of levels of abstraction, starting from community-developed code snippets implementing facets of xAI research all the way up to comprehensive solutions utilising state-of-the-art achievements in the domain.

11:00
Fake or real? The novel approach to detect online disinformation based on multi ML classifiers

ABSTRACT. Background: the machine learning (ML) techniques have been implemented in numerous applications, including health-care, security, entertainment, and sports. In this article, we present how the ML can be used for detecting fake news. The problem of online disinformation has become one of the most challenging issues of computer science recently. Methods: in this research, we developed the fake news detection method based on multi classifiers (CNN, XGBoost, Random Forest, Naive Bayes, SVM). In the proposed method two classifiers cooperate what enables obtaining higher results. We used realistic, publicly available data in order to train and test the classifiers. Results: in the article, we present numerous experiments; they differ classifiers implemented and some parameters improved. We report promising results (accuracy = 0.95, precision = 0.99, recall = 0.91, and F1-score = 0.95). Conclusion: the presented research proves that machine learning is a promising approach to the fake news detection. The proposed method will be developed in the future and possibly it could be extended by an explaining block (explainability).

11:20
Transformer Based Models in Fake News Detection

ABSTRACT. The article presents models for detecting fake news and the results of the analyzes of the application of these models. The precision, f1-score, recall metrics were proposed as a measure of the model quality assessment. Neural network architectures, based on the state-of-the-art solutions of the Transformer type were applied to create the models. The computing capabilities of the Google Colaboratory remote platform, as well as the Flair library, made it feasible to obtain reliable, robust models for fake news detection. The problem of disinformation and fake news is an important issue for modern societies, which commonly use state-of-the-art telecommunications technologies. Artificial intelligence and deep learning techniques are considered to be effective tools in protection against these undesirable phenomena.

11:40
Towards Model-Agnostic Ensemble Explanations

ABSTRACT. Explainable Artificial Intelligence (XAI) methods form a large portfolio of different frameworks and algorithms. Although the main goal of all of explanation methods is to provide an insight into the decision process of AI system, their underlying mechanisms may differ. This may result in very different explanations for the same tasks. In this work, we present an approach that aims at combining several XAI algorithms into one ensemble explanation mechanism via quantitative, automated evaluation framework. We focus on model-agnostic explainers to provide most robustness and we demonstrate our approach on image classification task.

10:40-12:20 Session 3J: CMSA 1
10:40
A new multi-objective approach to optimize irrigation using a crop simulation model and weather history

ABSTRACT. Optimization of water consumption in agriculture is necessary to preserve freshwater reserves and reduce the environment's burden. Finding optimal irrigation and water resources for crops is necessary to increase the efficiency of water usage. Many optimization approaches maximize crop yield or profit but do not consider the impact on the environment. We propose a machine learning approach based on the crop simulation model WOFOST to assess the crop yield and water use efficiency. In our work, we use weather history to evaluate extreme weather scenarios. The application of multi-criteria optimization based on the non-dominated sorting genetic algorithm-II (NSGA-II) allows users to find the dates and volume of water for irrigation, maximizing the yield and reducing the water loss. In the study case, we compared the effectiveness of NSGA-II with Monte-Carlo search and a real farmer's strategy. We have shown a decrease in water consumption and an increase in yield for sugar-beet. Our approach produced a higher yield for potatoes than the farmer with the same level of water consumption.

11:00
Bluetooth Low Energy Livestock Positioning for Smart Farming Applications

ABSTRACT. Device localization provides additional information and context to IoT systems, including Agriculture 4.0 and Smart Farming. However, enabling localization incurs additional requirements and trade-offs that often do not fit into application constraints -- use of specific radio technologies, increased communication, computational, and energy costs. This paper presents a localization method that was designed for Smart Farming and applies to large range of radio technologies and IoT systems. The method was verified in a real-life IoT system dedicated to monitor cow health and behavior. In a large multi-path environment, with a large number of obstacles, using only 10 anchors, the system achieves an average localization error equal to 6.3\,m. This allows to use the proposed approach for animal tracking and activity monitoring which is beneficial for well-being assessment.

11:20
Monitoring the Uniformity of Fish Feeding Based on Image Feature Analysis

ABSTRACT. The main purpose of the conducted research is the development and experimental verification of the methods for detection of fish feeding as well as checking its uniformity in the recirculating aquaculture systems (RAS) using machine vision. A particular emphasis has been set on the methods useful for rainbow trout farming due to the planned implementation of the vision-based system in the RAS-based farming center, being currently under construction as a part of the project conducted within the "Fisheries and the Sea" program. Obtained results, based on the analysis of individual video frames, convince that the estimation of feeding uniformity in individual RAS-based farming ponds is possible using the selected local image features without the necessity of camera calibration. The experimental results have been achieved for the images acquired in the RAS-based rainbow trout farming ponds and verified using some publicly available video sequences from tilapia and catfish feeding.

12:20-13:20Lunch
13:20-15:00 Session 4A: MT 3
13:20
Out-plant milk-run-driven mission planning subject to dynamic changes of date and place delivery

ABSTRACT. We consider a dynamic vehicle routing problem in which a fleet of vehicles delivers ordered services or goods to spatially distributed customers while moving along separate milk-run routes over a given periodically repeating time horizon. Customer orders and the feasible time windows for the execution of those orders can be dynamically revealed over time. The problem essentially entails the rerouting of routes determined in the course of their proactive planning. Rerouting takes into account current order changes, while proactive route planning takes into account anticipated (previously assumed) changes in customer orders. Changes to planned orders may apply to both changes in the date of services provided and emerging notifications of additional customers. The considered problem is formulated as a constraint satisfaction problem using the ordered fuzzy number (OFN) formalism, which allows us to handle the fuzzy nature of the variables involved, e.g. the timeliness of the deliveries performed, through an algebraic approach. The computational results show that the proposed solution outperforms the commonly used computer simulation methods.

13:40
An Efficient Hybrid Planning Framework for In-Station Train Dispatching

ABSTRACT. In-station train dispatching is the problem of optimising the effective utilisation of available railway infrastructures for mitigating incidents and delays. This is a fundamental problem for the whole railway network efficiency, and in turn for the transportation of goods and passengers, given that stations are among the most critical points in networks since a high number of interconnections of trains' routes holds therein. Despite such importance, nowadays in-station train dispatching is mainly managed manually by human operators. In this paper we present a framework for solving in-station train dispatching problems, to support human operators in dealing with such task. We employ automated planning languages and tools for solving the task: PDDL+ for the specification of the problem, and the ENHSP planning engine, enhanced by domain-specific techniques, for solving the problem. We carry out a in-depth analysis using real data of a station of the North West of Italy, that shows the effectiveness of our approach and the contribution that domain-specific techniques may have in efficiently solving the various instances of the problem. Finally, we also present a visualisation tool for graphically inspecting the generated plans.

14:00
Evaluating energy-aware scheduling algorithms for I/O-intensive scientific workflows

ABSTRACT. Improving energy efficiency has become necessary to enable sustainable computational science. At the same time, scientific workflows are key in facilitating distributed computing in virtually all domain sciences. As data and computational requirements increases, I/O-intensive workflows have become prevalent. In this work, we evaluate the ability of two popular energy-aware workflow scheduling algorithms to provide effective schedules for this class of workflow applications, that is, schedules that strike a good compromise between workflow execution time and energy consumption. These two algorithms make decisions based on a widely used power consumption model that simply assumes linear correlation to CPU usage. Previous work has shown this model to be inaccurate, in particular for modeling power consumption of I/O-intensive workflow executions, and has proposed an accurate model. We evaluate the effectiveness of the two aforementioned algorithms based on this accurate model. We find that, when making their decisions, these algorithms can underestimate power consumption by up to 360%, which makes it unclear how well these algorithm would fare in practice. We then propose a simple I/O-aware algorithm that uses the accurate power consumption model to make scheduling decisions. Experimental results show that this algorithm achieves a better compromise between energy consumption and workflow execution time than the two popular algorithms.

14:20
A Job Shop Scheduling Problem with Due Dates under Conditions of Uncertainty

ABSTRACT. In the work we consider a job shop problem with due dates under conditions of uncertainty. Uncertainty is considered for operation execution times and job completion dates. It is modeled by normal and Erlang random variables. We present algorithms whose constructions are based on the tabu search method. Due to the application of the probabilistic model, it was possible to obtain solutions more resistant to data disturbances than in the classical approach.

13:20-15:00 Session 4B: MT 4
13:20
SGAIN, WSGAIN-CP and WSGAIN-GP: Novel Gan Methods for Missing Data Imputation

ABSTRACT. Real-world datasets often have missing values, which hin-ders the use of a large number of machine learning (ML) estimators. Toovercome this limitation in a data analysis pipeline, data points may bedeleted in a data preprocessing stage, when at least one value is missing.However, an alternative better solution is data imputation.Several methods based on Artificial Neural Networks (ANN) have beenrecently proposed as successful alternatives to classical discriminativeimputation methods. Amongst those ANN imputation methods are theones that rely on Generative Adversarial Networks (GAN), a specificclass of ANN.This paper presents three data imputation methods based on GAN:SGAIN, WSGAIN-CP and WSGAIN-GP. These methods were testedon datasets with different settings of missing values probabilities, wherethe values are missing completely at random. The evaluation of the newlydeveloped methods shows that they are equivalent or outperform com-petitive state-of-the-art imputation methods in different ways, either interms of the accuracy of post-imputation tasks (e.g., prediction or clas-sification), response time, or the data imputation quality.

13:40
Machine-Learning Based Prediction of Multiple Types of Network Traffic

ABSTRACT. Prior knowledge regarding approximated future traffic requirements allows adjusting suitable network parameters to improve the network's performance. To this end, various analyses and traffic prediction methods assisted with machine learning techniques are developed. In this paper, we study on-line multiple time series prediction for traffic of various frame sizes. Firstly, we describe the gathered real network traffic data and study their seasonality and correlations between traffic types. Secondly, we propose three machine learning algorithms, namely, linear regression, k nearest neighbours, and random forest, to predict the network data which are compared under various models and input features. To evaluate the prediction quality, we use the root mean squared percentage error (RMSPE). We define three machine learning models, where traffic related to particular frame sizes is predicted based on the historical data of corresponding frame sizes solely, several frame sizes, and all frame sizes. According to the performed numerical experiments on four different datasets, linear regression yields the highest accuracy when compared to the other two algorithms. As the results indicate, the inclusion of historical data regarding all frame sizes to predict summary traffic of a certain frame size increases the algorithm's accuracy at the cost of longer execution times. However, by appropriate input features selection based on seasonality, it is possible to decrease this time overhead at the almost unnoticeable accuracy decrease.

14:00
Scalable handwritten text recognition system for lexicographic sources of under-resourced languages and alphabets

ABSTRACT. The paper discusses an approach to decipher large collections of handwritten index cards of historical dictionaries. Our study aims at reading the cards, and linking their lemmas to a searchable list of dictionary entries, for a large historical dictionary entitled the Dictionary of the 17th- and 18th-century Polish, which comprizes 2.8 mln index cards. We apply a tailored HTR solution that involves (1) an optimized detection model based on keras-ocr-craft; (2) a recognition model to decipher the handwritten content: it was designed as an STN transformation followed by RCNN with ResNet backbone with CTC layer, trained using the CVIT dataset and a synthetic set of 500,000 generated Polish words of different length; (3) a post-processing step, in which the results returned by CTC (i.e. connectionist temporal classification predictions) were decoded using a Constrained Word Beam Search: the predictions were matched against a list of dictionary entries known in advance. Our model achieved the accuracy of 0.881 on the word level, which can be considered a competitive result to the base model offered by a RCNN network. Within this study we produced a set of 20,000 manually annotated index cards that can be used for future benchmarks and transfer learning HTR applications.

14:20
Scientific workflow management on hybrid clouds with cloud bursting and transparent data access

ABSTRACT. Cloud bursting is an application deployment model wherein additional computing resources are provisioned from public clouds in cases where local resources are not sufficient, e.g. during peak demand periods. We propose and experimentally evaluate a~cloud-bursting solution for scientific workflows. Our solution is portable thanks to using Kubernetes for deployment of the workflow management system and computing clusters in multiple clouds. We also introduce transparent data access by employing a~virtual distributed file system across the clouds, allowing jobs to use a POSIX file system interface, while hiding data transfer between clouds. To balance load distribution and minimize the communication volume between clouds, we leverage graph partitioning, while ensuring that the algorithm distributes the load equally at each parallel execution stage of a workflow. The solution is experimentally evaluated using the HyperFlow workflow management system integrated with the Onedata data management platform, deployed in our on-premise cloud in Cyfronet AGH and in the Google Cloud.

13:20-15:00 Session 4C: AIHPC4AS 2
13:20
AIHPC4AS KEYNOTE: The Discontinuous Petrov-Galerkin (DPG) Method for Convection-Reaction Problems

ABSTRACT. We present a progress report on the development of Discontinuous Petrov-Galerkin methods for the convection-reaction problem in context of time-stepping and space-time discretizations of Boltzmann equations [1]. The work includes a complete analysis for both conforming (DPGc) and non-nonconforming (DPGd) versions of the DPG method employing either globally continuous or discontinuous piece-wise polynomials to discretize the traces. The results include construction of a local Fortin operator for the case of constant convection and a global discrete stability analysis for both DPGc and DPGd methods. The theoretical findings are illustrated with numerous numerical experiments in two space dimensions. [1] L. Demkowicz, N. Roberts, DPG Method for the Convection–Reaction Problem Revisited, submitted.

14:00
Refined Isogeometric Analysis for Solving Quadratic Eigenproblems in Electromagnetics

ABSTRACT. We propose the use of refined isogeometric analysis (rIGA) for efficient eigencomputations of electromagnetic wave propagation problems. rIGA reduces the continuity of certain basis functions and subdivides the computational domain into high-continuity macroelements interconnected by low-continuity hyperplanes. Thus, rIGA conserves the desirable properties of maximum-continuity isogeometric analysis (IGA), while it reduces the solution cost by decreasing the matrix connectivity. When considering the quadratic eigenproblems arising in electromagnetics, we first perform a linearization and obtain the linear eigenproblem. Then, for obtaining the eigensolution of the linearized multidimensional systems with multiple and clustered eigenfrequencies, we solve the shift-and-invert eigenproblem that results in fast and accurate eigencomputations. The total eigencomputation cost of linearized eigenproblems is almost equal to the cost of matrix factorization plus the cost of matrix–vector products in the iterative sense of Krylov projection. The latter consists of two sets of operations: multiplications of system matrices by vectors, and forward/backward eliminations. When using an optimal rIGA discretization, rIGA performs matrix factorization —the most expensive operation of the eigencomputation— up to O(p^2) times faster in large domains compared to IGA, where p is the polynomial degree of basis functions. Additionally, rIGA reduces forward/backward elimination cost asymptotically by up to O(p) in sufficiently large domains, while it slightly increases the cost of multiplying system matrices by vectors. Thus, based on the number of required numerical operations, rIGA improves the eigencomputation costs by O(p) for moderate-to-large problems.

14:20
Supermodeling - a meta-procedure for data assimilation and parameters estimation

ABSTRACT. The supermodel synchronizes several imperfect instances of a baseline model - e.g., variously parametrized models of a complex system - into a single simulation engine with superior prediction accuracy. In this paper, we present convincing pieces of evidence in support of the hypothesis that supermodeling can be also used as a meta-procedure for fast data assimilation (DA). Thanks ago, the computational time of parameters’ estimation in multi-parameter models can be radically shortened. To this end, we compare various supermodeling approaches which employ: (1) three various training schemes, i.e., "nudging", weighting and assimilation, (2) four classical data assimilation algorithms, i.e., ABC-SMC, 3dVar, evolutionary algorithm, simplex method, and (3) various coupling schemes between dynamical variables of the ensembled models. We have performed extensive tests on a model of diversified cancer dynamics in the case of tumor growth, recurrence, and remission. We demonstrated that in all the configurations the supermodels are radically more efficient than single models trained by using classical DA schemes. We showed that the tightly coupled supermodel, trained by using the $nudging$ scheme synchronizes the best, producing the efficient and the most accurate prognoses about cancer dynamics. Similarly, in the context of the application of supermodeling as the meta-algorithm for data assimilation, the classical 3dVar algorithm appeared to be the most efficient baseline DA scheme for both the supermodel training and pre-training of the sub-models.

14:40
Effective solution of ill-posed inverse problems with stabilized forward solver

ABSTRACT. We consider inverse parametric problems for elliptic variational PDEs. They are solved through the minimization of misfit functionals. Main difficulties encountered consist in the misfit multimodality and insensitivity as well as in the weak conditioning of the direct (forward) problem, that therefore requires stabilization. A complex multi-population memetic strategy hp-HMS combined with the Petrov-Galerkin method stabilized by the Demkowicz operator is proposed to overcome obstacles mentioned above. This paper delivers the theoretical motivation for the common inverse/forward error scaling, that reduces significantly the computational cost of the whole strategy. A short illustrative numerical example is attached at the end of the paper.

13:20-15:00 Session 4D: BBC 2
13:20
Towards mimetic membrane systems in Molecular dynamics: characteristics of E.coli membrane system

ABSTRACT. Plenty of research is focused on the analysis of the interactions between bacteria membrane and antimicrobial compounds or proteins. The hypothesis of the research is formed according to the results from the numerical models such as molecular docking or molecular dynamics. However, simulated membrane models often vary significantly from the real ones. This may lead to inaccurate conclusions. In this paper, we employed molecular dynamic simulations to create mimetic Escherichia coli full membrane model and to evaluate how the membrane complexity may influence the structural, mechanical and dynamical mainstream parameters. The impact of the O-antigen region presence in the outer membrane was also assessed. In the analysis, we calculated membrane thickness, area per lipid, order parameter, lateral diffusion coefficient, interdigitation of acyl chains, mechanical parameters such as bending rigidity and area compressibility, and also lateral pressure profiles. We demonstrated that outer membrane properties strongly depend on the structure of –lipopolysaccharides, changing its properties dramatically in each of the investigated parameters. Furthermore, we showed that the presence of the inner membrane during simulations, as it exists in a full shell of E.coli, significantly changer measured properties of the outer membrane.

13:40
PathMEx: Pathway-based Mutual Exclusivity for Discovering Rare Cancer Driver Mutations

ABSTRACT. The genetic material we carry today is different from that we were born with: our DNA is prone to mutations. Some of these mutations can make a cell divide without control, resulting in a growing tumor. Typically, in a cancer sample from a patient, a large number of mutations can be detected, and only a few of those are drivers - mutations that positively contribute to tumor growth. The majority are passenger mutations that either accumulated before the onset of the disease but did not cause it, or are byproducts of the genetic instability of cancer cells. One of the key questions in understanding the process of cancer development is which mutations are drivers, and should be analyzed as potential diagnostic markers or targets for therapeutics, and which are passengers. We propose PathMEx, a novel method based on simultaneous optimization of patient coverage, mutation mutual exclusivity, and pathway overlap among putative cancer driver genes. Compared to state-of-the-art method Dendrix, the proposed algorithm finds sets of putative driver genes of higher quality in three sets of cancer samples: brain, lung, and breast tumors. The genes in the solutions belong to pathways with known associations with cancer. The results show that PathMEx is a tool that should be part of a state-of-the-art toolbox in the driver gene discovery pipeline. It can help detect low-frequency driver genes that can be missed by existing methods.

14:00
Serverless Nanopore Basecalling with AWS Lambda

ABSTRACT. The serverless computing paradigm allows simplifying operations, offers highly parallel execution and high scalability without the need for manual management of underlying infrastructure. This paper aims to evaluate if recent advancements such as container support and increased computing resource limits in AWS Lambda allow it to serve as an underlying platform for running bioinformatics workflows such as basecalling of nanopore reads. For the purposes of the paper, we developed a sample workflow, where we focused on Guppy basecaller, which was tested in multiple scenarios. The results of the experiments showed that AWS Lambda is a viable platform for basecalling, which can support basecalling nanopore reads from multiple sequencing reads at the same time while keeping low infrastructure maintenance overhead. We also believe that recent improvements to AWS Lambda make it an interesting choice for a growing number of bioinformatics applications.

14:20
A Software Pipeline Based on Sentiment Analysis to Analyze Narrative Medicine Texts

ABSTRACT. By using Social Media people can exchange sentiments and emotions, allowing to understand public opinion on specific issues. Sentiment Analysis (SA) is a novel text-mining (TM) and natural language processing (NLP) methodology to extract sentiment, opinions and emotions from written texts, usually provided through social media or questionnaires. Sharing medical and clinical experiences of patients through social media, is the target of the so-called Narrative Medicine (NM). Here we report some research experiences in applying SA techniques to analyze NM texts. A problem to be faced in NM is the automatic analysis of a potentially large set of documents. Application of SA is useful for having immediate analysis and extracting information from medical literature quickly. Here we present a software pipeline based on SA and TM which allows to effectively analyze NM texts. First experimental results allow to discover topics related to diseases.

13:20-15:00 Session 4E: COMS 2
13:20
Pruned simulation-based optimal sailboat path search using micro HPC systems

ABSTRACT. Simulation-based optimal path search algorithms are often solved using dynamic programming, which is typically computationally expensive. This can be an issue in a number of cases including near-real-time autonomous robot or sailboat path planners. We show the solution to this problem which is both effective and (energy) efficient. Its three key elements -- an accurate and efficient estimator of the performance measure, two-level pruning (which augments the estimator-based search space reduction with smart simulation and estimation techniques), and an OpenCL-based SPMD-parallelisation of the algorithm -- are presented in detail. The included numerical results show the high accuracy of the estimator (the medians of relative estimation errors smaller than $0.003$), the high efficacy of the two-level pruning (search space and computing time reduction from seventeen to twenty times), and the high parallel speedup (its maximum observed value was almost $40$). Combining these effects gives (up to) $782$ times faster execution. The proposed approach can be applied to various domains. It can be considered as an optimal path planing framework parametrised by a problem specific performance measure heuristic/estimator.

13:40
Two stage approach to optimize electricity contract capacity problem for commercial customers

ABSTRACT. The electricity tariffs available to Polish customers depend on the voltage level to which the customer is connected as well as contracted capacity in line with the user demand profile. Each consumer, before connecting to the power grid, de-clares the demand for maximum power. This amount, referred to as the contracted capacity, is used by the electricity utility to assign appropriate connection type to the power grid, including the size of the security breaker. Maximum power is al-so the basis for calculating fixed charges for electricity consumption. Usually, the maximum power for the household user is controlled through a circuit breaker. For the industrial and business users the maximum power is controlled and me-tered through the peak meters. If the peak demand exceeds the contracted capaci-ty, a penalty charge is applied to the exceeded amount which is up to ten times the basic rate. In this article, we present a solution for entrepreneurs which is based on the implementation of two stage approach to predict maximal load values and the moments of exceeding the contracted capacity in the short-term, i.e., up to one month ahead. The forecast is further used to optimize the capacity volume to be contracted in the following month to minimize network charges for exceeding the contracted level. As shown experimentally with two datasets, the application of multiple output forecast artificial neural network model and genetic algorithm for load optimization delivers significant benefits to the customers.

14:00
Improved Design Closure of Compact Microwave Circuits by Means of Performance Requirement Adaptation

ABSTRACT. Numerical optimization procedures have been widely used in the design of microwave components and systems. Most often, optimization algorithms are applied at the later stages of the design process to tune the geometry and/or material parameter values. To ensure sufficient accuracy, parameter adjustment is realized at the level of full-wave electromagnetic (EM) analysis, which creates perhaps the most important bottleneck due to the entailed computational expenses. The cost issue hinders utilization of global search procedures, whereas local routines often fail when the initial design is of insufficient quality, especially in terms of the relationships between the current and the target operating frequencies. This paper proposes a procedure for automated adaptation of the performance requirements, which aims at improving the reliability of the parameter tuning process in the challenging situations as described above. The procedure temporarily relaxes the requirements to ensure that the existing solution can be improved, and gradually tightens them when close to terminating the optimization process. The amount and the timing of specification adjustment is governed by evaluating the design quality at the current design, and the convergence status of the algorithm. The proposed framework is validated using two examples of microstrip components (a coupler and a power divider), and shown to well handle design scenarios that turn infeasible for conventional approaches, in particular when decent starting points are unavailable.

14:20
Graph-grammar based longest-edge refinement algorithm for three-dimensional optimally p refined meshes with tetrahedral elements

ABSTRACT. Finite element method is a popular way of solving engineering problems in geoengineering. Three-dimensional grids employed for approximation the formation layers are often constructed from tetrahedral finite elements. The refinement algorithms that avoids hanging nodes are desired in order to avoid constrained approximation on broken edges and faces. We present a new mesh refinement algorithm for such the tetrahedral grids, with the following features (1) it is a two-level algorithm, refining first the elements faces, followed by the refinement of the element interiors; (2) for the face refinements it employs the graph-grammar based version of the longest-edge refinement algorithm to avoid the hanging nodes; and (3) it allows for nearly perfect parallel execution of the second stage, refining the element interiors. We describe the algorithm using the graph-grammar based formalism. We verify the properties of the algorithm, by checking breaking 5,000 tetrahedral elements, and checking their angles and proportions. On the generated meshes without hanging nodes we span the polynomial basis functions of the optimal order, selected via metaheuristic optimization algorithm. We use them for the projection based interpolation of formation layers.

14:40
Elitism in Multiobjective Hierarchical Strategy

ABSTRACT. The paper focuses on complex metaheuristic algorithms, namely multi-objective hierarchical strategy, which consists in dynamically evolving tree of interdependent demes of individuals. The main contribution presented in this paper is the introduction of elitism in a form of archive, locally into the demes and globally into the whole tree and developing necessary updates between them. The newly proposed algorithms (utilizing elitism) are compared with their previous versions as well as with the best state of the art multi-objective metaheuristics.

13:20-15:00 Session 4F: SOFTMAC 2
13:20
A new finite element method for Stokes equations with pressure Dirichlet boundary condition

ABSTRACT. The Stokes equations with pressure Dirichlet boundary condition arises from many fields, typically modeling of blood flow and pressure in arteries. For the Stokes equations with pressure Dirichlet boundary condition, it turns out that many popular elements in the computational fluid dynamics are not any longer applicable, although these the inf-sup stable elements are in use for the Stokes equations with velocity Dirichlet boundary condition. The Taylor-Hood elements are such elements. As a matter of fact, the convergence theory for the Taylor-Hood elements has been open till today for the Stokes equations with pressure Dirichlet boundary condition. In this talk, we shall study some new finite elements, extending and generalizing the Taylor-Hood elements, to effectively solve the Stokes equations with pressure Dirichlet boundary condition. Theory and numerical experiments are given.

13:40
Poroelasticity Modules for \texttt{DarcyLite}

ABSTRACT. This paper elaborates on design and implementation of code modules for finite element solvers for poroelasticity in our \texttt{Matlab} package \texttt{DarcyLite} \cite{LiuSadreWang_ICCS_2016}. The Biot's model is adopted. Both linear and nonlinear cases are discussed. Numerical experiments are presented to demonstrate the accuracy and efficiency of these solvers.

14:00
Mathematical Modeling of the Single-Phase Multicomponent Flow in Porous Media

ABSTRACT. A numerical scheme of higher-order approximation in space for the single-phase multicomponent flow in porous media is presented. The mathematical model consists of Darcy velocity, transport equations for components of a mixture, pressure equation and associated relations for physical quantities such as viscosity or density. The discrete problem is obtained via discontinuous Galerkin method for the discretization of transport equations with the combination of mixed-hybrid finite element method for the discretization of Darcy velocity and pressure equation both using higher-order approximation. Subsequent problem is solved with the fully mass-conservative iterative IMPEC method. Numerical experiments of 2D flow are carried out.

14:20
An enhanced finite element algorithm for thermal Darcy flows with variable viscosity

ABSTRACT. This paper deals with the development of a stable and efficient unified finite element method for the numerical solution of thermal Darcy flows with variable viscosity. The governing equations consist of coupling the Darcy equations for the pressure and velocity fields to a convection-diffusion equation for the heat transfer. The viscosity in the Darcy flows is assumed to be nonlinear depending on the temperature of the medium. The proposed method is based on combining a semi-Lagrangian scheme with a Galerkin finite element discretization of the governing equations along with an robust iterative solver for the associate linear systems. The main features of the proposed finite element algorithm are that the same finite element space is used for all solutions to the problem including the pressure, velocity and temperature. In addition, the convection terms are accurately dealt with using the semi-Lagrangian scheme and the standard Courant-Friedrichs-Lewy condition is relaxed and the time truncation errors are reduced in the diffusion terms. Numerical results are presented for two examples to demonstrate the performance of the proposed finite element algorithm.

14:40
Multilevel adaptive Lagrange-Galerkin methods for unsteady incompressible viscous flows

ABSTRACT. A highly efficient multilevel adaptive Lagrange-Galerkin finite element method for unsteady incompressible viscous flows is proposed in this work. The novel approach has several advantages such that the convective part is handled by the modified method of characteristics, the complex and irregular geometries are discretized using finite element method and for more accuracy and efficiency a multilevel adaptive L2-projection using quadrature rules is employed. An error indicator based on the gradient of the velocity field is used in the current study for the multilevel adaptation. Contrary to h-adaptive, p-adaptive, and hp-adaptive finite element methods for incompressible flows, the resulted linear system in our Lagrange-Galerkin finite element method keeps the same fixed structure and size at each refinement in the adaptation procedure. To evaluate the performance of the proposed approach, we solve a coupled Burgers problem with a known analytical solution for errors quantification then we solve an incompressible flow past two circular cylinders to illustrate the performance of the multilevel adaptive algorithm.

13:20-15:00 Session 4G: CLDD 2
13:20
Classifying Functional Data from Orthogonal Projections -- model, properties and fast implementation

ABSTRACT. We consider the problem of functional, random data classification from equidistant samples. Such data are frequently not easy for classification when one has a large number of observations that bear low information for classification. We consider this problem using tools from the functional analysis. Therefore, a mathematical model of such data is proposed and its correctness is verified. Then, it is shown that any finite number of descriptors, obtained by orthogonal projections on any differentiable basis of $L_2(0,\, T)$, can be consistently estimated within this model.

Computational aspects of estimating descriptors, based on the fast implementation of the discrete cosine transform (DCT), are also investigated in conjunction with learning a classifier and using it on-line. Finally, the algorithm of learning descriptors and classifiers were tested on real-life random signals, namely, on accelerations, coming from large bucket-wheel excavators, that are transmitted to an operator’s cabin. The aim of these tests was also to select a classifier that is well suited for working with DCT-based descriptors.

13:40
Clustering and Weighted Scoring Algorithm Based on Estimating the Number of Clusters

ABSTRACT. Imbalanced datasets are still a big method challenge in data mining and machine learning. Various machine learning methods and their combinations are considered to improve the quality of the classification of imbalanced datasets. This paper presents the approach with the clustering and weighted scoring function based on geometric space are used. In particular, we proposed a significant modification to our earlier algorithm. The proposed change concerns the use of automatic estimating the number of clusters and determining the minimum number of objects in a particular cluster. The proposed algorithm was compared with our earlier proposal and state-of-the-art algorithms using highly imbalanced datasets. The performed experiments show a significant improvement in the classification performance measure compared to the reference algorithms.

14:00
Exact Searching for the Smallest Deterministic Automaton

ABSTRACT. We propose an approach to minimum-state deterministic finite automaton (DFA) inductive synthesis that is based on using satisfiability modulo theories (SMT) solvers. To that end, we explain how DFAs and their response to input samples can be encoded as logic formulas with integer variables, equations, and uninterpreted functions. An SMT solver is then tasked with finding an assignment for such a formula, from which we can extract the automaton of a required size. We provide an implementation of this approach, which we use to conduct experiments on a series of benchmarks. The results showed that our method outperforms in terms of CPU time other SAT and SMT approaches and other exact algorithms on prepared benchmarks.

14:20
Learning Invariance in Deep Neural Networks

ABSTRACT. One of the long-standing difficulties in machine learning involves distortions present in data -- different input feature vectors may represent the same entity. This observation has led to the introduction of invariant machine learning methods, for example techniques that ignore shifts, rotations, or light and pose changes in images. These approaches typically utilize pre-defined invariant features or invariant kernels, and require the designer to analyze what type of distortions are to be expected. While specifying possible sources of variance is straightforward for images, it is more difficult in other domains. Here, we focus on learning an invariant representation from data, without any information of what the distortions present in the data, only based on information whether any two samples are distorted variants of the same entity, or not. In principle, standard neural network architectures should be able to learn the invariance from data, given sufficient numbers of examples of it. We report that, somewhat surprisingly, learning to approximate even a simple types of invariant representation is difficult. We then propose a new type of layer, with a richer output representation, one that is better suited for learning invariances from data.

14:40
Mimicking learning for 1-NN classifiers

ABSTRACT. We consider the problem of mimicking the behavior of the nearest neighbor algorithm with an unknown distance measure. Our goal is, in particular, to design and update a learning set so that two NN algorithms with various distance functions, rho_p and rho_q, 0 < p, q < 1, classify in the same way, and to approximate the behavior of one classifier by the other. The autism disorder-related motivation of the problem is presented.

13:20-15:00 Session 4H: IoTSS 1
13:20
A Review on Visual Programming for Distributed Computation in IoT

ABSTRACT. Internet-of-Things (IoT) systems are considered one of the most notable examples of complex, large-scale systems. Some authors have proposed visual programming (VP) solutions to address part of their inherent complexity. However, in most of these solutions, the orchestration of devices and system components is still dependent on a centralized unit, preventing a higher degree of dependability. In this work, we carry out a systematic literature review of the current solutions that provide visual and decentralized orchestration to define and operate IoT systems. Our work reflects upon a total of 29 proposals that address these issues. We provide an in-depth discussion of these works and find out that only four of these solutions attempt to tackle this issue as a whole, although still leaving a set of open research challenges. We finally argue that these challenges, if addressed, could make IoT systems more fault-tolerant, with an impact on their dependability, performance, and scalability.

13:40
Data preprocessing, aggregation and clustering for agile manufacturing based on Automated Guided Vehicles

ABSTRACT. Automated Guided Vehicles (AGVs) have become an indispensable component of Flexible Manufacturing Systems. AGVs are also a huge source of information that can be utilised by the data mining algorithms that support the new generation of manufacturing. This paper focuses on data preprocessing, aggregation and clustering in the new generation of manufacturing systems that use the agile manufacturing paradigm and utilise AGVs. The proposed methodology can be used as the initial step for production optimisation, predictive maintenance activities, production technology verification or as a source of models for the simulation tools that are used in virtual factories.

14:00
Comparison of Speech Recognition and Natural Language Understanding Frameworks for Detection of Dangers with Smart Wearables

ABSTRACT. Wearable IoT devices that can register and transmit human voice can be invaluable in personal situations, such as summoning assistance in emergency healthcare situations. Such applications would benefit greatly from automated voice analysis to detect and classify voice signals. In this paper, we compare selected Speech Recognition (SR) and Natural Language Understanding (NLU) frameworks for Cloud-based detection of voice-based assistance calls. We experimentally test several services for speech-to-text transcription and intention recognition available on selected large Cloud platforms. Finally, we evaluate the influence of the manner of speaking and ambient noise on the quality of recognition of emergency calls. Our results show that many services can correctly translate voice to text and provide a correct interpretation of caller intent. Still, speech artifacts (tone, accent, diction), which can differ even for each individual in various situations, significantly influences the performance of speech recognition.

14:20
A Decision Support System Based on Augmented Reality for the Safe Preparation of Chemotherapy Drugs

ABSTRACT. The preparation of chemotherapy drugs has always presented complex is-sues and challenges given the nature of the demand on the one hand, and the criticality of the treatments on the other. Chemotherapy involves handling special drugs that require specific precautions. These drugs are toxic and po-tentially harmful for people handling them. Their preparation entails there-fore particular and complex procedures including preparation and control. The relevant control methods are often limited to the double visual control. The search for optimization and safety of pharmaco-technical processes leads to the use of new technologies with the main aim of improving patient care. In this respect, Augmented Reality (AR) technology can be an effective solution to support the control of chemotherapy preparations. It can be eas-ily adapted to the chemotherapy drugs preparation environment. This paper introduces SmartPrep, an innovative decision support system (DSS) for the monitoring of chemotherapy drugs preparation. The proposed DSS uses the AR technology, through smart glasses, to facilitate and secure the prepara-tion of these drugs. Controlling the preparation process is done with the help of the voice since hands are busy. SmartPrep was co-developed by the research laboratory CRISTAL, GRITA research group and the software pub-lisher Computer Engineering.

13:20-15:00 Session 4I: ACMAIML 1
13:20
Using Randomized Sub-data Sets for an Ensemble of Deep Learning Models to enhance Robustness

ABSTRACT. Artificial intelligence (AI) and machine learning (ML) models are increasingly applied to many diverse domains and areas. These domains may include fields such as computer vision (CV), natural language processing (NLP), video and audio analysis, and predictive analytics. Since these problems entail complex structures, the most promising deep learning models used in these AL and machine learning applications are hybrid models that are ensemble of many smaller models and architectures. At the same time, increasingly so, the data sets used in these models are generated from many different and diverse sources and are of many different types. These data types may include personal or user data, ecommerce data, geolocation data, time dependent and time independent data, and metadata. Thus far, all of the data, as a whole, has been used for the entire ensemble models. In this work, using randomization, samples of the data are used for different sub-models and/or substructures of an ensemble model. These sub-data sets, used for different parts of the ensemble deep learning models, are generated using resampling with substitution and thus may have overlaps. In this work, this modeling approach is used for an ensemble of deep learning models comprising of sub-models of GRUs (gated recurrent units) and DNNs (deep neural networks) for a recommendation problem. This approach leads to improvements in the accuracy of the recommended items vs the case when the same data set is used for all parts of the ensemble model.

13:40
A Deep Neural Network Based on Stacked Auto-Encoder and Dataset Stratification in Indoor Localization

ABSTRACT. Indoor location has become the core part for large-scale location-aware services, especially in scalable applications. Fingerprint location is carried out by using the received signal strength indicator (RSSI) of WiFi signal, which has the advantages of full coverage and strong expansibility. At the same time, it also has the shortcomings of off-line data calibration and insufficient samples in dynamic environment. In order to locate the hierarchical information of the user's building, floor and space, a deep neural network for indoor positioning (DNNIP) is explored using stacked auto-encoder and data stratification. Experimental results show that DNNIP has better classification accuracy than other machine learning algorithms based on UJIIndoorLoc dataset.

14:00
Recurrent Autoencoder with Sequence-Aware Encoding

ABSTRACT. Recurrent Neural Networks (RNN) received a vast amount of attention last decade. Recently, the architectures of Recurrent AutoEncoders (RAE) found many applications in practice. RAE can extract the semantically valuable information, called context that represents a latent space useful for further processing. Nevertheless, recurrent autoencoders are hard to train, and the training process takes much time. This paper proposes a new recurrent autoencoder architecture with sequence-aware encoding (RAES), and its second variant which employs a 1D Convolutional layer (RAESC) to improve its performance and flexibility. We discuss the advantages and disadvantages of the solution and prove that the recurrent autoencoder with sequence-aware encoding outperforms a standard RAE in terms of model training time in most cases. The extensive experiments performed on a dataset of generated sequences of signals prove the advantages of RAES(C). The results show that the proposed solution dominates over the standard RAE, and the training process is the order of magnitude faster.

14:20
A Gist Information Guided Neural Network for Abstractive Summarization

ABSTRACT. Abstractive summarization aims to condense the given documents and generate fluent summaries with important information. It is challenging for selecting the salient information and maintaining the semantic consistency between documents and summaries. To tackle these problems, we propose a novel framework - Gist Information Guided Neural Network (GIGN), which is inspired by the process that people usually summarize a document around the gist information. First, we incorporate the multi-head attention mechanism with the self-adjust query to extract the global gist of the input document, which is equivalent to a question vector questions the model ``What is the document gist?''. Through the interaction of the query and the input representations, the gist contains all salient semantics. Second, we propose the remaining gist guided module to dynamically guide the generation process, which can effectively reduce the redundancy by attending to different contents of the gist. Finally, we introduce the gist consistency loss to improve the consistency between inputs and outputs. We conduct experiments on the benchmark dataset - CNN/Daily Mail to validate the effectiveness of our methods. The results indicate that our GIGN significantly outperforms all baseline models and achieves the state-of-the-art.

14:40
Quality of Recommendations and Cold-start Problem in Recommender Systems based on Multi-Clusters

ABSTRACT. This article presents a new approach to collaborative filtering recommender systems that focuses on the problem of an active user's (a user to whom recommendations are generated) neighbourhood modelling. Precise identification of the neighbours has a direct impact on the quality of the generated recommendation lists. Clustering techniques are the solution that is often used for neighbourhood calculation, however, they negatively affect the quality (precision) of recommendations.

In this article, a new version of the algorithm based on multi-clustering, $M-CCF$, is proposed. Instead of one clustering scheme, it works on a set of multi-clusters, therefore it selects the most appropriate one that models the neighbourhood most precisely. This article presents the results of the experiments validating the advantage of multi-clustering approach, $M-CCF$, over the traditional methods based on single-scheme clustering. The experiments particularly focus on the overall recommendation performance including accuracy and coverage as well as a cold-start problem.

15:00
Model of the Cold-start Recommender System Based on the Petri-Markov Nets

ABSTRACT. The article describes a model for constructing a cold-start recommendation system based on the mathematical apparatus of Petri-Markov nets. The model combines stochastic and structural approaches to building recommendations. This solution allows you to differentiate recommendation objects by their popularity and impose restrictions on the available latent data about the user.

13:20-15:00 Session 4J: SE4Science 1
13:20
I/O Associations in Scientific Software: A Study of SWMM

ABSTRACT. Understanding which input and output variables are related to each other is important for metamorphic testing, a simple and effective approach for testing scientific software. We report in this paper a quantitative analysis of input/output (I/O) associations based on co-occurrence statistics of the user manual, as well as association rule mining of a user forum, of the Storm Water Management Model (SWMM). The results show a positive correlation of the identified I/O pairs, and further reveal the complementary aspects of the user manual and user forum in supporting scientific software engineering tasks.

13:40
Understanding Equity, Diversity and Inclusivity Challenges Within the Research Software Community

ABSTRACT. Research software - specialist software used to support or undertake research - is of huge importance to researchers and contributes to significant advances in the wider world and requires collaboration between people with diverse skills and backgrounds. Analysis of recent survey data provides evidence for a lack of diversity in the Research Software Engineer community. We identify interventions which could address challenges in the wider research software community and areas where the community is becoming more diverse. There are also lessons applicable to the wider software development field around recruitment from other disciplines and the importance of welcoming communities.

14:00
How has the COVID-19 Pandemic affected working conditions for Research Software Engineers?

ABSTRACT. We report results from a diary study asking research software engineers to reflect on their experience of working during the COVID-19 pandemic in spring 2020. Whilst people reported difficulties with working at home (lack of space and equipment; increased childcare responsibilities) the majority were able to continue their work without major disruption. Communication changed significantly and variously, with people reporting better, worse, more frequent and qualitatively different interactions with people. Participants reported both improved productivity, and potential burnout. Overall, research software engineers found the switch to working from home fulltime straightforward in terms of performing the technical parts of their job role, but many found the changes to the organizational structures surrounding them had profound effects on their well-being, through improved flexibility and inclusivity, but also poorer social interaction and increased fatigue.

15:10-16:00 Session 5: Keynote Lecture 2
15:10
Material Transport Simulation in Complex Neurite Networks Using Isogeometric Analysis and Machine Learning Techniques

ABSTRACT. Neurons exhibit remarkably complex geometry in their neurite networks. So far, how materials are transported in the complex geometry for survival and function of neurons remains an unanswered question. Answering this question is fundamental to understanding the physiology and disease of neurons. Here, we develop an isogeometric analysis (IGA) based platform for material transport simulation in neurite networks. We model the transport process by reaction-diffusion-transport equations and represent geometry of the networks using truncated hierarchical tricubic B-splines (THB-spline3D). We solve the Navier-Stokes equations to obtain the velocity field of material transport in the networks. We then solve the transport equations using the streamline upwind/Petrov-Galerkin (SU/PG) method. Using our IGA solver, we simulate material transport in a number of representative and complex neurite networks. From the simulation we discover several spatial patterns of the transport process. Together, our simulation provides key insights into how material transport in neurite networks is mediated by their complex geometry.

To enable fast prediction of the transport process within complex neurite networks, we develop a Graph Neural Networks (GNN) based model to learn the material transport mechanism from simulation data. In this study, we build the graph representation of the neuron by decomposing the neuron geometry into two basic structures: pipe and bifurcation. Different GNN simulators are designed for these two basic structures to predict the spatiotemporal concentration distribution given input simulation parameters and boundary conditions. In particular, we add the residual term from PDEs to instruct the model to learn the physics behind the simulation data. To recover the neurite network, a GNN-based assembly model is used to combine all the pipes and bifurcations following the graph representation. The loss function of the assembly model is designed to impose consistent concentration results on the interface between pipe and bifurcation. Through machine learning, we can quickly and accurately provide a prediction of material transport given a new complex neuron tree.

16:00-16:30Coffee Break
16:30-18:10 Session 6A: MT 5
16:30
Deep learning driven self-adaptive hp finite element method

ABSTRACT. The finite element method (FEM) is a popular tool for solving engineering problems governed by Partial Differential Equations (PDEs). The accuracy of the numerical solution depends on the quality of the computational mesh. We consider the most accurate deterministic algorithm, called the self-adaptive hp-FEM, which delivers optimal mesh refinements. It divides selected elements of the mesh into smaller elements and the polynomials orders of the elements to approximate the solution of a given PDE more accurately. It iterates refinements until the approximation error is sufficiently small. This is the only known algorithm that delivers exponential convergence of the numerical error with respect to the mesh size for a large class of problems. Thus, it enables solving difficult engineering problems with the highest possible numerical accuracy. In this work, we replace the computationally expensive kernel of the refinement algorithm by a deep neural network. The network learns how to optimally refine the elements and modify the orders of the polynomials. In this way, the deterministic algorithm is replaced by a neural network that selects similar quality refinements in a fraction of the time needed by the original algorithm.

16:50
New variants of SDLS algorithm for LABS problem dedicated to GPGPU architectures

ABSTRACT. Low autocorrellation binary sequence (LABS) remains - an open hard optimisation problem that has many applications. One of the promising directions for solving the problem is designing advanced solvers based on local search heuristics. The paper propose two new heuristics developed from the steepest-descent local search algorithm, dedicated to GPGPU architectures. The introduced algorithms utilise the parallel nature of GPU and provide an effective method of solving the LABS problem. As a means for comparison, the efficiency between SDSL and new algorithms, all implemented on the GPGPU, are presented, showing that exploring the wider neighbourhood improves the results.

17:10
Highly Effective GPU Realization of Discrete Wavelet Transform for Big-Data Problems

ABSTRACT. Discrete wavelet transform (DWT) is widely used in the tasks of signal processing, analysis and recognition. Moreover it's practical applications are not limited to the case of one-dimensional signals but also apply to images and multidimensional data. From the moment of introduction of the dedicated libraries that enable to use graphics processing units (GPUs) for mass-parallel general purpose calculations the development of effective GPU based implementations of one-dimensional DWT is an important field of scientific research. It is also important because with use of one-dimensional procedure we can calculate DWT in multidimensional case if only the transform's separability is assumed. In this paper the authors propose a novel approach to calculation of one-dimensional DWT based on lattice structure which takes advantage of shared memory and registers in order to implement necessary inter-thread communication. The experimental analysis reveals high time-effectiveness of the proposed approach which can be even 5 times higher than the one characteristic for the convolution based approach in computational tasks that can be classified as big-data problems.

17:30
A Dynamic Replication Approach for Monte Carlo Photon Transport on Heterogeneous Architectures

ABSTRACT. This paper considers Monte Carlo photon transport applications on heterogeneous compute architectures with both CPUs and GPUs. Previous work on this problem has considered only meshes that can fully fit within the memory of a GPU, which a significant limitation: many important problems require meshes that exceed memory size. We address this gap by introducing a new dynamic replication algorithm that adapts assignments based on the computational ability of a resource. We then demonstrate our algorithm’s efficacy on a variety of workloads, and find that incorporating the CPUs provides speedups of up to 20% over the GPUs alone. Further, these speedups are well beyond the FLOPS contribution from the CPUs, which provide further justification for continuing to include CPUs even when powerful GPUs are available. In all, the contribution of this work is an algorithm that can be applied in real-world settings to make more efficient use of heterogeneous architectures.

17:50
Scaling Simulation of Continuous Urban Traffic Model for High Performance Computing System

ABSTRACT. Urban traffic simulation of extensive areas with complex driver models poses a significant computational challenge. Developing highly scalable parallel simulation algorithms is the only feasible way to provide useful results in this case. In this paper, we present extensions of the SMARTS system, a traffic simulation tool, which provides efficient scalability with a large number of parallel processes. The presented extensions enabled its scalability for HPC-grade systems. The extended version has been thoroughly tested in strong and weak scalability scenarios for up to 2400 computing cores of a supercomputer. The satisfactory scalability has been achieved by introducing several significant improvements, which have been discussed in details.

16:30-18:10 Session 6B: MT 6
16:30
A Semi-Supervised Approach for Trajectory Segmentation to Identify Different Moisture Processes in the Atmosphere

ABSTRACT. Different moisture processes in the atmosphere leave distinctive isotopologue fingerprints. Therefore, the paired analysis of water vapour and the ratio between different isotopologues, for example {H2O,dD} with dD as the standardized HDO/H2O isotopologue ratio, can be used to investigate these processes. In this paper, we propose a novel semi-supervised approach for trajectory segmentation to extract information that enables us to identify atmospheric moisture processes. While our approach can be transferred to a variety of domains as well, we focus our evaluation on Lagrangian air parcel trajectories and modelled {H2O,dD} fields. Our final aim is to understand the free tropospheric {H2O,dD} pair distribution that is observable by satellite sensors of the latest generation. Our method adopts a recently developed density-based clustering algorithm with constrained expansion, CoExDBSCAN, which identifies clusters of temporal neighbourhoods that are only expanded with regards to a priori constraints in defined subspaces. By formulating a constraint for the correlation of {H2O,dD}, we can segment trajectories into multiple phases and extract the regression coefficients for each phase. Grouping segments with similar coefficients and comparing them to theoretical values allows us to find interpretable structures that correspond to atmospheric moisture processes. The experimental evaluation demonstrates that our method facilitates an efficient, data-driven analysis of large-scale climate data and multivariate time series in general.

16:50
mRelief: A Reward Penalty based Feature SubsetSelection Considering Data Overlapping Problem

ABSTRACT. Feature selection plays a vital role in machine learning and data mining by eliminating noisy and irrelevant attributes without compromising the classification performance. To select the best subset of features, we need to consider several issues such as the relationship among the features (interaction) and their relationship with the classes. Even though the state-of-the-art, Relief based feature selection methods can handle feature interactions, they often fail to capture the relationship of features with different classes. That is, a feature that can provide a clear boundary between two classes with a small average distance may be mistakenly ranked low compared to a feature that has a higher average distance with no clear boundary (data overlapping). Moreover, most of the existing methods provide a ranking of the given features rather than selecting a proper subset of the features. To address these issues, we propose a feature subset selection method namely modified Relief (mRelief) that can handle both feature interactions and data overlapping problems. Experimental results over twenty-seven benchmark datasets taken from different application areas demonstrate the superiority of mRelief over the state-of-the-art methods in terms of accuracies, number of the selected features, and the ability to identify the features (gene) to characterize a class (disease).

17:10
Reconstruction of Long-Lived Particles in LHCb CERN Project by Data Analysis and Computational Intelligence Methods

ABSTRACT. LHCb at CERN, Geneva is a world-leading high energy physics experiment dedicated to searching for New Physics phenomena. The experiment is undergoing a major upgrade and will rely entirely on a flexible software trigger to process the data in real-time. In this paper a novel approach to reconstructing (detecting) long-lived particles using a new pattern matching procedure is presented. A large simulated data sample is applied to build an initial track pattern by an unsupervised approach. The pattern is then updated and verified by real collision data. As a performance index, the difference between density estimated by nonparametric methods using experimental streaming data and the one based on theoretical premises is used. Fuzzy clustering methods are applied for a pattern size reduction. A final decision is made in a real-time regime with rigorous time boundaries.

17:30
Motion Trajectory Grouping for Human Head Gestures Related to Facial Expressions

ABSTRACT. The paper focuses on human head motion in connection with facial expressions for virtual-based interaction systems. Nowadays, the virtual representation of a human, with human-like social behaviour and mechanism of movements, can realize the user-machine interaction. The presented method includes the head motion because head gestures transmit additional information about the interaction’s situational context. This paper presents head motion analysis based on the rotation of rigid objects technique for virtual-based interaction systems. First, we captured the head gestures of a human subject, expressing three basic facial expressions. The proposed motion model was described using three non-deformable objects, which reflect the neck and head skeleton movement’s character. Based on the captured actions, the motion trajectories were analyzed, and their characteristic features were distinguished. The obtained dependencies were used to created new trajectories using piecewise cubic Hermite interpolating polynomial (PCHIP). The resulting rotation dependencies were used to create movements on the three-dimensional human head.

17:50
DenLAC: Density Levels Aggregation Clustering - A Flexible Clustering Method

ABSTRACT. This paper introduces DenLAC (Density Levels Aggregation Clustering), a novel clustering algorithm which assembles several popular notions in data mining and statistics such as Kernel Density Estimation, density attraction and hierarchical agglomerative clustering. DenLAC is an adaptable clustering algorithm which obtains high accuracy independent of the input dataset's shape and distribution. While the majority of the clustering algorithms are specialized on particular dataset types, DenLAC achieves accurate results for spherical, elongated and different density clusters.

16:30-18:10 Session 6C: AIHPC4AS 3
16:30
Design of borehole resistivity measurement acquisition systems for noisy data using deep learning

ABSTRACT. An estimation of the earth’s subsurface properties facilitates the extraction of natural resources such as oil and gas. To this end, borehole resistivity measurements recorded with logging-while-drilling(LWD) instruments are routinely employed. In LWD, a well-logging tool conveys down into the well borehole, records electromagnetic measurements and transmits the data in real-time to evaluate the formation and subsequently, assist to adjust the inclination and azimuth of the well trajectory.

The main challenge when dealing with LWD technology is the need to interpret borehole resistivity measurements rapidly, possibly in real-time. Thus, Deep Neural Network (DNN)-based methods are suitable for the rapid inversion of borehole resistivity measurement as they approximate the forward and inverse problem offline during the training phase and they only require a fraction of a second for the evaluation. However, inverse problems are generally ill-posed and they do not have unique solutions. DNNs with traditional loss functions based on data misfit are unsuitable for solving such problems. This can be partially fixed by adding regularization terms in a loss function specifically designed for encoder-decoder architectures. However, adding regularization introduces an a priori bias in the set of possible solutions. To avoid this, we use a two-step loss function without any regularization.

Also, for optimal estimation of the inverse solution, we need to carefully select a measurement acquisition system with a sufficient number of measurements. We propose a DNN-based iterative algorithm for designing such a measurement acquisition system. We illustrate our DNN-based iterative algorithm via several synthetic examples. Numerical results show that the predicted formation obtained using a sufficient measurement acquisition system can identify and characterize both resistive and conductive layers above and below the logging instrument. Additionally, to make the method amenable to realistic scenarios, we test them against augmented synthetic data that contain additive Gaussian and multiplicative speckle noise. Our method is proved to be robust against noisy data.

16:50
A Finite Element based Deep Learning solver for parametric PDEs

ABSTRACT. Inverse problems are of great importance to our society. Traditional inverse problem solvers approximate evaluations of the inverse function at certain points. To approximate the full inverse function, which is required in multiple real-time inversion applications, it is possible to use Deep Learning (DL) methods. Critical to the use of DL methods for solving inverse problems is to have a large database. Alternatively, we can use an encoder-decoder based model. In both cases, we need to efficiently solve parametric Partial Differential Equations (PDEs), possibly using a DL arquitecture. In this work, we propose a DL method for solving parametric PDEs that resembles the Finite Element Method (FEM). The architecture aims to mimic the Finite Element connectivity graph when applying mesh refinements: we associate each Neural Network (NN) layer with a mesh refinement. Each NN layer employs a residual type architecture and extends coarse solutions to finer meshes. For simplicity, we restrict to PDEs with piecewise-constant parameters, which is an important case in multiple applications (e.g., in geophysics). The developed DL-FEM first sets an initial architecture that produces coarse solutions after training. Then, we iteratively and dynamically add layers to the architecture, maintaining the previously trained parameters and adding new ones. Subsequently, we retrain end-to-end the new model. The training utilizes a combination of customized Adam and SGD optimizers with a loss-dependent adaptive learning rate. We repeat this process until we achieve a desired degree of discretization/accuracy of the parametric solution. Each training step is the equivalent in DL to perform a V-cycle of a multigrid (MG) method. Numerical results show great performance in semi-positive definite (SPD) problems. For non-SPD problems, the method also provides adequate results, and we are currently analyzing their convergence.

17:10
An application of a pseudo-parabolic modeling to texture image recognition

ABSTRACT. In this work, we present a novel methodology for texture image recognition using a partial differential equation modeling. More specifically, we employ the pseudo-parabolic Buckley-Leverett equation to provide a dynamics to the digital image representation and collect local descriptors from those images evolving in time. For the local descriptors we employ the magnitude and signal binary patterns and a simple histogram of these features was capable of achieving promising results in a classification task. We compare the accuracy over well established benchmark texture databases and the results demonstrate competitiveness, even with the most modern deep learning approaches. The achieved results open space for future investigation on this type of modeling for image analysis, especially when there is no large amount of data for training deep learning models and therefore model-based approaches arise as suitable alternatives.

17:30
A study on a feedforward neural network to solve partial differential equations in hyperbolic-transport problems

ABSTRACT. In this work we present an application of modern deep learning methodologies to the numerical solution of partial differential equations in transport models. More specifically, we employ a supervised deep neural network that takes into account the equation and initial conditions of the model. We apply it to the Riemann problems over the inviscid nonlinear Burger's equation, whose solutions might develop discontinuity (shock wave) and rarefaction, as well as to the classical one-dimensional Buckley-Leverett two-phase problem. The Buckley-Leverett case is slightly more complex and interesting because it has a non-convex flux function with one inflection point. Our results suggest that a relatively simple deep learning model was capable of achieving promising results in such challenging tasks, providing numerical approximation of entropy solutions with very good precision and consistent to classical as well as to recently novel numerical methods in these particular scenarios.

16:30-18:10 Session 6D: MESHFREE 1
16:30
Analysis of vortex induced vibration of a thermowell by high fidelity FSI numerical analysis based on RBF structural modes embedding

ABSTRACT. The present paper addresses the numerical fluid-structure interaction (FSI) analysis of a thermowell immersed in a water flow. The study was carried out implementing a modal superposition approach into a computational fluid dynamics (CFD) solver. The core of the procedure consists in embedding the structural natural modes, computed by a finite element analysis (FEA), by means of a mesh morphing tool based on radial basis functions (RBF). In order to minimize the distortion during the morphing action and to obtain a high quality of the mesh, a set of corrective solutions, that allowed the achievement of a sliding morphing on the duct surface, was introduced. The obtained numerical results were compared with experimental data, providing a satisfying agreement and demonstrating that the modal approach, with an adequate mesh morphing setup, is able to tackle unsteady FSI problems with the accuracy needed for industrial applications.

16:50
Automatic Optimization Method based on mesh morphing surface sculpting driven by Biological Growth Method: an application to the Coiled Spring section shape

ABSTRACT. The increasing importance of optimization in manufacturing processes led to the improvement of well established optimization techniques and to the development of new and innovative approaches. Among these, an approach that exploits surface stresses distribution to obtain an optimized configuration is the Biological Growth Method (BGM). Coupling this method with surface sculpting based on Radial Basis Functions (RBF) mesh morphing had proven to be efficient and effective in optimizing specific mechanical components. In this work, the automatic, meshless and constrained parameter-less optimization approach is applied to a classical mechanical component.

16:30-18:10 Session 6E: COMS 3
16:30
Modelling and forecasting based on recurrent pseudoinverse matrices

ABSTRACT. Time series modelling and forecasting techniques have a wide spectrum of applications in several fields including economics, finance, engineering and computer science. Most available modelling and forecasting techniques are applicable to a specific underlying phenomenon and its properties and lack generality of application, while more general forecasting techniques require substantial computational time for training and application. Herewith, we present a general modelling framework based on a recursive Schur - complement technique, that utilizes a set of basis, linear or non-linear, to form a model for a general time series. The basis functions need not be orthogonal and their number is determined adaptively based on fitting accuracy. Moreover, no assumptions are required for the input data. The coefficients for the basis functions are computed using a recursive pseudoinverse matrix, thus they can be recomputed for different input data. The case of sinusoidal basis functions is presented. Discussions around stability of the resulting model and choice of basis functions is also provided. Numerical results are given depicting the applicability and effectiveness of the proposed technique.

16:50
Semi-analytical Monte Carlo optimisation method applied to the inverse Poisson problem

ABSTRACT. The research is focused on the numerical analysis of the inverse Poisson problem, namely the identification of the unknown (input) load source function, being the right-hand side function of the second order differential equation. It is assumed that the additional measurement data of the solution (output) function are available at few isolated locations inside the problem domain. The problem may be formulated as the non-linear optimisation problem with inequality constrains.

The proposed solution approach is based upon the well-known Monte Carlo concept with a random walk technique, approximating the solution of the direct Poisson problem at selected point(s), using series of random simulations. However, since it may deliver the linear explicit relation between the input and the output at measurement locations only, the objective function may be analytically differentiated with the respect to unknown load parameters. Consequently, they may be determined by the solution of the small system of algebraic equations. Therefore, drawbacks of traditional optimization algorithms, computationally demanding, time-consuming and sensitive to their parameters, may be removed. %Moreover, the combination of the Monte Carlo method with selected meshless techniques allows to extend the scope of application of the method to problems with an arbitrary geometry and mixed boundary conditions. The potential power of the proposed approach is demonstrated on selected benchmark problems with various levels of complexity.

17:10
Modeling the contribution of agriculture towards soil nitrogen surplus in Iowa

ABSTRACT. The Midwest state of Iowa in the US is one of the major producers of corn, soybean, ethanol, and animal products, and has long been known as a significant contributor of nitrogen loads to the Mississippi river basin, supplying the nutrient-rich water to the Gulf of Mexico. Nitrogen is the principal contributor to the formation of the hypoxic zone in the northern Gulf of Mexico with a significant detrimental environmental impact. Agriculture, animal agriculture, and ethanol production are deeply connected to Iowa's economy. Thus, with increasing ethanol production, high yield agriculture practices, growing animal agriculture, and the related economy, there is a need to understand the interrelationship of Iowa's food-energy-water system to alleviate its impact on the environment and economy through improved policy and decision making. In this work, the Iowa food-energy-water (IFEW) system model is proposed that describes its interrelationship. Further, a macro-scale nitrogen export model of the agriculture and animal agriculture systems is developed. Global sensitivity analysis of the nitrogen export model reveals that the commercial nitrogen-based fertilizer application rate for corn production and corn yield are the two most influential factors affecting the surplus nitrogen in the soil.

17:30
An attempt to replace System Dynamics with Discrete Rate Modeling in demographic simulations

ABSTRACT. The usefulness of simulation in demographic research has been repeatedly confirmed in the literature. The most common simulation ap-proach to model population trends is system dynamic (SD). Difficulties in a reliable mapping of population changes with SD approach have been howev-er reported by some authors. Another simulation approach, i.e. discrete rate modeling (DRM), had not yet been used in population dynamics modelling, despite examples of this approach being used in the modelling of processes with similar internal dynamics. The purpose of our research is to verify if DRM can compete with the SD approach in terms of accuracy in simulating population changes and the complexity of the model. The theoretical part of the work describes the principles of the DRM approach and provides an overview of the applications of the DRM approach versus other simulation methods. The experimental part permits the conclusion that DRM approach does not match the SD in terms of comprehensive accuracy in mapping the behavior of cohorts of the complex populations. We have been however able to identify criteria for population segmentation that may lead to better re-sults of DRM simulation against SD.

17:50
New On-Line Algorithms for Modelling, Identification and Simulation of Non-Linear Multidimensional Dynamic Systems Using Modulating Functions and Non-Asymptotic State Estimators: Case Study for a Chosen Physical Process

ABSTRACT. The paper presents an advanced application of computation methodology with complicated algorithms and calculation methods dedicated to optimal identification and simulation of dynamic processes. These models may have an unknown structure (the order of a differential equation) and unknown parameters. The presented methodology uses non-standard algorithms for identification of such continuous time models that can represent linear and non-linear physical processes. In research and in the subject literature there are presented the solvers for these problems, but devoted only to time discrete models. However, for the case of continuous time models with differential equation in which both, the parameters and the derivatives of the output variable are unknown, the solution is not easy. In the paper, for the solution of identification task the convolution transformation of the differential equation with a special Modulating Function will be used. Also, to be able properly simulate the behaviour of the process based on the obtained model, the exact state integral observers with minimal norm will be used for the reconstruction of the exact value of the initial conditions (not their estimate). For multidimensional process case, with multiple control signals (many inputs), additional problems arise that make continuous identification and observation of the vector state (and hence simulation) impossible by the use of the standard methods. Application of the above-mentioned methods for solving this problem will be also presented. Both algorithms, for the identification of parameters and observation of the state, will be implemented on-line in two independent but cooperating windows that will simultaneously move along the time axis. The presented algorithms will be tested on the basis of real process data recorded at successive time intervals within chosen physical process of the heat exchange during glass conditioning in the long channel of the forehearth which is the final part of the glass melting installation.

16:30-18:10 Session 6F: SOFTMAC 3
16:30
Numerical investigation of transport processes in porous media under laminar, transitional and turbulent flow conditions with the lattice-Boltzmann method

ABSTRACT. In the present paper the mass transfer in porous media under laminar, transitional and turbulent flow conditions was investigated using the lattice-Boltzmann method (LBM). While many previous studies applied the LBM to determine the species transport in complex geometries under laminar conditions, the main objective of this study was to demonstrate the applicability of the LBM to turbulent internal flows also including the transport of a scalar quantity. Thus, besides the resolved scalar transport, an additional turbulent diffusion coefficient was introduced to account for the subgrid-scale turbulent transport. A packed-bed of spheres and an adsorber geometry based on µCT scans were considered. The simulations were carried out for a Schmidt number of 1 and turbulent Schmidt number of 0.7, which are appropriate for gaseous flows. While a two-relaxation time (TRT) model was applied to the laminar and transitional cases, the Bhatnagar-Gross-Krook (BGK) collision operator in conjunction with the Smagorinsky turbulence model was used for the turbulent flow regime. To validate the LBM results, simulations under the same conditions were carried out with ANSYS Fluent v19.2. It was found that the pressure drop over the height of the packed-bed were in close accordance to empirical correlations. Furthermore, the comparison of the calculated species concentrations for all flow regimes showed good agreement between the LBM and the results obtained with Fluent. Subsequently, the proposed extension of the Smagorinsky turbulence model seems to be able to predict the scalar transport under turbulent conditions.

16:50
A three-level linearized time integration scheme for tumor simulations with Cahn-Hilliard equations

ABSTRACT. The paper contains an analysis of a three-level linearized time integration scheme for Cahn-Hilliard equations. We start with a rigorous mixed strong/variational formulation of the appropriate initial boundary value problem taking into account the existence and uniqueness of its solution. Next we pass to the definition of two time integration schemes: the Crank-Nicolson and a three-level linearized ones. All schemes are applied to the discrete version of Cahn-Hilliard equation obtained through the Galerkin approximation in space. We prove that the sequence of solutions of the mixed three level finite difference scheme combined with the Galerkin approximation converges when the time step length and the space approximation error decrease. We also recall the verification of the second order of this scheme and its unconditional stability with respect to the time variable. A comparative scalability analysis of parallel implementations of the schemes is also presented.

17:10
A Study on a Marine Reservoir and a Fluvial Reservoir History Matching Based on Ensemble Kalman Filter

ABSTRACT. In reservoir management, utilizing all the observed data to update the reservoir models is the key to make accurate forecast on the parameters changing and future production. Ensemble Kalman Filter (EnKF) provides a practical way to continuously update the petroleum reservoir models, but its application reliabil-ity in different reservoirs types and the proper design of the ensemble size are still remain unknown. In this paper, we mathematically demonstrated Ensemble Kal-man Filter method; discussed its advantages over standard Kalman Filter and Ex-tended Kalman Filter (EKF) in reservoir history matching, and the limitations of EnKF. We also carried out two numerical experiments on a marine reservoir and a fluvial reservoir by EnKF history matching method to update the static geologi-cal models by fitting bottom-hole pressure and well water cut, and found the op-timal way of designing the ensemble size. A comparison of those the two numer-ical experiments is also presented. Lastly, we suggested some adjustments of the EnKF for its application in fluvial reservoirs.

17:30
Numerical Investigation on Leakage and Diffusion Characteristics of Domestic Hydrogen-Blended Natural Gas in Indoor Space

ABSTRACT. Blending hydrogen into the natural gas pipeline has significant influences on the safety accident characteristics and evolution law of natural gas. In this study, the leakage and diffusion characteristics of hydrogen-blended natural gas in an indoor kitchen are numerically simulated and investigated. A real mathematical model for the kitchen cooker leakage is established, and the accuracy of the model is verified by reported experimental data in literature. The finite volume method is used to solve the mathematical model, and the pressure-correction algorithm is applied to handle the pressure-velocity cou-pling, the k-ε turbulent model is adopted to describe the turbulent flow of natu-ral gas-hydrogen mixture. The hazardous area, alarm response time and con-centration distribution of the leaked gas are mainly studied, and the effects of leakage rate, hydrogen blending ratio and ventilation conditions are analyzed in detail. Results show that with the increase of hydrogen blending ratio, the alarm response time will be advanced, i.e., when the mass flow rate remains constant, the alarm response time of natural gas with hydrogen blending ratio of 5%, 10%, 15% and 20% is 5.41%, 11.71%, 21.62% and 23.65% earlier than that of natural gas without hydrogen, respectively. The natural ventilation has an obvious impact on reducing the concentration of leaked gas in the kitchen. When the wind speed is only 1m/s, the gas concentration in the kitchen can be guaranteed to be lower than the alarm threshold value.

17:50
Modeling and Simulation of Atmospheric Water Generation Unit Using Anhydrous Salts

ABSTRACT. The atmosphere contains 3400 trillion gallons of water vapor, which would be enough to cover the entire earth in 1 inch of water. As air humidity is available everywhere, it acts as an abundant alternative as a renewable reservoir of water known as atmospheric water. Atmospheric water harvesting system efficiency depends on the sorption capacity of water based on the adsorption phenomenon. Using anhydrous salts is an efficient process for capturing and delivering water from ambient air, especially at a low relative humidity as low as 15 %. A lot of water-scarce countries like Saudi Arabia have much annual solar radiation and relatively high humidity. This study is focusing on modeling and simulating the water absorption and release of the anhydrous salt copper chloride (CuCl2), under different relative humidity to produce atmospheric drinking water in scarce regions.

16:30-18:10 Session 6G: CLDD 3
16:30
Application of Multi-Objective Optimization to Feature Selection for a Difficult Data Classification Task

ABSTRACT. Many different decision problems require taking a compromise between the various goals we want to achieve into account. A certain group of features often decides the state of a given object. An example of such a task is selecting features that allow increasing the decision's quality while minimizing the cost of features or the total budget. The main purpose of the work is to compare feature selection methods such as the classical approach, the one-objective optimization and the multi-objective optimization. The article proposes a method of selecting features using various criteria, i.e., the cost and the accuracy with the use of a Genetic Algorithm. In this way, the optimal Pareto points for the nonlinear problem of multi-criteria optimization were obtained. These points constitute a compromise between two conflicting objectives. By carrying out various experiments on various base classifiers, it has been shown that the proposed approach can be used in the task of optimizing difficult data.

16:50
Deep Embedding Features for Action Recognition on Raw Depth Maps

ABSTRACT. In this paper we present approach for embedding features for action recognition on raw depth maps. It demonstrates high potential when amount of training data is small. A convolutional autoencoder is trained to learn embedded features, encapsulating the content of single depth maps. Afterwards, multichannel 1D CNN features are extracted on multivariate time-series of such embedded features to represent actions on depth map sequences. In the second stream the dynamic time warping is used to extract action features on multivariate streams of statistical features from single depth maps. The output of the third stream are class-specific action features extracted by TimeDistributed and LSTM layers. The action recognition is achieved by voting in an ensemble of one-vs-all weak classifiers. We demonstrate experimentally that the proposed algorithm achieves competitive results on UTD-MHAD dataset and outperforms by a large margin the best algorithms on 3D Human-Object Interaction Set (SYSU 3DHOI).

17:10
Analysis of variance application in the construction of classifier ensemble based on optimal feature subset for the task of supporting glaucoma diagnosis

ABSTRACT. This work aims to develop a new method of constructing an ensemble of classifiers diversified by the appropriate selection of the problem subspace. The experiments were performed on a numerical dataset in which three groups are present: healthy controls, glaucoma suspects, and glaucoma patients. Overall, it consists of medical records from 211 cases described by 48 features, being the values of biomarkers, collected at the time of glaucoma diagnosis. To avoid the risk of losing information hidden in the features, the proposed method for each base classifier draws a separate subset of the features from all available, according to the probability determined by the ANOVA test. The method was validated for four base classifiers and various subspace sizes, and compare with existing feature selection methods. The cross-validation results were confirmed by a non-parametric corrected t-test. For all of the presented base classifiers, the method achieved superior results in comparison with the others presented. A high level of accuracy is maintained for different subspace sizes which also reduces the need to optimize method hyperparameters. Experiments confirmed the effectiveness of the proposed method to create an ensemble of classifiers for small high-dimensional datasets.

17:30
Multi-objective evolutionary undersamplingalgorithm for imbalanced data classification

ABSTRACT. The classification of imbalanced data is an important topic of research conducted in recent years. One of the commonly used techniques for dealing with this problem is undersampling, which is aiming to balance the training set by selecting the most important samples of the original set. The selection procedure proposed in this paper is using multi-objective genetic algorithm NSGA-2, for a search of the optimal subset of the learning set. The paper presents the detailed description of the considered method. Moreover, the proposed algorithm has been compared with a selection of reference algorithms showing promising results.

17:50
Missing value imputation method using separatefeatures nearest neighbors algorithm

ABSTRACT. Missing value imputation is a problem often meet when working with medical and biometric data sets. Prior to working on these datasets, missing values have to be eliminated. It could be done by imputing estimated values. However, imputation should not bias data, nor alter the class balance. This paper presents an innovative approach to the problem of imputation of missing values in the training data for the classification. Method uses the k-NN classifier on a separate features to impute missing values. The unique approach used in this method allows using data from incomplete vectors to impute another incomplete vectors, unlike in conventional methods, where only complete vectors could be used in the imputation process. The paper also describes a test protocol, where the Cross Validation with a Set Substitution method is used as an evaluation tool for scoring missing value imputation methods.

16:30-18:10 Session 6H: IoTSS 2
16:30
Metagenomic analysis at the edge with Jetson Xavier NX

ABSTRACT. Nanopore sequencing technologies and devices such as MinION Nanopore enable cost-effective and portable metagenomic analysis. However, performing mobile metagenomics analysis in secluded areas requires computationally and energetically efficient Edge devices capable of running the whole analysis workflow without access to extensive computing infrastructure. This paper presents a study on using Edge devices such as Jetson Xavier NX as a platform for running real-time analysis. In the experiments, we evaluate it both from a performance and energy efficiency standpoint. For the purposes of this article, we developed a sample workflow, where raw nanopore reads are basecalled and later classified with Guppy and Kraken2 software. To provide an overview of the capabilities of Jetson Xavier NX, we conducted experiments in various scenarios and for all available power modes. The results of the study confirm that Jetson Xavier NX can serve as an energy-efficient, performant, and portable device for running real-time metagenomic experiments, especially in places with limited network connectivity, as it supports fully offline workflows. We also noticed that a lot of tools are not optimized to run on such Edge devices, and we see a great opportunity for future development in that area.

16:50
Programming IoT-spaces: A User-Survey on Home Automation Rules

ABSTRACT. The Internet-of-Things (IoT) has transformed everyday manual tasks into digital and automatable ones, giving way to the birth of several end-user development solutions that attempt to ease the task of configuring and automating IoT systems without requiring prior technical knowledge. While some studies reflect on the automation rules that end-users choose to program into their spaces, they are limited by the number of devices and possible rules that the tool under study supports. There is a lack of systematic research on (1)~the automation rules that users wish to configure on their homes, (2)~the different ways users state their intents, and (3)~the complexity of the rules themselves --- without the limitations imposed by specific IoT devices systems and end-user development tools. This paper surveyed twenty participants about home automation rules given a standard house model and device's list, without limiting their creativity and resulting automation complexity. We analyzed and systematized the collected 177 scenarios into seven different interaction categories, representing the most common smart home interactions.

17:10
Application of the Ant Colony algorithm for routing in next generation programmable networks

ABSTRACT. New generation 5G technology provides mechanisms for network resources management to control efficiently dynamical bandwidth allocation and assure the Quality of Service (QoS) in terms of KPI's (Key Performance Indicators) that is important for delay or loss sensitive Internet of Things (IoT) services. To meet such applications requirements, network resources management in Software Defined Networking (SDN), supported by Artificial Intelligence (AI) algorithms, comes with the solution. In our approach, we propose the solution where AI is responsible for controlling intent-based routing in the SDN network. The paper focuses on algorithms inspired by biology, i.e. the ant algorithm for selecting the best routes in a network with an appropriate defined objective function and constraints. The proposed algorithm is compared with the Mixed Integer Programming (MIP) based algorithm and a greedy algorithm. Performance of above algorithms is tested and compared for several network topologies. The obtained results confirm that ant colony algorithm is a viable alternative to the MIP and greedy algorithms and provide the base for further enhanced research for its effective application to programmable networks.

17:30
Scalable Computing System with Two-Level Reconfiguration of Multi-Channel Inter-Node communication

ABSTRACT. The paper presents the architecture and organization of a reconfigurable inter-node communication system based on hierarchical embedding and logical multi-buses. The communication environment is a physical network with a bus topology or its derivatives (e.g. folded buses, mesh and toroidal bus network). In the system, multi-channel communication is forced through the use of tunable signal receivers/transmitters, with the buses or their derivatives being completely passive. In the physical environment, logical components (nodes, channels, paths) are distinguished on the basis of which logical connection networks separated from each other are created. The embedding used for this purpose is fundamentally different from the previous interpretations of this term. Improvement of communication and computational efficiency is achieved by changing the physical network architecture (e.g. the use of folded bus topologies, 2D and 3D networks), as well as the logical level by grouping system elements (processing nodes and bus channels) or their division. As a result, it is possible to ensure uniformity of communication and computational loads of system components. To enable formal design of the communication system, a method of hierarchy description and selection of its organization was proposed. In addition, methods of mathematical notation of bus topologies and the scope of their applications were analyzed. The work ends with a description of simulations and empirical research on the effectiveness of the proposed solutions. There is high flexibility of use and relatively low implementation price.

17:50
Real-time Object Detection for Smart Connected Worker in 3D printing

ABSTRACT. IoT and smart systems have been introduced into advanced manufacturing, especially 3D printing with the trend of fourth industrial revolution. The rapid development of computer vision and IoT devices in recent years has led the fruitful direction to the development of real-time machine state monitoring. In this study, computer vision technology was adopted into the Smart Connected Worker (SCW) system with the use case of 3D printing. Specifically, artificial intelligence models was investigated instead of discrete labor intensive methods to monitor the machine state and predict the errors and risks for advanced manufacturing. The model achieves accurate supervision in real time for twenty-four hours a day, which can reduce human resource costs significantly. At the same time, the experiments demonstrate the feasibility of adopting AI technology to more aspects of advanced manufacturing.

16:30-18:10 Session 6I: ACMAIML 2
16:30
Text-Based Product Matching with Incomplete and Inconsistent Items Descriptions

ABSTRACT. In recent years Machine Learning and Artificial Intelligence are reshaping the landscape of e-commerce and retail. Using advanced analytics, behavioral modeling, and inference, representatives of these industries can leverage collected data and increase their market performance. To perform assortment optimization -- one of the most fundamentals problems in retail -- one has to identify products that are present in the competitors' portfolios. It is not possible without effective product matching. The paper deals with finding identical products in the offer of different retailers. The task is performed using a text-mining approach, assuming that the data may contain incomplete information. Besides the description of the algorithm, the results for real-world data fetched from the offers of two consumer electronics retailers are being demonstrated.

16:50
Unsupervised Text Style Transfer via An Enhanced Operation Pipeline

ABSTRACT. Unsupervised text style transfer aims to change the style attribute of the given unpaired texts while preserving the style-independent semantic content. In order to preserve the content, some methods directly remove the style-related words in texts. The remaining content, together with target stylized words, are fused to produce target samples with transferred style. In such a mechanism, two main challenges should be well addressed. First, due to the style-related words are not given explicitly in the original dataset, a detection algorithm is required to recognize the words in an unsupervised paradigm. Second, the compatibility between the remaining content and target stylized words should be guaranteed to produce valid samples. In this paper, we propose a multi-stage method following the working pipeline -- Detection, Matching, and Generation. In the Detection stage, the style-related words are recognized by an effective joint method and replaced by mask tokens. Then, in the Matching stage, the contexts of the masks are employed as queries to retrieve target stylized tokens from candidates. Finally, in the Generation stage, the masked texts and retrieved style tokens are transformed to the target results by attentive decoding. On two public sentimental style datasets, experimental results demonstrate that our proposed method addresses the challenges mentioned above and achieves competitive performance compared with several state-of-the-art methods.

17:10
Exemplar Guided Latent Pre-trained Dialogue Generation

ABSTRACT. Pre-trained models with latent variables have been proved to be an effective method in the diverse dialogue generation. However, the latent variables in current models are finite and uninformative, making the generated responses lack diversity and informativeness. In order to address this problem, we propose an exemplar guided latent pre-trained dialogue generation model to sample the latent variables from a continuous sentence embedding space, which can be controlled by the exemplar sentences. The proposed model contains two parts: exemplar seeking and response generation. First, the exemplar seeking builds a sentence graph based on the given dataset and seeks an enlightened exemplar from the graph. Next, the response generation constructs informative latent variables based on the exemplar and generates diverse responses with latent variables. Experiments show that the model can effectively improve the propriety and diversity of responses and achieve state-of-the-art performance.

17:30
Monte Carlo Winning Tickets

ABSTRACT. Recent research on sparse neural networks demonstrates thatdensely-connected models contain sparse subnetworks that are trainablefrom a random initialization. Existence of these so calledwinning ticketssuggests that we may possibly forego extensive training-and-pruning pro-cedures, and train sparse neural networks from scratch. Unfortunately,winning tickets are data-derived models. That is, while they can betrained from scratch, their architecture is discovered via iterative prun-ing. In this work we propose Monte Carlo Winning Tickets (MCTWs) –random, sparse neural architectures that resemble winning tickets withrespect to certain statistics over weights and activations. We show thatMCTWs can match performance of standard winning tickets. This opensa route to constructing random but trainable sparse neural networks.

17:50
Interpreting Neural Networks Prediction for a Single Instance via Random Forest Feature Contributions

ABSTRACT. In this paper, we are focusing on the problem of interpreting Neural Networks on the instance level. The proposed approach uses the Feature Contributions, numerical values that domain experts further interpret to reveal some phenomena about a particular instance or model behavior. In our method, Feature Contributions are calculated from the Random Forest model trained to mimic the Artificial Neural Network's classification as close as possible. We assume that we can trust the Feature Contributions results when both predictions are the same, i.e., Neural Network and Feature Contributions give the same results. The results show that this highly depends on the level the Neural Network is trained because the error is then propagated to the Random Forest model. For good trained ANNs, we can trust in interpretation based on Feature Contributions on average in 80%.

18:10
A Higher-Order Adaptive Network Model to Simulate Development of and Recovery from PTSD

ABSTRACT. In this paper, a second-order adaptive model is introduced for simulation of the formation of a mental model of a traumatic course of events and its emotional responses, and learning processes of how a stimulus can become a trigger to activate this mental model. Furthermore, the influence of therapy on the ability of an individual to learn to control the emotional responses to the traumatic mental model was modeled. To unblock and activate this learning, a form of second-order adaptation was applied.

16:30-18:10 Session 6J: SE4Science 2

Working Session to write blog posts