View: session overviewtalk overview
11:10 | Tackling Morpion Solitaire with AlphaZero-like Ranked Reward Reinforcement Learning PRESENTER: Hui Wang ABSTRACT. Morpion Solitaire is a popular single player game, performed with paper and pencil. Due to its large state space (on the order of the game of Go) traditional search algorithms, such as MCTS, have not been able to find good solutions. A later algorithm, Nested Rollout Policy Adaptation, was able to find a new record of 82 steps, albeit with large computational resources. After achieving this record, to the best of our knowledge, there has been no further progress reported, for about a decade. In this paper we take the recent impressive performance of deep self-learning reinforcement learning approaches from AlphaGo/AlphaZero as inspiration to design a searcher for Morpion Solitaire. A challenge of Morpion Solitaire is that the state space is sparse, there are few win/loss signals. Instead, we use an approach known as ranked reward to create a reinforcement learning self-play framework for Morpion Solitaire. This enables us to find medium-quality solutions with reasonable computational effort. Our record is a 67 steps solution, which is very close to the human best (68) without any other adaptation to the problem than using ranked reward. We list many further avenues for potential improvement. |
11:30 | Machine Learning based models for examining differences between modern and ancient DNA in dental calculus PRESENTER: Iulia-Monica Szuhai ABSTRACT. DNA, or deoxyribonucleic acid, carries the entirety of genetic information of any living organism. The study of the bacterial DNA extracted from human bones excavated from archaeological and anthropological sites aims to analyse the evolution of microorganisms inhabiting the human body and to contribute to new insight related to the health, diet and even migration of our ancestors. This paper aims to offer a solution for the discrimination between ancient and modern bacterial DNA in dental calculus. We propose three internal representations for the considered DNA sequences in order to analyse which captures the most information and is more informative for classification models. Two of these are text based, while the third one takes advantage of several physical and chemical properties of nucleotides in the DNA. We use a data set containing both ancient and modern dental calculus bacterial DNA and apply two supervised models, namely artificial neural networks and support vector machines to distinguish between the two types of sequences. The two main conclusions indicated by the obtained results are: the representation based on physical and chemical properties seems to best capture relevant information for the task at hand; for the considered data set and DNA encoding proposals, support vector machines outperform artificial neural networks, although results obtained by both models are promising. |
11:50 | Extractive Summarization using Cohesion Network Analysis and Submodular Set Functions ABSTRACT. Numerous approaches have been introduced to automate the process of text summarization, but only few can be easily adapted to multiple languages. This paper introduces a multilingual text processing pipeline integrated in the open-source ReaderBench framework, which can be retrofit to cover more than 50 languages. While considering the extensibility of the approach and the problem of missing labeled data for training in various languages besides English, an unsupervised algorithm was preferred to perform extractive summarization (i.e., select the most representative sentences from the original document). In particular, two different approaches relying on text cohesion were implemented: a) a graph-based text representation derived from Cohesion Network Analysis that extends TextRank, and b) a class of submodular set functions. Evaluations were performed on the DUC dataset and use as baseline the implementation of TextRank from Gensim. Our results using the submodular set functions outperform the baseline. In addition, two use cases on English and Romanian languages are presented, with corresponding graphical representations for the two methods. |
12:10 | Cohesion Network Analysis: Customized Curriculum Management in Moodle ABSTRACT. Learning Management Systems frequently act as platforms for online content which is usually structured hierarchically into modules and lessons to ease navigation. However, the volume of information may be overwhelming, or only part of the lessons may be relevant for an individual; thus, the need for customized curricula emerges. We introduce a Moodle plugin developed to help learners customize their curriculum to best fit their learning needs by relying on specific filtering criteria and semantic relatedness. For this experiment, a Moodle instance was created for doctors working in the field of nutrition in early life. The platform includes 78 lessons tackling a wide variety of topics, organized into five modules. Our plugin enables users to specify basic filtering criteria, including their field of expertise, topics of interest from a predefined taxonomy, or expected themes (e.g., background knowledge, practice & counselling, or guidelines) for a preliminary pre-screening of lessons. In addition, learners can also provide a description in natural language of their learning interests. This text is compared with each lesson’s description using Cohesion Network Analysis, and lessons are selected above an experimentally set threshold. Our approach also takes into account prior knowledge requirements, and may suggest lessons for further reading. Overall, the plugin covers the management of the entire course lifecycle, namely: a) creating a customized curriculum; b) tracking the progress of completed lessons; c) generating completion certificates with corresponding CME points. |
12:30 | What’s Been Happening in the Romanian News Landscape? A Detailed Analysis Grounded in Natural Language Processing Techniques ABSTRACT. People strive to be connected to events happening worldwide in terms of politics, technology, sports, business, and many other domains. The main source of news today resides in online publications which can strongly influence the public opinion. Our purpose is to build a comprehensive automated pipeline, integrating various Natural Language Processing techniques, to process online news written in the Romanian language. Our dataset consists of 631,565 news articles from various Romanian publications between May 2004 and December 2019 which are used to detect semantic similarities between articles and rank various publications in terms of their influence. Furthermore, we created visualizations to ease the understanding of results and ensure efficient text retrieval over the gathered articles. In the future, we plan to apply opinion mining, geographical names extraction and content quality assessments relating, for example, to the likelihood of being a fake news. |
11:10 | A Metric-Based Approach to Modelling a Virtual Machine for Smart Contract Execution PRESENTER: Alexe Luca Spataru ABSTRACT. Applicability of smart contracts showed significant potential to multiple industries by offering data integrity, transparency, non-repudiation, elimination of trust. An essential part of introducing programmability in a blockchain is a process virtual machine that performs the execution and computes new valid states. In this paper, a model for a programmable blockchain will be presented that can perform computations and track the state changes. Besides that, the virtual machine responsible for the execution will be defined as a deterministic state machine and there will be proposed a loosely-coupled architecture such that the virtual machine works independently from the blockchain. In addition to that, flaws and shortcomings of current execution environments of today's popular blockchain platforms will be discussed. The models provided could serve as an implementation guide for blockchains that want to add programmability to their system. |
11:30 | Towards Efficient Governance In Distributed Ledger Systems Using High-Performance Computational Nodes PRESENTER: Nicolae-Bogdan-Cristian Ţogoe ABSTRACT. Due to increasing popularity in the distributed ledger systems and the increasing demand of a stable model of a blockchain, fitting both for the development of IoT(internet of things) and DApps(Decentralized applications), the current generation of blockchain needs to solve its main problems ranging from scalability to security and to bring improvements in order to fit the needs of the society. One of many solutions brought by current prospective blockchains is a form of governance through different types of nodes, usually equipped with more computational resources, that have a more central significance in the network. In this paper, we tried to showcase the applicability of democratic governance in the blockchain ecosystem through the use of supernodes in order to solve some of the current dilemmas. Despite the multitude of the use-cases, we will focus on four that show great potential in improving the blockchain technology, by outlining both their positive and negative points. Besides that, current blockchains that have a form of governance will be analyzed by examining the use-cases of the supernodes as well as the benefits and negative aspects they give to the ecosystem. |
14:00 | Quality of Pre-trained Deep-Learning Models for Palmprint Recognition ABSTRACT. In this paper, we present a comprehensive study on deep learning methods and datasets used for solving the palmprint recognition problem. The quality of image embeddings provided by deep neural networks, pre-trained on the ImageNet dataset, are evaluated on palmprint recognition in the visible spectrum task. In our tests, we used twelve publicly available datasets obtained with different types of acquisition procedures: constrained, partially constrained and unconstrained. Sixteen convolutional neural networs (two from the VGG family, six from ResNet, three from Inception, two from MobileNet and three from DenseNet) were evaluated. We analyzed the results from the point of view of specialization potential, the difficulty of the datasets and general parameter tuning. For evaluation, EER (Equal Error Rate) was employed. We ranked the datasets and appraised the feature vectors computed by the pre-trained networks using this metric. The best results, on average, were provided by the deep neural networks from the MobileNet family. The distances used for comparing the feature vectors were Euclidean, Cityblock, cosine and correlation. The best results were obtained with the cosine family distances. |
14:20 | Edge map response of dilated and reconstructed classical filters PRESENTER: Ciprian-Constantin Orhei ABSTRACT. Edge detection is a basic and fundamental feature in image processing domain. Dilation of edge filters kernels has proven to bring benefits for the edge detection operation by permitting to filter out noise and to take in consideration a bigger region of the image when processing. Numerous techniques were used in the past for finding edge features, one of the most common used being finding features in lower level scale of the image pyramid. Now, naturally, we want to investigate if our dilating of the filter kernels bring similar benefits as finding edges in a lower scale pyramid level. |
14:40 | Semantic Image Inpainting via Maximum Likelihood PRESENTER: Sebastian Ciobanu ABSTRACT. Current approaches involve deep learning in solving an image inpainting task. We propose a meta-algorithm in which we can set a probabilistic distribution on images. The set distribution can be a classic one, e.g. Normal distribution, or a modern one, e.g. PixelCNN++. Our first observation is relatively unexpected: if a learnt distribution generates reasonable images, then this does not make it is a good candidate for inpainting via our proposed algorithm and vice-versa (if a learnt distribution gives reasonable results on image inpainting via our algorithm, then this does not make it is a good candidate for sampling a new image). Our second observation is that although the visual results of the state-of-the-art method are superior to ours, the training time for our method is lower. Hence, one can experiment faster with our method to see, for example, if the desired inpainting is learnable. Moreover, using a specific distribution, our algorithm can be trained also on high-quality RGB images, like 1024 times 1024 pixels. As for the experiments, we included visual results. The Google Colab notebook with the code and the demo is available at: https://github.com/aciobanusebi/mle-inpainting |
15:00 | Automatic Real-Time Road Crack Identification System PRESENTER: Lucia-Georgiana Coca ABSTRACT. Crack identification is a common problem that requires human involvement and manual identification. Our work is focused on detecting street surface cracks using Computer Vision algorithms. The problem at hand has been divided in three steps: (i) the first step transforms a given 3D video in 2D individual frames, (ii) the second step processes these frames in order to identify the relevant part of the street using Support Vector Machine and Vanishing Point Detection and (iii) in the third step the detection itself has been implemented using three methods: Convolutional Neural Network, U-Net and a Local Binary Pattern. In this paper we present our methods, experiments and results for detecting cracks on surfaces like streets and sidewalks. |
15:20 | Should I trust a deep learning condition monitoring prediction ? ABSTRACT. We introduce in this paper an explainable deep learning solution for non-invasive condition monitoring of cantilever beams and we emphasize the advantages of it. The explanations of the black box AI connectionist classifier are provided as features-related importance ranking for the output of the probabilistic decision margin, improving in this way the trust in the exact recognition of damaged beams and its characteristics ie. damage depth and damage size. For training the classifier, we have used precomputed distributional sets with 10 natural frequencies. The local, sample based explanation is obtained from a model agnostic LIME algorithm and the global explanation is obtained from averaging SHAP values, both applied post-hoc to the classifier. We have performed intensive testing and we have observed that sometimes the decision is not to be trusted due to the features that mostly influenced that particular decision despite of its high accuracy. |
17:10 | DS Lab Notebook: A new tool for data science applications PRESENTER: Alexandru Ionascu ABSTRACT. The main focus of the technical application of this research work relies on a web and mobile-based solution identified as Data Science Notebook named as DS Lab Notebook. Specifically, the main focus will be on tackling already present challenges in data science education and a solution presented around DS Lab, an interactive computational notebook. The core ideas are represented by the concept of extending the traditional computing notebooks, especially from the Jupyter family, with live visualizations, debugging, widgets, and interactivity during the educational process across all the major platforms: web, Android, and IOS. The features are outlined in several use cases that can be useful in the data-science teaching process, with a primary focus in matrix manipulations, scatter plots, and image filters. Python 3 was used as the main programming runtime and the back-end for providing access to variable values and type information is being described in the form of a runtime-independent pipeline relying on code parsing and injection. |
17:30 | Maia’s fixed point theorems for discontinous mappings ABSTRACT. We establish new fixed point theorems for Maia’s fixed point theorem in the setting of a space with a distance, more precisely when both metrics are replaced with two distance. We also present some exemples to illustrate the theoretical results. |
17:50 | Business Decisions Support using Sentiment Analysis in CRM Systems PRESENTER: Bashar Al Asaad ABSTRACT. The role of Sentiment Analysis in business decision making process is an important role, especially in these days where people relay on online shopping and write their opinions describing the online purchased products. We have implemented two different approaches to deal with Sentiment Analysis in Customer Relationship Management (CRM) systems. One approach is based on Natural Language Processing (NLP) algorithms, and the second approach is based on Machine Learning Probabilistic Classification. The NLP approach is based on manually extracting the opinion words and creating an algorithm to classify customer reviews based on the extracted features. It also includes extracting the aspects of the product and their semantic orientation percentages scores. The features extraction is based on the "Dependency Parsing" technique. The Machine Learning algorithm is a supervised learning algorithm that will use a labelled dataset to be trained, the data will be transformed using Term Frequency-Inverse Document Frequency text representation model. For experimental results, we have used a dataset of online customers reviews on a product, to simulate a CRM system. The Machine Learning model showed a better overall results than the NLP-based approach. But through the NLP-based approach we were able to extract the list of product's aspects. |