WSOM+ 2019: 13TH INTERNATIONAL WORKSHOP ON SELF-ORGANIZING MAPS AND LEARNING VECTOR QUANTIZATION, CLUSTERING AND DATA VISUALIZATION
PROGRAM FOR THURSDAY, JUNE 27TH
Days:
previous day
next day
all days

View: session overviewtalk overview

09:00-10:00

Invited Talk (Aïda Valls, Spain)

10:00-10:50 Session 4

Life Science Applications, part I

10:00
A voting ensemble method to assist the diagnosis of prostate cancer using multiparametric MRI

ABSTRACT. Prostate cancer is the second most commonly occurring cancer in men. Diagnosis through Magnetic Resonance Imaging (MRI) is limited, yet current practice holds a relatively low specificity. This paper extends a previous SPIE ProstateX challenge study in three ways 1) to include healthy tissue analysis, creating a solution suitable for clinical practice, which has been requested and validated by collaborating clinicians; 2) by using a voting ensemble method to assist prostate cancer diagnosis through a supervised SVM approach; and 3) using the unsupervised GTM to provide interpretability to understand the supervised SVM classification results. Pairwise classifiers of clinically significant lesion, non-significant lesion, and healthy tissue, were developed. Results showed that when combining multiparametric MRI and patient level metadata, classification of significant lesions against healthy tissue attained an AUC of 0.869 (10-fold cross validation).

10:25
Classifying and grouping mammography images into communities using Fisher information networks to assist the diagnosis of breast cancer

ABSTRACT. The aim of this paper is to build a computer based clinical decision support tool using a semi-supervised framework, the Fisher Information Network (FIN), for visualization of a set of mammographic images. The FIN organizes the images into a similarity network from which, for any new image, reference images that are closely related can be identified. This enables clinicians to review not just the reference images but also ancillary information e.g. about response to therapy. The Fisher information metric defines a Riemannian space where distances reflect similarity with respect to a given probability distribution. This metric is informed about generative properties of data, and hence assesses the importance of directions in space of parameters. It automatically performs feature relevance detection. This approach focusses on the interpretability of the model from the standpoint of the clinical user. Model predictions were validated using the prevalence of classes in each of the clusters identified by the FIN.

10:50-11:15

Coffee Break

11:15-13:00 Session 5

Applications

11:15
Incremental Traversability Assessment Learning using Growing Neural Gas Algorithm

ABSTRACT. In this paper, we report early results on the deployment of the growing neural gas algorithm in online incremental learning of traversability assessment with a multi-legged walking robot. The addressed problem is to incrementally build a model of the robot experience with traversing the terrain that can be immediately utilized in the traversability cost assessment of seen but not yet visited areas. The main motivation of the studied deployment is to improve the performance of the autonomous mission by avoiding hard to traverse areas and support planning cost-efficient paths based on the continuously collected measurements characterizing the operational environment. We propose to employ the growing neural gas algorithm to incrementally build a model of the terrain characterization from exteroceptive features that are associated with the proprioceptive based estimation of the traversal cost. Based on the reported results, the proposed deployment provides competitive results to the existing approach based on the Incremental Gaussian Mixture Network.

11:40
Self-Organizing Maps with Convolutional Layers

ABSTRACT. Self-organizing maps (SOMs) are well appropriate for visualizing high-dimensional data sets. Training SOMs on raw high-dimensional data with classic metrics often leads to problems arising from the curse-of-dimensionality effect. To achieve more valuable semantic maps of high-dimensional data sets, we assume that higher-level features are necessary. We propose to gather such higher-level features from pre-trained convolutional layers, i.e., filter banks of convolutional neural networks (CNNs). Appropriately pre-trained CNNs are required, e.g., from the same or related domains, or in semi-supervised scenarios. We introduce SOM quality measures and analyze the new approaches on two benchmark image data sets considering different convolutional network levels.

12:05
SOM-Based Anomaly Detection & Localization for Space Subsystems

ABSTRACT. The aim of this paper is to contribute to machine-learning technology that expands real-time and offline Integrated System Health Management capabilities for future deep-space exploration efforts. To this end, we have developed Anomaly Detection via Topological feature-Map (ADTM), which leverages a Self-Organizing Map (SOM)-based architecture to produce high-resolution clusters of nominal system behavior. What distinguishes ADTM from more common clustering techniques (e.g. k-means) is that it maps high-dimensional input vectors to a 2D grid while preserving the topology of the original dataset. The result is a ‘semantic map’ that serves as a powerful visualization tool for uncovering latent relationships between features of the incoming points. We successfully modeled and analyzed datasets from a NASA Ames Research Center Graywater Recycling System, which documents a real system fault. Our results show that ADTM effectively detects both known and unknown anomalies and identifies the correlated measurands from models trained using just nominal data.

12:30
Using Hierarchical Clustering to Understand Behavior of 3D Printer Sensors

ABSTRACT. 3D Printing is one of the latest industrial revolution that is disrupting the manu-facturing value of chain massively and will deeply impact in the new context of Industry 4.0, with new ways of products manufacturing, delivery and mainte-nance. The 3D Printing process is heavily reliant on the power of data both com-ing from the physical OEM (original equipment manufacturer) and print files. The Jet fusion technology of 3D printing can take hours to produce and occurs on miniscule scales. Thus, it is must to develop new data-driven techniques to ac-tively prevent possible issues (both hardware failure as well as part defects). As this is relatively a new field, researchers are still actively studying various sensors and their impact on printing process and the outcome itself. By appropriate profil-ing of printing sensors, one can reduce the post processing effort to minimum while ensuring the desired part quality. In this work, the authors are studying some specific sensors and their behavior while the machine is printing a job to understand relationships among them and how they overall govern the printing process. Also, attempts are being made to create print profiles by appropriately applying the clustering techniques and using visual inspection.

12:55
A walk through spectral bands: Using virtual reality to better visualize hyperspectral data

ABSTRACT. One of the basic challenges of understanding hyperspectral data arises from the fact that it is intrinsically 3-dimensional. A diverse range of algorithms have been developed to help visualize hyperspectral data trichromatically in 2-dimensions. In this paper we take a different approach and show how virtual reality provides a way of visualizing a hyperspectral data cube without collapsing the spectral dimension. Using several different real datasets, we show that it is straightforward to find signals of interest and make them more visible by exploiting the immersive, interactive environment of virtual reality. This enables signals to be seen which would be hard to detect if we were simply examining hyperspectral data band by band.

14:25-15:40

SPONSOR COMPANIES WORKSHOP and ROUND TABLE

15:40-17:20 Session 6

LVQ: Theory and Methods

15:40
Investigation of Activation Functions for Generalized Learning Vector Quantization

ABSTRACT. An appropriate choice of the activation function plays an important role for the performance of (deep) multilayer perceptron (MLP) classification learning. These activations are applied to the perceptron units in the network. A powerful alternative are the prototype-based classification learning methods like (generalized) learning vector quantization (GLVQ). These models also deal with activation functions but here they are applied to the so-called classifier function instead. In the paper we investigate whether successful candidates of activation functions for MLP also perform well for GLVQ. For this purpose we show that the GLVQ classifier function can also be interpreted as a generalized perceptron.

16:05
Robustness of Generalized Learning Vector Quantization Models against Adversarial Attacks

ABSTRACT. Abstract Adversarial attacks and the development of (deep) neural networks robust against them are currently two widely researched topics. The robustness of Learning Vector Quantization (LVQ) models against adversarial attacks has however not yet been studied to the same extent. We therefore present an extensive evaluation of three LVQ models: Generalized LVQ, Generalized Matrix LVQ and Generalized Tangent LVQ. The evaluation suggests that both Generalized LVQ and Generalized Tangent LVQ have a high base robustness, on par with the current state-of-the-art in robust neural network methods. In contrast to this, Generalized Matrix LVQ shows a high susceptibility to adversarial attacks, scoring consistently behind all other models. Additionally, our numerical evaluation indicates that increasing the number of prototypes per class improves the robustness of the models.

16:30
Passive Concept Drift handling via Momentum Based Robust Soft Learning Vector Quantization

ABSTRACT. Concept drift is a change of the underlying data distribution which occurs especially with streaming data. Besides other challenges in the field of streaming data classification, concept drift should be addressed to obtain reliable predictions. The Robust Soft Learning Vector Quantization has already shown good performance in traditional settings and is modified in this work to handle streaming data. Further, momentum-based stochastic gradient descent is applied to tackle concept drift passively due to increased learning capabilities. The proposed work is tested against common benchmark algorithms and streaming data in the field.

16:55
Prototype-based classifiers in the presence of concept drift: A modelling framework

ABSTRACT. We present a modelling framework for the investigation of prototype-based classifiers in non-stationary environments. Specifically, we study Learning Vector Quantization (LVQ) systems trained from a stream of high-dimensional, clustered data. We consider standard winner-takes-all updates known as LVQ1. Statistical properties of the input data change on the time scale defined by the training process. We apply analytical methods borrowed from statistical physics which have been used earlier for the exact description of learning in stationary environments. The suggested framework facilitates the computation of learning curves in the presence of virtual and real concept drift. Here we focus on time-dependent class bias in the training data. First results demonstrate that, while basic LVQ algorithms are suitable for the training in non-stationary environments, weight decay as an explicit mechanism of forgetting does not improve the performance under the considered drift processes.

18:30-23:00

BCN Guided Tour & Gala Dinner