Program for Tuesday, May 21st

PROGRAM FOR TUESDAY, MAY 21ST

Days:

previous day

all days

View: session overview talk overview

08:00-08:45 Continental Breakfast

08:00-10:00 Registration

08:45-09:00 Announcements

Location: Emerald D+E

09:00-10:00 Session 9: Invited Talk

Location: Emerald D+E

09:00

John Seel

Challenges and Opportunities of AI/ML and Autonomy for the Navy

ABSTRACT. Over the last dozen years, advances in machine learning have heralded and accelerated new generations of AI breakthroughs with much of the innovation happening outside DoD and government. Since, the US’ ability to compete in the 21st century depends, in part, on US leadership in data, analytics, and AI, DoD’s task is to adopt these innovations wherever they can add the most military value and drive their diffusion across the enterprise. This talk will discuss the Navy’s approach to AI adoption and its hierarchy of AI needs and will emphasize the aspects of the Navy and its mission that shape the environment for and demands on desired AI solutions.

10:00-10:30 Break

10:30-12:00 Session 10A: ANLP-3

Location: Emerald A

10:30	Tanvi Sandhu and Ziad Kobti Analysis of word embeddings with Graph-Based Context Adaptation for Enhanced Word Vectors ABSTRACT. In the aspect of information storage, text assumes a central role, necessitating streamlined and effective methods for swift retrieval. Among various text representations, the vector form stands out for its remarkable efficiency, especially when dealing with expansive datasets. This paper explores the intersection of data representation in vector form and the heightened performance and accuracy observed in Natural Language Processing (NLP) tasks, employing dynamic embedding models enriched with graph structures. The investigation delves into the merits of vectorized text in NLP, extending to the incorporation of graphs within vectors to enhance overall capabilities in information representation and retrieval. The study employs graph analysis to reveal word relatedness, utilizing a vertex embedding method for generating embeddings. Experimental deployment of this technique across diverse text corpora underscores its superiority over conventional word-embedding approaches. Furthermore, cutting-edge NLP techniques, such as contextual word embeddings from models like ELMo and GPT, are seamlessly integrated to augment text classification. Unlike traditional static embeddings, contextual embeddings consider the specific context in which a word appears, offering distinct representations for words across different contexts. This adaptability to surrounding context addresses limitations in capturing the richness of language semantics present in static word vectors. This research not only contributes valuable insights into advanced word representation methodologies but also sheds light on their implications for text classification tasks, especially within the context of dynamic embedding models. The holistic perspective provided in this paper aims to advance the understanding of optimal information representation and retrieval strategies in the dynamic landscape of NLP.
10:50	Jean-Thomas Baillargeon and Luc Lamontagne Assessing the Impact of Sequence Length Learning on Classification Tasks for Transformer Encoder Models ABSTRACT. Classification algorithms using Transformer architectures can be affected by the sequence length learning problem whenever observations from different classes have a different length distribution. This problem causes models to use sequence length as a predictive feature instead of relying on important textual information. Although most public datasets are not affected by this problem, privately owned corpora for fields such as medicine and insurance may carry this data bias. The exploitation of this sequence length feature poses challenges throughout the value chain as these machine learning models can be used in critical applications. In this paper, we empirically expose this problem and present approaches to minimize its impacts.
11:10	Simona Ondrčková, Kristýna Pantůčková and Roman Barták Handling Empty Decomposition Methods in Hierarchical Planning ABSTRACT. Hierarchical planning is a form of planning where tasks decompose into sub-tasks until primitive tasks (actions) are obtained. These decompositions might contain additional constraints, such as subtask ordering and state constraints. If a task is already fulfilled, it does not need to decompose into anything, but it may still require satisfaction of a particular state constraint (to check that the task is fulfilled). Such decomposition methods are called empty. Despite practical usefulness, many hierarchical planning models do not support empty methods fully. This paper shows that two recently introduced hierarchical planning formalisms are equivalent with respect to empty methods. We also discuss the possibility of compiling such methods away. In particular, we show how to compile them away in totally ordered domains and discuss the difficulties in partially ordered domains.
11:30	Kristýna Pantůčková, Simona Ondrčková and Roman Barták Using Earley Parser for Verification of Totally Ordered Hierarchical Plans ABSTRACT. Hierarchical planning extends classical planning by capturing the hierarchical structure of tasks. Plan verification is the problem of determining whether a given plan is valid according to that structure. As decomposition trees in totally ordered hierarchical planning domains resemble parsing trees of context-free grammars, one may exploit the techniques for context-free grammars and also for hierarchical plans. Specifically, the Earley parser has been proposed for checking whether a given word belongs to a language defined by context-free grammar. This paper suggests using the modified Earley parser to verify totally ordered hierarchical plans.

10:30-12:00 Session 10B: NNDM-2

Location: Emerald B

10:30	Fares Alkhawaja, Manar Amayri and Nizar Bouguila Simultaneous count data feature selection and clustering using Multinomial Nested Dirichlet Mixture ABSTRACT. The elevating effect of the curse of dimensionality in count data has made clustering a challenging task. This paper solves this by adopting the concept of feature saliency as a feature selection method in the context of using the Multinomial Nested Dirichlet Mixture (MNDM). The MNDM is a generalization of the Dirichlet Compound Mixture (DCM) that suffers from several limitations. The model learning is accomplished through the expectation-maximization method. The Minimum Message Length criterion is used to simultaneously determine the best number of components in the mixture with the updated selected features. The results show better accuracy and convergence times, as the model aims to select the salient features and tune away the non-salient anomalistic features.
10:50	Fares Alkhawaja, Manar Amayri and Nizar Bouguila A hierarchical count data clustering based on Multinomial Nested Dirichlet Mixture using the Minorization-Maximization framework ABSTRACT. Despite the vast acknowledgment of mixture models among researchers, it involves several challenges in the process of obtaining good results. In this paper, we address two main challenges, namely, parameter estimation and data representation strategies. Expectation-Maximization (EM) is a widely used framework for parameter estimation. However, many factors complicate the process through intractable calculations of the posterior distribution and the parameters. Minorization-Maximization (MM) is an alternative framework that relaxes the complications and requirements of the EM. This paper adopts the MM framework for the Multinomial Nested Dirichlet Mixture in a hierarchical manner. The hierarchical nature of the MNDM is exploited through a Hierarchical Feature Learning framework (HFL), where the data represented is a result of the well-known Spatial Pyramid Matching method. Moreover, the components of the mixture are determined by the Minimum Message Length (MML). Therefore, this paper presents an HFL framework for the data representation of the MNDM, where its learning is based on the MM framework, and its selection is based on MML. The validation of the two addressed improvements is proven through three visual datasets using recall and precision performance metrics.
11:10	Farhat Lamia Barsha and William Eberle Mode Collapse Detection Strategies in Generative Adversarial Networks for Credit Card Fraud Detection ABSTRACT. A Generative Adversarial Network (GAN) is an artificial intelligence model developed specifically to produce synthetic data that resembles real data by training a generative model and a discriminative model simultaneously using adversarial training. A GAN can be extensively used for generating replicated data, however, it suffers from several issues, one of which is mode collapse. Mode collapse takes place when the generator is unable to capture the complete range of diversity in the target data distribution, resulting in the production of limited and repeating variations of samples. Multiple metrics exist to quantify mode collapse in GANs, although no individual metric is capable of consistently providing accurate results. This research focuses on the critical need for accurate mode collapse detection techniques in GANs, specifically designed to strengthen the resilience of credit card fraud detection systems. In this work, we utilize a GAN to generate numerical data instead of image data. Our approach utilizes a wide range of measures, such as Generator and Discriminator Loss, Wasserstein Distance, precision, recall, and visualization tools, to provide a comprehensive framework for detecting mode collapse as early as possible. In addition, we introduce an alert mechanism that identifies possible mode collapse at an early stage, allowing for earlier intervention and modifications to the training process. We have further proposed suggestions regarding monitoring and analyzing generator and discriminator loss values to identify potential instances of mode collapse to help the developer optimize GAN training and consequently improve the quality of synthetic data.
11:30	Harshitha Polsani, Haodi Jiang and Yuexin Liu DeepGray: Malware Classification Using Grayscale Images with Deep Learning ABSTRACT. In the ever-evolving landscape of cybersecurity, the threat posed by malware continues to loom large, necessitating innovative and robust approaches for its effective detection and classification. In this paper, we introduce a novel method, DeepGray, for multi-class malware classification utilizing malware images and the power of deep learning. Our dataset combines the malware sample from the BODMAS dataset and the benign sample from the DikeDataset. The methodology involves transforming executable files into a deep learning-friendly format by converting them into grayscale images while preserving essential data characteristics. Subsequently, Principal Component Analysis (PCA) is applied to distill the most significant features. The study harnesses the power of deep learning and transfer learning, utilizing established neural network architectures such as VGG16, InceptionV3, Efficientnetv2b0, and Vision Transformers (ViT) for malware classification. Experimental results demonstrate the effectiveness of the proposed method in accurately classifying malware.

10:30-12:00 Session 10C: Main-7

Location: Emerald D

10:30	Sherri Weitl Harms, John Hastings and Jay Powell Toward Automated Knowledge Discovery in Case-Based Reasoning ABSTRACT. Automated Case Elicitation (ACE) enables case-based reasoning (CBR) systems to automatically acquire knowledge through real-time exploration and interaction with environments. CBR is an explainable AI methodology, where decisions are based on previous encounters. ACE combined with CBR continues learning as it is being deployed, and produces specific cases that can be reviewed by humans, unlike pretrained large language models (LLMs) that learn by training offline on prior data. ACE and CBR may be useful methods to gather training data for use with generative AI, or to help them to adapt on the fly. This research demonstrates ACE's potential by applying it to chess and conducting extensive experiments against Stockfish, the world's highest rated chess engine. An ACE agent was developed that combines random exploration with shallow alpha-beta search for novel game states. Results over 1000+ games showed the ACE player defeated Stockfish in nearly 10% of games—a notable achievement given Stockfish's extreme strength. Notably, the ACE agent required only 0.1 seconds per game compared an average of 8 minutes for Stockfish, while still gradually improving its win rate through accrued experience. Detailed analyses revealed how the relaxation of ACE's case matching criteria along with selective retention of useful cases enabled accumulation of strategic chess knowledge. The research provides valuable insights into ACE's proficiency for knowledge discovery in complex, adversarial domains. It also lays groundwork for integrating ACE, an unsupervised CBR learner, with modern deep learning techniques like neural networks and large language models to combine the strengths of symbolic and subsymbolic AI. By demonstrating ACE's ability to extract strategic knowledge against world-class opponents, this work highlights its potential for impact across gaming, autonomous systems, and other complex problem-solving domains.
10:50	Michael Huelsman Preference Reasoning Under Partial Alternatives ABSTRACT. Selecting a single alternative among many is a key cognitive task. Many formal approaches for solving this problem have been explored, but these approaches consider having, or eliciting, complete preference information. As a natural extension work has also been done to consider situations where complete preference information is unknown. What has not been studied is the effect of incomplete information wrt the alternatives an agent may select from. This work focuses of the computational problems that arise when preference information is known, but alternatives are only partially specified, such as when one is search online classified ads. We both define these problems and specify some general computational complexity results. While the complexity of the defined problems are not tightly bound, we do provide a case study which demonstrates how partial alternatives affect different preference representations differently.
11:10	Moumita Choudhury, Sandhya Saisubramanian, Hao Zhang and Shlomo Zilberstein Minimizing Negative Side Effects in Cooperative Multi-Agent Systems Using Distributed Coordination ABSTRACT. Autonomous agents engineered for operating in real-world environments frequently encounter undesirable outcomes or Negative Side Effects (NSEs) when working collaboratively alongside other agents. Even when agents can execute their primary assigned tasks optimally when operating in isolation, their training may not account for potential negative interactions that arise in the presence of other agents. We frame the challenge of minimizing NSEs as a Lexicographic Decentralized Markov Decision Process. In this framework, we assume independence of rewards and transitions with respect to the primary assigned tasks while recognizing that addressing negative side effects creates a form of dependence within this context. Furthermore, we focus on minimizing NSEs arising from interactions between a limited subset of agents in the system. We present a lexicographic Q-learning approach to mitigate the NSEs using human feedback models while maintaining near-optimality with respect to the assigned tasks (up to some given slack). Our empirical evaluation across two domains demonstrates that our collaborative approach effectively mitigates NSEs, outperforming non-collaborative methods.
11:30	Abderrahmane Boukontar, Jean-François Condotta and Yakoub Salhi On Clustering in Qualitative Spatial and Temporal Reasoning ABSTRACT. Our understanding of the world is intricately linked to both the spatial arrangement of objects and the timing of events. Knowledge-dependent systems employ mechanisms like Qualitative Spatial and Temporal Reasoning (QSTR) to effectively process and interpret this information. This article explores application of QSTR in data clustering, offering several contributions. These include introducing a formal clustering framework for qualitative data, implementing SAT encoding to compute a clustering, introducing two appropriate distance measures for Qualitative Relation Networks, and experimentally validating through adaptations of $k$-means and Agglomerative Hierarchical Clustering algorithms.

10:30-12:00 Session 10D: Main-8

Location: Emerald E

10:30	Bruno Cascaes Alves, Marla Pereira Melo, Artur Melchiori Cerri, Larissa Astrogildo de Freitas, Diana Francisca Adamatti and Marilton Sanchotene de Aguiar A Machine Learning Pipeline for Emotion Recognition based on Brain Topographic Maps Derived from Electroencephalogram Signals ABSTRACT. Emotion recognition is an increasingly relevant field due to its direct implications for various sectors of society. The area aims to enhance the understanding of how emotions influence human behavior. Exploring brain activity analysis through electroencephalogram signals becomes possible when considering that emotions can manifest non-verbally. In this scenario, machine learning applications prove promising due to the complexity of recognizing emotions from electrical signal data from the brain. The case study focuses on DEAP, a recognized dataset constructed through experiments in electroencephalography, exposing subjects to musical and visual stimuli. The main objective of this work is to present a pipeline for the classification of emotions based on images of topographic maps generated from the EEGLAB tool and electroencephalogram signals. Additionally, the contributions of this work include the presentation of a structured dataset created through the mapping of temporal, spatial, and frequency data derived from topographic images and models for predicting dimensional emotions of arousal and valence based on the new dataset. Results demonstrate accuracies of 85.46% and 85.05% for the classification of low/high arousal and valence emotions, respectively.
10:50	Ryoma Matsuura, Haodi Jiang and Jason T. L. Wang Multiclass Classification of Solar Flares in Imbalanced Data Using Ensemble Learning and Sampling Methods ABSTRACT. Solar flares are intense bursts of radiation across the electromagnetic spectrum on the surface of the Sun. They are categorized into four classes: B, C, M, and X, depending on their intensity, with X-class flares being the strongest. Being able to predict a flare’s class before its occurrence is critical for anticipating the severity of its impact on Earth. We used the Space-weather HMI Active Region Patches (SHARP) parameters available from Stanford’s Joint Science Operations Center (JSOC) to train machine learning models to classify these flares. However, predicting the flare class is a challenging task, as it is a multiclass classification problem involving imbalanced data due to the small number of X-class flares in a solar cycle. We propose a new method that uses a combination of random undersampling and the synthetic minority oversampling technique (SMOTE) to combat the imbalanced data problem. Furthermore, we develop an ensemble algorithm that uses nine classifiers as base learners and logistic regression as meta-learner. Experimental results show that the proposed method is effective in predicting solar flares, especially the most intense X-class flares, within the next 24 hours
11:10	Md Akib Zabed Khan and Agoritsa Polyzou Estimate Undergraduate Student Enrollment in Courses by Re-purposing Recommendation Tools ABSTRACT. Resource allocation in educational institutions is a very challenging task in higher education. To prepare for every new semester, academic administration faces various challenges in allocating instructors, classrooms, sessions, teaching assistants, and laboratories for different possible courses considering students' needs and the limited resources that are available to utilize. Predicting the number of students enrolled in a specific class in the next semester can help with this task. To address this problem, we investigate various machine learning models (direct and indirect methods) using different features of course enrollment data of past students to predict the number of enrollments in possible courses in the upcoming semester. In this work, we propose to use a course recommendation model as a first step to generate suggestions for students, and then, use those to estimate student enrollment in the courses of the next semester. We test four course recommendation models, two time series models, three regression models, and three baseline approaches for course enrollment prediction. The experimental evaluation demonstrates that our proposed approach achieves good behavior and similar or better performance compared to other competing approaches to predict student enrollment in courses.
11:30	Andre Kuroswiski, Angelo Passaro and Annie Wu Attention-Driven Multi-Agent Reinforcement Learning: Enhancing Deci-sions with Expertise-Informed Tasks ABSTRACT. In this paper, we introduce an alternative approach to en-hancing Multi-Agent Reinforcement Learning (MARL) through the integration of domain knowledge and attention-based policy mechanisms. Our methodology focuses on the incorporation of domain-specific expertise into the learning process, facilitating a more efficient and targeted develop-ment of collaborative behaviors in agent interactions. This approach aims to reduce the complexity and learning over-head typically associated with MARL by enabling agents to concentrate on essential aspects of complex tasks, thus op-timizing the learning curve. The utilization of attention mechanisms plays a key role in our model. It allows for the effective processing of dynamic environmental data and nu-anced agent interactions, leading to more refined decision-making. Applied in standard MARL scenarios such as the Stanford Intelligent Systems Laboratory (SISL) Pursuit and Multi-Particle Environments (MPE) Simple Spread, our method has shown to improve both learning efficiency and the effectiveness of collaborative behaviors. The results in-dicate that our attention-based approach could offer a viable alternative to improve the efficiency of MARL training pro-cess, integrating domain-specific knowledge at the action level.

12:00-13:00 Lunch

Location: Emerald C

13:00-14:00 Business Meeting

Location: Emerald D