FLAIRS-37: THE 37TH INTERNATIONAL CONFERENCE OF THE FLORIDA ARTIFICIAL INTELLIGENCE RESEARCH SOCIETY
PROGRAM FOR TUESDAY, MAY 21ST
Days:
previous day
all days

View: session overviewtalk overview

09:00-10:00 Session 9: Invited Talk
Location: Emerald D+E
09:00
Challenges and Opportunities of AI/ML and Autonomy for the Navy

ABSTRACT. Over the last dozen years, advances in machine learning have heralded and accelerated new generations of AI breakthroughs with much of the innovation happening outside DoD and government. Since, the US’ ability to compete in the 21st century depends, in part, on US leadership in data, analytics, and AI, DoD’s task is to adopt these innovations wherever they can add the most military value and drive their diffusion across the enterprise. This talk will discuss the Navy’s approach to AI adoption and its hierarchy of AI needs and will emphasize the aspects of the Navy and its mission that shape the environment for and demands on desired AI solutions.

10:30-12:00 Session 10A: ANLP-3
Location: Emerald A
10:30
Analysis of word embeddings with Graph-Based Context Adaptation for Enhanced Word Vectors

ABSTRACT. In the aspect of information storage, text assumes a central role, necessitating streamlined and effective methods for swift retrieval. Among various text representations, the vector form stands out for its remarkable efficiency, especially when dealing with expansive datasets. This paper explores the intersection of data representation in vector form and the heightened performance and accuracy observed in Natural Language Processing (NLP) tasks, employing dynamic embedding models enriched with graph structures. The investigation delves into the merits of vectorized text in NLP, extending to the incorporation of graphs within vectors to enhance overall capabilities in information representation and retrieval. The study employs graph analysis to reveal word relatedness, utilizing a vertex embedding method for generating embeddings. Experimental deployment of this technique across diverse text corpora underscores its superiority over conventional word-embedding approaches. Furthermore, cutting-edge NLP techniques, such as contextual word embeddings from models like ELMo and GPT, are seamlessly integrated to augment text classification. Unlike traditional static embeddings, contextual embeddings consider the specific context in which a word appears, offering distinct representations for words across different contexts. This adaptability to surrounding context addresses limitations in capturing the richness of language semantics present in static word vectors. This research not only contributes valuable insights into advanced word representation methodologies but also sheds light on their implications for text classification tasks, especially within the context of dynamic embedding models. The holistic perspective provided in this paper aims to advance the understanding of optimal information representation and retrieval strategies in the dynamic landscape of NLP.

10:50
Assessing the Impact of Sequence Length Learning on Classification Tasks for Transformer Encoder Models

ABSTRACT. Classification algorithms using Transformer architectures can be affected by the sequence length learning problem whenever observations from different classes have a different length distribution. This problem causes models to use sequence length as a predictive feature instead of relying on important textual information. Although most public datasets are not affected by this problem, privately owned corpora for fields such as medicine and insurance may carry this data bias. The exploitation of this sequence length feature poses challenges throughout the value chain as these machine learning models can be used in critical applications. In this paper, we empirically expose this problem and present approaches to minimize its impacts.

11:10
Handling Empty Decomposition Methods in Hierarchical Planning

ABSTRACT. Hierarchical planning is a form of planning where tasks decompose into sub-tasks until primitive tasks (actions) are obtained. These decompositions might contain additional constraints, such as subtask ordering and state constraints. If a task is already fulfilled, it does not need to decompose into anything, but it may still require satisfaction of a particular state constraint (to check that the task is fulfilled). Such decomposition methods are called empty. Despite practical usefulness, many hierarchical planning models do not support empty methods fully. This paper shows that two recently introduced hierarchical planning formalisms are equivalent with respect to empty methods. We also discuss the possibility of compiling such methods away. In particular, we show how to compile them away in totally ordered domains and discuss the difficulties in partially ordered domains.

11:30
Using Earley Parser for Verification of Totally Ordered Hierarchical Plans

ABSTRACT. Hierarchical planning extends classical planning by capturing the hierarchical structure of tasks. Plan verification is the problem of determining whether a given plan is valid according to that structure. As decomposition trees in totally ordered hierarchical planning domains resemble parsing trees of context-free grammars, one may exploit the techniques for context-free grammars and also for hierarchical plans. Specifically, the Earley parser has been proposed for checking whether a given word belongs to a language defined by context-free grammar. This paper suggests using the modified Earley parser to verify totally ordered hierarchical plans.

10:30-12:00 Session 10B: NNDM-2
Location: Emerald B
10:30
Simultaneous count data feature selection and clustering using Multinomial Nested Dirichlet Mixture

ABSTRACT. The elevating effect of the curse of dimensionality in count data has made clustering a challenging task. This paper solves this by adopting the concept of feature saliency as a feature selection method in the context of using the Multinomial Nested Dirichlet Mixture (MNDM). The MNDM is a generalization of the Dirichlet Compound Mixture (DCM) that suffers from several limitations. The model learning is accomplished through the expectation-maximization method. The Minimum Message Length criterion is used to simultaneously determine the best number of components in the mixture with the updated selected features. The results show better accuracy and convergence times, as the model aims to select the salient features and tune away the non-salient anomalistic features.

10:50
A hierarchical count data clustering based on Multinomial Nested Dirichlet Mixture using the Minorization-Maximization framework

ABSTRACT. Despite the vast acknowledgment of mixture models among researchers, it involves several challenges in the process of obtaining good results. In this paper, we address two main challenges, namely, parameter estimation and data representation strategies. Expectation-Maximization (EM) is a widely used framework for parameter estimation. However, many factors complicate the process through intractable calculations of the posterior distribution and the parameters. Minorization-Maximization (MM) is an alternative framework that relaxes the complications and requirements of the EM. This paper adopts the MM framework for the Multinomial Nested Dirichlet Mixture in a hierarchical manner. The hierarchical nature of the MNDM is exploited through a Hierarchical Feature Learning framework (HFL), where the data represented is a result of the well-known Spatial Pyramid Matching method. Moreover, the components of the mixture are determined by the Minimum Message Length (MML). Therefore, this paper presents an HFL framework for the data representation of the MNDM, where its learning is based on the MM framework, and its selection is based on MML. The validation of the two addressed improvements is proven through three visual datasets using recall and precision performance metrics.

11:10
Mode Collapse Detection Strategies in Generative Adversarial Networks for Credit Card Fraud Detection

ABSTRACT. A Generative Adversarial Network (GAN) is an artificial intelligence model developed specifically to produce synthetic data that resembles real data by training a generative model and a discriminative model simultaneously using adversarial training. A GAN can be extensively used for generating replicated data, however, it suffers from several issues, one of which is mode collapse. Mode collapse takes place when the generator is unable to capture the complete range of diversity in the target data distribution, resulting in the production of limited and repeating variations of samples. Multiple metrics exist to quantify mode collapse in GANs, although no individual metric is capable of consistently providing accurate results. This research focuses on the critical need for accurate mode collapse detection techniques in GANs, specifically designed to strengthen the resilience of credit card fraud detection systems. In this work, we utilize a GAN to generate numerical data instead of image data. Our approach utilizes a wide range of measures, such as Generator and Discriminator Loss, Wasserstein Distance, precision, recall, and visualization tools, to provide a comprehensive framework for detecting mode collapse as early as possible. In addition, we introduce an alert mechanism that identifies possible mode collapse at an early stage, allowing for earlier intervention and modifications to the training process. We have further proposed suggestions regarding monitoring and analyzing generator and discriminator loss values to identify potential instances of mode collapse to help the developer optimize GAN training and consequently improve the quality of synthetic data.

11:30
DeepGray: Malware Classification Using Grayscale Images with Deep Learning

ABSTRACT. In the ever-evolving landscape of cybersecurity, the threat posed by malware continues to loom large, necessitating innovative and robust approaches for its effective detection and classification. In this paper, we introduce a novel method, DeepGray, for multi-class malware classification utilizing malware images and the power of deep learning. Our dataset combines the malware sample from the BODMAS dataset and the benign sample from the DikeDataset. The methodology involves transforming executable files into a deep learning-friendly format by converting them into grayscale images while preserving essential data characteristics. Subsequently, Principal Component Analysis (PCA) is applied to distill the most significant features. The study harnesses the power of deep learning and transfer learning, utilizing established neural network architectures such as VGG16, InceptionV3, Efficientnetv2b0, and Vision Transformers (ViT) for malware classification. Experimental results demonstrate the effectiveness of the proposed method in accurately classifying malware.

10:30-12:00 Session 10C: Main-7
Location: Emerald D
10:30
Toward Automated Knowledge Discovery in Case-Based Reasoning

ABSTRACT. Automated Case Elicitation (ACE) enables case-based reasoning (CBR) systems to automatically acquire knowledge through real-time exploration and interaction with environments. CBR is an explainable AI methodology, where decisions are based on previous encounters. ACE combined with CBR continues learning as it is being deployed, and produces specific cases that can be reviewed by humans, unlike pretrained large language models (LLMs) that learn by training offline on prior data. ACE and CBR may be useful methods to gather training data for use with generative AI, or to help them to adapt on the fly. This research demonstrates ACE's potential by applying it to chess and conducting extensive experiments against Stockfish, the world's highest rated chess engine. An ACE agent was developed that combines random exploration with shallow alpha-beta search for novel game states. Results over 1000+ games showed the ACE player defeated Stockfish in nearly 10% of games—a notable achievement given Stockfish's extreme strength. Notably, the ACE agent required only 0.1 seconds per game compared an average of 8 minutes for Stockfish, while still gradually improving its win rate through accrued experience. Detailed analyses revealed how the relaxation of ACE's case matching criteria along with selective retention of useful cases enabled accumulation of strategic chess knowledge. The research provides valuable insights into ACE's proficiency for knowledge discovery in complex, adversarial domains. It also lays groundwork for integrating ACE, an unsupervised CBR learner, with modern deep learning techniques like neural networks and large language models to combine the strengths of symbolic and subsymbolic AI. By demonstrating ACE's ability to extract strategic knowledge against world-class opponents, this work highlights its potential for impact across gaming, autonomous systems, and other complex problem-solving domains.

10:50
Preference Reasoning Under Partial Alternatives

ABSTRACT. Selecting a single alternative among many is a key cognitive task. Many formal approaches for solving this problem have been explored, but these approaches consider having, or eliciting, complete preference information. As a natural extension work has also been done to consider situations where complete preference information is unknown. What has not been studied is the effect of incomplete information wrt the alternatives an agent may select from. This work focuses of the computational problems that arise when preference information is known, but alternatives are only partially specified, such as when one is search online classified ads. We both define these problems and specify some general computational complexity results. While the complexity of the defined problems are not tightly bound, we do provide a case study which demonstrates how partial alternatives affect different preference representations differently.

11:10
Minimizing Negative Side Effects in Cooperative Multi-Agent Systems Using Distributed Coordination

ABSTRACT. Autonomous agents engineered for operating in real-world environments frequently encounter undesirable outcomes or Negative Side Effects (NSEs) when working collaboratively alongside other agents. Even when agents can execute their primary assigned tasks optimally when operating in isolation, their training may not account for potential negative interactions that arise in the presence of other agents. We frame the challenge of minimizing NSEs as a Lexicographic Decentralized Markov Decision Process. In this framework, we assume independence of rewards and transitions with respect to the primary assigned tasks while recognizing that addressing negative side effects creates a form of dependence within this context. Furthermore, we focus on minimizing NSEs arising from interactions between a limited subset of agents in the system. We present a lexicographic Q-learning approach to mitigate the NSEs using human feedback models while maintaining near-optimality with respect to the assigned tasks (up to some given slack). Our empirical evaluation across two domains demonstrates that our collaborative approach effectively mitigates NSEs, outperforming non-collaborative methods.

11:30
On Clustering in Qualitative Spatial and Temporal Reasoning

ABSTRACT. Our understanding of the world is intricately linked to both the spatial arrangement of objects and the timing of events. Knowledge-dependent systems employ mechanisms like Qualitative Spatial and Temporal Reasoning (QSTR) to effectively process and interpret this information. This article explores application of QSTR in data clustering, offering several contributions. These include introducing a formal clustering framework for qualitative data, implementing SAT encoding to compute a clustering, introducing two appropriate distance measures for Qualitative Relation Networks, and experimentally validating through adaptations of $k$-means and Agglomerative Hierarchical Clustering algorithms.

10:30-12:00 Session 10D: Main-8
Location: Emerald E
10:30
A Machine Learning Pipeline for Emotion Recognition based on Brain Topographic Maps Derived from Electroencephalogram Signals

ABSTRACT. Emotion recognition is an increasingly relevant field due to its direct implications for various sectors of society. The area aims to enhance the understanding of how emotions influence human behavior. Exploring brain activity analysis through electroencephalogram signals becomes possible when considering that emotions can manifest non-verbally. In this scenario, machine learning applications prove promising due to the complexity of recognizing emotions from electrical signal data from the brain. The case study focuses on DEAP, a recognized dataset constructed through experiments in electroencephalography, exposing subjects to musical and visual stimuli. The main objective of this work is to present a pipeline for the classification of emotions based on images of topographic maps generated from the EEGLAB tool and electroencephalogram signals. Additionally, the contributions of this work include the presentation of a structured dataset created through the mapping of temporal, spatial, and frequency data derived from topographic images and models for predicting dimensional emotions of arousal and valence based on the new dataset. Results demonstrate accuracies of 85.46% and 85.05% for the classification of low/high arousal and valence emotions, respectively.

10:50
Multiclass Classification of Solar Flares in Imbalanced Data Using Ensemble Learning and Sampling Methods

ABSTRACT. Solar flares are intense bursts of radiation across the electromagnetic spectrum on the surface of the Sun. They are categorized into four classes: B, C, M, and X, depending on their intensity, with X-class flares being the strongest. Being able to predict a flare’s class before its occurrence is critical for anticipating the severity of its impact on Earth. We used the Space-weather HMI Active Region Patches (SHARP) parameters available from Stanford’s Joint Science Operations Center (JSOC) to train machine learning models to classify these flares. However, predicting the flare class is a challenging task, as it is a multiclass classification problem involving imbalanced data due to the small number of X-class flares in a solar cycle. We propose a new method that uses a combination of random undersampling and the synthetic minority oversampling technique (SMOTE) to combat the imbalanced data problem. Furthermore, we develop an ensemble algorithm that uses nine classifiers as base learners and logistic regression as meta-learner. Experimental results show that the proposed method is effective in predicting solar flares, especially the most intense X-class flares, within the next 24 hours

11:10
Estimate Undergraduate Student Enrollment in Courses by Re-purposing Recommendation Tools

ABSTRACT. Resource allocation in educational institutions is a very challenging task in higher education. To prepare for every new semester, academic administration faces various challenges in allocating instructors, classrooms, sessions, teaching assistants, and laboratories for different possible courses considering students' needs and the limited resources that are available to utilize. Predicting the number of students enrolled in a specific class in the next semester can help with this task. To address this problem, we investigate various machine learning models (direct and indirect methods) using different features of course enrollment data of past students to predict the number of enrollments in possible courses in the upcoming semester. In this work, we propose to use a course recommendation model as a first step to generate suggestions for students, and then, use those to estimate student enrollment in the courses of the next semester. We test four course recommendation models, two time series models, three regression models, and three baseline approaches for course enrollment prediction. The experimental evaluation demonstrates that our proposed approach achieves good behavior and similar or better performance compared to other competing approaches to predict student enrollment in courses.

11:30
Attention-Driven Multi-Agent Reinforcement Learning: Enhancing Deci-sions with Expertise-Informed Tasks

ABSTRACT. In this paper, we introduce an alternative approach to en-hancing Multi-Agent Reinforcement Learning (MARL) through the integration of domain knowledge and attention-based policy mechanisms. Our methodology focuses on the incorporation of domain-specific expertise into the learning process, facilitating a more efficient and targeted develop-ment of collaborative behaviors in agent interactions. This approach aims to reduce the complexity and learning over-head typically associated with MARL by enabling agents to concentrate on essential aspects of complex tasks, thus op-timizing the learning curve. The utilization of attention mechanisms plays a key role in our model. It allows for the effective processing of dynamic environmental data and nu-anced agent interactions, leading to more refined decision-making. Applied in standard MARL scenarios such as the Stanford Intelligent Systems Laboratory (SISL) Pursuit and Multi-Particle Environments (MPE) Simple Spread, our method has shown to improve both learning efficiency and the effectiveness of collaborative behaviors. The results in-dicate that our attention-based approach could offer a viable alternative to improve the efficiency of MARL training pro-cess, integrating domain-specific knowledge at the action level.