FLAIRS-37: THE 37TH INTERNATIONAL CONFERENCE OF THE FLORIDA ARTIFICIAL INTELLIGENCE RESEARCH SOCIETY
PROGRAM FOR MONDAY, MAY 20TH
Days:
previous day
next day
all days

View: session overviewtalk overview

09:00-10:00 Session 5: Invited Talk
Location: Emerald D+E
09:00
Teaching Robots To "Get It Right"

ABSTRACT. We are interested in building and deploying service mobile robots to assist with arbitrary end-user tasks in everyday environments. In such open-world settings, how can we ensure that robots 1) are robust to environmental changes; 2) navigate the world in ways consistent with social and other unwritten norms; and 3) correctly complete the tasks expected of them? In this talk, I will survey these technical challenges, and present several promising directions to address them. To "get it right", robots will have to reason about unexpected sources of failures in the real world and learn to overcome them; glean appropriate contextual information from perception to understand how to navigate in the world; and reason about what correct task execution actually entails.

10:30-12:00 Session 6A: NNDM-1
Location: Emerald A
10:30
Neural Network Hardware Acceleration

ABSTRACT. Deep learning models require hardware acceleration. The current thirst for this acceleration is exceeding current capabilities and reality. At current trends, by 2045, one half of the world's electricity will be consumed by training deep learning models. This talk will cover background and a history of the field, the acceleration which is currently available, and what is expected in the future.

11:10
Dynamic PageRank with Decay: A Modified Approach for Node Anomaly Detection in Evolving Graph Streams

ABSTRACT. Given a large graph stream with dynamically changing structures over a given timestep, it is important to detect the sudden appearance of anomalous patterns, such as sudden spikes in IP-network attacks or unexpected surges in social media followers. In addition, it is important that we promptly identify these abrupt changes in the network by considering swift and short-term responses within the network structure. So, how can we design a model capable of adapting to dynamic changes? In this study, we introduce an approach that utilizes a modified dynamic "PageRank-with-Decay" as a node scoring function. This method enables the detection of sudden dynamic graph changes based on node importance scores, leveraging the temporal evolution of graph structures at each timestep. This approach provides a refined anomaly detection mechanism for tracking rapid structural changes in the network. Through experiments conducted on a real-world dataset, our model demonstrates faster and more accurate results (in terms of precision and recall) compared to state-of-the-art methods.

11:30
Multimodal and Explainable Android Adware Classification Techniques

ABSTRACT. With the widespread availability of adware masquerading as useful apps, there is an increasing need for robust security measures to identify adware. The identification of adware as malware is a challenging task, as it often appears benign despite its malicious intent in the background. In this study, we propose a unified approach to classify adware on Android devices using data from multiple modalities. The focus is particularly on the classification challenges posed by Airpush and Dowgin adware. Our proposed method uses both tabular and grayscale image data, as well as a Feedforward Neural Network as the classification method to build a multimodal deep learning technique which achieves 95\% prediction accuracy. Additionally, we incorporate Explainable AI (XAI) to enhance the interpretability of classification results for both individual and unified approaches. The efficiency of our proposed approach is showcased through its ability to classify adware instances in explainable manner from diverse perspective, highlighting its significance not only in adware detection but also in fortifying against the evolving challenges posed by adware.

10:30-12:00 Session 6B: ANLP-1
Location: Emerald E
10:30
Evaluating Graph Attention Networks as an Alternative to Transformers for ABSA Task in Low-Resource Languages

ABSTRACT. Opinions toward subjects and products hold immense relevance in business to guide decision-making processes. However, due to the increase in user-generated content, manual analysis is unrealistic. Techniques such as Sentiment Analysis are paramount to understanding and quantifying human emotion expressed in text data. Aspect-Based Sentiment Analysis aims to extract aspects from an opinionated text while identifying their underlying sentiment. Graph-based text representations have been shown to bring benefits to this task, as they explicitly represent structural relationships in text. While studies have demonstrated the effectiveness of this representation for Aspect-based Sentiment Analysis using Graph Neural Networks in English, there is only sparse evidence of improvement using these techniques for low-resource languages such as Portuguese. We develop a straightforward Graph Attention Network model for the Aspect-Based Sentiment Analysis task in Brazilian Portuguese. The proposed approach achieves a Balanced Accuracy score of 0.74, yielding competitive results and ranking third place in the ABSAPT competition. Furthermore, by leveraging sparse graph connections our model is less computationally demanding than a Transformer architecture in terms of training and inference.

10:50
The Impact of Data Augmentation on the Hate Speech Detection in Portuguese Language

ABSTRACT. Online communities allow users to establish a web presence, manage their identities, and stay connected with others. The internet has facilitated global outreach with just a click on the World Wide Web. However, the current landscape of online social media platforms is marred by various issues, with hate speech prominently taking center stage. Hate speech is characterized by hostile and malicious language driven by prejudice, targeting individuals or groups based on their innate, natural, or perceived characteristics. Detecting such speech is crucial for maintaining a safe online environment. This study examines the impact of dataset regularization techniques on the performance of BERTimbau-based models when applied to four Portuguese hate speech datasets: Fortuna, OFFCOMBR-2, ToLD-BR, and Hate-BR. Four Data Augmentation techniques are evaluated: Oversampling, Undersampling, Text Augmentation, and Synonym Replacement. Our experiments revealed that apart from the Fortuna dataset, the Data Augmentation techniques did not significantly enhance the performance of hate speech detection tasks.

11:10
Authorship Attribution of English Poetry using Sentiment Analysis

ABSTRACT. We present a basic methodology and share some interesting results from experiments on using sentiment analysis for authorship attribution of English poetry. We demonstrate that sentiment analysis can be effectively used to determine the authorship of poetic works given a sufficiently large training corpus. The results compare well with traditional authorship attribution approaches. The strengths and limitations of our methodology and directions for further research are outlined at the end of the paper.

11:30
Investigating Lexical and Syntactic Differences in Written and Spoken English Corpora

ABSTRACT. This paper presents an analysis of the differences between written text and the transcription of spoken text using current Natural Language Processing (NLP) methods. The purpose of the study is to investigate the long and rich history of attempts to differentiate spoken and written text in fields such as linguistics, communication, and rhetoric, which date back to the early 20th century. Given the availability of large quantities of machine-readable data and machine learning algorithms that can handle them, it is possible to use a large number of derived features. The research focuses on syntactic and lexical differences in written books and transcriptions of speeches by United States presidents. The analysis investigates morphological, lexical, syntactical, and text-level aspects. In this process, multiple features have been considered including lexical diversity, syllable count, frequency of parts of speech, and features relating to the parse tree, like the average length of noun phrases, and the use of interrogative sentences, among others. This study will enhance our understanding of the difference between written text and the transcription of spoken text in various disciplines including computer science, applied linguistics, communication, and similar fields.

10:30-12:00 Session 6C: Main-4
Location: Emerald D
10:30
Embedding Ethics Into Artificial Intelligence: Understanding What Can Be Done, What Can't, and What Is Done

ABSTRACT. Embedding ethical considerations within the development of AI driven technologies becomes more and more pressing as new technologies are developed. Given the impact of autonomous technologies on individuals and society (e.g. environmental costs, job security, possible harm to individuals, threat to democracy, etc.), it is worth taking the time to assess and manage the ethical aspects and possible consequences of our technological endeavors. While the growing rapidity of autonomous decision processes makes it hard to keep individuals in the decision loops, people are turning their attention to the ways in which ethics could be integrated to machines and algorithms, as well as to the possibility of defining autonomous ethical machines that would be able to solve ethical dilemmas and act ethically (e.g. autonomous vehicles). Notwithstanding the theoretical and practical difficulties surrounding the possibility of defining such ethical machines, important elements should be considered when reflecting on the embedding of ethics into AI technologies. The present paper aims to critically analyze the limitations of such endeavors by exposing common misconceptions relating to AI ethics.

10:50
A Partial MaxSAT Approach to Nonmonotonic Reasoning with System W

ABSTRACT. The only recently introduced System W is a nonmonotonic inductive inference operator exhibiting some notable properties like extending rational closure and satisfying syntax splitting postulates for inference from conditional belief bases. The semantic model of system W is given by its underlying preferred structure of worlds, a strict partial order on the set of propositional interpretations, also called possible worlds, over the signature of the belief base. Existing implementations of system W are severely limited by the number of propositional variables that occur in a belief base because of the exponentially growing number of possible worlds. In this paper, we present an approach to realizing nonmonotonic reasoning with system W by using partial maximum satisfiability (PMaxSAT) problems and exploiting the power of current PMaxSAT solvers. An evaluation of our approach demonstrates that it outperforms previous implementations of system W and scales reasoning with system W up to a new dimension.

11:10
Shortest Walk in a Dungeon Graph

ABSTRACT. With games like No Man’s Sky and Diablo IV, procedural generation of content in games is ever-increasing. Heuristics are needed to assess the goodness of generated content. Using Nintendo’s The Legend of Zelda as inspiration, we formalize the idea of a dungeon graph: an undirected graph with consisting of ‘locked’ edges in which ‘keys’ are acquired from specific nodes. We then introduce the Shortest Dungeon Walk Problem as well as a solution to this problem in the context of a dungeon graph and reduce Traveling Salesperson Problem in polynomial time to the Shortest Dungeon Walk Problem and conclude that Shortest Dungeon Walk Problem is NP-Complete. We then assess practical performance of the shortest walk algorithm using the first eight dungeons in the Legend of Zelda.

11:30
Lung and Colon Cancer Histopathological Image Classification Using 1D Convolutional Channel-based Attention Networks

ABSTRACT. Lung and Colon cancer are the leading diseases of death and disability in humans caused by a combination of genetic diseases and biochemical abnormalities. If these are diagnosed in their early stages, they can not be spread in organs and negatively impact human life. Many deep-learning networks have recently been proposed to detect and classify these malignancies. However, incorrect detection or misclassification of these fatal diseases can significantly affect an individual's health and well-being. This paper introduces a novel, cost-effective, and mobile-embedded architecture to diagnose and classify squamous cell carcinomas and adenocarcinomas of the lung and adenocarcinomas of the colon from digital pathology images. Extensive experiments show that our proposed modifications achieve 100% testing results for lung, colon, and lung-and-colon cancer detection. Our novel architecture takes around 0.65 million trainable parameters and around 6.4 million flops to achieve the best performance in lung and colon cancer detection using deep learning approaches. Compared with the other results, this architecture shows state-of-the-art results.

10:30-12:00 Session 6D: Main-3
Location: Emerald B
10:30
POLOR: Leveraging Contrastive Learning to Detect Political Orientation of Opinion in News Media

ABSTRACT. News articles are naturally influenced by the values, beliefs, and biases of the reporters preparing the stories and the policies of the publishing outlets. Numerous studies and datasets have been proposed to detect the political orientation of news articles. However, most of these studies ignore real textual clues and learn the textual signature of the source (commonly the publisher and rarely the writer) of the article instead. Moreover, a good volume of opinion pieces published by major news outlets do not reflect the political orientation of the publisher but rather reflect the political orientation of a non-professional writer. Existing methods are not built to correct this difference in the training data and, hence, perform poorly on human-annotated data. We propose, POLOR, a fine-tuned BERT model that employs contrastive learning to detect the political orientation of news articles even when the training data is labeled by the source (i.e. the publisher of the news article). Unlike previous work in the literature, the model learns features by employing different contrastive learning objectives where each sentence is contrasted with sentences from various sources simultaneously. POLOR achieves a 15% increase on our dataset compared to previously proposed baselines. Finally, we release two datasets of opinion news: source-annotated and human-annotated datasets. The full paper including supplementary materials, code, and datasets can be found at https://www.cs.unm.edu/~ajararweh/.

10:50
Latent Beta-Liouville Probabilistic Modeling for Bursty Topic Discovery in Textual Data

ABSTRACT. Topic modeling has become a fundamental technique for uncovering latent thematic structures within large collections of textual data. However, conventional models often struggle to capture the burstiness of topics. This characteristic, where the occurrence of a word increases its likelihood of subsequent appearances in a document, is fundamental in natural language processing. To address this gap, we introduce a novel topic modeling framework, integrating Beta-Liouville and Dirichlet Compound Multinomial distributions. Our approach, named Beta-Liouville Dirichlet Compound Multinomial Latent Dirichlet Allocation (BLDCMLDA), is designed to specifically model word burstiness and support a wide range of adaptable topic proportion patterns. Through rigorous experiments on diverse benchmark text datasets, the BLDCMLDA model has demonstrated superior performance over conventional models. Our promising results in terms of perplexity and coherence scores, demonstrate the effectiveness of BLDCMLDA in capturing the nuances of word usage dynamics in natural language.

11:10
Twitter User Account Classification to Gain Insights into Communication Dynamics and Public Awareness During Tampa Bay's Red Tide Events

ABSTRACT. This study presents an innovative approach to analyzing environmental challenges, focusing on the localized impacts of toxic algal blooms, specifically the dinoflagellate Karenia brevis on Florida's Gulf Coast, commonly known as "red tide". Despite the extensive influence of social media in public discourse, its potential in environmental awareness remains largely untapped. Our research exploits Twitter data to examine communication trends and public understanding of red tide issues in the Tampa Bay area from 2018 to 2022. For that study period, we collected 63K tweets from 30K accounts that mentioned terms related to red tide. Our methodology involves a tiered labeling process to obtain over 15K labeled accounts. In the initial tier, we employ predefined dictionaries for account groups to establish preliminary class designations, streamlining the subsequent labeling tiers, one of which is aided by preliminary machine learning classification. Having used several text classification algorithms and feature preprocessing approaches, Support Vector Machine with Bidirectional Encoder Representations from Transformers (BERT) yielded the best cross-validation performance in both accuracy (90%) and versatility (unweighted F1 score of 0.67). Lastly, we creatively leveraged the Term Frequency-Inverse Document Frequency (TF-IDF) method to study the terms that most distinguish each user category from the rest.

11:30
Abstractive Text Summarization Based on Neural Fusion

ABSTRACT. Abstractive text summarization, in comparison to extractive text summarization, offers the potential to generate more accurate summaries. In our work, we present a stage-wise abstractive text summarization model that incorporates Elementary Discourse Unit (EDU) segmentation, EDU selection, and EDU fusion. We first segment the articles into a fine-grained form, EDUs, and build a Rhetorical Structure Theory (RST) graph for each article in order to represent the dependencies among EDUs. Those EDUs are encoded in a Graph Attention Networks (GATs), and those with higher importance will be selected as candidates to be fused. The fusing stage is done by BART which merges the selected EDUs into summaries. Our model outperforms the baseline of BART (large) on the CNN/Daily Mail dataset, showing its effectiveness in abstractive text summarization.

13:30-15:00 Session 7A: ANLP-2
Location: Emerald D
13:30
TaxTajweez: A Large Language Model-based Chatbot for Income Tax Information In Pakistan Using Retrieval Augmented Generation (RAG)

ABSTRACT. The advent of Large Language Models (LLMs) has heralded a transformative era in natural language processing across diverse fields, igniting considerable interest in domain-specific applications. However, while proprietary models have made significant strides in sectors such as medicine, education, and law through tailored data accumulations, similar advancements have yet to emerge in the Pakistani taxation domain, hindering its digital transformation.

In this paper, we introduce TaxTajweez, a specialized Retrieval Augmented Generation (RAG) system powered by the OpenAI GPT-3.5-turbo LLM, designed specifically for income taxation. Complemented by a meticulously curated dataset tailored to the intricacies of income taxation, TaxTajweez leverages the RAG pipeline to mitigate model hallucinations, enhancing the reliability of generated responses. Through a blend of qualitative and quantitative evaluation methodologies, we rigorously assess the accuracy and usability of TaxTajweez, establishing its efficacy as an income tax advisory tool.

13:45
Beyond Binary: Revealing Variations in Islamophobic Content with Hierarchical Multi-Class Classification

ABSTRACT. In the digital age, the rise of Islamophobia-marked by an irrational fear or discrimination against Islam and Muslims- has emerged as a pressing issue, especially on social media platforms. In this paper we employs a multi-class classification system, moving beyond traditional binary models. We categorize Islamophobic content into three main classes and various subclasses, covering a range from subtle biases to explicit incitement. Comparative analysis of data from Reddit and Twitter illuminates the distinct prevalence and types of Islamophobic content specific to each platform. This paper deepens our understanding of digital Islamophobia and provides insights for crafting targeted online counter strategies. Additionally, it highlights the role of machine and deep learning in detecting and addressing Islamophobic content, emphasizing their significance in resolving complex social issues in the digital sphere.

14:00
Toward Inclusivity: Rethinking Islamophobic Content Classification in the Digital Age

ABSTRACT. In this paper, we implement a comprehensive three-class system to categorize social media discussions about Islam and Muslims, enhancing the typical binary approach. These classes are: I) General Discourse About Islam and Muslims, II) Criticism of IslamicTeachings and Figures, and III) Comments Against Muslims. These categories are designed to balance the nuances of free speech while protecting diverse groups like Muslims, ex-Muslims, LGBTQ+ communities, and atheists. By utilizing machine learning and employing transformer-based models, we analyze the distributionand characteristics of these classes in social media content. Our findings reveal distinct patterns of user engagement with topics related to Islam, providing valuable insights into the complexities of digital discourse. This research contributes to the fields of quantitative social science by offering an improved method for understanding and moderating online discussions on sensitive religious and cultural subjects.

14:15
Transformer Models for Brazilian Portuguese Question Generation: An Experimental Study

ABSTRACT. The rapid progress in Natural Language Processing, propelled by approaches based on Transformers, has ushered in a new era of possibilities and challenges. These models, known for their parallel multi-head attention mechanisms, have brought significant advancements to tasks such as translation, summarization, and question-answering. However, question generation poses a unique challenge. Unlike tasks such as translation or summarization, generating meaningful questions necessitates a profound understanding of context, semantics, and syntax. This complexity arises from the need to not only comprehend the given text comprehensively but also infer information gaps, identify relevant entities, and construct syntactically and semantically correct interrogative sentences. We address this challenge by proposing an experimental fine-tuning approach for encoder-decoder models (T5, FLAN-T5, and BART-PT) tailored explicitly for Brazilian Portuguese question generation. Our study involves fine-tuning these models on the SQUAD-v1.1 dataset and subsequent evaluation, also on SQUAD-v1.1. Through our experimental endeavors, BART returned a higher result in all the ROUGE metrics, as ROUGE-1 0.46, ROUGE-2 0.24, ROUGE-Lsum 0.43, suggesting a higher lexical similarity in the questions generated, and it is comparable to the results of the question generation task for the English language. We explored how these advancements can significantly enhance the precision and quality of the question generation task in Brazilian Portuguese, bridging the gap between training data and the intricacies of interrogative sentence construction.

14:30
Bridging the Gap: A Comprehensive Study on Named Entity Recognition in Electronic Domain using Hybrid Statistical and Deep Knowledge Transfer

ABSTRACT. Training deep neural network models in NLP applications with a small amount of annotated data does not usually achieve high performances. To address this issue, transfer learning, that consists on transferring knowledge from a domain with large amount of annotated data to a specific domain which lacks annotated data, could be a solution. In this paper, we present a study case on named entity recognition for the electronic domain, that relies on several approaches based on statistics, deep learning and transfer learning.

Our evaluations showed a significant improvement in the overall performance, with the best results using the transfer learning, up to +15% compared to other approaches.

13:30-15:00 Session 7B: Security, Trust, & XAI-2
Location: Emerald E
13:30
Sharing Accountability of Versatile AI Systems: A Literature Review on the Role of Developers and Practitioners

ABSTRACT. AI systems pose both opportunities and threats in various industries. To harness these opportunities and mitigate risks, accountability is crucial. Traditionally, developers bear the responsibility for auditing and modifying algorithms. However, in the evolving landscape of versatile AI, developers may lack contextual understanding across diverse fields. This paper proposes a theoretical framework that distributes accountability to developers and practitioners according to their capabilities. This framework enhances systemic comprehension of shared roles, empowering both groups to collaboratively avert potential adverse impacts.

13:45
Enhanced Multi-Class Detection of Fake News

ABSTRACT. The rapid spread of fake news, either intentionally or erroneously, has emerged as a critical challenge in the current era. Confusion and conflict can arise if people mistake fake news for real news. Therefore, advanced detection methodologies are desirable. The goals of this paper are to identify fake news, while addressing the issue of class imbalances in available training data. Our task is multi-class fake news detection, an advanced methodology beyond traditional binary classification. We highlight Convolutional Neural Network (CNN)’s superior performance over the baseline BERT model in the literature, with improvements in accuracy, precision, recall, and F1-Score. We specifically experimented with four model variants: CNN and BERT with both trainable embeddings and BERT embeddings. Our experiment demonstrates CNN's effectiveness in identifying text patterns. To address class imbalances, a common issue in fake news datasets, we experimented with three different balancing methods. Our study includes an evaluation of ChatGPT used for multi-class labeling, compared with actual labels. The result indicates notable limitations in ChatGPT's automated classification, which highlights the complexities of AI-based categorization. Overall, our findings demonstrate the CNN model's efficiency and effectiveness, and showcase the intricacies of fake news detection. These insights confirm the need for advanced AI methodologies in combating misleading information.

14:00
A Scoping Review of Transparency and Explainability in AI Ethics Guidelines

ABSTRACT. Transparency and explainability are crucial tenets of ethical Artificial Intelligence (AI) and are often classified as technical components of AI Ethics. Many countries and international governing bodies have developed AI guidelines and principles that are made public for respective civilians with a diverse range of expertise and knowledge. This short paper compares how explainability and transparency are presented and dis-cussed in AI ethics guidelines developed by the top ten coun-tries leading in AI research and development according to the AI Global Index Report in 2023, as well as leading efforts from governing bodies such as the EU. Methodologically, this paper presents a thematic analysis focusing on the presence and acknowledgment of various dimensions of explainability and transparency, and the level of detail and examples given in the guideline. The aim is to uncover how various dimensions of AI ethics are presented in guideline documents and high-light their uniqueness as well as discuss potential reasoning behind the way these guidelines are expressed.

14:15
Beyond Size and Accuracy: The Impact of Model Compression on Fairness

ABSTRACT. Model compression is increasingly popular in the domain of deep learning. When addressing practical problems that use complex neural network models, the availability of computational resources can pose a significant challenge. While smaller models may provide more efficient solutions, they often come at the cost of accuracy. To tackle this problem, researchers often use model compression techniques to transform large, complex models into simpler, faster models. These techniques aim to reduce the computational cost while minimizing the loss of accuracy. The majority of the model compression research focuses exclusively on model accuracy and size/speedup as performance metrics. This paper explores how different methods of model compression impact the fairness/bias of a model. We conducted our experiments using the COMPAS Recidivism Racial Bias dataset. We evaluated a variety of model compression techniques across multiple bias groups. Our findings indicate that the type and amount of compression have substantial impact on both the accuracy and fairness/bias of the model.

14:30
Hybrid Cyber-Physical Intrusion Detection System for Smart Manufacturing

ABSTRACT. Smart manufacturing is an important part of our critical infrastructure and is the current age of industry where physical components such as robotic arms, 3-D Printers, CNC machine, etc. are all interconnected and remotely controlled or automated to provide a major boost in efficiency. While being more effective, the cyber-physical integration expands the attack surface of these systems for any potential threats to act on and exploit. This integration also creates gaps in the current intrusion detection systems (IDS) and the research of such systems as they focus on either the cyber or physical components of these system, which leaves blind spots when an attack can only be detected by using either cyber or physical features. This paper fill that research gap by creating a cyber-physical testbed, launching denial of service and physical hijacking attacks, collecting benign and malicious data, and creating a hybrid IDS using K-Nearest Neighbors and Decision Tree models that consider both cyber and physical features. Our proposed hybrid IDS achieves an accuracy of 97.2% which was roughly the same as separate cyber and physical IDSs, but there was a significant boost in precision (98.4%), recall (94.2%), and F1 score (96.1%) when using the hybrid IDS compared to the separate IDSs.

13:30-15:00 Session 7C: Main-5
Location: Emerald A
13:30
What Matters in Irony Detection: An Extended Feature Engineering for Irony Detection in English Tweets.

ABSTRACT. In recent years, large-scale language models (LLMs) have nearly become the dominant force in almost every natural language processing (NLP) task. The primary research approach has focused on selecting the most appropriate language model for specific NLP tasks and then incorporating linguistic features to enhance the model’s performance. With swift progress in this field, new features and models are evolving rapidly, and outdated systems require timely updates. In this paper, we extended the accomplishments of SemEval-2018 Task 3, enhancing its irony detection systems with novel features and more sophisticated language models. Subsequently, we conducted an ablation study to showcase the contributions of these enhancements to the LLM-based system. Furthermore, we compared our leading system with the top performers in the SemEval-2018 competition, and our best model exhibited superior performance when compared to the leading performers applied to the same corpus.

13:45
Knowledge Distillation for a Domain-Adaptive Visual Recommender System

ABSTRACT. In the last few years large-scale foundational models have shown remarkable performance in computer vision tasks. However, deploying such models in a production environment poses a significant challenge, because of their computational requirements. Furthermore, these models typically produce generic results and they often need some sort of external input. The concept of knowledge distillation provides a promising solution to this problem. In this paper, we focus on the challenges faced in the application of knowledge distillation techniques in the task of augmenting a dataset for object detection used in a commercial Visual Recommender System called VISIDEA; the goal consists in detecting items in various e-commerce websites, encompassing a wide range of custom product categories. We discuss a possible solution to problems such as label duplication, erroneous labeling and lack of robustness to prompting, by considering examples in the field of fashion apparel recommendation.

14:00
Knowledge-infused and Explainable Malware Forensics

ABSTRACT. Despite considerable progress in malicious software forensics, the challenge of accurate attribution, formulation of appropriate response and mitigation strategies, and ensuring the interpretability of deep learning methods persists. While being less flexible and robust to noise compared to deep learning models, Knowledge Graphs are natively developed to be explainable and are a promising solution for exploring the new features and relations, and enhancing understandability of decisions. In this work, we aim to develop an explainable malware classifier which can classify PE executable as malign or benign, by infusing external knowledge using Knowledge Graph (KG). We enrich our Knowledge Graph using MITRE Attack ontology (i.e., domain knowledge) and EMBER dataset and utilize Graph2Vec algorithm to embed KG knowledge into our classifier. We found that our classifier yields satisfactory results while maintaining a high level of explainability.

14:15
War of Words: Harnessing the Potential of Large Language Models and Retrieval Augmented Generation to Classify, Counter and Diffuse Hate Speech

ABSTRACT. This paper explores the emergence of divergent narratives in the wake of the Russian-Ukraine war, which began on February 24, 2022, and the innovative application of AI language models, specifically Retrieval-Augmented Generation (RAG) and instruction based large language models (LLMs), in countering hateful speech on social media. We design a pipeline to automatically discover and then respond to hateful content trending on social media platforms. Monitoring via traditional topic/narrative modeling often focuses on low-level content, which is difficult to interpret. In addition, workflows for prioritization and response generation are highly manual. We utilize several large language models (LLMs) throughout our pipeline to detect and summarize topics, to determine whether tweets contain hate speech and to generate counter narratives. We test our approach on Ukraine Bio-Lab Tweet Corpus of 500k Tweets and evaluate the counter-narrative generation performance across several dimensions: relevance, grammaticality, factuality, and diversity.

14:30
Informed Traffic Signal Preemption for Emergency Vehicles

ABSTRACT. Emergency vehicles traffic light preemption systems is extended by using predicted trajectories to provide favorable changes of signals. The location of the emergency vehicle is tracked and traffic signals on the requested travel route are contacted and switched to become green in time, before the emergency vehicle reached the intersection. The resulting time to destination obtained by our simulation is compared to a control simulation without preemption. In simulations, when an emergency vehicle in the control simulation reached a red light, a delay of five seconds was added to the vehicle’s journey. According to the improvements shown by our simulations, from the 356,000 out-of-hospital cardiac arrests in the United States per year, approximately 3,000 lives could be saved.

13:30-15:00 Session 7D: Main-6
Location: Emerald B
13:30
StrXL: Approximating Permutation Invariance/Equivariance to Model Arbitrary Cardinality Sets

ABSTRACT. Current deep-learning techniques for processing sets are limited to a fixed cardinality, causing a steep increase in computational complexity when the set is large. To address this, we have taken techniques used to model long-term dependencies from natural language processing and combined them with the permutation equivariant architecture, Set Transformer (STr). The result is Set Transformer XL (STrXL), a novel deep learning model capable of extending to sets of arbitrary cardinality given fixed computing resources. STrXL's extension capability lies in its recurrent architecture. Rather than processing the entire set at once, STrXL processes only a portion of the set at a time and uses a memory mechanism to provide additional input from the past. STrXL is particularly applicable to processing sets of high-throughput sequencing (HTS) samples of DNA sequences as their set sizes can range into hundreds of thousands. When tasked with classifying HTS prairie soil samples and MNIST digits, results show that STrXL exhibits an expected memory size-accuracy trade-off that scales proportionally with the complexity of downstream tasks, but, unlike STr, is capable of generalizing to sets of arbitrary cardinality.

13:50
Constraint Composite Graph-Based Weighted CSP Solvers: An Empirical Study

ABSTRACT. The Weighted Constraint Satisfaction Problem (WCSP) is a very expressive framework for optimization problems. The Constraint Composite Graph (CCG) is a graphical representation of a given (Boolean) WCSP that reduces it to a Minimum Weighted Vertex Cover (MWVC) problem by introducing intelligently chosen auxiliary variables. It also enables kernelization, a maxflow procedure used to fix the optimal values of a subset of the variables before initiating search. In this paper, we present some CCG-based WCSP solvers and compare their performance against toulbar2, a state-of-the-art WCSP solver, on a variety of benchmark instances. We also study the effectiveness of kernelization.

14:10
Matching-based Coalition Formation for Multi-robot Task Assignment Under Partial Uncertainty

ABSTRACT. In this paper, we study the multi-robot coalition formation problem for instantaneous task allocation, where a group of robots needs to be allocated to a set of tasks to execute optimally. One robot might not be enough to complete a given task, so forming teams to complete these tasks becomes necessary. In many real-world scenarios, the robots might have noisy localization. Due to this, cost calculations for robot-to-task assignments become uncertain. However, a small amount of resources might be available to accurately localize a subset of these robots. To this end, we propose a bipartite graph matching-based task allocation strategy (centralized and distributed versions) that gracefully handles the uncertainty arising from cost calculations using an interval-based technique while leveraging the fact that a small number of robots might be localized on demand using an external system such as drones. We have tested the proposed technique in simulation. Results show that our approach is moderately fast -- scales up to 100 robots and 50 tasks in 0.85 sec. (distributed solution) while gracefully handling partial uncertainty.

14:30
Lattice-Based Generation of Euclidean Geometry Figures

ABSTRACT. We present a user-guided method to generate geometry figures appropriate for high school Euclidean geometry courses. We first establish that a two-dimensional geometry figure can be represented abstractly using a complete, rank 4 lattice we call a geometry figure lattice (GFL). As input, we take a user-defined vector of primitive geometry shapes and convert each into a GFL. We then exhaustively combine each these `primitive' GFLs into a set of complex GFLs using a process we call gluing. These lattices act as a template for the second step: instantiating GFLs into a sequence of concrete geometry figures. To identify figures that are structurally similar to textbook problems, we use a discriminator model trained on a corpus of textbook geometry figures.

15:30-17:00 Session 8A: Games
Location: Emerald D
15:30
Using Genetic Algorithms to Automate Scenario Generation and Enhance the Training Value of Serious Games for Adaptive Instruction

ABSTRACT. This paper discusses an end-to-end process to enhance the variability of serious game scenario conditions to support adaptive instructional strategies that select changes in sce-nario difficulty to match learning objectives to trainee ca-pabilities. This process can also be used to model or recog-nize human responses to serious game stimuli to support trainee feedback during after-action reviews (AARs). Adaptive instruction is a learning experience where inter-ventions (content selection and feedback) are tailored for each individual trainee or group of trainees. Adaptive train-ing has been shown to be as much as one to two standard deviations more effective than traditional classroom meth-ods. Our process starts with a few parent scenarios. The population of parent scenarios is automatically expanded using a Novelty search (genetic algorithm) methodology to optimize feature variability in the game-based scenarios. Specifically, this paper focuses on the process of creating new training scenarios of varying difficulty and conditions based on changes in scenario environmental factors (i.e., light, precipitation, wind, and cloud cover). The resulting new scenarios may then be run within a serious game with embedded intelligent agents to represent realistic entity behaviors under the scenario-specified conditions. The ability to morph conditions and scenario difficulty during an adaptive training event is a significant benefit of this process.

15:50
Tip of the Spear: Developing Predictive Military Planning Tools Using Hidden Markov Models

ABSTRACT. The evolution of the modern battlefield is increasing complex as new technologies emerge. However, the nature of battlefield still can be explained by Baron De Jomini’s “Grand Tactics.” Military planners’ success is in their ability to develop synergy through layering effects of a complex system at a decisive point on the battlefield. Synergistic effects require subject matter experts (SME) working in planning cells to integrate systems and units in time and space. These planning cells have large footprints and become prioritized in the opposition’s targeting cycle. Without these SMEs, planners are unable to identify the latent variables to achieve the massing of forces at the decisive point. This paper explores the application of Hidden Markov Models (HMMs) to enhance existing Correlation of Forces and Means (COFM) calculators as predictive tools in military planning. Current tools focus on a 3 to 1 force ratio for an offensive operation. They get improved through applying a force equivalent factor. The force equivalent factors are scalars that adjusts relative combat power for a single main battle tank to a mechanized infantry vehicle. These tools lack the ability to identify the physics of the battlefield in time and space. The utilization of wargames during planning and training provides a venue for serious games to improve planning tools. It allows a visualization of target pairing and dynamics that a linear equation would miss. This study employs scenarios generated in OneSAF, ranging from simple platoon-level ambushes to combined arms maneuver featuring rotary wing assets requiring a shaping effort to ensure favorable COFMs. The focus of the research lies in leveraging HMMs to establish a time-series indicator of success probability using potentially observable data, with an emphasis on communication dynamics. In this study, we examine observable communication by data generated through both visual and direct contact. Through observation of contact from the friendly and opposition forces will provide a predictor of the hidden state of relative advantage, in time and space. With the data generated by the OneSAF simulation, the HMM determines the states of the operation and probability of success with minimizing required presence inside the units. This study demonstrated the successful use of HMM as a planning tool and provides an application for future research on improving decision making in military operations.

16:10
Improving Reinforcement Learning Experiments in Unity through Waypoint Utilization

ABSTRACT. Multi-agent Reinforcement Learning (MARL) models teams of agents that learn by dynamically interacting with an environment and each other, presenting opportunities to train adaptive models for team-based scenarios. However, MARL algorithms pose substantial challenges due to their immense computational requirements. This paper introduces an automatically generated waypoint-based movement system to abstract and simplify complex environments while allowing agents to learn strategic cooperation. To demonstrate the effectiveness of our approach, we utilized a simple scenario with heterogeneous roles in each team. We trained this scenario on variations of realistic terrains and compared learning between fine-grained (almost) continuous and waypoint-based movement systems. Our results indicate efficiency in learning and improved performance with waypoint-based navigation. Furthermore, our results show that waypoint-based movement systems can effectively learn differentiated behavior policies for heterogeneous roles in these experiments. These early exploratory results point out the potential of waypoint-based navigation for reducing the computational costs of developing and training MARL models in complex environments. The complete project with all scenarios and results is available on GitHub: https://github.com/XXX/XXX

16:30
Learning Cohesive Behaviors Across Scales for Competitive Agents

ABSTRACT. The development of automated opponents in video games has been part of game development since the very beginning of the field. The advent of modern AI approaches such as reinforcement learning has opened the door to a wide variety of flexible and adaptive AI opponents. However, challenges in producing realistic opponents persist, namely scalability and generalizability. Scalability is of particular importance when many individual opponents are required to act cohesively over long distances, but this makes learning more difficult. This paper presents a novel architecture applying graph convolutional layers in a U-net with custom pooling operators in order to achieve learning across scales. League play reinforcement learning was used to train competitive agents in a navigation mesh environment.

15:30-17:00 Session 8B: Healthcare
Location: Emerald B
15:30
Using Data Synthesis to Improve Length of Stay Predictions for Patient's with Rare Diagnoses

ABSTRACT. In healthcare, managing small patient cohorts, particularly those with rare diseases, presents a unique challenge due to the scarcity of data required for effective machine learning applications. Addressing this issue our paper investigates if a specific conditional data syntheses prior to learning the machine learning model using he CTGAN architecture improves the result. We choose the specific learning task of predicting hospital length of stay (LoS) of patients leaving the emergency department. It can, e.g., be used to predict the bed occupancy in a hospital and thus enables to prevent shortages in bed capacity. The accuracy of the LoS-prediction is strongly dependent on rarity of the patients disease, ranging from an acceptable accuracy, e.g., for often occurring homogeneous cases to worse accuracy for, e.g., inhomogeneous and small ones. To increase the accuracy for such cohorts, we enrich the dataset with new, synthesized patient admissions. Then, for each cohort, a model is trained to predict the LoS of a patient of this cohort. Our experiments show that adding synthetic data is able to increase the accuracy for some cohorts. However, it turns out that it needs to be handled with care, as adding synthetic data to a cohort does not necessarily increases its model performance, thus, we define indicators for deciding whether to use CTGAN for a specific cohort.

15:50
Machine learning prediction of severity and duration of hypoglycemic events in type 1 diabetes patients

ABSTRACT. We compare the performance of machine learning methods for building predictive models to estimate the expected characteristics of hypoglycemic or low blood glucose events in type 1 diabetes patients. We hypothesize that the rate of change of blood glucose ahead of a hypoglycemic event may affect the severity and duration of the event and investigate the utility of machine learning methods on using blood glucose rate of change, in combination with other physiological and demographic factors, to predict the minimum glucose value and the duration of a hypoglycemic event. This work compares the performance of six state-of-the-art methods on prediction accuracy and feature selection. We evaluate what ML methods build the most effective predictive models for this problem and examine whether the blood glucose rate of change is identified as a relevant input to the learned models.

16:10
Detecting Human Bias in Emergency Triage Using LLMs: Literature Review, Preliminary Study, and Experimental Plan

ABSTRACT. The surge in AI-based research for emergency healthcare poses challenges such as data protection compliance and the risk of exacerbating health inequalities. Human biases in demographic data used to train AI systems may indeed be replicated. Yet, AI also offers a chance for a paradigm shift, acting as a tool to counteract human biases. Our study focuses on emergency triage, rapidly categorizing patients by severity upon arrival. Objectives include conducting a literature review to identify potential human biases in triage and presenting a preliminary study. This involves a qualitative survey to complement the review on factors influencing triage scores. Moreover, we analyze triage data descriptively and pilot AI-driven triage using an LLM with data from the local hospital. Finally, assembling these pieces, we outline an experimental plan to assess AI's effectiveness in detecting human biases in triage data.

16:25
Fluid Path Detection Model for Lab on a Chip Images Using Deep Learn-ing-Based Segmentation Approach

ABSTRACT. Lab on a Chip technologies have gained substantial attention for their potential to revolutionize diagnostic, biotechnology, and chemical and mechanical analysis. The complexity of these chip design and the need for extracting valuable insights from fluid behavior within these chips underscore the importance of developing automated tools for data extraction from recorded microfluidic chip videos. Existing data extraction methods are often time-consuming and error-prone, limiting the comprehensive study of fluid dynamics in microfluidic chips. This paper proposes a solution by integrating advanced image processing into the automated tool, offering a robust and efficient method for precise data extraction. In this study, to address challenges in segmenting objects and enhance performance, we employed the well performed deep learning based model, DeepLabv3, for chip analysis path segmentation. To achieve this, we utilized a dataset of 400 images and masks from recorded videos capturing fluid behavior within a microfluidic chip developed in lab. Then, using the snapshots at different times, we labeled the patterns to create corresponding mask for each image. Prior to training and validating our dataset using DeepLabv3 for chip path segmentation, several preprocessing steps were applied, including image resizing and data augmentation. The performance results demonstrate that DeepLabv3 achieved a remarkable validation accuracy of 98% with a low loss value of 0.012. This study aims to showcase the latest advancements in an automated data extraction model designed for microfluidic chip videos, addressing challenges related to precise fluid interface tracking and data extraction from intricate fluidic networks. The successful integration of DeepLabv3 and meticulous preprocessing steps contribute to the tool's effectiveness in enhancing our understanding of fluid behavior within microfluidic chips, thereby paving the way for advancements in chip design, diagnostic processes, and other fluid feature-based analyses.

15:30-17:00 Session 8C: SLIE
Location: Emerald A
15:30
GCN-Based Issues Classification in Software Repository

ABSTRACT. Graph Convolutional Network (GCN) have demonstrated

significant potential in various fields, particularly

in classification tasks. This study introduces GCNbased

methodology for classifying issues in software

repositories, highlighting advancements in agile software

development. Utilizing the TAWOS dataset, our

research demonstrates the potential of GCNs to accurately

categorize software issues into bugs, improvements,

and tasks. Our results indicate a significant improvement

in issue classification, especially for bugs.

Additionally, we explore Fast Text GCN model, underlining

their efficiency in handling dynamic, evolving

datasets. This paper contributes to the fields of software

engineering and machine learning, offering novel

insights into enhancing issue management in software

projects.

15:50
Decoding Complexity: A Mathematical Framework for Enhanced Translation Comprehension

ABSTRACT. Machine translation tools have demonstrated substantial progress in enhancing translation accuracy since the emergence of artificial intelligence. However, challenges persist in reasoning (or the lack thereof), considering contexts, addressing specific word games, and interpreting very long or very short sentences—those exceeding 50 and falling below 7 words (Bowker, 2023: 893). Additionally, accurately translating technical or specialized terms and their variations remains a hurdle. This research introduces a categorical mathematical formalization of the comprehension stages in translation, along with a model for calculating acceptations (specific meanings of words) during the verification of meaning hypotheses. The goal is to elucidate the comprehension process and integrate contextual considerations. The formalism delineates a series of fundamental cognitive operations involved in comprehension. Furthermore, it advocates for evaluating meaning hypotheses using logical modalities, particularly hypostases, described as phrases (groups of words)—a unit of discourse rather than language—signifying the structure of arguments conveying the speaker's knowledge. The strength of our proposed mathematical model lies in its independence from both source and target languages, as well as the subjectivity of text authors or translators. Additionally, the assessment of meaning hypotheses relies on verifiable logical modalities, ensuring a reliable, explicable, and controllable outcome.

16:10
Predictive model using multivariate analysis and LSTM to assess mining pipeline thickness for corrosion degradation

ABSTRACT. Pipeline corrosion has significant impacts on the human, economic and natural environment. To help better detect and prevent it over time, in this paper we propose a multivariate approach using machine learning. More precisely, we propose to study the evolution of the thickness of the mining pipeline using a multivariate approach and to implement a predictive model using the Long Short-Term Memory (LSTM) artificial neural network. Indeed, LSTM is a specific recurrent neural network (RNN) architecture designed to model temporal sequences. The proposed predictive model achieved an accuracy of 80% and a loss of 0.01, and was able to predict variations in eight thickness measurements over a period of one hundred days.

16:30
On Softs Sets and Formal Concept Analysis (FCA) as mathematical categorization systems - An Engineering Application

ABSTRACT. This paper deals with an analysis of the reasoning for the cognitive problem of categorization in systems. Such reason- ing requires the construction of a formal model of the system that is performed in 3 steps: cognitive description of the sys- tem, extraction of system’s meaningful concepts, and design of the mathematical model based on the concepts. Our goal is to explore the cognitive features and their predominance in some old models of categorization well known in literature. We highlight the relation between Formal Concept Analysis (FCA) model and soft sets and we analyze the notion of “pa- rameter” from the cognitive point of view. Based on this anal- ysis, we propose the notion of double soft sets as a mathemat- ical notion more adequate in the engineering applications. All our study is conducted in the framework of material selection in mechanical engineering.

16:45
Enhancing Biomedical Knowledge Representation Through Knowledge Graphs

ABSTRACT. There is a plethora of information related to the biomedical domain on the internet. Unfortunately, retrieving this information online is challenging because of insufficient semantic metadata embedded within the web documents that help search engines interpret this metadata. To address this issue, we have introduced a “Semantically” platform to enhance semantic annotation of biomedical texts and embed it into the document heading for more accurate data retrieval. However, available semantic annotators struggle to achieve ideal levels of accuracy and speed, and they lack dynamic knowledge representation. For our use case, we have introduced a knowledge graph with an nlp-enhanced search query approach that allows users to choose prominent semantic annotation recommended by the domain through a socio-technical and personalized approach in a “Semantically” framework. The results show that knowledge representation in a Knowledge graph returns annotations more clearly and efficiently.

15:30-17:00 Session 8D: Robots
Location: Emerald E
15:30
Model for Knowledge Transfer in Agent Organizations: A Case Study on Moise+

ABSTRACT. Knowledge transfer enables the development of complex multi-agent systems featuring agents that share knowledge to execute tasks. Environments shaped by knowledge transfer involve agents assuming specific roles engaging in reasoning about the environment and other agents, and forming organizations that transfer knowledge through well-defined plans and strategies. This paper presents an organizational model for knowledge transfer, introducing a centralizing role, named Organizer, responsible for managing relations between the system, agents, and their roles. An interaction protocol is established to guide the step-by-step communication between the Organizer and other agents in the system during knowledge transfer. To model knowledge transfer within agent organizations, we employ a dynamic implementation of the Moise+ organizational model, called MoiseLight, enabling the creation of organizations at runtime. By adapting the model to MoiseLight, we validate the proposal, demonstrating the Organizer's ability to facilitate knowledge transfer among agents following the specified interaction protocol.

15:50
The Dynamic Anchoring Agent: A Probabilistic Object Anchoring Framework for Semantic World Modeling

ABSTRACT. Semantic world modeling has been studied extensively, with the goal of enabling robots to understand and interact with their environment. However, existing approaches to semantic world modeling rely on well-defined perceptual data, such as distinct visual features. In situations where different objects are difficult to distinguish based on perceptual data alone, the resulting world model will be ambiguous and inconsistent. To address this challenge, we present the Dynamic Anchoring Agent (DAA), a probabilistic object anchoring framework for semantic world modeling that uses domain knowledge and reasoning to handle the ambiguity of sensor data through probabilistic anchoring. It includes a multi-hypothesis tracker (MHT) as a filter for noisy observations, and a knowledge base that encodes domain knowledge and scene context to reduce uncertainty in the anchoring process. The framework is evaluated on both synthetic and real-world datasets, demonstrating its effectiveness in resolving association ambiguities in the presence of identical-looking instances. It has also been integrated into a real robot platform. We show that with the help of domain knowledge and scene context, the proposed framework outperforms traditional pure data-based algorithms in terms of identification precision and recall, and can effectively resolve ambiguities between sensor-identical object instances.

16:10
Which Actions Are Always Necessary in Fully-Observable Non-Deterministic Planning?

ABSTRACT. Explaining the sequential action choices of an autonomous agent is essential for understanding the agent and learning from the agent. Given a goal-reaching action sequence in a non-deterministic environment, one important aspect of explaining action choices is deciding if the action choice is always necessary for the action sequence, or if it may be omitted under certain circumstances. Most works in explainable planning (XAIP) have been focused on explanation of action choices in deterministic environments as assumed by classical planning, and one widely used notion is causal link. In this work, we will apply the notion of causal link to justifying actions choices of an execution trace in fully-observable non-determinisitic (FOND) settings. We will also introduce a new notion – the always-necessary actions of a trace – and propose an approach that derives the always-necessary actions of a trace from the causal link justifications of action choices.

16:30
Automatically Learning HTN Methods from Landmarks

ABSTRACT. Hierarchical Task Network (HTN) planning usually requires a domain engineer to provide manual input about how to decompose a planning problem. Even HTN-MAKER, a well-known method-learning algorithm, requires a domain engineer to annotate the tasks with information about what to learn. We introduce CURRICULAMA, an HTN method learning algorithm that completely automates the learning process. It uses landmark analysis to compose annotated tasks, and leverages curriculum learning to order the learning of methods from simpler to more complex. This eliminates the need for manual input, resolving a core issue with HTN-MAKER. We prove CURRICULAMA ’s soundness, and show experimentally that it has a substantially similar convergence rate in learning a complete set of methods to HTN-MAKER.