FLAIRS-37: THE 37TH INTERNATIONAL CONFERENCE OF THE FLORIDA ARTIFICIAL INTELLIGENCE RESEARCH SOCIETY
PROGRAM FOR SUNDAY, MAY 19TH
Days:
previous day
next day
all days

View: session overviewtalk overview

09:00-10:00 Session 1: Invited Talk 1
Location: Emerald D+E
09:00
The Evolution of AI: How Medical Applications Have Stimulated and Guided the Field

ABSTRACT. Five decades have passed in the evolution of Artificial Intelligence (AI) and its medical applications. Medical AI has evolved substantially while tracking the corresponding changes in computer science, hardware technology, communications, and biomedicine itself. Emerging from medical schools and computer science departments in its early years, the AI in Medicine (AIM) field is now more visible and influential than ever before, paralleling the enthusiasm and accomplishments of AI and data science more generally. This talk will briefly summarize some of AIM history, relating it to the themes in AI itself and providing an update on the status of the field as it enters its second half-century. The inherent complexity of medicine and of clinical care necessitates that AIM research and development address not only decision-making or analytical performance but also issues of usability, workflow, transparency, ethics, safety, and the pursuit of persuasive results from formal clinical trials. These requirements contribute to an ongoing investigative agenda for AIM R&D and are likely to continue to influence the evolution of AI itself.

10:30-12:00 Session 2: Poster Session
Location: Emerald A+B
Promoting Transparency and Trust in Biomedical Data: A FAIR Approach to Content Creation and Sharing [#1]

ABSTRACT. Efficient methods of sharing scientific knowledge are vital to advancing research and ensuring reproducibility. However, the dominance of the Portable Document Format (PDF) for sharing scientific findings creates a challenge in adhering to the FAIR principles—Findability, Accessibility, Interoperability, and Reusability—essential for making scientific content reproducible. While some web-based frameworks use RDF (Resource Description Framework) and linked data to enhance content interoperability, these often target users with technical expertise. To address this, we have developed a user-friendly platform called Semantically, specifically for biomedical researchers to create and share FAIR biomedical content, ultimately bolstering reproducibility. Semantically consist of biomedical semantic content authoring module, that incorporates technical and social aspects, provides annotation recommendations, and encourages collaboration between authors and domain experts. This approach maintains domain-specific insights and improves the accuracy and relevance of semantic annotations. The platform also proposes a publishing infrastructure that uses schema.org, a widely accepted structured data vocabulary, to ensure datasets are machine-readable and well-organized. This technological innovation improves the FAIRness of data, making it easier to find, access, use, and share. Combining the Semantically authoring module and schema.org for data publishing provides a comprehensive solution to enhance the FAIRness of biomedical data and contribute to making scientific content reproducible. Semantically is an open-source tool accessible at~\href{https://github.com/bukharilab/Semantically}{Github}.

An Interpretable Transformer Model for Operational Flare Forecasting [#2]

ABSTRACT. Interpretable machine learning tools including LIME (Local Interpretable Model-agnostic Explanations) and ALE (Accumulated Local Effects) are incorporated into a transformer-based machine learning model, named SolarFlareNet, to interpret the predictions made by the model. SolarFlareNet is implemented into an operational flare forecasting system for predicting whether an active region (AR) on the surface of the Sun would produce a >= M class flare within the next 24 hours. We use magnetic parameters (features) obtained from the Space-weather HMI Active Region Patches and survey flare events that occurred from May 2010 to December 2022 using the Geostationary Operational Environmental Satellite X‐ray flare catalogs provided by the National Centers for Environmental Information (NCEI), to build a database of flares with identified ARs in the NCEI flare catalogs. LIME is employed to determine the ranking of features. First-order ALE plots are employed to determine the effects of a single feature on the model's predictions. Second-order ALE plots are used to determine any additional interaction effects of two features (feature pairs) on the predictions. These tools together help scientists better understand which features are crucial for SolarFlareNet to make its predictions. Experiments show that the tools can explain the internal working of SolarFlareNet while maintaining its accuracy.

The Impact of PDDL+ Language Features on Planning Performance: An Empirical Analysis On a Real-World Case Study [#3]

ABSTRACT. Automated planning is the field of AI that focuses on identifying sequences of actions allowing one to reach a goal state from a given initial state. To support the use of planning techniques in challenging real-world applications, that requires the ability to reason in terms of hybrid discrete and continuous changes, expressive languages such as PDDL+ have been introduced. PDDL+ includes a number of features designed to improve the readability and conciseness of the resulting knowledge models, but that are commonly doubted to have detrimental impact on the performance of domain-independent searches and heuristics.

To shed some light on the extent of the impact that some of these language features can have on well-known planning techniques, in this paper we perform an empirical analysis using challenging models from a real-world application, and a range of search and heuristics approaches. Surprisingly, our analysis indicates that the use of assignments and conditional effects, usually deemed to be detrimental to planning performance, positively affects the performance of the considered techniques.

Pulmonary Disease Classification on Electrocardiograms Using Machine Learning [#4]

ABSTRACT. Pulmonary diseases, such as chronic obstructive pulmonary disease (COPD) and asthma, which are among the leading causes of death in the US. These lung diseases often are diagnosed by pulmonologists using physical exam (e.g., lung auscultation) and objective measurement of lung function with pulmonary function testing (PFT). These extensive tests, however, can be inaccessible to many patients due to limited resources and availability. In this paper, we explore the use of the easily accessible electrocardiograms (ECGs) to train machine learning models to classify pulmonary diseases. To this end, we developed and experimented with two approaches: deep neural network models trained (e.g., Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs)) using the ECG signals directly, and non-neural models (e.g., support vector machines (SVMs) and logistic regression) trained using derived features from ECGs. In the task of classifying whether a patient has obstructive lung disease, our results show that deep neural network models outperformed the non-neural models, though the difference is within 3% on accuracy and F1-score metrics.

Adapted Contrastive Predictive Coding Framework for Accessible Smart Home Control System Based on Hand Gestures Recognition [#5]

ABSTRACT. Smart home control systems have been widely used to control multiple smart home devices such as smart TVs, smart HVAC, and smart bulbs. While targeting the design of a smart home environment, the accessibility of the entire system is crucial to achieving a comfortable and accessible environment for all home individuals. Thus, in this paper, we are aiming to improve the daily quality of life of the elderly and Disabled individuals by improving the user experience of a vision-based accessible home control system based on an integrated self-supervised and supervised learning framework that provides a robust capability of extracting the features to enhance the overall performance via adapting the contrastive predictive coding concepts and deep learning for real-time hand gesture recognition system. The proposed framework provides an accessible smart home control system with a significant improvement in the user experience.

Severe Weather Forecasting via the Sole Use of Satellite Data [#6]

ABSTRACT. The forecasting of (severe) weather/climate systems using satellite telemetry and Machine Learning (ML) is generally held back by the size and availability of the pertaining datasets. This research outlines a newly devised pipeline for the automated construction of concise datasets designed to convert computationally expensive raw data from a netCDF4 database into a simpler format, with the end goal of future use in severe weather forecasting via the sole use of satellite data as an alternative to more conventional, expensive and localized means. By representing components of the dataset as int8 RGB(A) values of PNG images, data can be spatially related in a concise, consistent and visualizable manner that significantly reduces dataset size relative to the size of the raw dataset. This method is used on Atmospheric Motion Vectors (AMVs) derived from multispectral satellite telemetry via Optical flow Code for Tracking, Atmospheric motion vector, and Nowcasting Experiments (OCTANE) in the construction of a dataset capable of use in prediction of future movements of clouds.

CNN Brain Label-Maker: Automatic ICA Rejection EEG based System Architecture [#7]

ABSTRACT. The electroencephalogram (EEG) is a practical and reasonably applied tool for researching brain disorders and behavior changes. EEG offers a minimally restricted and non-invasive method, where the significant difficulties in utilizing EEG in studies on cognitive development are in the temporal resolution, the outbound signal sources, and the EEG Artifacts. Making an informed judgment about the application of EEG technology requires careful consideration of these factors. In-dependent component analysis (ICA) has been demonstrated to isolate the various source generator processes underlying simultaneously recorded signals from multiple, near-adjacent EEG scalp electrode channels. ICs discovered by ICA decomposition must be manually inspected, chosen, and interpreted, but this process takes time and experience. Automated IC category classification of ICs can be achieved sufficiently accurately, which expedites the analysis of large-scale EEG research permits the use of ICA decomposition in near-real-time applications. While many such classifiers, the next step of Brain activity rejection is necessary for medical specialists. Thus, this work presents an automated convolution neural network-based brain activity labeling for ICA rejection using the data from the well-used and widely utilized by neurologists and Scientists such as ICLabel MATLAB, EEGLab tools, etc. Replacing the manual task via an atoms system, which makes the proposed system reduces the processing time by 7200x and accuracy of 82.36%. The proposed system was trained, verified and tested using CCHMC clinical data, using a 128-channel HydroCel electrode net (Magstim EGI, Eugene, OR) and an EGI NetAmp 400 at a 1000Hz sampling rate.

Good Apples / Bad Apples: Detection and classification of damage [#8]

ABSTRACT. The sorting of apples in classes of quality using vision, and the detection of bad apples using X-ray and vision have naturally been addressed frequently in the past due to the sheer significance of this fruit. Our novelty consists in going past detection of damage into measuring it and classifying damaged apples for potential usages beyond direct sale, rather than simply deciding if an apple is rotten or not. Size and color of damage stains is assessed to help deciding potential usage of the apples. One can redirect apples towards various usages outside markets for the general population, such as for compost, for animal consumption, for fermentation, or for preserves pre-processing plants. Parameters of the segment anything algorithm are evaluated for this problem domain and appropriateness of algorithm features is discussed for the real problem, obtaining directions for algorithmic improvement.

Leveraging Organizational Hierarchy to Simplify Reward Design in Cooperative Multi-agent Reinforcement Learning [#9]

ABSTRACT. The effectiveness of multi-agent reinforcement learning (MARL) hinges largely on the meticulous arrangement of objectives. Yet, conventional MARL methods might not completely harness the inherent structures present in environmental states and agent relationships for goal organization.This study is conducted within the domain of military training simulations, which are typically characterized by complexity, heterogeneity, non-stationary and doctrine-driven environments with a clear organizational hierarchy and a top-down chain of command.This research investigates the approximation and integration of the organizational hierarchy into MARL for cooperative training scenarios, with the goal of streamlining the processes of reward engineering and enhancing team coordination.In the preliminary experiments, we employed two-tiered commander-subordinate feudal hierarchical (CSFH) networks to separate the prioritized team goal and individual goals.The empirical results demonstrate that the proposed framework enhances learning efficiency. It guarantees the learning of a prioritized policy for the commander agent and encourages subordinate agents to explore areas of interest more frequently, guided by appropriate soft constraints imposed by the commander.

Augmenting Training Data for a Virtual Character Using GPT-3.5 [#10]

ABSTRACT. This paper compares different methods of using a large language model (GPT-3.5) for creating synthetic training data for a retrieval-based conversational character. The training data are in the form of linked questions and answers, which allow a classifier to retrieve a pre-recorded answer to an unseen question; the intuition is that a large language model could predict what human users might ask, thus saving the effort of collecting real user questions as training data. Results show small improvements in test performance for all synthetic datasets. However, a classifier trained on only small amounts of collected user data resulted in a higher F-score than the classifiers trained on much larger amounts of synthetic data generated using GPT-3.5. Based on these results, we see a potential in using large language models for generating training data, but at this point it is not as valuable as collecting actual user data for training.

Perception Model for Mobile Robots Assisting Humans in Decision-Making during Complex Situations [#11]

ABSTRACT. Perception, the process of comprehending and deriving meaning from one's surroundings, is fundamental to human decision-making. In this context, we explore the development of a robust perception model designed for mobile robots to facilitate effective human-robot communication and decision-making in dynamic and intricate scenarios. Our research focuses on integrating state-of-the-art sensing technologies to enhance human decision-making. The perception model incorporates multi-sensor fusion, utilizing LiDAR, cameras, and inertial sensors to create a dynamic representation of the environment. Object recognition and tracking algorithms further enable the robot to interpret the scene providing valuable insights for informed decision-making. This paper presents a novel perception model tailored for mobile robots, emphasizing its role in assisting humans during decision-making processes. Our preliminary model includes multi-sensor fusion, semantic scene analysis, and understanding, evaluated using existing SLAM datasets. Extending our perception models to domains such as disaster response, search and rescue, and industrial settings is a key objective. In that objective, a mobile robot serves as a valuable companion in helping navigation by providing timely and relevant information. The initial stage, results, and evaluation of our perception model are detailed in this paper. Despite the introduction of various tools and techniques, effective decision-making, based on perceived information, remains a challenging interdisciplinary endeavor. Achieving localization without GPS in a network of roads using stratified sequential importance sampling where the stratification levels are based on semantic object spaces in the map and on the running time, we articulate and describe the known environment, demonstrating the potential of our perception model.

Growth Mindset Emojifier Multimodal App [#12]

ABSTRACT. This research introduces the Growth Mindset Emojifier Multimodal App (GMEMA), a pioneering tool designed to revolutionize educational feedback through the integration of emojis. By leveraging the ChatGPT-4-Vision model, GMEMA provides emotionally nuanced feedback, fostering student engagement and promoting growth mindsets. The app's innova-tive features include automatic Java quiz assessment, customized feedback based on student performance, and motivational statements tailored to the extent of task completion. Initial results highlight a significant enhancement in emotional en-gagement and comprehension among students receiving emoji-enhanced feedback, affirming the critical role of emotional intelligence in educational technologies.

Dementia Detection with Phonetic and Phonological Features [#13]

ABSTRACT. Several previous studies showed that Dementia of type Alz-heimer’s Disease (AD) can impact the pronunciation skills of patients with this disease. However, to the best of our knowledge, no previous work about AD used features from the related linguistic levels of phonetics and phonology in a systematic way. This paper has a twofold goal. First, it aims to improve our understanding of the impact of AD on differ-ent aspects of the linguistic levels of phonetics and phonolo-gy. Second, it aims to demonstrate the feasibility of using phonetic and phonological features to build a machine learn-ing classifier, to automatically diagnose AD, based on lan-guage samples produced by AD patients and normal subjects. The ADDReSS challenge dataset is used for training and test-ing a binary classifier designed to diagnose AD. This dataset consists of transcripts of descriptions of the Cookie Theft pic-ture, produced by 54 subjects in the training part and 24 sub-jects in the test part. The number of narrative samples is 108 in the training set and 48 in the test set. Two machine learning experiments were conducted on the task of classifying tran-scribed speech samples with text samples that were produced by people with AD from those produced by normal subjects. The first experiment showed that, among all the subtypes of phonetic and phonological features covered in this paper, vowels provided the best classification performance. The second experiment that used four feature selection techniques showed with the adopted phonetic and phonological features provided about 0.87 F1 score, that is close to the 0.89 F1 performance reported in the address challenge by systems using multiple linguistic levels and machine learning techniques. This result confirms the importance of the covered features as indicators of dementia.

Intelligent Prevention of DDoS Attacks using Reinforcement Learning and Smart Contracts [#14]

ABSTRACT. (Distributed) Denial-of-Service (DoS/DDoS) attacks are among the most dangerous cybersecurity threats to computer networks. Lately, blockchain and artificial intelligence (AI) cyberdefense applications have successfully been implemented to identify attack patterns. This paper proposes a novel collaborative, blockchain-based multi-agent reinforcement learning (RL) cyberdefense method using smart contracts. Initial numerical experiments have shown that the agents quickly learn to predict attacks, which can lead to mitigating network-wide service disruptions.

Genetic algorithm feature selection resilient to increasing amounts of data imputation [#15]

ABSTRACT. This paper investigates the robustness of a genetic algo- rithm (GA) in feature selection across a dataset with in- creasing imputed missing values. Feature selection can be beneficial in predictive modeling to reduce compu- tational costs and potentially improve performance. Be- yond these benefits, it also enables a clearer understand- ing of the algorithm’s decision-making processes. In the context of real-world datasets that can contain miss- ing values, feature selection becomes more challeng- ing. A robust feature selection algorithm should be able to identify the key features despite missing data values. We investigate the effectiveness of this approach against two other feature selection algorithms on a dataset with increasingly imputed values to determine whether it can sustain good performance with only the selected fea- tures. Our results reveal that compared to the other two methods, the features selected by GA resulted in bet- ter classification performance across different imputa- tion rates and methods.

Human Cognition for Mitigating the Paradox of AI Explainability: A Pilot Study on Human Gaze-based Text Highlighting [#16]

ABSTRACT. Artificial Intelligence (AI) explainability plays a crucial role in fostering robust Human-AI Interaction (HAI). However, cir-cular reasoning compromises decision robustness due to limi-tations in existing AI explainability methods. To address this challenge, we propose leveraging human cognition to enhance explainability, aligning with analysis goals without relying on potentially biased labels. By developing text highlighting driven by human gaze patterns, our research demonstrates that human gaze-based text highlighting significantly reduces decision time for proficient readers, without significantly af-fecting accuracy or bias. This study concludes by emphasiz-ing the value of human cognition-based explainability in ad-vancing explainable AI (XAI) and HAI.

Advancing Precision Healthcare Analytics: Machine Learning Approaches for Diabetes Prognosis using the PIMA Indian Dataset [#17]

ABSTRACT. The prevalence of diabetes presents a significant global health challenge, necessitating effective prognostic tools for timely intervention. Harnessing machine learning offers a promising avenue for accurate prediction, aiding in early detection and prevention. This study delves into the development of machine learning models for diabetes prognosis, leveraging the PIMA Indian dataset. Emphasizing the importance of early detection, the research underscores diabetes as a modifiable condition through lifestyle adjustments. By analyzing diverse healthcare data, including electronic health records and imagery, machine learning algorithms can unveil latent patterns crucial for timely diagnosis.The research methodology encompasses a range of statistical and machine learning algorithms, applied to the PIMA dataset to identify optimal prediction methods.The findings highlight Deep Learning's (DL) efficacy in feature extraction and prediction accuracy, suggesting its potential for automated prognostic tools. Furthermore, the integration of omics data holds promise for enhancing DL model performance, paving the way for robust diagnostic solutions.

The Effectiveness of Autonomous Intersections in a City [#18]

ABSTRACT. Vehicle navigation on roads is a complex problem that will probably be solved by using artificial intelligence in key roles. Today, there are cars capable of autonomous driving, but they are dependent on an old infrastructure that primarily includes intersections designed for human drivers. This paper opens a new chapter in the area of autonomous intersection management. Most research to date has looked at implementing a solution for a single intersection. We have created a simulation that runs in real-time, where up to several dozen intersections appear side by side. In this work, we conduct experiments to test the deployment of the autonomous algorithm in a city along with traffic lights. Autonomous intersections win with their efficiency, and in case of a limited budget, it's most advantageous to deploy them at the busiest intersections.

Terrain-Aware Military Planning Agents [#19]

ABSTRACT. Terrain is important, and often decisive, in military battles. This concept recurs across numerous historical examples, military theorists, and doctrinal manuals used by armies around the world. This poster discusses how Mission Command Agents, automated agents used to represent the behaviors of military forces in combat simulations, achieve this understanding of the importance of terrain. First, a review of tactical manuals from different nations consistently identified the importance of terrain to observation, fires, and mobility. Based on the unit’s mission, the geography afforded, or prevented, these activities with respect to the enemy force. The next step was to build an automated tool that could quickly calculate these effects based on the anticipated positions. The third step was to formulate these quantifiable terrain effects as objectives in a multi-objective search heuristic so that different places could be compared with each other. After excellent results with these techniques in a realistic military planning scenario, the team is further enhancing human-machine collaboration in this area by adding geospatial and military characteristics of terrain to an existing standard for Command and Control – Simulation Interoperation (C2SIM). With this enhancement, the reasoning employed by the agents is more explainable to military observers. It also allows military experts to adjust agent behaviors by adjusting their goals, without the need to change source code, while using a user interface that they will be familiar with.

AI Tutor: Student’s Perceptions and Expectations of AI-Driven Tutoring Systems: A Survey-Based Investigation [#20]

ABSTRACT. Generative AI (GenAI) and LLMs have started to influence how teachers teach and students learn, including the ones in programming languages and tutoring. However, there have been debates on whether AI could be beneficial to students’ learning or not. One way to see this issue is from the perspectives of the students. Therefore, this study explored how students perceive the use of AI in their education. The data was gathered through interviews with 62 students and other stakeholders, such as instructors and IT specialists. The results showed that the students positively perceived using AI as a tutor. Moreover, this study suggests several things to consider when integrating AI tutors for programming. The findings reveal positive student perceptions regarding AI’s potential within the teaching-learning process. Students envision AI tutors offering personalized assistance, adapting to individual learning styles, and providing immediate feedback, potentially augmenting their grasp of programming concepts. We applied Statistical analysis, machine learning, and natural language processing techniques such as PCA, t-SNE, LDA, and sentiment analysis.

Exploring Contrastive Learning Neural-Congruency on EEG Recording of Children with Dyslexia [#21]

ABSTRACT. Electroencephalogram (EEG) recordings of children are often used to study the underlying neural basis of causal factors of reading disorders and dyslexia, such as phono-logical ability and naming speed. However, the inter-subject variability in EEG and the unconstrained nature of reading experiments used to elicit and measure these factors made it challenging for traditional EEG analysis methods to extract neural components of these factors. In this work, we aim to explore the use of novel deep neural network architectures and contrastive learning methods to overcome the methodological limitations of tradition-al techniques and enhance the extraction process of neural components during reading tasks. The vision is to expand the use of contrastive learning methods, commonly used in many areas in computer vision, to EEG recordings. Motivated by the success of the recently proposed neural-congruency framework for analysis EEG - a method that extracts neural components based on the similarity of EEG responses across subjects - we propose formulating this extraction process in the deep neural framework. Notably, we formulate a neural network architecture to extract EEG embedding using contrastive loss that maximizes the neural congruency in non-dyslexic children compared to children with dyslexia. We plan to evaluate our approach on three EEG datasets involving children with dyslexia performing Rapid Automatized Naming (RAN) and Phonological Processing (PA) tasks. The proposed contrastive learning framework will pro-vide an enhanced tool to facilitate studying the neural underpinnings of naming speed and their association with reading performance and related difficulties.

Game Theoretic Analysis of the Feedback Loop Caused by Widely Available Computer Estimation on Market Values [#22]

ABSTRACT. Public availability of computer generated predictions can change the markets and its impact is here investigated with a game theoretic approach. Real estate inflation is not a new phenomenon but its consistent and almost monotonous persistence over unusually many years, coinciding with new prominence of public estimation information from successful Mass Real Estate Estimators (\MREE{}) already caused various independent research organizations to investigate potential links. What we model is a repetitive theoretical game between the MREEs and the home owners, where each player has secret information and expertise. In contrast to competing results, for our model and simulations a restriction of MREE-style price estimation availability to opt-in properties may help partially reduce an inflationary pressure.

Image Interpretation Confusion Resolution by Collaboration [#23]

ABSTRACT. Visual scene understanding can benefit from inputs provided by multiple participants with their different perspectives, and a distributed version of a modified Waltz filtering enriched with modern AI inferences can potentially help accuracy and speed trade-offs by exploiting the simultaneous perspectives and logic. Speed is improved by the contribution of the implicit parallelisation in processing. Accuracy improvements are expected from updating constraints with novel and more powerful inferences that the participants can apply. Automatically understanding scenes is a highly relevant problem. Multiple robots communicate with one another to classify shapes of edges of an object. Local reasoning can reduce communication latency.

Government Health Communication During the COVID-19 Pandemic: A BERT Topic Modeling Approach [#24]

ABSTRACT. Different levels of government agencies have exerted great effort to communicate with the public during the Covid-19 pandemic on multiple social media platforms. This study uses BERT topic modeling, an artificial intelligence model, to extract topics from various public health agencies of cities, states and the federal government on Twitter and Facebook for the years 2020 and 2021. We contrast and compare major topics addressed by these agencies related to Covid-19 and the pandemic across the two major social media platforms. The findings show how we can employ BERT topic modeling to extract social media topics during a health emergency and evaluate the extent to which topics covered by these agencies address the major social and health concerns of the pandemic.

How did an election fraud narrative spread online? Testing theories using Artificial Intelligence [#25]

ABSTRACT. In this study, we investigated 1) how election fraud narratives propagated on social media, and 2) the role of influential actors in the process of building and spreading the election fraud narratives. We applied machine learning (ML) and artificial intelligence (AI) methods to analyze Twitter data on an election fraud narrative after the 2020 Presidential election. We identified influential actors and found evidence for the former President’s use of social media to cue group identity.

Moving to Advanced Research on Human Machine Interface Design [#26]

ABSTRACT. This poster paper will discuss a variety of items related to Human-Machine Teaming and research in support in-creasing control of autonomous machines present in phys-ical problem domains of interest. Many military tasks can be decomposed into their primary elements – intelligence preparation, reconnaissance, movement, maneuver, fires, and and support across the combat domains of interest – air, land, sea, etc. There are an increasing number of au-tonomous and semi-autonomous ground-based systems available, such as the Multi-Utility Tactical Transport (MUTT) Unmanned Ground Vehicle, for movement of materials, or Quadrupedal Unmanned Ground Vehicle (QUGV) for the disposal of explosive ordinance, comple-mented by aerial platforms considering of a wide variety of Unmanned Aerial Systems (UAS) for the gathering and transmitting of information, held together by a common backbone and network. The preponderance of new systems and capabilities brings new issues, one of which is critical to research - how to control and manage a large number of systems. Commercial systems of significantly less capability, such as light-up drone shows, involve approximately 2 people per 100 drones; meaning that a single controller for a sin-gle drone is simply a non-starter. How can research be applied to scale the complexity of operations upwards without additional demands of personnel? This poster discusses several portions of early-stage research into a variety of applications, including: • Reasoning about mixed-team processes, includ-ing re-al/synthetic teammates, and how infor-mation gathered about the use of synthetic teammates within simulation can be transitioned into utilization for a robotic teammate • Embedding affective information into dialogue chan-nels in order to save cognitive bandwidth • Repurposing of foundational dialogue models for spe-cific tasks and purposes • Utilizing psychological research to design sys-tems in order to infer user intent • Theory of mind research for autonomous systems re-garding their human operators • Simulated environments and agents in order to test the simulations in representative areas. • Common-Sense reasoning augmented by Large Lan-guage Model technology for instructing ro-botic platforms The following things will be discussed at the poster ses-sion, representing a portfolio of ongoing work addressing the problems of large-scale multi-agent command and control. While the near-term application of these tech-nologies is into military problem domains; the poster presentation is cleared for public release.

AI-Driven Emergency Patient Flow Optimization is Both an Unmissable Opportunity and a Risk of Systematizing Health Disparities [#27]

ABSTRACT. There is a burgeoning interest in harnessing artificial intelligence (AI) to enhance patient flow within emergency departments (EDs). However, this advancement is accompanied by a significant risk: by relying on historical healthcare data, these AI tools may perpetuate existing systemic biases associated with gender, ethnicity, age, and socioeconomic status. This paper presents a comprehensive literature survey aimed at identifying potential sources of bias within ED data. These insights are crucial for the development of responsible AI-based solutions intended to optimize ED workflows.

Your Brain on STEM Video Lessons: Exploring Neurophysiological Pat-terns and Educational Engagement to Video Content [#28]

ABSTRACT. The COVID-19 pandemic catalyzed a significant shift towards online learning, revealing the potential of online educational videos as an educational tool in the post-crisis era. As we venture through this evolved education-al terrain, it becomes crucial to understand the effective-ness and impact educational video content has on students' engagement and performance. This research explores how different styles of educational video content on STEM (Science Technology, Engineering, and Mathematics) topics impact students' engagement and comprehension using Electroencephalography (EEG) and eye-tracking data of participants viewing the educational video. In particular, we propose a machine learning driven analysis framework to study which EEG and eye-gaze-based metrics are informative of students' engagement and attention to STEM-related educational videos and predict student-population-wide comprehension. Although still in the preliminary stages, our research endeavors to identify correlations between neurophysiological patterns and educational engagement across disciplines.

Enhancing Image Classification through Exploitation of Hue Cyclicity in Convolutional Neural Networks [#29]

ABSTRACT. This study introduces innovative methodologies for image classification employing Convolutional Neural Networks (CNNs) by leveraging the cyclical attributes of hue within the HSV color space. Two distinct kernels are explored to linearize the circular values of hue. The first kernel converts the angular values to three modulo distance values corresponding to three color hue points. The second kernel utilizes trigonometry to convert angles into sine and cosine linear values. Experimental evaluations demonstrate that linearizing hue values leads to a notable enhancement in classification accuracy. This research provides insights into optimizing CNN-based image classification by integrating hue cyclicity, thereby advancing the capabilities of computer vision systems.

Enhancing Nearest Neighbor Classification Performance through Dynamic Time Warping with Progressive Penalty [#30]

ABSTRACT. Dynamic time warping (DTW) is a widely used metric for comparing time series data, offering elasticity in alignment. While the original DTW allows infinite elasticity without penalty, the wDTW imposes a constant penalty regardless of elastic length. In this study, we propose DTW with a progressive penalty. Experimental evaluations across diverse time series datasets demonstrate the effectiveness of this approach, utilizing nearest neighbor classification. Optimal hyperparameters, including the number of neighbors and progressive weight factor, are jointly identified with the Minkowski p value using Gaussian Process. The proposed methodology shows promise for enhancing performance across various applications leveraging DTW.

Towards Youth Mental Health Support: Developing a Prototype AI Counselor [#31]

ABSTRACT. Many youths suffer from mental, emotional, and behav-ioral illnesses. Prevention and treatment require health professionals with empathetic and dedicated guidance that takes time and frequent interfaces. Frequent visits to the professionals are not always afforded for many youths. Automated AI chatbot can be a good alternative to visits to professionals. In this study, an AI chatbot is designed and implemented with youth engagement. Their en-gagement is aimed to identify the requirements for the automated consulting chatbot, and to generate responses that are using language style more relatable to youths. As a preliminary study, youth engagement was data analytics to understand drug overdose status. Their input was used to identify design requirements that can help the youth on drug issues. A prototype ChatDAKOTA is imple-mented, trained with youth generated response dataset to show the feasibility of AI-driven virtual counselor that generate responses to meet the needs of youths to cope with drug abuse or other behavioral and mental illness prevention and treatment.

Reinforcement Learning Agents with Generalizing Behavior [#32]

ABSTRACT. We explore the generality of Reinforcement Learning (RL) agents on unseen environment configurations by analyzing the behavior of an agent tasked with traversing a graph based environment from a starting position to a goal position. We find that training on a single task is likely to result in inflexible policies that do not respond well to change. Instead, training on a wide variety of scenarios offers the best chance of developing a flexible policy, at the expense of increased training difficulty. We also explain how these policies can be integrated into simulation or gaming environments.

Intelligent Infrastructure Facilitating Sequence Recommendation for Cybersecurity Education Systems [#33]

ABSTRACT. The ability to incorporate original and adapted data into query-based storage structures to provide dynamic and timely service to sequence recommendation systems is a continuous goal of learning management systems. This can be a challenging goal when data integrity and student privacy are paramount. We are developing a hybrid machine learning-assisted system (CyberTaliesin) for cybersecurity educational support. In this poster, we present the early building blocks of the system involving the use of federated knowledge graphs as a trusted knowledge source capable of learning from “less restricted” models such as large language models. How can integrating these tools yield a flexible system that improves sequence recommendations, facilitates concepts such as adaptive and personalized learning, and achieves improved competency-based educational outcomes?

Mobile Fog AI for Internet of BattleField Things (IoBT) [#34]

ABSTRACT. New mechanisms provide soldiers automation capabilities that include interactive, mobile, fast, scalable, secured, powerful computational communication in contested environments such as battle fields. Internet of BattleField Things (IoBT) is a technology that develops integrated Internet of Things (IoT) solutions for the war of the future. The IoBT connects soldiers with smart technology in armour, radios, weapons, munitions, other objects such as health sensors, GPS, and various IoT devices. These ``things'', objects, or constrained devices can communicate with each other or across existing network infrastructure such as the Internet or network meshes. They most of the time come with limited capacity in terms of memory storage, computational power and energy. IoT, networking (Device Communication (D2D), Peer to Peer Communication (P2P)), machine intelligence, multimodal sensing and fusion are some areas involved in developing interdisciplinary IoBT networks. Our aim is to equip soldiers with AI powered IoBT devices. Existing D2D/P2P/LPWAN based IoBT technology focuses on edge computing devices which are very limited, being slow, insecure, and lack sufficient scalability. We present a novel concept, i.e., Mobile Fog AI for IoBT to overcome existing IoBT limitations.

Opinion Identification using a Conversational Large Language Model [#35]

ABSTRACT. The paper focuses on testing the use of conversational Large Language Model (LLM), in particular chatGPT and Google models, instructed to assume the role of linguistics experts to produce opinions. In contrast to knowledge/evidence-based objective factual statements, opinions are defined as subjective statements about animates, things, events or properties in the context of an Opinion (Speech) Event in a social-cultural context. Taxonomy distinguishes explicit (direct/indirect) and implicit opinions (positive, negative, ambiguous, or balanced). Contextually richer prompts at the LLMs training phase are shown to be needed to deal with variants of implicit opinion scenario types.

Training Reinforcement Learning Agents to React to an Ambush for Military Simulations [#36]

ABSTRACT. There is a need for realistic Opposing Forces (OPFOR) behavior in military training simulations. Current training simulations generally only have simple, non-adaptive behaviors, requiring human instructors to play the role of OPFOR in any complicated scenario. This poster addresses this need by focusing on a specific scenario: training reinforcement learning agents to react to an ambush. It proposes a novel way to check for occlusion algorithmically. It shows vector fields showing the agent’s actions through the course of a training run. It shows that a single agent switching between multiple goals is possible, at least in a simplified environment. Such an approach could reduce the need to develop different agents for different scenarios. Finally, it shows a competent agent trained on a simplified React to Ambush scenario, demonstrating the plausibility of a scaled-up version.

Autonomous Underwater Robot for Water Quality Measurement of Northwest Pennsylvania Lakes [#37]

ABSTRACT. The Northwestern Pennsylvania Region, notably affected by toxic algae blooms like those in Lake Erie, suffers from a lack of adequate water quality monitoring tools. This project aims to bridge this gap through the development of cost-effective underwater robots designed to gather critical water quality data. The software developed for this project facilitates the generation of analytics essential for assessing the health of aquatic environments. The robots engineered through this project are set to be deployed by the local Conservation District for routine water quality monitoring activities.

An NLP Cluster Analysis of AI Ethics Syllabi [#38]

ABSTRACT. With new technology comes new responsibilities. Examining AI through an ethical lens has become increasingly important and significant. Numerous organizations, such as IEEE, have developed AI ethics guidelines. Additionally, academia is essential for fostering innovation and developing gifted people who work in both the ethical and technical spheres. Academia prepares students to be culturally responsive, have a collaborative mindset with interdisciplinary skills, and be aware of equity issues and other social problems that plague society. To assess the content in academia, a Natural Language Processing (NLP) analysis of AI ethics syllabi at the university level was conducted. This study can be described as a scoping review of 45 AI ethics syllabi that are publicly available and sourced online. Some important features captured from each syllabus are the course description, topics, department, and year. Using various NLP tools for analysis, a general exploration of AI ethics curricula was conducted. Through supervised and unsupervised clustering and Latent Dirichlet Analysis (LDA), various patterns in AI ethics syllabus contents were found. Some of these include patterns from syllabi across various academic departments and the pre-post Chat-GPT era. This study is insightful as it acts as a baseline for investigating various AI ethics topics that converge across academic departments, as well as uncovering potential gaps in AI ethics syllabus contents.

Aircraft Engine Remaining Useful Life Prediction Using Machine Learning [#39]

ABSTRACT. Knowing the Remaining Useful Life (RUL) of aircraft engines is of paramount importance in the aviation industry. RUL helps anticipate engine failures beforehand so that airlines can proactively schedule maintenance, optimize resource allocation, and reduce the risk of downtime. In this work, we consider two problems on the prediction of RUL: a binary classification problem to predict whether the engine will fail within a month, and a regression problem to predict the remaining number of operational cycles before engine failure. To this end, using the NASA C-MAPSS dataset, we trained a slew of machine learning models to address the aforementioned two problems. Our results show that the Long Short-Term Memory (LSTM) model performed the best on the binary classification problem with 0.95 precision, 0.88 recall and 0.91 F1-score on the test set, and that the Convolutional Neural Network (CNN) model best on the regression problem with 14.02 RMSE on the test set. The state-of-the-art paper documented an RMSE of 16.42. Our work not only surpassed the reference RMSE but also demonstrated superior predictive accuracy.

Predicting Solar Energy Output On Meteorological Time-Series Data Using Machine Learning [#40]

ABSTRACT. Solar energy production using photovoltaic (PV) systems is increasingly popular as a source of renewable energy for numerous applications. However, there is a main challenge with solar energy, namely, the unpredictability of its energy output. Therefore, accurate short-term predicting of the power output for PV systems is essential for effective decision making in power grid management. To this end, this paper focuses on training selected machine learning models, both traditional regression models and deep recurrent neural networks, to accurately predict solar energy output on meteorological time-series data from the Alice Springs solar farm in Australia. These machine learning models include linear regression, gated recurrent unit, recurrent neural network, long short-term memory, and random forest regression. The results of these tests showed that simple ensemble methods can out-perform powerful single models and that hyperparameter tuning can greatly improve the performance of a model.

Optimizing Dynamic Airlift Operations: Winning Strategies in the AFRL Airlift Challenge [#41]

ABSTRACT. The Air Force Research Laboratory (AFRL) has sponsored the Airlift Challenge over the past two years, aimed at addressing the dynamic airlift problem. The dynamic nature of the challenge included the random disappearance of graph edges to simulate adverse weather conditions and the spontaneous appearance of cargo requiring delivery. This poster presents the systems that won the 2023 challenge and currently leads the 2024 challenge. The initial approach focused on intelligent solutions for subtasks, or 'build-smart'. It soon became clear that the optimization of the scoring rate, points per second, was more important than single instance metric performance. In the subsequent competition, a 'build-fast' strategy was adopted due to this observation. This paper discusses the impact of iteration on algorithm selection for optimization problems and suggests considerations for structuring scoring processes in future competitions.

Evaluating Vision-Language Models on the TriangleCOPA Benchmark [#42]

ABSTRACT. The TriangleCOPA benchmark consists of 100 textual questions with videos depicting the movements of simple shapes in the style of the a classic social-psychology film created by Fritz Heider and Marianne Simmel in 1944. In our experiments, we investigate the performance of current vision-language models on this challenging benchmark, assessing the capability of these models for visual anthropomorphism and abstract interpretation.

Automated Mapping Tool from Moise+ to Colored Petri Nets [#43]

ABSTRACT. The demand for systems incorporating artificial intelligence, such as multi-agent systems, is continually increasing. Simultaneously, there is a growing need for developing tools to support this field, ensuring better fault tolerance within projects. This is particularly crucial given that these systems possess characteristics that render them non-deterministic, thereby amplifying the challenge of conducting tests. To address this challenge, a mapping tool has been developed. This tool automatically generates a graphical model, facilitating the identification of test paths for a given multi-agent system. It operates by taking an XML file from Moise+, an organizational model for multi-agent systems, and translates it into a colored Petri net. The resulting mapping serves as a foundation for generating test cases essential for validating the Moise+ model, guiding system testing. Automation streamlines the process, enhancing speed, and eliminating the potential for human error.

13:30-15:00 Session 3A: Tutorial 1
Location: Emerald E
13:30
Hands-On Introduction to Quantum Machine Learning

ABSTRACT. This tutorial will cover a hands-on introduction to quan- tum machine learning. Foundational concepts of quantum information science (QIS) will be presented (qubits, single and multiple qubit gates, measurements, and entanglement). Building on that, foundational concepts of quantum machine learning (QML) will be introduced (parametrized circuits, data encoding, and feature mapping). Then, QML models will be discussed (quantum support vector machine, quantum feedforward neural network, and quantum convolutional neu- ral network). All the aforementioned topics and concepts will be examined using codes ran on a quantum computer simu- lator. All the covered materials assume novice audience who are interested to learn about QML. Further reading and soft- ware packages and frameworks will also be shared with the audience.

13:30-15:00 Session 3B: Tutorial 2

Ethics of AI explained (Clayton Peterson, PhD)

Location: Emerald D
13:30
Ethics of AI Explained

ABSTRACT. This tutorial introduces participants to the main issues and themes pertaining to ethics of artificial intelligence (AI). Analyzing what ethics of AI is by reflecting on our understanding of both ethics and AI, the aim is to clarify and expose how ethics of AI can be conceived in different ways depending on the approach one adopts. Through the use of concrete examples, participants will be introduced to various issues in the ethics of AI literature, allowing them to reflect upon their own research and practice.

13:30-15:00 Session 3C: Tutorial 3

An Introduction to Probabilistic Graphical Models (Luigi Portinale, PhD)

Location: Emerald B
13:30
An Introduction to Probabilistic Graphical Models

ABSTRACT. This tutorial covers an introduction to Probabilistic Graphical Models (PGM), such as Bayesian Networks and Markov Random Fields, for reasoning under uncer- tainty in intelligent systems. Basic terminology, formal concepts, representational and inference issues will be discussed, starting from basic notions about probabil- ity theory, in such a way that the novice and the less skilled in the field will be able to follow the details. Further reading and software packages and frameworks will also be discussed

15:30-17:00 Session 4A: Security, Trust, & XAI-1
Location: Emerald A
15:30
Evolution of Norms for a Trustworthy AI Society and Our Responsibilities and Roles

ABSTRACT. Artificial Intelligence (AI) holds promising opportunities for enhancing human life, yet its pervasive influence on society also introduces unintended negative consequences. Throughout the 2010s, a growing recognition of ethical concerns in AI led to the formulation of principles emphasizing ethical research and development. Various global entities have since endeavored to establish norms for AI ethics. Recently, beyond the declaration of ethical principles, concrete practical guides and codes of conduct are emerging, and in some countries, they are developing into specific legal regulatory discussions. This paper provides a detailed examination of the evolutionary trajectory of AI norms and further explores the responsibilities and roles of diverse stakeholder groups, including developers, suppliers, and users, in ensuring the trustworthiness of AI. The analysis aims to move beyond abstract ethical principles to actionable norms, emphasizing the need for a holistic societal response.

15:50
Enhanced Multimodal Content Moderation of Children's Videos using Audiovisual Fusion

ABSTRACT. Due to the rise in video content creation targeted towards children, there is a need for robust and efficient content moderation schemes for video hosting platforms. A video that is visually benign may include audio content that is inappropriate for young children while being impossible to detect with a unimodal content moderation system. Popular video hosting platforms for children such as YouTube Kids still publish videos that contain subtle audio content, which is not conducive to a child's healthy behavioral and physical development. A robust classification of malicious videos requires audio representations in addition to video features. However, recent content moderation approaches rarely employ multimodal architectures that explicitly consider non-speech audio cues. To address this, we present an efficient adaptation of CLIP (Contrastive Language–Image Pre-training) that can leverage contextual audio cues for enhanced content moderation. We incorporate 1) the audio modality, and 2) prompt learning, while keeping the backbone modules of each modality frozen. We conduct our experiments on a multimodal version of MOB (Malicious or Benign) dataset in supervised and few-shot settings. Our code and dataset will be made publicly available.

16:10
Enhancing Explainability in Predictive Maintenance: Investigating the Impact of Data Preprocessing Techniques on XAI Effectiveness

ABSTRACT. In predictive maintenance, the complexity of the data (e.g. multivariate time series) often requires the use of deep learning models. These models, called "black boxes", have proved their worth in predicting the Remaining Useful Life of industrial machines (RUL). However, the inherent opacity of these models requires the incorporation of post-hoc explanation methods to enhance interpretability. The quality of the explanations provided is then assessed using so-called evaluation metrics. Modeling is a whole process that includes an important data pre-processing phase, with parameter selection such as time window, smoothing parameter, or rectified RUL when dealing with time series data. We propose to analyze the impact of these pre-processing choices on the quality of explanations provided by the local post-hoc models LIME, SHAP, and L2X, using six evaluation measures : stability, consistency, congruence, selectivity, completeness, and acumen, based on the NASA dataset (C-MAPSS) with LSTM model. Our findings show that certain choices of pre-processing parameters can significantly improve predictive performance. Moreover, the quality of explanations depends on the explainability methods chosen. Furthermore, a factorial analysis of the evaluation measures reveals that they do not all point in the same direction. Indeed, understanding the nuanced relationships between evaluation measures is essential for a comprehensive and accurate assessment of explainability methods.

16:30
Effects of Matching on Measurements of Accuracy, Fairness, and Fairness Impossibility

ABSTRACT. Statistical matching has recently been proposed as a means of addressing fairness impossibility (Beigang 2023). In particular, it has been suggested on conceptual grounds that when matched rather than unmatched datasets are analyzed, the tradeoff between equalized odds (EO) and positive predictive value (PPV) will be reduced. In this study we evaluate matching as a practical rather than merely conceptual approach to reducing fairness impossibility. As a test case we conduct pre-matched and post-matched fairness analyses on the well-known COMPAS dataset from Broward Co., Florida, 2013-2014 (Larson et al. 2016). We then reflect on what the results suggest regarding the extent to which matching enables (a) more precise measurement of the accuracy of a classifier, (b) more precise measurements of the fairness of a classifier, and (c) reduced difference between fairness metrics (in particular, equalized odds [EO] and positive predictive value [PPV]) – that is, the extent to which it solves the “fairness impossibility” problem. We conclude that matching is a promising avenue for improved fairness evaluations on all of these fronts, but that it faces limitations and challenges under conditions extremely common to ML evaluation contexts such as incomplete data, possible confounding variables, and possibly non-independent parameters.

15:30-17:00 Session 4B: Uncertain Reasoning
Location: Emerald D
15:30
Efficient Detection of Exchangeable Factors in Factor Graphs

ABSTRACT. To allow for tractable probabilistic inference with respect to domain sizes, lifted probabilistic inference exploits symmetries in probabilistic graphical models. However, checking whether two factors encode equivalent semantics and hence are exchangeable is computationally expensive. In this paper, we efficiently solve the problem of detecting exchangeable factors in a factor graph. In particular, we introduce the detection of exchangeable factors (DEFT) algorithm, which allows us to drastically reduce the computational effort for checking whether two factors are exchangeable in practice. While previous approaches iterate all O(n!) permutations of a factor’s argument list in the worst case (where n is the number of arguments of the factor), we prove that DEFT efficiently identifies restrictions to drastically reduce the number of permutations and validate the efficiency of DEFT in our empirical evaluation.

15:50
Intrinsic Prioritization in Answer Set Programming Based on an Adapted Notion of Tolerance

ABSTRACT. Answer set programming (ASP) is a declarative programming language suited to solve complex combinatorial search problems. Prioritized ASP is the subdiscipline of ASP which aims at prioritizing the models (answer sets) of ASP programs. Common approaches to prioritized ASP require an additional input, sometimes expressed as special literals within the ASP program, that guides the prioritization of the answer sets. In this paper, to the best of our knowledge, we propose the first approach that is able to prioritize answer sets solely based on the original ASP program. For this, we adapt the notion of tolerance from conditional reasoning according to System Z and establish tolerance partitions of ASP programs which can be used for prioritization.

16:10
Initial Goal Allocation for Multi-agent Systems

ABSTRACT. In the multi-agent environment, a human expert engages in the allocation of objectives among individual agents. However, autonomous agents need to determine and allocate objectives without external intervention from humans. Therefore, in this research, we attempt to solve initial goal distribution challenges in multi-agent settings by developing two goal allocation algorithms. The primary objectives are to find cost-effective goal solution sets and distribute them evenly among available agents. We introduce two algorithms when the goals are structured in a hierarchical goal tree structure, and then test their efficiency across a variety of baseline allocation methods. Both algorithms were able to increase the performance of agents in multi-agent settings by finding the most optimal distributions of goals and allowing agents to act independently from human intervention.

16:30
Causal Unit Selection using Tractable Arithmetic Circuits

ABSTRACT. The unit selection problem aims to find objects, called units, that optimize a causal objective function which describes the objects' behavior in a causal context (e.g., selecting customers who are about to churn but would most likely change their mind if encouraged). While early studies focused mainly on bounding a specific class of counterfactual objective functions using data, more recent work allows one to find optimal units exactly by reducing the causal objective to a classical objective on a meta-model, and then applying a variant of the classical Variable Elimination (VE) algorithm to the meta-model---assuming a fully specified causal model is available. In practice, however, finding optimal units using this approach can be very expensive because the used VE algorithm must be exponential in the constrained treewidth of the meta-model, which is larger and denser than the original model. We address this computational challenge by introducing a new approach for unit selection that is not necessarily limited by the constrained treewidth. This is done through compiling the meta-model into a special class of tractable arithmetic circuits that allows the computation of optimal units in time linear in the circuit size. We finally present empirical results on random causal models that show order-of-magnitude speedups based on the proposed method for solving unit selection.

15:30-17:00 Session 4C: Main-1
Location: Emerald B
15:30
TemporalAugmenter: An Ensemble Recurrent Based Deep Learning Approach for Signal Classification

ABSTRACT. Ensemble modeling has been widely used to solve complex problems as it helps to improve overall performance and generalization. In this paper, we propose a novel TemporalAugmenter approach based on ensemble modeling for augmenting the temporal information capturing for long-term and short-term dependencies in data integration of two variations of recurrent neural networks in two learning streams to obtain the maximum possible temporal extraction. Thus, the proposed model augments the extraction of temporal dependencies. In addition, the proposed approach reduces the preprocessing and prior stages of feature extraction, which reduces the required energy to process the models built upon the proposed TemporalAugmenter approach, contributing towards green AI. Moreover, the proposed model can be simply integrated into various domains including industrial, medical, and human-computer interaction applications. Our proposed approach empirically evaluated the speech emotion recognition, electrocardiogram signal, and signal quality examination tasks as three different signals with varying complexity and different temporal dependency features.

15:50
Improving Axial-Attention Network via Cross-Channel Weight Sharing

ABSTRACT. In recent years, Hypercomplex-inspired neural networks improved deep CNN architectures due to their ability to share weights across input channels and thus improve cohesiveness of representations within the layers. The work described herein studies the effect of replacing existing layers in an Axial Attention ResNet with their quaternion variants that use cross-channel weight sharing to assess the effect on image classification. We expect the quaternion enhancements to produce improved feature maps with more interlinked representations. We experiment with the stem of the network, the bottleneck layer, and the fully connected backend, by replacing them with quaternion versions. These modifications lead to novel architectures which yield improved accuracy performance on the ImageNet300k classification dataset. Our baseline networks for comparison were the original real-valued ResNet, the original quaternion-valued ResNet, and the Axial Attention ResNet. Since improvement was observed regardless of which part of the network was modified, there is a promise that this technique may be generally useful in improving classification accuracy for a large class of networks.

16:10
On GAN-based Data Integrity Attacks Against Robotic Spatial Sensing

ABSTRACT. Communication is arguably the most important way to enable cooperation among multiple robots. In numerous such settings, robots exchange local sensor measurements to form a global perception of the environment. One example of this setting is adaptive multi-robot informative path planning, where robots' local measurements are ``fused'' using probabilistic techniques (e.g., Gaussian process models) for more accurate prediction of the underlying ambient phenomena. In an adversarial setting, in which we assume a malicious entity--the adversary--can modify data exchanged during inter-robot communications, these cooperating robots become vulnerable to data integrity attacks. Such attacks on a multi-robot informative path planning system may, for example, replace the original sensor measurements with fake measurements to negatively affect achievable prediction accuracy. In this paper, we study how such an adversary may design data integrity attacks using a Generative Adversarial Network (GAN). Results show the GAN-based techniques learning spatial patterns in training data to produce fake measurements that are relatively undetectable yet significantly degrade prediction accuracy.

16:30
Enhancing Time-Series Prediction with Temporal Context Modeling: A Bayesian and Deep Learning Synergy

ABSTRACT. In time-series classification, conventional deep learning methods often treat continuous signals as discrete windows, each analyzed independently without considering the contextual information from adjacent windows. This study introduces a novel, lightweight Bayesian meta-classification approach designed to enhance prediction accuracy by integrating contextual label information from neighboring windows. Alongside training a deep learning model, we construct a Conditional Probability Table (CPT) during training to capture label transitions. During inference, these CPTs are utilized to adjust the predicted class probabilities of each window, taking into account the predictions of preceding windows. Our experimental analysis, focused on Human Activity Recognition (HAR) time series datasets, demonstrates that this approach not only surpasses the baseline performance of standalone deep learning models but also outperforms contemporary state-of-the-art methods that integrate temporal context into time series prediction.

15:30-17:00 Session 4D: Main-2
Location: Emerald E
15:30
Determining Legal Relevance with LLMs using Relevance Chain Prompting

ABSTRACT. In legal reasoning, part of determining whether evidence should be admissible in court requires assessing its relevance to the case, often formalized as its probative value---the degree to which its being true or false proves a fact in issue. However, determining probative value is an imprecise process and must often rely on consideration of arguments for and against the probative value of a fact. Can generative language models be of use in generating or assessing such arguments? In this work, we introduce relevance chain prompting, a new prompting method that enables large language models to reason about the relevance of evidence to a given fact and uses measures of chain strength. We compare this to the chain-of-thought prompting method and explore different methods for scoring a relevance chain grounded in the idea of probative value. Additionally, we evaluate the outputs of large language models with ROSCOE metrics and compare the results of our new prompting method to chain-of-thought prompting. We test the prompting methods on a dataset created from the Legal Evidence Retrieval dataset.

15:50
LLM Augmented Hierarchical Agents

ABSTRACT. Solving long horizon temporally extended tasks using Reinforcement Learning (RL) is extremely challenging, compounded by the common practice of learning without prior knowledge (or tabula rasa learning). Humans can generate and execute plans with temporally extended actions and learn to perform new tasks because we almost never solve problems from scratch. We want autonomous agents to have the same capabilities. Recently, LLMs have shown to encode tremendous amount of knowledge about the world and impressive in-context learning and reasoning capabilities. However, using LLMs to solve real world tasks is challenging as these models are not grounded in the current task. We want to leverage the planning capabilities of LLMs while using RL to provide the essential environment interaction. In this paper, we present a hierarchical agent which uses LLMs to solve long horizon tasks. Instead of completely relying on LLMs, we use them to guide the high-level policy making them significantly more sample efficient. We evaluate our method on simulation environments such as MiniGrid, SkillHack, Crafter and on a real robot arm in block manipulation tasks. We show that agents trained using our method outperform other baselines methods and once trained, they don't depend on LLMs during deployment.

16:10
Bridging the Knowledge Gap: Improving BERT models for answering MCQs by using Ontology-generated synthetic MCQA Dataset

ABSTRACT. BERT-based models possess impressive language understanding capabilities but often lack domain-specific knowledge, limiting their performance on specialised tasks such as medical multiple-choice question answering (MCQA). In this paper, we study how biomedical ontologies, rich repositories of medical knowledge, can be harnessed to enhance BERT-based models for medical MCQA task. Our contributions include OntoMCQA-Gen, a system which leverages different biomedical ontologies to construct BioOntoMCQA, a large synthetic MCQA dataset. OntoMCQA-Gen exploits the subclass-class relationships, definitions of concepts, and also synonym relationships from the ontologies to create this dataset of MCQs automatically. We then use this synthetic dataset to fine-tune various BERT-based models to answer medical MCQs. We evaluated these fine-tuned BERT models on the challenging MedMCQA and MedQA datasets of questions from admission examinations for medical degrees in India and USA respectively. Our evaluation study on the evaluation datasets shows that fine-tuning the BERT-based models on BioOntoMCQA results in significantly improved accuracy scores. BioBERT and PubMedBERT, pretrained on the large medical corpus, have also shown significant improvements with our technique of fine-tuning ontology-generated synthetic data. This finding highlights the effectiveness of incorporating biomedical ontologies to enhance the BERT-based model in the medical domain. Moreover, our results underscore the importance of using ontology-generated data along with model adaptation for specialised domains, contributing to a novel advancement in natural language processing.

16:30
Smart Sampling: Self-Attention and Bootstrapping for Improved Ensembled Q-Learning

ABSTRACT. We present a novel method aimed at enhancing the sample efficiency of ensemble Q learning approach. Our proposed approach integrates multi-head self-attention into the ensembled Q networks while bootstrapping the state-action pairs ingested by the ensemble. This not only results in performance improvements over the original REDQ (Chen et al. 2021) and its variant DroQ (Hiraoka et al. 2022), thereby enhancing Q predictions, but also effectively reduces both the average normalized bias and standard deviation of normalized bias within Q-function ensembles. Importantly, our method also performs well even in scenarios with a low update-to-data (UTD) ratio. Notably, the implementation of our proposed method is straightforward, requiring minimal modifications to the base model.