ICAICTA 2025: 2025 12TH INTERNATIONAL CONFERENCE ON ADVANCED INFORMATICS: CONCEPT, THEORY AND APPLICATION (ICAICTA)
PROGRAM FOR SATURDAY, SEPTEMBER 20TH
Days:
next day
all days

View: session overviewtalk overview

10:30-12:00 Session 4A: Prediction
Location: Room 7603
10:30
Predicting the Quality of Product Images in E-Commerce using Deep Learning

ABSTRACT. Product images play a critical role in e-commerce, providing customers with relevant information about the quality and appearance of products. To ensure that product images meet certain standards, it is important to define what quality means and to look for solutions that can improve the aesthetics and customer experience of e-commerce websites. In this paper, we present an approach for predicting the quality of product images in e-commerce using a convolutional neural network (CNN) trained on user-generated photos. Our approach outperformed other methods that rely on feature extraction and classical machine learning algorithms and introduced a quality score on a scale of 0 to 1, which led to more accurate predictions of the mean opinion score from human observers.

10:45
Prediction of MMP-9 Inhibitors as Anticancer Therapeutics by Using 1D-CNN Optimized by Monarch Butterfly Algorithm

ABSTRACT. The development of therapeutic drugs targeting MMP-9 inhibitors has shown potential in anticancer treatment. Matrix Metalloproteinase-9 (MMP-9) inhibitors are biomolecules that offer potential as novel anticancer therapies. The conventional development of therapeutic drugs for cancer faces significant challenges due to the high average cost, and an alternative solution is the use of Machine Learning. This study proposes a deep learning approach using a one-dimensional Convolutional Neural Network (1D-CNN) optimized with the Monarch Butterfly Optimization (MBO) algorithm to predict the activity of MMP-9 inhibitor compounds. The dataset consisted of 1,123 samples classified based on pIC50 values. Three baseline 1D-CNN architectures were evaluated and subsequently optimized through MBO with varying population sizes. The results demonstrate that deeper CNN architectures improve classification performance, while MBO significantly enhances convergence and accuracy. Among the optimization schemes, the third scheme, using a population size of 25, achieved the highest test accuracy of 0.7747 and F1-score of 0.7744, indicating superior generalization. These findings highlight the effectiveness of combining 1D-CNN and MBO for efficient and accurate anticancer compound prediction.

11:00
Predictive Modeling of Cathepsin K Inhibitors Bioactivity for Osteoporosis Treatment by using Grey Wolf Optimizer-Support Vector Machine
PRESENTER: Marcel Tobing

ABSTRACT. Osteoporosis is a condition when people diagnosed with it experience a bone mass loss due to the bone mineral density levels are below the standard. Cathepsin K, an enzyme involves in bone modelling process, is one of the causing factors of osteoporosis. The negative side effects of the existing osteoporosis treatment and Cathepsin K’s role in causing osteoporosis make Cathepsin K as the target of new osteoporosis treatment, named Cathepsin K inhibitor. However, the complex and resource significant nature of drug development raising challenges in the development of Cathepsin K inhibitor. This study aims to provide an alternative by using machine learning approach to overcome these challenges. This study develops a predictive modelling of Cathepsin K bioactivity using Grey Wolf Optimizer-Support Vector Machine. Feature selection process is performed using Grey Wolf Optimizer to find the most influential features in the dataset. The SVM model proposed in this study predicts bioactivity class of Cathepsin K inhibitors which obtained from ChEMBL database. The results show that optimized SVM model with linear kernel achieves the accuracy of 0.996 and F1-score of 0.996. The GWO used in this study also proves its ability to reduce the size of the dataset significantly without compromising model performance.

11:15
Survival Analysis Models for Mortality Risk Prediction During the Pandemic: A Comparative Study Using Cox Proportional Hazards, Random Survival Forests, and DeepSurv
PRESENTER: Fitriah Fadillah

ABSTRACT. As the COVID-19 pandemic continues to place immense pressure on healthcare systems globally, there is an urgent need for analytical tools that can identify key factors affecting patient outcomes and support clinical decisionmaking. This study analyzes critical factors influencing outcomes among COVID-19 patients by comparing the Cox proportional hazards (CPH), random survival forests (RSF), and DeepSurv (DS) survival analysis models, using data from RSUD Dr. Kanujoso Djatiwibowo, a large public referral hospital in Balikpapan, Indonesia. Results show that RSF achieves the best performance (C-index = 0.8462; AUC = 0.7186). The analysis further reveals that older male patients requiring mechanical ventilation and ICU care experience significantly reduced survival probabilities, impacting both length of hospital stay and mortality risk. Additionally, oxygen saturation consistently emerges as the most influential predictor of mortality risk across all three models, highlighting its strong association with other clinical variables. This study contributes by comparing various survival analysis models, incorporating not only clinical variables but also comprehensive historical health conditions such as comorbidities and physiological parameters, and offering insights for decision-makers to prioritize key risk factors and optimize patient management during the COVID-19 crisis.

11:30
ESG-IDX: A Framework to Utilize ESG-Related News Sentiment for Indonesia Stock Price Prediction

ABSTRACT. Environmental, Social, and Governance (ESG) has become popular for investment analysis. ESG shows companies’ decision-making regarding sustainability matters reflect on the stock performance. However, there is still limited research that integrates ESG-related news for stock price prediction. In this study, we propose a framework of ESG aspect-based sentiment analysis on Indonesian news to predict stock prices. Models based on Bidirectional Encoder Representations from Transformers (BERT) are used to classify ESG aspects from the news. Generative Large Language Models (LLMs) are used to classify sentiment from the news based on each ESG aspect. Bi-LSTM is used to predict stock prices using historical stock price data and ESG-related news sentiment. Our framework achieves better results than time series model without using ESG news sentiment with up to 31.11% decrease on RMSE. Our study shows that integration of ESG-related news can improve the performance of stock price prediction.

10:30-12:00 Session 4B: Natural Language Processing
Location: Room 7604
10:30
Understanding Evolving Sentiment Dynamics in Indonesian Disaster-Related Tweets using IndoBERT and Temporal Analysis

ABSTRACT. Social media platforms generate massive amounts of unstructured information which can provide valuable insights into public sentiments and perception in crisis situations. While previous research has effectively utilized social media data and demonstrated its value for either sentiment analysis or topic detection, there are still ways to utilize this data by combining these approaches. This study aims to build upon this by integrating aspect-sentiment analysis to capture more detailed and nuanced views often overlooked in disaster contexts. By leveraging fine-tuned pre-trained language models and categorization of tweets into different topics and sentiments, it uncovered sentiment patterns associated with various aspects and topics such as negative sentiments towards response efforts indicating public dissatisfaction or issues with aid. This comprehensive approach provides deep insights into the public perception of various topics which can enable authorities to better understand public needs and concerns.

10:45
Aligning Open-Source LLMs for Regulatory QA with AI Feedback

ABSTRACT. The growing complexity of financial regulations in Indonesia, particularly those issued by the Financial Services Authority (OJK), calls for more intelligent and automated compliance tools. This study presents a question-answering (QA) system using open-source Large Language Models (LLMs) finetuned on Indonesian regulatory data. The models are optimised through a combination of Supervised Fine-Tuning (SFT) and Direct Preference Optimisation (DPO), a reinforcement learning technique within the AI Feedback (RLAIF) framework. Evaluation results demonstrate that domain-specific SFT substantially enhances answer quality, and DPO provides additional alignment benefits in certain scenarios. This approach offers a scalable and efficient solution for regulatory comprehension and digital governance.

11:00
Application of XLM-RoBERTa and Llama-3 for Cross-Lingual and Multi-Label Text-Based Emotion Detection
PRESENTER: Bill Clinton

ABSTRACT. Emotions are complex psychological responses to significant internal or external stimuli. They manifest themselves in a variety of nuances, which are often not explicitly expressed in textual communication, but rather in the context and structure of language. The emotions expressed can also be more than one in an event, for example, happy and surprised. This makes it impossible for humans to always accurately guess the emotions that others are feeling. Although complex, emotions play an important role in everyday life, for example in the medical, education, and marketing industries. From this, the significance of the emotion detection model study can be seen. On the other hand, there is a wide variety of languages in the world. Data for model training tend to be less for less popular languages. Thus, models that can detect emotions in a cross-lingual and multi-label manner are needed. In this study, XLM-RoBERTa and Llama-3 were applied to address this challenge. Data from the BRIGHTER dataset, specifically from the SemEval 2025 event, were utilized. For training, the English portion of the dataset was employed, while inference was conducted on the Indonesian dataset, enabling a cross-lingual evaluation. To facilitate multi-label prediction, XLM-RoBERTa was specifically configured to predict five emotions: anger, fear, joy, sadness, and surprise. Concurrently, the disgust emotion was predicted using Llama-3. A promising macro-F1 score of 0.552 was obtained for the six label predictions, representing an increase of 0.176 over the baseline value of 0.376 reported by the SemEval 2025 Task 11 organizers.

11:15
Food Hazard Text Classification and Entity Detection using BERT Model

ABSTRACT. Monitoring food hazards throughout the food supply chain is essential to ensure the proper functioning of food safety management systems. Machine learning (ML) for food safety monitoring and prediction has been used in various studies. However, its application and exploration in text-based food hazard classification remain very limited. This study aims to address the current gap in the application of advanced machine learning—specifically transformer-based language models like Bidirectional Encoder Representations from Transformers (BERT)—to the domain of food hazard detection and classification. The system can do four subtasks: food hazard classification, food hazard entity detection, product classification, and product entity detection. The system uses preprocessing data methods such as Named Entity Recognition (NER) and data augmentation to further expand the model’s understanding of the data trained. The system also varies in its usage of BERT models and hyperparameters such as learning rate and epoch. The results show that the macro F1-score values increase according to the number of classes in each subtask. The more classes a subtask has, the lower the macro F1-score. All subtasks trained with the “bert-base-uncased” model had the highest macro F1-scores. The best model performance comes from the hazard classification with a macro F1-score of 0.7919. The macro F1-score improvement compared to the baseline model range from 0.20 to 0.48 depending on the trained and predicted subtask.

11:30
Exploring the Feasibility of Detecting Media Frame Indicators in Indonesian News Using Transformer-Based Language Models

ABSTRACT. This study explores the feasibility of using transformer-based language models to automate the detection of media frame indicators in Indonesian news articles. Leveraging the general framing framework by Semetko and Valkenburg, the research treats each of the 20 frame indicators as a binary classification task. Using a limited dataset of annotated news articles about two Indonesian public figures, three RoBERTa-based models and one multilingual XLM-R model are fine-tuned and evaluated. The results show varying degrees of classification performance, strongly correlated with the representation of positive samples in the dataset. The study demonstrates the potential of using contextualized language models, previously fine-tuned on IndoNLU and IndoNLI tasks, for identifying latent narrative structures in news text. This approach suggests a shift toward predictive media frame analysis in low-resource settings, with implications for real-time media monitoring and bias detection.

13:00-14:30 Session 5A: Speech, Summarization and Machine Translation
Location: Room 7603
13:00
Fine-Tuning Whisper for Domain-Specific ASR: Transcribing Indonesian YouTube Content on Local Wisdom in Disaster Mitigation

ABSTRACT. In Indonesia, YouTube is the prominent platform for sharing videos about life stories and insights. This platform holds significant potential for data mining applications, especially in extracting disaster mitigation insights rooted in local wisdom. However, converting video content into text remains challenging, as existing ASR systems struggle with Indonesian regional dialects, contextual terms, and audio disturbances typical of YouTube videos. To address these limitations, this study fine-tuned the pre-trained Whisper-small model using a domain-specific dataset derived from YouTube videos. Audio segments are extracted based on subtitle timestamps, with subtitle texts as the transcription labels. To ensure the quality of the data, label validation was performed by comparing subtitle annotations with zero-shot transcriptions generated by Whisper-large-v3. Augmentation techniques, such as generating clean vocal versions, were applied to improve audio clarity. Additionaly, secondary datasets were also included to maintain the flexibility of the model in common transcription scenarios. The experimental results show that the fine-tuned Whisper-small model significantly outperformed the original, reducing the word error rate (WER) from 41.04% to 13.10% on domain-specific test data. These findings suggest that fine-tuning Whisper with domain-specific targeted data and its acoustic characteristics can greatly improve transcription accuracy.

13:15
Numerical Consistency Verification and Correction in Abstractive Summaries Using Semantic and Syntactic Features

ABSTRACT. We address numerical factuality errors in abstractive summaries and present three automatic correction methods: Key Segment Dependency-based Summary Reviser (KSDR), which aligns dependency heads of numbers after identifying key source segments; Semantic Numerical Reviser (SNR), which selects source sentences via sentence-level semantic similarity and matches corresponding numbers; and Meaning-Guided Numerical Reviser (MGNR), which leverages lexical and semantic features around numbers for flexible matching. Evaluated on 190 samples extracted from Tokyo Metropolitan Assembly minutes, SNR achieves 93.2\% accuracy and an F1-score of 0.908, while MGNR attains 90.0\% accuracy and 0.871 F1. In post-hoc summary revision, MGNR corrects 94.8\% of the 58 summaries containing numerical errors, demonstrating practical utility for improving the reliability of political text summarization. These results suggest that incorporating semantic context around numbers yields more robust numerical error correction than relying solely on syntactic alignment.

13:30
Relevance Summarization for Example-Guided Text-to-SQL Generation in Complex Schema Database

ABSTRACT. Recent progress in Large Language Models has pushed Text-to-SQL systems closer to enabling natural language queries over complex relational databases. Unfortunately, the models sometimes misread the provided examples, making mistakes in joins, aggregations, or predicates, issues that we label as misunderstanding knowledge evidence. To tackle this, we introduce relevance summarization, a simple prompting technique that adds brief, natural-language overviews to few-shot examples, clarifying the role of each schema element. Tests on the Academic dataset, which contains twenty-two tables, indicate that this approach lifts performance. The execution accuracy rose from a baseline 75% to 85% with relevance summaries in a 3-shot prompt setup, while the rate of semantic errors decreased dramatically between models, including GPT-4.1 Mini, Gemini 1.5 Pro, and Deepseek V3. The findings reveal that this method increases the correct query execution rates and decreases typical logical errors, confirming its value for Text-to-SQL in practical environments.

13:45
Incorporating Curriculum Learning into Iterative Back-Translation for Neural Machine Translation

ABSTRACT. Iterative Back-Translation (IBT) is known as an effective data augmentation method in Neural Machine Translation (NMT). The main idea of IBT is to improve the quality of pseudo-parallel data by Back-Translation and the performance of two opposite-directional translation models, iteratively. This work proposes introducing Curriculum Learning into IBT to further enhance NMT models. Our proposed method begins by sorting sentence pairs in the pseudo-parallel data based on their complexity for training. We then perform iterative training of IBT, starting with a simpler subset of the sentence pairs and gradually incorporating harder pairs until all pairs are utilized. We employed various metrics to access the complexity of the sentence pairs, including sentence length, sentence BLEU score, and cosine similarity of Sentence_BERT embeddings. Our experimental evaluation demonstrated that incorporating Curriculum Learning into IBT improved the performance of MT. Notably, Sentence_BERT emerged as the most effective metric among the compared ones.

14:00
Topological Data Analysis for Transformer NMT: Exploring the Use of Cohomology-Based Persistence Landscapes as a Representation of Global Context

ABSTRACT. This study explores the integration of topological data analysis (TDA) into transformer-based neural machine translation (NMT), with a focus on cohomology-based persistence landscapes as a means of encoding global contextual information. We propose a modified architecture in which a new layer processes document-level persistence landscape features and feeds them into a vanilla transformer model, aiming to enrich the model’s semantic representation through topological insights. The approach was tested on a Portuguese–English translation task. Despite the theoretical appeal of using cohomological summaries to capture high-level structural features of linguistic data, our em- pirical findings show that the modified model does not outperform the baseline. In fact, BLEU scores were slightly lower compared to the standard transformer. This outcome suggests that either the topological features do not align well with the lexical-level objectives of BLEU, or that transformers already sufficiently capture such structural patterns through attention mechanisms. The findings highlight the challenges in bridging the gap between global topological features and token-level sequence modeling in NMT, offering insights into the limitations and potential future directions for TDA in natural language processing.

13:00-14:30 Session 5B: AI & UX in health and medical application
Location: Room 7604
13:00
A CFA–Clustering Framework for Assessing Health System Performance Considering Healthcare Resources, Socioeconomic Conditions, and Accessibility

ABSTRACT. Traditional evaluations of health system performance often emphasize clinical indicators, overlooking essential social, economic, and accessibility dimensions that are central to achieving Sustainable Development Goal 3 (good health and well-being). This study introduces a multidimensional, theory-driven framework for evaluating regional health system performance in contexts with devolved or locally governed health structures. Using confirmatory factor analysis, 10 input indicators are empirically grouped into three latent constructs: healthcare resources, socioeconomic conditions, and accessibility. The resulting model meets established thresholds for fit indices and demonstrates temporal measurement invariance, supporting its reliability and stability across three reference years—2019, 2021, and 2023. Validated factor scores are subsequently used in cluster analysis. Applied to Indonesia, a geographically diverse country with decentralized health governance, results reveal persistent interprovincial disparities, with 29 provinces maintaining consistent cluster memberships over time. The proposed approach offers a practical tool for identifying regional disparities, informing equitable resource allocation, and supporting data-driven health policy aligned with SDG 3.

13:15
Identifying Key Molecular Features for Cancer-Related hCA Inhibitor Compounds Using Machine Learning Models

ABSTRACT. Human Carbonic Anhydrase (hCA) IX and XII are overexpressed in certain cancers, making them important targets for drug discovery. Traditional methods for identifying hCA inhibitors face challenges due to limited structural data, leading to increased interest in machine learning-based approaches. This study evaluates NCART and XGBoost models for predicting compound activity against hCA isoforms and identifying key molecular features influencing their inhibitory potential. Data preprocessing steps, including duplicate removal and zero-variance feature elimination, improved model performance, with ExtraTrees remaining the best classifier. Feature importance analysis revealed that van der Waals’ surface area features, charge distribution, and ring structures play a crucial role in determining inhibitor activity and selectivity. These findings provide insights for rational drug design, aiding the development of selective inhibitors targeting cancer-related hCA isoforms while minimizing off-target effects.

13:30
Recognition of Stroke Level Based on EEG Signal Using Multi-networks of CNN-Transformer

ABSTRACT. Accurate and efficient classification of neurological conditions, such as stroke, is critical for timely diagnosis and effective treatment planning. Other instruments, such as CT scans, remain expensive and subjective, whereas Electroencephalogram (EEG) signals offer a non-invasive and cost-effective alternative. This research introduces a novel multi-network deep learning architecture for classifying stroke levels (No Stroke, Minor Stroke, Moderate Stroke) from multi-channel EEG data. The model leverages two main feature types: Motor Imagery (Mu, Beta bands) and Asymmetric Channel Pairs (Delta, Theta, Mu, Beta bands), extracted via Wavelet Transform. Each feature set is processed through a dedicated 2D Convolutional Neural Network (2D-CNN) for spatial feature extraction, followed by a Transformer module to model temporal dependencies through self-attention. This parallel pipeline design, referred to as the Multi-networks 2D CNN-Transformer model, enables the system to learn complex spatial-temporal patterns across multiple EEG representations. An experimental evaluation on a 14-channel EEG dataset yielded an accuracy of 91.13%, a loss value of 0.288, and an inference time of 75.83 seconds, using the AdamW optimizer with a learning rate of 0.0001. Compared to CNN-GRU and single-stream CNN models, the proposed multi-network framework demonstrates improved accuracy, computational efficiency, and robustness. These findings highlight the effectiveness of combining spatial and temporal modeling through a parallel architecture, offering a reliable tool for automated stroke-level recognition based on EEG signals.

13:45
Understanding User Experience of E-Health Service Platforms Across Age Groups

ABSTRACT. This study aims to examine the user experience (UX) of e-health service platforms developed under Surabaya’s e-government initiative, with a specific focus on variations across different age groups. As digital healthcare services become more essential, understanding generational preferences is critical for ensuring inclusive and effective user engagement. The research employed the User Experience Questionnaire (UEQ) to assess three main UX dimensions: Attractiveness, Pragmatic Quality, and Hedonic Quality, each with sub-dimensions measured using a 7-point semantic differential scale. Data were collected from 100 respondents in Surabaya, of which 82 responses were analyzed after consistency screening. The findings show that Pragmatic Quality, particularly Perspicuity and Efficiency, was rated most positively across all age groups, indicating the platform’s functionality and ease of use. Generation Z and Generation X expressed a desire for balance between pragmatic and hedonic aspects, while Millennials emphasized the importance of both pragmatic and visual attractiveness. Surprisingly, Generation X evaluated the website as dull, contrasting with previous research suggesting their preference for simple, functional design. Overall, the study concludes that although the e-health website performs well functionally, enhancements in Attractiveness and Hedonic Quality are needed to meet the expectations of a multi-generational user base. Improving these dimensions is expected to increase engagement with the e-health service and contribute to better public health outcomes.

14:00
Measuring Physiological Response and Mental Workload of AR Users in Assembly Learning: Implications for Educational Multimedia Applications

ABSTRACT. This study investigates the impact of Augmented Reality (AR) media as a learning tool for assembly, functioning as a transformative interactive multimedia service in the field of education. Addressing the limitations inherent in conventional assembly teaching methods, this study aims to analyze its impact on users’ physiological responses (heart rate) and mental workload (NASA-TLX). A comparative experimental design was employed, dividing participants into AR and non AR groups. Heart rate measurements were taken during the pre-experiment, experiment, and post-experiment phases, while mental workload assessments were conducted post-experiment using the NASATLX instrument. Although heart rate analysis did not show statistically significant differences between the two groups, the consistent heart rate patterns observed across all phases indicate physiological responses related to task demands when using multimedia services. Conversely, mental workload results showed that AR users reported significantly lower NASA-TLX scores (M = 15.8660) compared to the non AR group (M = 38.9340), (t(df) = -3.273(4.191), p = 0.029). These findings highlight the effectiveness of AR as a multimedia service capable of optimizing cognitive load during the learning process. As an initial study, this research provides valuable insights into the physiological effects and mental workload of AR, thereby contributing to the development of literature on multimedia services and applications in education. Additionally, these results can serve as a reference for the development of more effective AR-based learning services in the future.

14:45-16:00 Session 6A: Information System and Business Framework
Location: Room 7603
14:45
Balancing Functional Complexity and Flexibility through Integrated System Design
PRESENTER: Syed Nasirin

ABSTRACT. This paper explores how local governments may effectively balance functional complexity and design flexibility through integrated system design in Management Information Systems (MIS). Drawing on a qualitative case study of a local municipality in Sabah, Malaysia, the study investigates how policy-driven rule such as licensing and payment dependencies can be embedded within modular digital platforms. Despite constraints such as limited ICT capacity, shifting political mandates, and vendor-related challenges, the municipality has developed a resilient MIS that supports both operational depth and user-centric features, including QR code access, chatbot support, and unified login systems. Using grounded theory techniques—open, axial, and selective coding, the paper identifies six core dimensions that shape this balance, including hierarchical business logic, modular architecture, and user accessibility. The findings offer a conceptual model that may inform the design of scalable and adaptive public services transformation agenda in similar contexts globally.

15:00
Design of BPM for Inventory and Operations System in Distributor Company by using 7FE Framework

ABSTRACT. In the era of the industrial revolution 4.0, this is a great opportunity for companies to develop their business towards digital, especially for companies whose business processes are still conventional. For example, such as a distributor company, a distributor company is an agency that acts as a liaison or bridge between producers and consumers. This study aims to identify business processes that include operations and inventory, such as sales processes, purchasing processes, and inventory processes to determine needs, and determine the design of new business processes that will be implemented at PT. EMS. The methods used in conducting this study are literature studies, interviews, and observations. To implement Business Process Management (BPM) using the 7FE Framework. which takes 7 steps from 10 phases, namely organization strategy, process architecture, launch pad, understand, innovate, people, and develop. The results of the analysis are carried out on the ongoing business process, then poured into the design of an information system in the form of a website application that can help sales, purchases, stock of goods, and produce reports according to company needs. The resulting website application is expected to help companies in running their business processes more effectively and efficiently.

15:15
Sea Animal-Inspired Metaheuristics for No-Wait Flexible Flow Shop Scheduling in Small-Scale Bakery Enterprises

ABSTRACT. This study investigates the performance of three sea animal-inspired metaheuristic algorithms—Whale Optimization Algorithm (WOA), Jellyfish Search Optimizer (JSO), and Walrus Optimization Algorithm (WaOA)—in addressing the No-Wait Flexible Flow Shop Scheduling Problem (NWFFSP) in the context of small-scale bakery production. The scheduling problem is modeled as a two-stage process with heterogeneous machines and strict no-wait constraints, and evaluated under both single-objective and multi-objective scenarios. Three experiments are conducted to assess algorithm performance based on makespan and total oven idle time (ToIT), using a factorial parameter design across sixteen configurations. Results indicate that JSO consistently outperforms the other two algorithms in both solution quality and robustness, while WaOA provides stable alternatives with moderate competitiveness. WOA performs weaker and is more sensitive to parameter tuning. These findings offer practical insights into the suitability and behavior of sea-inspired metaheuristics in real-world scheduling applications.

15:30
Adoption Model of Autonomous Coal Mining System Based on TOE Framework

ABSTRACT. The coal mining sector plays a pivotal role in China’s economic development, yet the adoption rate of Autonomous Coal Mining Systems (ACMS) remains low. This study aims to identify the key factors influencing the acceptance of ACMS among coal mining enterprises, focusing on Ordos, a major coal-producing region. Using the Technology-Organization-Environment (TOE) framework, a structured questionnaire was distributed to senior management and IT departments of local coal companies. The collected data were analysed using multiple linear regression in SPSS to evaluate the impact of various technological, organisational, and environmental variables. The results indicate that factors such as system compatibility, leadership support, resource availability, and regulatory environment significantly affect ACMS adoption. These findings contribute to a more comprehensive understanding of ACMS acceptance in industrial contexts and offer practical guidance for policymakers and industry stakeholders aiming to promote intelligent and safer coal mining operations.

15:45
Developing an Integrated Success Model for Digital Twin Adoption in Facilities Management: A Conceptual Framework for Malaysia

ABSTRACT. Digital Twin (DT) technology is increasingly recognized for its potential to transform Facilities Management (FM) by enabling real-time monitoring, predictive maintenance, and data-driven decision-making. Despite its benefits, the adoption of DT in FM, particularly in Malaysia, remains limited due to a range of technical, organizational, and human-related challenges. This paper presents a conceptual framework called the Integrated Success Model (ISM), developed to support the effective implementation of DT in FM settings. The model is grounded in established theories, including the DeLone and McLean IS Success Model, the Technology Acceptance Model (TAM), and the Technology Readiness Index (TRI). It brings together key factors from these frameworks to address issues such as system quality, user readiness, and stakeholder involvement. By outlining this model, the paper aims to provide a practical foundation for future research and to assist policymakers, practitioners, and researchers in planning and evaluating DT adoption strategies that align with Malaysia’s digital transformation goals.

14:45-16:00 Session 6B: Computer Vision
Location: Room 7604
14:45
A Paradoxical Role of Motion Blur in 4D Reconstruction from Casual Videos

ABSTRACT. Reconstructing dynamic 4D scenes from casually captured videos remains one of the biggest challenges for generalized models. Casually captured videos inevitably contain motion blur, which we must consider as an inherent property of the input. In practice, we must decide whether to process the blurry video as-is or apply a deblurring step to improve the input. Our analysis of the effect of motion blur on reconstruction quality reveals a paradox. We find that reconstructions from sharp video inputs can unexpectedly fail, resulting in fragmentation or flawed models of dynamic subjects. Conversely, although lacking detail, motion-blurred inputs often produce more structurally coherent and stable reconstructions, though with notable ghosting artifacts. This suggests that motion blur provides a form of spatio-temporal smoothing that supports geometric reconstruction. Our contribution demonstrates that a deblurring preprocessing step effectively addresses this trade-off: it reduces fragmentation in sharp inputs while restoring detail and minimizing ghosting artifacts from blurred inputs, resulting in more complete and visually coherent 4D reconstructions. Our findings highlight that, for robust 4D reconstruction from casual video, balancing input sharpness and blur is a key challenge. It underscores the importance of deblurring in achieving both geometric accuracy and detailed visuals.

15:00
A Comparative Study of Nabla Tau and SCAR Unlearning Algorithms for CNN-Based Facial Race Classification

ABSTRACT. The proliferation of machine learning (ML) models, particularly Convolutional Neural Networks (CNNs) for sensitive tasks like facial race classification, has created an urgent need for mechanisms to remove the influence of specific data points from trained models. This need is driven by privacy regulations such as the GDPR's "right to be forgotten". While naïve retraining is a straightforward solution, it is computationally prohibitive. Approximate unlearning algorithms offer a more efficient alternative, but their performance varies. This paper presents a comparative study of two state-of-the-art approximate unlearning algorithms: Nabla Tau (∇τ) and SCAR. The study evaluates their performance on a CNN model trained for facial race classification using the FairFace Race dataset. The evaluation focuses on three key aspects: effectiveness (ability to forget), efficiency (computational cost), and utility (post-unlearning model accuracy). Experiments were conducted under two scenarios: random data removal (RR) and full class removal (CR), with varying proportions of data to be forgotten. The results indicate that SCAR demonstrates superior effectiveness in random removal scenarios. Conversely, Nabla Tau proves more effective for class removal, successfully reducing forget-class accuracy to zero, whereas SCAR exhibits catastrophic unlearning. Ultimately, Nabla Tau emerges as the more robust algorithm, preserving model utility where SCAR fails, highlighting a critical trade-off between unlearning effectiveness and the risk of catastrophic model degradation.

15:15
LiteLSTM Architecture for Mediapipe Hand Gesture Classification

ABSTRACT. Human Computer Interaction has progressed and given birth to various methods that aims improvement and best user experience. Mediapipe is a powerful library of pose estimation that allows the computer to read hand information through the camera. Hand Gesture Classification can be built with LSTM (Long Short Term Memory). LSTM is algorithm that capable to handle sequence data. However, using LSTM give some cost of computation load that may affect device performance. LiteLSTM is modified LSTM that can give lightweight of computation cost but still gives promising accuracy result. In the end of this article we achieve a model that has accuracy 94.44% for liteLSTM and 93.89% for Standard LSTM as hand gesture classification. In the same time, we also get the resolve of cost issue from LSTM which is shown one of it by the training time from LiteLSTM which is faster than Standard LSTM.

15:30
Indonesian Sign Language (BISINDO) Alphabet Detection Through a Mobile-Based Approach Based on YOLO Algorithm Version 11

ABSTRACT. Individuals who are Deaf or hard of hearing depend significantly on sign language, a visual mode of communication that employs bodily movements, facial emotions, and hand gestures. Advances in deep-learning methodologies offer substantial potential for automating the recognition of Bahasa Isyarat Indonesia (BISINDO), thereby enhancing the language’s accessibility to a broader range of users. This paper proposes a BISINDO alphabet recognition system based on You Only Look Once (YOLO), a state-of-the-art real-time object detection model, with the latest YOLOv11 version used to recognize hand gestures corresponding to each alphabet letter. The suggested system uses an effective and portable detection architecture that minimizes latency while preserving high precision in mobile environment. Comparative training of several YOLO variants yielded a peak mean average precision (mAP) of 82.3 % and a minimum latency of 10 frames per second, demonstrating the system’s viability for real-time BISINDO recognition on mobile platforms.

15:45
Time Series Sentiment Analysis of YouTube Videos in the 2024 Indonesian Presidential Election

ABSTRACT. Social media platforms like YouTube have become potent tools for influencing public opinion during elections in recent years. This study examined the sentiments expressed in YouTube videos related to three presidential candidates in the past 2024 Indonesian election. Initially, we categorized the videos into three distinct groups: official channels of the candidates, public news sources, and third-party-created content. Subsequently, we conducted sentiment analysis on each video and compute a metric called the Sentiment Impact Score (SIS) to quantify the overall sentiment trends. Our findings revealed a notable shift in public sentiment, with the elected candidate gaining favor, particularly in the third-party-created videos.