View: session overviewtalk overview
Biobjective Optimization Problems in Simple Reccurent Neural Networks PRESENTER: Kazuma Morishita ABSTRACT. This paper considers an approach to multi-objective optimization problems [1] in recurrent neural networks. For simplicity, we consider bi-objective optimization problems in a binary neural network (BNN [2] [3]). The BNN is a simple recurrent neural network characterized by ternary cross-connection parameters and the signum activation function. In the BNN, we have a parameter setting method that guarantees storage of desired memories. Prospective applications include associative memories [4] and error correcting codes [5]. We define bi-objective optimization problem based on two objectives. The first objective evaluates stability of desired memories and the second objective evaluates sparsity of connection parameters. In order to solve the problem, we apply a simple evolutionary algorithm. Performing precise numerical experiments for typical BNNs, we have obtained Pareto fronts that guarantee existence of trade-offs between the two objectives. [1] Y. Jin and B. Sendhoff, Pareto-based multiobjective machine learning: An overview and case studies, IEEE Trans. Syst., Man, Cybern. C, Appl. Rev., 38, 3, pp. 397-415, 2008. [2] K. Morishita and T. Saito, A simple bi-objective evolutionary algorithm for binary associative memory, Proc. of JKCCS, pp. 173-176, 2023 [3] K. Morishita and T. Saito, A Simple Algorithm for Multiobjective Optimization Problems in Binary Neural Networks, Proc. of NOLTA, pp. 12-15, 2023 [4] J. J. Hopfield, Neural networks and physical systems with emergent collective computation abilities, Proc. Nat. Acad. Sci., 79, pp. 2554-2558, 1982. [5] C. Anton, L. Ionescu, I. Tutanescu, A. Mazare and G. Serban, Error detection and correction using LDPC in parallel Hopfield networks, Proc. ISEEE, 2013 |
Enhanced Cooperative Spectrum Sensing Using Two-Stage LSTM-CNN Models PRESENTER: Swe Swe Latt ABSTRACT. To tackle the challenge of limited available spectrum in the realm of wireless networks, cognitive radio, a technology enabling unlicensed devices to intelligently access licensed spectrum opportunistically, has emerged as a promising solution. This paper presents an innovative approach to cooperative spectrum sensing in cognitive radio systems. Our method employs a dual-deep learning framework, uniting a Long Short-Term Memory (LSTM) and Convolutional Neural Network (CNN) for local energy detection, complemented by an LSTM-based fusion center. Our proposed method operates in two stages. Initially, the LSTM-CNN-based local detector predicts primary user (PU) signals individually. To enhance detection performance, the local decisions are then integrated and learned by the fusion center. A key strength of our method lies in its ability to predict both the current and future spectrum patterns of the PU. This is achieved by leveraging the temporal and spatial feature extraction capabilities embedded within the LSTM and CNN layers. Consequently, our method not only learns the present PU signal patten but also anticipates its future patten. This represents a significant advancement over traditional convolutional networks, which are limited to detecting the current pattern. The two-stage learning approach for PU signal detection aims to collectively improve the overall detection performance among all secondary users. Our method promises enhanced detection capabilities by harnessing the strengths of LSTM and CNN models, offering a comprehensive solution for cooperative spectrum sensing in cognitive radio systems. We conducted an extensive series of simulations to validate the effectiveness of our method, and the results unequivocally show that our approach outperforms existing methods in terms of detection accuracy and classification precision. |
Personalized Federated Learning with Divergence Subduing Gradient PRESENTER: Binh Nguyen ABSTRACT. Federated learning (FL) is a distributed and privacy-preserving machine learning concept that enables various distributed clients with the help of a centralized server to learn a collaborative model to avoid sharing their data. However the performance of FL is obstructed by nonindependent and identically distributed (non- IID) data and device heterogeneity. In this research, we examined this key challenge from the perspective of the gradient on the server side. In particular, we study the phenomenon called gradient conflict which happened between multiple clients due to the heterogeneity of data distribution. To tackle this problem, we proposed PerDSG, a method that lessens the divergence of local gradients through Divergence Subduing Gradient. This method finds the optimal gradient direction update that not only minimizes the average loss function but also improves the worst local update of clients. Comprehensive experiments have shown that PerDSG significantly enhanced the performance of the state-of-the-art FL baselines. |
Reducing Write Amplification of SSDs Through Machine Learning Based Hotness Classification PRESENTER: Sungjun Yun ABSTRACT. An SSD(Solid State Drive) with NAND flash memory has faster I/O performance than a conventional HDD(Hard Disk Drive). However, NAND flash memory has several constraints, such as erase-before-write and limited program and erase cycle(P/E Cycle) number. Additionally, in NAND flash memory, erase unit(block) size is much larger than read/write unit(page). To address these properties of NAND flash memory, SSDs employ out-place-update, invalidating the previous data and writing a new one to a different place. Therefore, when free spaces are exhausted, SSD must perform Garbage Collection(GC) to delete invalid pages and make some empty blocks. Because SSD has to migrate the remaining valid pages of victim blocks to a safe place before erasing them, GC induces additional writes and decreases other I/O jobs’ performance. To address this problem, we propose a machine learning based data hotness classification method to allocate data to separate blocks based on their invalidation timing. In this paper, we comprise an ML model to predict and label the hotness of each page and evaluated the proposed method with an SSD simulator. According to evaluation results, the proposed hotness classification method can reduce the number of valid pages copied by up to 32.7% during GC. |
Reservoir Reinforcement Learning Model for Generating and Switching Motor Primitives PRESENTER: Yu Yoshino ABSTRACT. Reinforcement learning, which relies solely on reward-based learning without explicit instructions, is increasingly recognized in robotics. It presents unique challenges, such as sample efficiency, training costs, state space dimensionality, and partial observability. To address these, this study introduces a modular model for generating motor primitives—basic motion patterns with biological underpinnings. The model consists of two networks: one for generating motor primitives and another employing reinforcement learning to switch between these primitives. The evaluation demonstrated that this model effectively switched motor primitives, creating complex trajectories like the figure-eight pattern. |
Effect of Memory Capacity Characteristics on Time-Series Prediction in Reservoir Neural Network Consisting of Neurons with Local Temporal History PRESENTER: Go Ishii ABSTRACT. Previous studies using a chaotic neural network reservoir (CNNR) indicated that the local memory at the neuron level would affect the time-series prediction performance because the shape of memory capacity (MC) characteristics differs from those of ordinary reservoir neural networks. In this study, we first modify the CNNR model to propose a reservoir neural network with local temporal history introducing two memory terms in the neuron model. The permissible ranges of these local memory terms are also expanded to have both negative and positive values considering the echo state property condition. Second, one-step time-series prediction tasks are used to evaluate the effects of the local memory terms. We verify these terms improve the time-series prediction performance. Finally, the MC characteristics are employed to further investigate the effects. Here, we clearly demonstrate that the local memory terms dramatically alter the shape of MC characteristics. |
Evaluating the Capability of Transformers in Computational Algebra Problems PRESENTER: Yuta Sato ABSTRACT. This study analyzes the mathematical reasoning capability of Transformers using computational algebra problems. Particularly, we consider polynomial-related problems: computation of the greatest common divisor, factorization, and counting of the irreducible components. For each of these problems, we train a Transformer, one of the most successful deep neural networks in recent machine learning, on many pairs of problems and solutions. Our experiments show that Transformers successfully learn to solve all the problems for polynomials with integer coefficients. Interestingly, when the coefficients of a finite field are used, we observe a significant performance degradation. The results indicate the limited capability of vanilla Transformers in modular arithmetic. We also observe that the input encoding scheme and the number of encoding layers affect the performance. |
Game-Theoretic Visualization Method of Important Pixels PRESENTER: Kosuke Sumiyasu ABSTRACT. To better understand the behavior of image classifiers, it is useful to visualize the contribution of individual pixels to the model prediction. In this study, we propose a method, MoXI (Model eXplanation by Interactions), that efficiently and accurately identifies a group of pixels with high prediction confidence. The proposed method employs game-theoretic concepts, Shapley values and interactions, taking into account the effects of individual pixels and the cooperative influence of pixels on model confidence. Theoretical analysis and experiments demonstrate that our method better identifies the pixels that are highly contributing to the model outputs than widely-used visualization methods using Grad-CAM, Attention rollout, and Shapley value. While prior studies have suffered from the exponential computational cost in the computation of Shapley value and interactions, we show that this can be reduced to linear cost for our task. |
Point Cloud Data Reduction Method Considering Bandwidth Limitation at an Intersection PRESENTER: Hidetoshi Hotoyama ABSTRACT. Danger detection by in-vehicle sensors is limited in its recognition range, thus it is difficult to ensure sufficient safety at intersections because of the amount of information and the many blind spots. To maintain safety, accident prevention systems using LiDAR sensors at intersections have been proposed. However, the amount of point cloud data collected by LiDAR sensors is large and may become a bottleneck in the network communication link. In this paper, we propose a game-theoretic method for reducing the amount of point cloud data by considering the spatial importance of point cloud data. Spatial importance means the importance of dynamic objects such as vehicles, people, bicycles, etc. We show that simulations on point cloud data sets maintain the quality of high importance regions while satisfying bandwidth limitations. |
Incentive Mechanism Considering User Departure with a Participatory Sensing System PRESENTER: Yuta Miyakawa ABSTRACT. Participatory sensing is expected to be used in various fields such as traffic monitoring, infrastructure management, environmental monitoring, and smart cities because of the low cost of implementation and the wide range of sensing possibilities. Data collection is costly, and users cannot be motivated to collect data without an incentive mechanism that rewards participating users. However, conventional incentive mechanism does not consider user departure from a participatory sensing system. In this paper, we propose an incentive mechanism that considers user departure with a participatory sensing system. |
Situation-Aware Robust Multi-Step Prediction of Vehicle Trajectory with Enhanced Transformer PRESENTER: Minsung Kim ABSTRACT. Multi-step vehicle trajectory prediction for urban transportation is particularly challenging due to the complex structure of streets. In addition, the driver’s decision is regulated by traffic lights and traffic enforcement cameras, and thus, the trajectory forecasting model is prone to diverge. In this work, we propose a Transformer-based neural network architecture for vehicle trajectory prediction that is situation-aware and robust to error accumulation. For enhancing the trajectory prediction accuracy, we propose to learn from the road geometry as well as the additional traffic control systems such as traffic light in addition to learning from the past trajectory. The proposed approach utilizes the Transformer model to analyze and forecast the multi-step vehicle trajectories. To make accurate prediction, we propose an add-on neural network, called AdvisoryNN, that is providing the encoded information about the traffic control system to the Transformer model. By extensive evaluations using the real-world trajectory data, we show the effectiveness of the proposed architecture compared to the widely-used recurrent neural network models. |
Edge Server Placement for Maximum Delay Reduction Using Queueing Model PRESENTER: Koki Shibata ABSTRACT. When edge computing is used for automated driving technology or online gaming, reducing the maximum delay is important because real-time performance for all users must be ensured from Quality-of-Service (QoS) perspective. The main latency that occurs in edge computing can be expressed as the sum of the propagation delay during data transmission and the waiting time at the edge server. The conventional edge computing method focuses on reducing the average delay of user-processing requests (tasks). However, this method increases the maximum delay due to increasing the utilization variance of each edge server. In this paper, we propose a method for determining both edge server placement and task allocation to reduce the maximum delay. |
Characteristic Analysis of Three Classes of VoIP Session Considering Talkspurt Length in Call Admission Control PRESENTER: Sota Narikiyo ABSTRACT. In an emergency such as a disaster, the level of importance depends on the purpose of each call. Call Admission Control (CAC) using thresholds is provided to prioritize connections according to importance level of calls. Some conventional CAC methods focus on the location of the caller and classify calls into three types: emergency calls, calls from within the disaster area, and calls from outside the disaster area. However, these conventional CAC methods does not consider the talk speed of the caller in emergency situations. In this paper, we propose a CAC method for calls in emergency considering talk speed. We model our CAC method with talk speed considering both talkspurt length and silence period by MMPP queueing theory. Moreover, we analyze characteristics of our method with various talkspurt length. |
Analysis of a Blockchain-based Priority Management System for disaster information PRESENTER: Tomoya Kato ABSTRACT. In an earthquake or other disaster, we need to quickly share information on the damage and evacuation status of the affected areas. However, information management systems used currently are centralized and may not be able to share information on the disaster area quickly due to traffic congestion and server problems. To solve this problem, a decentralized management system using blockchain has been proposed. However, blockchain has the problem that processing time depends on the number of transactions. In this paper, we propose a prioritized management system for disaster information, and evaluate the delay time of the blockchain using queueing theory. |
Similarity Analysis among Languages Using Context Vectors Generated in Large Language Model PRESENTER: Hiroki Azuma ABSTRACT. Machine translation, one of the AI technologies recently paid attention to, is implemented by large-scale language models, in which a numeric vector, the so-called context vector, represents the meaning of a sentence. In this study, we evaluate the similarity among languages by analyzing context vectors generated in a large language model and then compare the obtained result with the language network proposed in linguistics. |
Just In Time Compilation Code protection in WebAssembly PRESENTER: Suhyeon Song ABSTRACT. WebAssembly, a technology designed for platform-agnostic code execution, relies on the V8 WebAssembly engine, employing JIT compilation to optimize performance. However, this engine's use of Intel MPK, a hardware-dependent security feature, contradicts WebAssembly's platform-agnostic goals. This paper presents a novel software-based approach for read-write protection in WebAssembly JIT compiled code. This solution safeguards compiled code from external tampering and unintended operations, aligning with the core principle of secure and cross-platform execution in WebAssembly. |
Multimodal Cortical-based Analysis with Attention Modelling for Early Alzheimer's Disease Diagnosis PRESENTER: Quan Anh Duong ABSTRACT. In neuroimaging, cortical surface representation offers many advantages in understanding the progression of Alzheimer's disease, from quantitatively measuring cortical atrophy patterns to assessing brain function by analyzing the PET Standardized uptake value ratio (SUVR). However, existing state-of-the-art deep learning methods for AD diagnosis still rely on 3D volumetric features. In this study, we propose a new deep-learning framework for early AD diagnosis. Our framework utilizes lightweight cortical structural features extracted from T1w MRI and cortical functional features from FDG PET. We then use an attention model for AD diagnosis (cognitive normal vs. AD) and early AD diagnosis (cognitive normal vs. mild cognitive impairment) tasks. Our attention model employs cross-attention mechanisms to effectively analyze inter-modality interactions. Experimental results on the ADNI-1 cohorts demonstrate that the framework outperforms state-of-the-art volume-based methods on early AD diagnosis. |
Deep Reinforcement Learning Approach for Resource Optimization and Load Balancing on Edge Computing PRESENTER: Avilia Kusumaputeri Nugroho ABSTRACT. Placed on the edge of the network, edge computing is an efficient equivalent to cloud computing by reducing network delay and congestion. In this regard, edge computing is considered a key enabler for delay-sensitive applications. Optimization-based approaches have been proposed to promote the optimal use of edge computing resources and task offloading from resource-constrained user devices. However, widely used integer programming solutions suffer from high computational complexity, making them inapplicable to online scheduling.In this work, we propose an edge computing based on reinforcement learning that integrates load balancing and resource allocation optimization within online resource scheduling. We emphasize our approach using the Proximal Policy Optimization (PPO) algorithm. This enables the agent to execute complex actions, including dynamic resource allocation, load balancing, and task prioritization. Such capabilities optimize resource management, task prioritization, and system efficiency in edge computing orchestration. To validate the effectiveness of the proposed solution, we implement an edge computing environment and then carry out performance comparisons with widely used resource scheduling algorithms. |
HAGCN: Heterogeneous Attentive GCN for gene-disease association PRESENTER: Ki Beom Kim ABSTRACT. Predicting gene-disease association (GDA) is essential for comprehending molecular mechanisms, diagnosing diseases, and targeting genes. The validation of causal relationships between diseases and genes through experimental methods can be exceedingly costly and time-consuming. Deep learning, in particular graph neural networks, has shown great promise in this area. However, a significant limitation remains despite the success in this area. One major issue is that models mainly rely on single-source homogeneous networks. Another is the need for expert knowledge and manual definition of meta-paths to build multi-source heterogeneous networks. Acknowledging these challenges, the present study presents the Heterogeneous Attentive Graph Convolution Network (HAGCN), a hybrid graph neural network. The model combines the strengths of a traditional GCN with an attention mechanism, enabling the accurate prediction of GDA. Without defining the meta-path, the model uses a heterogeneous graph consisting of Gene, Gene Ontology (GO), Disease, Disease Ontology (DO), and Human Phenotype Ontology (HPO). The experimental evaluations have demonstrated the model's superiority over four state-of-the-art models, based on metrics such as AUC-ROC, Accuracy, and F1 score. We believe that HAGCN can accelerate the sequence of finding disease-associated genes and contribute to computational drug discovery. |
10:20 | Configuration of a data-center network for strong accessibility and load balancing PRESENTER: Jaeho Kim ABSTRACT. To maintain strong accessibility and load balancing in a data-center network, we propose a configuration method inspired from a statistical physics approach and multi-hops coverage. Through numerical simulations, we show that feasible solutions obtained by the proposed method could be utilized for data-center configuration. Moreover, data-centers assigned by the proposed method have a shorter distance between data-centers and end-users, and more even load balancing among data-centers than ones by a conventional heuristic method. We also consider a healing method against attacks by adding proxy centers in order to re-cover end-users. We conclude that such configurations of center networks are resilient and well-distributed. |
10:35 | Design and Evaluation of a Hybrid AI Inspection System for Detecting Bad Welds in Manufacturing Quality Control PRESENTER: Seungho Lee ABSTRACT. The integration of computer vision, a key component of artificial intelligence, has revolutionized the manufacturing sector. An important application in this field is weld defect detection, which is essential to ensure product quality. Traditionally, weld inspection has relied on non-destructive testing methods such as radiographic testing (RT), but these methods suffer from human limitations in interpreting the results. In this study, we present a novel computer vision-based system for weld defect detection utilizing a dual-model approach that combines the features of DRAEM, an unsupervised learning-based anomaly detection model, and U-Net, an image segmentation model. The system addresses two major challenges in weld defect detection: the biased nature of manufacturing data and the need to transform unstructured RT image data into a structured format. With this approach, our system effectively normalizes unstructured RT images and accurately detects weld defects, confirming its potential to replace traditional human-based methods. This work not only demonstrates the potential of combining advanced machine learning models for defect detection, but also lays the foundation for future innovations in automated manufacturing processes. |
10:50 | The Boundary Between Communication System Speeds and Chaotic Random Number Generators ABSTRACT. In the digitalized world, the speed of communication systems is increasing every day, but at the same time, the need for secure communication systems emerges. On the other hand, increasing computation power makes it easier for hackers to attack secure communication systems. In this context, the design of faster and more robust random number generators has become a necessity as the heart of security systems. In this paper, the throughput and robustness parameters of Chaotic Random Number Generators, which are promising for faster operation, will be evaluated simultaneously, their limits will be revealed, and their feasibility in a typical communication system will be presented with simulation results. |
3D Mesh Models Simplification Based on Surface Similarity PRESENTER: Soma Ueno ABSTRACT. There is a technology called Virtual Reality paint (VR paint) that enables modeling of 3D models as if painting in a virtual space. The advantage of this technology is that modeling is possible without requiring specialized CG modeling techniques. In addition, when 3D models are used for contents such as games and video works, it is often preferable to use simplified models rather than highly detailed ones. However, 3D models created with VR paint have a unique structure, and applying conventional simplifying methods may lead to unintended results. Therefore, this study proposes an effective simplifying method for 3D models created by VR paint. The results of the simplifying using the proposed method suggest that for shapes composed of planes such as cube and pyramids, the amount of data can be significantly reduced while preserving the shape. |
Error-related activity in human-robot cooperative exploration in virtual maze PRESENTER: Takumi Fuji ABSTRACT. Error-related Potential (ErrP) is an electroencephalography (EEG) that is generated when a human perceives an erroneous action. Using ErrP, it is possible to make robot motion corrected when the robot make an error in response to human commands. Until now, we have studied the cooperative systems between an autonomous mobile robots and human for environmental exploration, and found that that ErrP could be observed at 200 msec in the time domain in small-sized maze. However, the latency and amplitude of ErrP varies depending on the tasks. In this study, we characterized ErrP during human-robot cooperative activity in the time domain as well as in the frequency domain in the exploration of the larger-sized maze. The results revealed that an increase in θ powers corresponding to ErrP in the time domain at 5-600 msec as well as at 200 msec and after the robot's error. In addition, the significant decrease in β powers was observed around 600 msec after the robot's error. These results suggest that the feature of both in time-domain and in frequency-domain may be useful for improving the detection of ErrP using machine learning. |
An Ensemble Learning Approach for Emotion Recognition from Gait ABSTRACT. In this study, we propose an ensemble learning-based method for emotion recognition from gait. To enhance the accuracy of emotion recognition, we apply a multi-time window feature extraction technique that can extract both global and local features from long-term to medium-term and short-term. Our proposed method constructs an ensemble of classifiers that are trained with features extracted from different multi-time windows. To verify the effectiveness of the proposed method, we evaluated it using a publicly available three-dimensional gait dataset. Additionally, with the same dataset, we extensively conducted a comprehensive performance comparison of various machine learning-based methods. The simulation results show that emotions can be accurately recognized, demonstrating that the proposed method achieves state-of-the-art performance when compared with benchmark methods. |
Enhancing Reinforcement Learning in POMDPs with the Reservoir Soft Actor-Critic Model Incorporating Multi-layer Structures PRESENTER: Tatsuro Nagai ABSTRACT. This study proposes a novel approach combining the off-policy reinforcement learning model with reservoir computing (RC) in partially observable Markov decision processes (POMDPs). The off-policy reinforcement learning model has the problem of increased learning costs when it is combined with state estimation algorithms. To address this problem, we propose a model combining RC with the soft actor-critic (SAC) off-policy reinforcement learning model to improve learning efficiency. In the experiment, we explore the effectiveness of the proposed model with multi-layer structures in the output layer of RC. We compared the proposed model with that without multi-layer structures to investigate its necessity. Additionally, we evaluate the proposed model based on the number of rewards obtained using the Markov decision process (MDP) and POMDPs. As experimental results, it is clear that the multi-layer structure in the output layer of RC improves performance and effectively facilitates learning in POMDPs. Furthermore, the proposed model achieved rewards comparable to feedforward models in MDP and obtained more than feedforward models in POMDPs. Moreover, the proposed model could realize small-scale models than feedforward models. In conclusion, the proposed model could improve performance using multi-layer structures. The proposed model is expected to enable efficient and effective application in real-world environments imitated by POMDPs. |
An memory-efficient streaming architecture towards online learning for physical reservoir computing PRESENTER: Kota Tamada ABSTRACT. Due to the recent advancements of IoT, significant amounts of data can now be collected from devices. However, transmitting this data as is poses certain risks, including the possibility of data leaks. Consequently, it has become crucial to implement small-scale AI to process the data on the device prior to transmission. Reservoir computing is a viable method of implementation, although its computational complexity and memory usage leave room for improvement. The ensemble Kalman filter could address these issues. In this study, we have devised a streaming hardware configuration for the ensemble Kalman filter, which is one of several online learning methodologies for reservoir computing. Our investigation centred around identifying the stream processing capabilities that the architecture of the ensemble Kalman filter could accommodate, and consequently configuring the hardware architecture. We conducted simulations to measure the anticipated efficacy of the new architecture, including its expected learning accuracy, its effect on computing speed, and its capacity for optimising memory use. The study demonstrates that the streaming hardware architecture of the ensemble Kalman filter has achieved additional acceleration and memory preservation in reservoir computing. |
Out-of-distribution data detection applying predictive coding networks and their variational free energy PRESENTER: Takafumi Kunimi ABSTRACT. The free energy principle (FEP), which takes a macroscopic view of brain function and states that the brain functions in such a way as to minimize variational free energy (VFE), has been proposed and is being studied in various fields including the field of artificial intelligence (AI). In this study, we constructed a predictive coding network based on the FEP and evaluated its out-of-distribution (OOD) detection performance.Through an OOD detection task using VFE as an evaluation index, we verified whether PCNs can successfully mimic the computational mechanism of the brain proposed in FEP and whether they can be a prototype for brain-like AI, and suggested their potential. |
Sparsity-centric Reservoir Computing Architecture PRESENTER: Yuki Abe ABSTRACT. Power efficiency, hardware resource and privacy - problems with cloud-based AI have been a concern for a decade. An edge-friendly, light, and simple ML framework is demanded. Many works attempt to implement reservoir computing to FPGA as an edge AI system. However, most of the existing works can not make full use of its simple algorithm, and their implementations are not so different from other RNN prediction accelerators. From this background, we propose a novel architecture of a prediction accelerator for reservoir computing, which takes full advantage of the lightweight feature of the framework. This work proposes a novel architecture with (1)memory and hardware resource reduction by log-quantized internal weights and LFSR-based weights generator. (2)Throughput improvement by the zero-skipping and parallel computation mechanisms. By implementing our proposal, we designed a prediction accelerator for reservoir computing that operates at 450K fps prediction within <3000 LEs. |
Smart Factory Integration in Small-Scale Industries: An AIoT Approach for Production Efficiency PRESENTER: Ki Hwan Kim ABSTRACT. This paper explores the growing need for intelligent production technology in small-scale manufacturing sectors, facing challenges like diverse product demands, an aging workforce, and skilled labor shortages. To address these issues, an edge-based AIoT platform was developed, facilitating 24-hour manned and unmanned production management, thereby simplifying the integration of smart factories for small manufacturers. The study introduces an AI model tailored for a food manufacturing plant, which effectively predicts the need for replacing consumables and detects abnormal conditions. The model, utilizing a comprehensive dataset from a 4-axis motor with over 4 million data points and 22 variables, achieved a notable accuracy of 0.8226 and an F1 score of 0.8243, using the Light Gradient Boosting Machine (LightGBM) approach. These metrics underscore the model's capability in enhancing operational efficiency and reducing production downtime. This research signifies a key advancement in intelligent manufacturing, offering practical solutions to the challenges faced by the industry. |
Ultrafast Channel Allocation by a Parallel Laser Chaos Decision-Maker for Downlink NOMA Systems PRESENTER: Masaki Sugiyama ABSTRACT. Non-orthogonal multiple access (NOMA) is a techniquey that enables multiple users to share the same channel by appropriate power and channel allocation at the transmitter, with the receiver decoding the multiplexed signal. Fast and effective channel allocation in NOMA is crucial for maintaining optimal allocation in a dynamic environment. Prior study has demonstrated the Laser Chaos Decision-Maker's capability to effectively and ultrafastly solve the multi-armed bandit problem. However, as the number of users grows, the complexity of channel allocation grows rapidly, posing a challenge for achieving suitable channel allocation using the single Laser Chaos Decision-Maker method from previous studies. To overcome this challenge, we propose a parallel Laser Chaos Decision-Maker deciding the channel for each user effectively and ultrafast, even with an increasing number of users. The system performance is evaluated by compared with conventional methods in simulation. |
A Study on the Prediction of Worker paths to Prevent Accidents PRESENTER: Byeongmin Lee ABSTRACT. Although manufacturing and industrial sites have applied advanced technologies such as the Internet of Things and artificial intelligence to increase efficiency, safety accident prevention is still an important challenge. In this paper, we studied the prediction of a worker’s walking path to prevent unexpected accidents caused by machine-human interaction with the introduction of advanced technologies. |
Performance Evaluation of Resource Allocation Optimization in UAV Networks Using Ising Machines PRESENTER: Tsukumo Fujita ABSTRACT. Unmanned Aerial Vehicles (UAVs) are widely deployed in wireless communications to enhance the network service experience of mobile users on the ground. This article investigates a dynamic resource allocation problem in UAV-aided wireless networks, where multiple UAVs function as aerial base stations to provide services to ground users. A high-speed UAV-user association and resource allocation method based on the Ising model is proposed to improve communication efficiency for all users in the presence of co-channel interference. Performance evaluation results demonstrate that the network performance can be enhanced using a high-performance Ising machine capable of efficiently solving Ising problems. |
Throughput Evaluation of Multi-cell Power-domain GF-NOMA for mMTC PRESENTER: Rei Oda ABSTRACT. This paper presents the throughput of the grant-free power-domain non-orthogonal multiple access (GF-NOMA) in multi-cell environments for the massive machine-type communications (mMTC). 5G/6G networks need to accommodate a massive number of devices, such as sensors, to realize smart industries in the context of mMTC. Such mMTC requires uplink grant-free access (GF), like an ALOHA protocol, to reduce signaling overheads. To improve the throughput of GF access, a promising technology is the power-domain channel-inverted non-orthogonal multiple access (NOMA) with GF access, i.e., GF-NOMA. In GF-NOMA, each device selects a pre-designed power level and calculates its transmission power by the aiming received power associated with the selected level and its channel to a base station (BS). By the benefits of pre-designed power levels, packets exclusively occupying at each level are successfully decoded, and then, increasing the number of power levels can increase the throughput in single-cell environments. In realistic environments, each device in a cell provides interference (i.e., inter-cell interference) with BSs in other cells; in such multi-cell environments, increasing the number of power levels increases inter-cell interference, and as a result, multi-cell GF-NOMA are expected to experience lower throughput than single-cell GF-NOMA. However, the characteristics of the throughput in multi-cell GF-NOMA have remained unclear quantitatively in related works, focusing on only the performance of single-cell GF-NOMA. In this paper, we model the inter-cell interference in multi-cell GF-NOMA and analyze the characteristics of the throughput, focusing on the relationship between the number of power levels and inter-cell interference. The key approach in this paper is to analyze the impacts of inter-cell interference and intra-cell interference on the throughput. Our simulation results highlighted that multi-cell GF-NOMA provided the maximum throughput at nine power levels due to inter-cell interference. Then, multi-cell GF-NOMA provided lower throughput by 62% than single-cell GF-NOMA. |
Exploring a Model of Interaction Between Structural Change in Online Social Networks and User Dynamics PRESENTER: Ryusei Yamamoto ABSTRACT. Online Social Networks (OSNs) have facilitated rapid information exchange and efficient community building among users worldwide. However, they have also contributed to issues such as the spread of fake news and misinformation, negatively impacting real-world scenarios. Understanding user dynamics in OSNs is crucial for addressing these problems. User dynamics are closely linked to the network structure. Previous analyses have focused on how changes in network structure influence user dynamics. However, in real OSNs, this influence is bidirectional. We consider that user dynamics can also instigate changes in network structure. This study analyzes a model where user dynamics and structural changes in the network mutually influence each other. We demonstrate that the effects of this mutual influence vary based on the nature of the structural changes. Our findings offer new insights into the complex interplay between user behavior and network evolution in OSNs. |
Digital Twin Smart City: Synthesizing IFC and CityGML for Enhanced 3D City Model Visualization PRESENTER: Lam Phuoc-Dat ABSTRACT. The burgeoning interest in interacting with building data management systems, particularly Building Information Models (BIM), has instigated notable improvements in various facets of management, material supply chain analysis, documentation, and archiving. The critical objective of achieving robust integration between BIM and Geographical Information Systems (GIS) underscores the necessity for seamless interoperability. Despite existing methodologies for integration, involving labor-intensive manual scrutiny of intricate model files, proving time-consuming and exhibiting limited generalization success, this study proposes a data transformation method based on contextual mapping between the Industry Foundation Class (IFC) and City Geometry Markup Language (CityGML) domains. To enhance BIM data interoperability, we conducted an analysis of the conversion method between IFC and CityGML. Subsequently, an experiment was conducted, utilizing streaming BIM data with the Cesium Ion web-based service and Cesium for Unreal Engine. |
Design and validation of an ontology of OGC Sensorthings structures for relational queries PRESENTER: Bonhyeon Gu ABSTRACT. The advancement of software and hardware, along with the spread of the internet, has made the realization of the Semantic Web possible, with DBpedia being a notable example. The OGC Sensorthings API is an international standard that deals with a predefined, open-structured format in JSON. This standard is mainly implemented in a structure where databases and API servers are connected, representing the typical API services we are familiar with. The research aims to explore and design whether this structure can be represented through a complex, extendable ontology, addressing complex queries and possible extensions. The open standard OGC Sensorthings API, which handles sensor information in JSON format, takes the form of a typical API and requires additional implementation to handle complex queries. In this study, we design this structure as an ontology to solve the described problems, and explore how to represent UML within OGC Sensorthings API as Classes and Properties before conducting experiments on specific queries. |
Design and Implementation of AI Perfumer Service System Using ChatGPT ABSTRACT. When choosing a perfume, many people worry about what types of scents there are and what kind of scent suits them. In this paper, in order to alleviate these concerns, an AI (Artificial Intelligence) system using ChatGPT directly analyzes the user's image and climate environment, and directly creates a perfume suitable for the current user, or similar products sold on the market. We designed and implemented an AI perfume service system that recommends perfumes and allows users to create or purchase personalized perfumes that suit their tastes. |
Improvement of Image Transformation Accuracy in CycleGAN Applying Data Augmentation PRESENTER: Shuhei Kanzaki ABSTRACT. Deep Learning uses multiple layers of neural networks to learn the features of data in depth step by step. This technology has been actively studied not only in the field of data recognition such as image recognition and speech recognition, but also in the field of data generation and transformation. Recently, Generative Adversarial Networks (GANs) have attracted much attention as a generative model. A GAN consists of two networks, the Generator and the Discriminator, where the Generator generates data and the Discriminator determines whether the input data is generated by the Generator or not. The data generated by GAN can be used to supplement the data needed in the machine learning field. A derivative model of GAN is CycleGAN, which can transform different images into each other. CycleGAN is composed of two networks of GAN, i.e., two each of Generator and Discriminator. Data Augmentation (DA) is one of the methods to compensate for the lack of data. DA is a technique for processing images without losing the original meaning of the images. It has been shown that applying DA to GAN improves the performance of GAN. In this study, we propose a data augmentation method by applying Data Augmentation to CycleGAN and analyze its learning performance. |
Haptic texture recognition system based on characteristics of object surface material using convolutional recurrent neural network PRESENTER: Hyoung-Gook Kim ABSTRACT. This paper proposes a Convolutional Recurrent Neural Network-based texture recognition system using haptic vibration signals that represent the texture characteristics of various object surfaces. The proposed system combines three-axis vibration acceleration signals collected during contact with object surfaces using tools such as stylus pens into one-dimensional acceleration data and converts them into log Mel-Spectrograms. The converted data is applied as input to a convolutional bidirectional gate recurrent unit model to recognize the object surface texture. The Lehrstuhl für Medientechnik (LMT) haptic texture dataset was applied to evaluate the texture recognition accuracy performance of the proposed method. The results of the experiment conducted with 10-fold cross-validation showed improved performance compared to the existing 2D CNN and BGRU methods. |
A Study on the Color Extraction Method of Minhwa Objects ABSTRACT. This study proposes a methodology for color extraction of Minhwa Objects. This study focuses on the creation of traditional Minhwa and explores the process of effectively recognizing and classifying objects in Minhwa. To do this, we propose a combination of the Object Detection model and the Color Classification Sequential model to accurately classify the colors of objects detected in Minhwa. Through this methodology, we want to overcome the limitations of color recognition models for objects that are difficult to distinguish clearly. |
15:50 | Self-supervised 3D Face Model Learning for Monocular Image PRESENTER: U-Chae Jun ABSTRACT. Estimating a 3D facial model from 2D images has been a long-standing area of research in the field of computer vision. Especially, reconstructing a facial model using monocular images is challenging due to inherent physical constraints. Consequently, previous models struggled to generalize across various variations, such as skin and lighting. In this paper, we propose a new facial reconstruction model that overcomes these limitations by combining a 3D Morphable Model with a learned correction space. By combining them to enhance generalization and regularization, our method overcomes the limitations of existing approaches that rely on prior information. Our method demonstrates high-quality monocular reconstruction even for challenging unseen faces. |
16:05 | A Bilinear Face Model for Real-time Performance-Based Applications PRESENTER: JaeEun Ko ABSTRACT. 3D human face models have been widely used in many applications, such as face recognition, facial expression recognition, and facial expression tracking. All of the previous face models for these applications contain faces with different identities and expressions. However, their expression spaces are not diverse enough to be used in applications requiring various expressions other than those included in the model, such as real-time performance-based facial image animation. Therefore, this paper proposes a face model to reproduce various captured expressional faces. We demonstrate the potential of our face model with an application of real-time performance-based facial image animation. |
16:20 | Performance Comparison of Recurrent Neural Network Models for Solar Power Generation Prediction PRESENTER: Hajin Noh ABSTRACT. With environmental issues taking center stage, the adoption of renewable energy is accelerating. However, renewable energy, obtained from nature, faces the limitation of varying power generation levels depending on weather conditions, making stable generation challenging. In this paper, we compare the performance of recurrent neural network models used for time series data forecasting to predict irregular solar power generation, and analyze the differences. |
15:50 | Using Graph Neural Networks for Identifying Influencers in Social Networks PRESENTER: Sho Tsugawa ABSTRACT. Identifying influencers in a given social network is important for several applications such as viral marketing and prevention of the spread of unwanted information. This paper tackles the influencer identification problem by using graph neural networks (GNNs). While GNNs have been shown to be effective for several graph-based learning tasks, the effectiveness of GNNs in influencer identification remains unexplored. In this paper, we discuss experiments that were conducted to test whether influencers in a social network could be identified by using a GNN. The results show that the accuracy of a GNN-based model for identifying influencers is higher than baseline methods in Facebook and co-authorship networks, but lower in a Twitter network. |
16:05 | Efficient Multi-Sensor Fusion for Depth Completion with RGB, NIR, and LiDAR Sensors PRESENTER: Janghyun Kim ABSTRACT. Most existing depth completion methods rely on a combination of RGB and LiDAR cameras. However, they often encounter limitations when generating accurate depth maps in dark areas due to low-light conditions. To address this issue, we propose a depth completion method that leverages a Near-Infrared (NIR) camera to maintain accuracy even in low-light scenarios. Furthermore, they suffer from slow inference times, making it challenging to adapt to real-world scenarios such as augmented reality and autonomous driving. Our method utilizes a simple fusion method with each modality to overcome these limitations. As a result, our approach maintains each sensor's properties with an efficient structure and achieves higher performance than using only RGB cameras and LiDARs. |
16:20 | Integration of Attention Mechanism in Reservoir Computing for Time Series Forecasting PRESENTER: Felix Köster ABSTRACT. Photonic reservoir computing has showcased considerable potential for forecasting chaotic time series. Nevertheless, the prediction of chaotic systems is a challenging issue. In our study, we enhance the performance of chaotic time-series prediction by integrating an attention layer into the reservoir computer's output. This mechanism is adept at giving precedence to certain temporal features, with a clear objective to improve the forecasting accuracy. Our findings illuminate a noticeable enhancement in performance capabilities when dealing with smaller reservoir sizes. We validate the efficacy of our attention-enhanced reservoir using two examples of time series: a unidirectionally coupled dual Lorenz System and a composite system of Lorenz and Rössler dynamics. Our scheme possesses the potential for applications in the real world, especially those that demand high accuracy in chaotic time-series prediction. |
16:35 | Natural Neighbor Interpolation based Positioning Error and Deviation Mapping for Enhancing UWB Indoor Positioning PRESENTER: Hyun-Gi An ABSTRACT. Accurate positioning information of tracked entities such as humans and objects is crucial in various industries including education, healthcare, construction, logistics, and security. Recently, Ultra-Wideband (UWB) technology, providing centimeter-level accuracy in indoor RTLS(Real-Time Location System), has gained widespread adoption. However, the accuracy of positioning varies due to factors such as frequency interference, collisions, reflection waves, and Line of Sight (LOS) establishment. In certain environments, high positioning accuracy cannot be guaranteed. Therefore, this study proposes a method for generating positioning error and deviation maps using the Natural Neighbor interpolation technique, which can be utilized to correct inaccurate positioning in a given environment. This study introduces a technique to sample positioning errors and deviations within a positioning space, interpolate empty spaces using a natural neighborhood interpolation technique, and then generate a positioning error and deviation map. These maps, based on sampled data, are expected to contribute to improving the accuracy of Indoor RTLS by correcting positioning results closer to the actual locations. |
17:00 | Estimation and Analysis of Train Boarding and Alighting Passengers by Time Zone According to Environment PRESENTER: Sujeong Choi ABSTRACT. In modern society, public transportation is essential, and subway systems, less affected by road traffic conditions, attract substantial demand during commuting hours. However, due to physical limitations, overcrowding occurs, leading to inevitable delays in train operations. One solution to address this issue involves optimizing train dwell times, yet this necessitates accurate passenger boarding and alighting information. To achieve more precise optimization of train dwell times, historical data is utilized to forecast boarding and alighting passenger numbers. Hence, to consider various environments distinguishing between transfer and non-transfer stations, this paper employs three models—LSTM, GRU, and RNN—to predict boarding and alighting passenger counts, comparing their respective performances. |
17:15 | A Study on the Impact of CTGAN-Based Data Augmentation on SASRec Performance PRESENTER: Dahun Seong ABSTRACT. It is difficult to secure a sufficient amount of data for model learning in the recommendation system. In order to overcome this limitation, data augmentation technology using a generated model is attracting a lot of research and attention. Among them, CTGAN is a model that can augment table data and appropriately reflect the characteristics of original data. Therefore, in this study, two scenarios for augmenting data are proposed, and the data is augmented using CTGAN to find out the effect of the augmented data on the performance of the recommendation system. The performance evaluation results are analyzed by applying the augmented data to SASRec, a session-based recommendation system. |
17:30 | Analysis of the Relationship on Emotional Similarity between fonts with Similar Shapes PRESENTER: Lee Ga-Eun ABSTRACT. This study analyzed the relationship between font shape similarity and emotional similarity. As a result of the study, a negative correlation was confirmed between the similarity of shape and emotional value, indicating that the shape of the font can affect the user's emotions. In addition, through ANOVA analysis of the clustering results using t-SNE, it was confirmed that differences in shape and emotion between clusters were statistically significant. Based on these results, it was confirmed that when the shapes are similar, the emotions that appear are also similar. |
17:00 | Exploring the Direction of Applying A.I to Game Contents for Sustainable Development of the Gaming Industry -Focusing on Animal Crossing PRESENTER: Junkoo Kang ABSTRACT. Game content has traditionally engaged users emotionally with a strong sense of theme. One of the key elements of game content has always been to give the player a purpose to engage with and let them figure out how to achieve it. However, with the development of the internet, games have evolved from simply beating programs to encouraging human competition. This can sometimes lead to user fatigue and hinder the scalability of the content. In order to overcome this, various attempts have been made recently, and the utilization of artificial intelligence is gaining attention as a means to overcome it. We wanted to explore this direction of utilization through Animal Crossing, which is a representative user-friendly game content. |
17:15 | Perceptual JPEG Artifact Removal using Sum of Weighted IQA as Loss Function PRESENTER: Takamichi Miyata ABSTRACT. The challenge of using Flexible Blind CNN (FBCNN) to remove compression artifacts from JPEG images with unknown quality parameters is that the restored images are overly smoothed and their perceived quality is low. This is due to the use of the L1 norm as the loss function, which has a low correlation with perceptual quality. On the other hand, using a single image quality assessment (IQA) as the loss function is known to introduce new artifacts into the image. We propose a new perceptual quality-aware JPEG artifact removal method that uses a weighted sum of multiple IQAs as the loss function. The proposed method includes a modification of the FBCNN architecture to prevent other artifacts caused by JPEG artifact removal. Experimental results show that the proposed method significantly improves the perceptual quality of JPEG artifact-removed images quantitatively and qualitatively compared to FBCNN. |
17:30 | Data Screening based on Image Meta-Information for Emotion Recognition Model PRESENTER: Hyeongju Moon ABSTRACT. Due to the impact of the pandemic and the development of ICT technology, the use of non-face-to-face or unmanned systems is expanding. As communication in these non-face-to-face situations expands, recognizing expressions from facial images plays a very important role in communication. Because emotions are perceived differently depending on each individual's interpretation, research is being conducted to solve various problems related to the recognition of emotions in facial expressions. However, learning a deep learning model from a large amount of image data sets is subject to limitations in various experimental environments. Therefore, in this paper, we propose an emotion recognition model that screens age and gender data based on image meta-information and then learns it through deep learning. The proposed method utilizes complex image data for Korean emotion recognition provided by AI-Hub. Based on image meta-information, age is classified as 10s to 50s and gender is classified as male/female. Emotions are recognized by screening classified data and building each model through image preprocessing and face detection. As a result of learning using the EfficientNet, VGG, and ResNet models, the performance of all seven emotions was improved for the model that learned the screened data compared to the model that learned the entire data set, and the accuracy improved by 5 to 8%. |