View: session overviewtalk overview
Technical Session No 1, Q&A (Presentations 1-6)
| 16:10 | Color Key-Phrases Data-Mined from Twitter Analyzed Using Word-Embedding in Language Models ABSTRACT. Click here to watch this presentation Color is an important part of society and culture. The use of colors can be found in almost every aspect of life, from biological indicators to cultural components and product design, amongst others. Research has been conducted on the implications of color uses and color naming. This paper uses data mined from Twitter of 11 color key phrases and pre-processed using natural language processing. A separate language model is trained for each color term by a simple neural network. The resulting word-embeddings are used to compare and analyze similarity scores between color terms within each language model. Results are visualized and discussed. The findings show that similarities between color key phrases differ based on the underlying language model. While the color terms blue, and yellow have the highest cross-similarities, brown, pink, and gray showed divergent similarity scores. | 
| 16:20 | Automatic glare region detection using deep-learning-based semantic segmentation PRESENTER: Yuma Takei ABSTRACT. Click here to watch this presentation Recently, with the advancement of deep learning technology, there has been a lot of research on automated driving. In automated driving systems, tasks such as detecting obstacles and estimating the possible driving path rely heavily on vision. However, high-luminance light sources such as the sun in the daytime, oncoming traffic at night, and streetlights reduce the visibility of surrounding objects. Therefore, preventing glare is an important issue in automated driving technology. The glare that can be expected when driving includes glare from light sources that emit strong light, such as the sun and the headlights of oncoming vehicles, and glare from reflections from wet roads and mirrors. In this research, we propose an algorithm to automatically detect those glare regions in an image. Specifically, our approach combines a semantic segmentation method based on deep learning and a method based on conventional image processing. The results of applying the proposed method to our large dataset of night-time videos demonstrates that our approach offers higher performance than existing techniques. | 
| 16:30 | IMPACT OF PHASE DETECTION ON ACOUSTIC MEASUREMENTS ACCURACY PRESENTER: Maria Kostina ABSTRACT. Click here to watch this presentation The present paper describes the application of dual-frequency probing of a control object with subsequent correction of the received signal phase for determining the control object geometric dimensions. The accuracy of dual-frequency method was denoted in previous works of the authors. The phase correction influence will significantly increase the acoustic control accuracy (more than 2 times). Recognition of the number period (in which the detection took place) for each of the signals eliminates the errors associated with a significant change in the shape of the signals of different frequencies, which will extend the range of probing frequencies. | 
| 16:40 | A SUPPORT SYSTEM FOR STIMULATING DISCUSSIONS IN COMBINATION WITH FACE-TO-FACE AND REMOTE ONLINE MEETINGS PRESENTER: Ryota Murai ABSTRACT. Click here to watch this presentation Because of the coronavirus pandemic which has occurred in the year of 2020, remote work using online meeting systems is becoming popular in many organizations. Although face-to-face meetings are expected to gradually return after the pandemic gets contained by vaccines, we think that there will be opportunities to use online meeting systems in combination with face-to-face and remote meetings. On the other hand, there are problems with online meetings, such as difficulty in sharing awareness between remote and face-to-face participants. Many studies have been conducted to solve such problems, however, most of the existing studies have been conducted for situations where discussions are active, and few have assumed situations where discussions are passive. In this study, we developed and verified a system to share the thoughts of participants in order to stimulate discussions even in a passive meeting. The system visualizes the degree of understanding of participants with graphs in a real time manner. As a result, the number of utterances by participants increased when the proposed system was used. In addition, there were many conversations in which the face-to-face participants asked the remote participants for their opinions. This suggests that the proposed system can promote speech when discussions are passive. | 
| 16:50 | RECOMMENDING RECIPE FROM FOOD IMAGE BASED ON CNN AND TRANSFORMER SELF-ATTENTION MODEL ABSTRACT. Click here to watch this presentation Researchers have been studying the food recommender system more and more recently, advising not only that can they be a method to assist people to find food recipes that they might want to try, but also encourage them to raise their self-awareness about having a healthier eating lifestyle. This paper presents research for a recipe recommender system, which is able to predict the ingredients from input food images and from there, suggest generated recipes to users. The system is a model consisting of a 50-layer convolutional neural network (CNN) called residual network 50 (ResNet-50) and a transformer model with a mechanism of attention. The experiment is conducted on a preprocessed dataset with over 250,000 recipes with food images. The system delivers a result in terms of predicting ingredients in pictures and providing appropriate recipes from there. Using this output as a foundation, it would benefit to improve the system with different directions in the future. | 
| 17:00 | Voronoi-based Flood Area Detection Using Multi-agent Systems PRESENTER: Shaiful Nizam ABSTRACT. Click here to watch this presentation In the presented study, an algorithm was made to support flood detection for connected and disconnected area of water body. The algorithm aims to give an equal distribution of drones to the flood areas in a stochastic environment. The corresponding algorithm is implemented using Voronoi-based approaches and image segmentation. The goal is to provide information regarding flood area distribution, to a flood caging system. The Voronoi-based approaches use multi-modal gaussian distribution to generate centroids around the flood area. These centroids attract other drones to gather around to flood detected area. Image segmentation assists Voronoi algorithm by detecting a flood area from color difference between ground and water body. Several scenarios, such as one connected water body and three disconnected water body, are simulated and investigated with the algorithm throughout this study. The experiments and results are briefly discussed in this paper. Finally, some conclusions are drawn at the paper’s end. | 
Technical Session No 2, Q&A (Presentations 7-12)
| 17:20 | A method of driving video enhancement via CycleGAN for improving automated object detection PRESENTER: Takumi Fujii ABSTRACT. Click here to watch this presentation Over the last decade, the development of autonomous driving technology and driving support systems has seen immense growth, particularly for the goal of ensuring safety. However, at night, there are problems that the detection accuracy of the situation outside the vehicle is lower than detection during the daytime due to darkness and glare from other vehicles and light sources (e.g., signs, traffic lights, etc.). In this research, we introduce an algorithm that converts night-time road images to daytime brightness to improve detection accuracy. We use "CycleGAN", an image generation algorithm based on machine learning, to convert the night-time image into a day-time image. Specifically, CycleGAN is an unsupervised deep network that searches for an optimal solution by changing a night-time image into a day-time image, and then changing the generated day-time image back into a night-time image, whereupon the two night-time images are compared and the loss is minimized. After converting the image, we detect lanes and objects by using HSV color space information and a separate CNN. We demonstrate that the proposed method can lead to significantly improved detection accuracy for night-time images. | 
| 17:30 | Gaze Tracking Technology for Smart Environments PRESENTER: Yuto Tomita ABSTRACT. Click here to watch this presentation Recently, technological innovation has pushed computational devices into mainstream industries and use cases. The gaze tracking system is one of these emerging technologies. Gaze tracking can be used in a variety of situations. This includes assisting individuals with disability, preventing crime and nefarious acts, and commercial uses. However, most of the gaze tracking systems current deployed need special devices such as three-dimensional depth sensing cameras or computers with high processing capabilities. In this research, we discussed how to implement a gaze tracking system on mobile devices and its application to daily activities. The Mediapipe framework for multi-platform gaze tracking was used for the creation of the system. The creation of a remote-controlled system for electrical devices by gaze tracking, concentrating analysis tool for drivers’ observing, and studying helper were developed as possible use cases for this type of system. To control electrical devices, a smart controller for smart home applications named Nature Remo was used. In order to detect whether users are concentrating, support vector machine classifier was used. The applications developed in this paper do not need any special devices for deployment expect for mobile devices. Using this technology, it will be simpler and cost effective to implement gaze tracking. This provides a robust platform for gaze tracking application and technology development in the future. | 
| 17:40 | CARE TRANSITION RECORDS: A SOLUTION APPROACH TOWARDS SEAMLESS DIGITAL PROCESSING PRESENTER: Elisabeth Veronica Mess ABSTRACT. Click here to watch this presentation Medical and nursing facilities are under considerable pressure because the number of patients increases, and nursing staff is scarce. Digitizing administrative care processes could help save time that can be used for personal care. In this paper, the Care Transition Record (CTR) and Care Data Transition Process (CDTP), as part of the German Discharge and Transition Management (GDTM), are studied. The main goal is to digitize the CTR and optimize and streamline the CDTP. Six field observations (Augsburg University Hospital) and three contextual interviews (one care facility) were conducted to analyze the initial situation. The results were modeled as two BPMN models. While performing the observations, it was found that the duration of the CDTP varied significantly, depending on the facility. Furthermore, the CTR never arrived at the facility before the patient. Additionally, nurses were frequently interrupted while manually transferring nursing-relevant data (e.g., CTR) into the in-house system. These problems lead to increased administrative as well as managerial effort for the nursing staff. A possible digital solution to address these issues is proposed and presented in form of a system context diagram. | 
| 17:50 | Traffic Anomaly Detection Based on Vehicle's Orientation PRESENTER: Annisa Dea Rachmantya ABSTRACT. Click here to watch this presentation In a traffic analysis system, it is important to detect anomalous events in order to avoid accidents. Automatic anomaly detection could be made to help human operators in detecting anomalous events. This paper proposed an approach for detecting anomalous events in traffic situations based on vehicle direction. It is done by analyzing video gathered from traffic camera. Vehicle’s orientation are detected based on sparse optical flow. To improve on the sparse optical flow method, the background subtraction method will be applied as a preprocess step. Using the orientation of each feature point, the One-Class Support Vector Machine method is applied to make a normal model and classify feature points. The result of this proposed method is that it is able to detect anomalous events recorded such as illegal U-turn done by vehicles. | 
| 18:00 | APPLICATION OF DEEP LEARNING TECHNIQUES TO PROBLEMS OF STATISTICAL PHYSICS PRESENTER: Vitalii Kapitan ABSTRACT. Click here to watch the slides Nowadays, methods and techniques of Machine Learning and Deep Learning are being used in various scientific areas. They help to automatize calculations without losing in quality. In this paper the applying of convolutional neural network was considered in frame of problems from statistical physics and computer simulation of magnetic films. In a frame of the first task, CNN was used to determine critical Curie point for Ising model on 2D square lattice. Obtained results were compared with classical Monte-Carlo methods and exact solution. Systems of various lattice sizes and the influence of the size effect on the results’ accuracy were considered. Also, authors considered the classical two-dimensional Heisenberg model, a spin system with direct short-range exchange, and studied of its competition with the Dzyaloshinskii-Moriya interaction. A neural network was applied to the recognition of Spiral (Sp), Spiral-skyrmion (SpSk) Skyrmion (Sk), Skyrmion-ferromagnetic (SkF) and Ferromagnetic (FM) phases of the Heisenberg spin system with magnetic skyrmions. The advantage of CNN’s application over conventional methods for determination of skyrmion's phases was revealed. | 
| 18:10 | Image super-resolution via a combination of DCT coefficient estimation and SRGAN PRESENTER: Yushi Yamanaka ABSTRACT. Click here to watch this presentation Super-resolution is a technology that improves resolution and image quality by increasing the size (number of pixels) of an image. There are two types of super-resolution techniques: multi-frame super-resolution using multiple images or videos, and single-frame super-resolution using a single input image. Single-frame super-resolution methods include example-based methods and deep-learning-based methods. The latter approach has shown particularly impressive results due to simplicity in generating training data. However, although deep-learning-based super-resolution is highly effective for edge regions, it performs less effectively for textured regions. Therefore, in this paper, we discuss our preliminary work on performing single-frame super-resolution by using a combination of DCT coefficient generation and deep learning. DCT-based super-resolution is used to improve the textured areas by synthesizing high-frequency coefficients based on noise patterns. As we demonstrate, combining deep learning with this DCT-based technique can potentially yield good performance on both edges and textures, although significant practical limitations remain. | 
Technical Session No 3, Q&A (Presentations 13-17)
| 18:30 | Detection and Classification of Tactile Paving for the Visually Impaired via Deep Learning PRESENTER: Kaito Nagao ABSTRACT. Click here to watch this presentation Braille blocks are brightly colored concrete pavers with specific tactile patterns that are installed throughout Japan in indoor and outdoor public facilities to provide crucial navigation support for the visually impaired. Those with weak vision, or partial or complete blindness, make decisions on where to move next by feeling the patterns on the blocks by using a walking stick and/or the soles of their feet. In this research, we propose a smart walking stick system that uses vision, lidar, and deep-learning-based method to support this navigation. We believe that this smart walking stick can provide the visually impaired a much safer and much more convenient lifestyle than is currently available using a traditional walking stick. Here, we present the camera-based part of the system, which uses a semantic segmentation approach to identify the braille blocks' locations (segmentation) and types (classification) from each frame of input video. As we demonstrate, our approach can provide information about future distant blocks, thus providing better informed decisions about where next to walk. We also demonstrate that the proposed method is more accurate and robust to environmental variations as compared to competing methods. | 
| 18:40 | Mathematical optimization in building images of inspected objects using ellipse properties PRESENTER: Maria Kostina ABSTRACT. Click here to watch this presentation The article presents a method that enables reduction of transmitted data from the block of receiving and preliminary processing to a personal computer, as well as the amount of digitized information and the time of its processing. The method is based on the ellipse phenomenon. The developed data processing algorithm for the system with a multielement sensor was tested in MatLab. A block diagram and data processing algorithm for the practical implementation of FPGA have been developed. The amount of digitized information has been reduced by more than 10 times. | 
| 18:50 | A System To Support Planning of Donor Blood Delivery To Hospitals in Indonesia PRESENTER: Michael Evan Santoso ABSTRACT. Click here to watch this presentation Blood donor is an important substance that should be available in hospitals at any time. Consequently, blood products are delivered regularly from blood banks to medical institutions such as hospitals and clinics. Commonly, blood bags are delivered from one bank to several medical institutions. As blood products are perishable, delivery time is crucial to guarantee that no spoilage of products occur during delivery. Traffic conditions, however, can be highly unpredictable, especially during rush hours, and could severely impact the blood product delivery schedule. Hence, delivery planning is essential to ensure efficiency of blood distribution among hospitals in Indonesia. This study thus aims to improve time-efficiency in blood bags delivery. A Genetic Algorithm is deployed for route finding, and XGBoost method is used to determine the travel time prediction. Moreover, parametric bootstrap is applied to estimate the range of time for the delivery. Experiments were conducted with data from Banten Province of Indonesia, and results obtained suggest that the proposed system provides for efficient scheduling by generating optimal delivery routes based on the planned start time and hospital locations. | 
| 19:00 | Robust Adaptive Multi-Agent Coverage Control For a Dynamically Changing Area PRESENTER: Keerati Fungtammasan ABSTRACT. Click here to watch this presentation This paper presents an adaptive multi-agent control approach for the coverage of a dynamic region in the presence of time-varying uncertainties. The Centroidal Voronoi Tessellation (CVT) is applied to generate an optimal coverage configuration of a multi-agent system for the dynamic region. We propose an adaptive controller based on the function approximation and deep reinforcement learning techniques to drive the system to the generated optimal configuration and the validity of the proposed controller is tested by simulations. | 
| 19:10 | Sign Language Recognition by Machine Learning Using Multimodal Wearable Sensors and RGB Imagery Data PRESENTER: Ignasius Ian Savio Gunawan ABSTRACT. Click here to watch this presentation Sign language is a type of communication used by those who are deaf. Their means of communication include using facial expressions, hand gestures, and arm movements. Previous research utilized camera sensors to capture images of people reenacting sign language gestures, glove-based sensors to record hand motions or armband sensors to detect the changes in Surface Electromyography (sEMG) for certain sign language gestures. This paper is focused on the evaluation of multi-modal usage of sEMG and Inertial Motion Unit (IMU) sensors in machine learning-based American Sign Language (ASL) recognition systems. Preliminary experiments demonstrated that arm movements are recognized with high accuracy. Future works should include more ASL movement patterns collected from many subjects. In addition, a promising method of real-time alphabetical ASL recognition based on the skeletonization of RGB images of hand gestures was investigated. |