Program for Thursday, March 14th

PROGRAM FOR THURSDAY, MARCH 14TH

Days:

09:00-10:20 Session 8A: Virtual, Augmented, and Mixed Reality (3)

09:00	Kenji Mizutani, Ryuto Masuda and Kouu You Combination of environmental coefficients and pedestrian behavior toward autonomous vehicles. ABSTRACT. Pedestrian responses to autonomous vehicles were examined when multiple environmental coefficients were varied in a virtual space. A total of eight scenes were observed using two sets each of three environmental variables: scene background, lighting conditions, and vehicle type. The results of varying each environmental variable one at a time were consistent with the results of previous studies, indicating that the proposed application works correctly. When multiple environmental variables were combined, it was found that some combinations produced pedestrian behavior that did not appear with only one environmental coefficient. Scene background and lighting conditions influenced pedestrians, followed by vehicle type.
09:20	Taiyo Taguchi and Tomokazu Ishikawa A Verification Experiment Whether Overlaying a CG Image Can Enhance the Illusion of Taste Induced by Food Pairing ABSTRACT. The purpose of this study is to investigate how CG visual presentation can enhance taste illusion in virtual eating. We have developed a system that overwrites the CG image when viewing food through an HMD. Using a taste sensor to pick food pairings like sea urchin and pudding with soy sauce, or cream cheese and ice cream with lemon, 11 subjects compared tastes of the paired and authentic foods, both in reality and virtual space. The results confirmed that the gustatory illusion was enhanced with CG visual presentation, suggesting a cross-modal impact on taste perception.
09:40	Shogo Nishida, Kei Kanari and Mie Sato An examination of the effects of font size on reading in VR and on a display ABSTRACT. With the spread of VR technology, there has been an increasing number of studies on the readability of text displayed in the VR space. However, differences in the effects of the font size between in VR and on a display under the same conditions have not been fully investigated. In this study, we presented texts in VR and on a display with four different font sizes and evaluated their readability. As results, in the VR space, the larger the font size, the higher the subjective readability rating. On the display, the font size smaller than that in the VR space was evaluated more highly in terms of the readability.
10:00	Mio Yamada and Takafumi Koike Analysis of Moving Single Pictures to Improve the Sense of Presence ABSTRACT. We have investigated the visual conditions that enhance the sense of presence in moving single pictures. This research is based on an IPQ survey of four moving single pictures created under a combination of two conditions: Condition 1: distant objects are less distinct, and Condition 2: the movement of objects present in a single picture is smooth. The results showed that the picture with both atmospheric perspective and smoothness of movement gave the highest sense of realism. In this study, we conduct quantitative evaluation experiments on various perceptions of realism in moving single pictures and examine the terms that improve the sense of presence.

09:00-10:20 Session 8B: Medical Image Processing

Location: ATI Room

09:00	Pragyan Shrestha, Chun Xie, Hidehiko Shishido, Yuichi Yoshii and Itaru Kitahara 2D-3D Registration Method for X-Ray Image Using 3D Reconstruction based on Deep Neural Network ABSTRACT. This paper proposes a method for registering X-ray images with its 3D CT model by estimating 3D point clouds from X-ray images and their corresponding points on the image. Many conventional methods generate a simulated X-ray image from a 3D CT model and optimize the pose by using the similarity metrics be-tween the simulated X-ray and the input X-ray image. On the other hand, deep learning approaches that predict pose information need a canonical coordinate system defined manually on the pre-operative CT to properly utilize the estimated pose. Therefore, we devise a fully automatic registration pipeline that is independent of coordinate system by recovering 3D point clouds from X-ray images, estimating the corresponding points on the images, and aligning them with the given 3D CT model.
09:20	Thalita Munique Costa, Yoko Usami, Mai Iwaya, Yuka Takezawa, Yuika Natori, Hernan Aguirre and Kiyoshi Tanaka White Blood Cells Classification with YOLOv7: Single and Cascade Classification Aproaches for Images Segmented by CellaVision DM96 ABSTRACT. In this work, we study the use of YOLOv7 in the reclassification of blood cell images, segmented by CellaVisionTM DM96, into 11 classes, i.e., Band Neutrophil, Segmented Neutrophil, Basophil, Eosinophil, Erythroblast, Thrombocyte, Lymphocyte, Variant Lymphocyte, Metamyelocyte, Monocyte, and Myelocyte, in simple and cascade classification. The classification made by CellaVisionTM DM96 achieved an accuracy of 76.44%, simple classification achieved an accuracy of 94.22%, and cascade classification an accuracy of 94.44% for the same database. Both methods proved effective in increasing the performance and, mainly the cascade classification, reduced the rate of more relevant mistakes.
09:40	Phuong Thao Nguyen and Hiroshi Watanabe A Real-time Polyp Detection Method Based on GhostAtt-YOLOv8 PRESENTER: Phuong Thao Nguyen ABSTRACT. Convolutional Neural Network (CNN) in medical image processing has lately received a lot of interest. Computer-aided polyp detection in gastrointestinal endoscopy has been the subject of research over the past few decades. However, despite significant advances, automatic polyp detection in real-time is still an unsolved problem. In this paper, we propose a Deep Learning method for reliable real-time polyp detection on endoscopic images and videos. We improve the performance of YOLOv8 model by modifying YOLOv8 model architecture with Ghost Convolution and Spatial and Channel Attention mechanisms (GhostAtt-YOLOv8). These techniques are integrated into the backbone network to enhance detection result. The proposed method is applied on Showa University and Nagoya University polyp database (SUN) dataset. Experimental results show that a better performance is archived with mAP@50 of 80.13% compared to the original YOLOv8, and FPS of our proposed model is 294, faster than original YOLOv8.
10:00	Ayumu Kubota, Hiroshi Hanaizumi, Yoshiteru Watanabe, Shungo Murai, Ryo Tohara and Nobuhide Kawabe A Computer Aided Diagnosis System for Hallux Valgus ABSTRACT. Doctors have been required to record the numerical information for clarifying the basis of their diagnosis against cases of Hallux Valgus. So, they had to find landmarks in each patient’s X-ray image and calculate the Hallux Valgus (HV) angle. These works were very time-consuming and prevented doctors from seeing many patients. Kind of computer-aided diagnosis systems have been required. We propose a Segment-Anything-based method for X-ray measurement of HV angles. In the method, we first recognize the 1st proximal phalanx and the 1st metatarsal bone by using Segment Anything. Then, both longitudinal directions and their difference were calculated by applying a Principal Component Analysis. In a preliminary experiment using 14 radiographs, the HV angle obtained by the proposed method showed good agreement with the independent X-ray measurements by two doctors.

10:20-10:35 Break

10:35-11:35 Keynote3

Location: F. C. Room

11:35-13:20 Lunch

13:20-17:00 Industry Forum

Location: F. C. Room

13:20-14:00 Session 9: Multimedia

Location: ATI Room

13:20

Eri Yokoyama, Hiroshi Sunaga and Makoto J Hirayama

Creation of a Digital Literary Map of Kawakami Village, Yoshino-gun, Nara Prefecture for Regional Revitalization

ABSTRACT. Osaka Institute of Technology has had a partnership agreement with Kawakami Village in Yoshino-gun, Nara Prefecture since 2010, and students are involved in various activities to support the community. The purpose of this paper is to describe the development of a digital literary map that will contribute to the regional development of Kawakami Village. By building an electronic literary map on our website, we can also develop the education of Japanese literature researchers and students, as well as provide information to those who want to learn about the region. The digital literary map is a tool that shows these historical places on a map, it also gives geographical and historical overviews, shows literary texts, and provides links to the original images of the texts so that visitors can also check the text of classical books. This system was developed as a study tool for literature and history related to this village and Yoshino. By building a system that provides access to classical Japanese literature and historical places maps under the theme of "Kawakami-Village and Yoshino in Literature. The use of digital literary maps opens the opportunities for students to study the creation and transformation of images of famous places through literary works.

13:40

Shuto Kinoshita and Yasushi Yamazaki

Smartphone-based Continuous Authentication based on Flick Input Features using Japanese Free Text

ABSTRACT. With the rapid spread of smartphones, user authentication on smartphones has become essential. However, conventional user authentication methods for smartphones using PINs, passwords, pattern locks, etc. have a problem in that users are not authenticated continuously after the first successful authentication; therefore, there is a risk that an authenticated smartphone might be used improperly by unauthorized individuals. Therefore, we proposed a new continuous authentication method based on flick input features using Japanese free text during normal smartphone usage and verified its effectiveness.

14:00-14:15 Break

14:35-15:35 Session 10: Image Recognition & Detection (3)

Location: ATI Room

14:35	Taiyo Nakagawa and Tomoko Ozeki Jewelry Image-to-Image Translation with Consistency Regularization and Data Augmentations ABSTRACT. Image enhancement of jewelry is a difficult task because of the shape of the jewelry, its color, background elements such as shadows and glass stand, as well as the blurring of the boundary between the jewelry and the background and unique light reflections. Our preliminary results indicate that CycleGAN is effective in correcting jewelry images and that background elements in jewelry images adversely affect jewelry image correction. In this study, we propose a method to correct jewelry images with strong background elements. The results show that the target consistency of TC-ShadowGAN is effective in correcting images with strong background elements, and that the removal of background elements is further improved by introducing data augmentation with Balanced Consistency Regularization (BCR) and Dense Consistency Regularization (DCR).
14:55	Yuma Sasaki, Mie Sato and Kei Kanari Effect of visual saliency on emotions while viewing paintings PRESENTER: Yuma Sasaki ABSTRACT. On viewing a painting, emotions such as happiness and sadness are evoked. However, it is unclear whether these emotions are truly aroused or not, and what factors are responsible for the arousal of these emotions. It has been demonstrated that the pupil diameter changes while looking at emotional objects. In this study, using pupillary response as a physiological index of emotion, pupillary response and eye movements were measured when viewing paintings. We analyzed their relationship with the evaluation of the painting (valence, arousal, liking) to examine the influence of lower-order features of the painting on emotions.
15:15	Mika Sakuma and Satoru Fujita Deformation Invariant Palmprint Recognition with Multiple-Resolution Feature Matching ABSTRACT. Palmprint recognition from high-resolution images involves a lot of ridge and minutiae-based features. A high-resolution image of palmprint involves many small ridges and these crossing points. To identify such features of palmprint images, we need to extract micro feature points of the images and find combinations of the feature points that are consistent to physical constraints. The palm, however, has a soft skin, and can deform, and, as a result, we cannot assume a linear transformation of the image. Our approach is loosely-coupled feature-pair matching using neural network. First, we analyze two palm images and extracts multiple-resolution features using convolutional neural network with skip connections. Second, we compare coarse-grained features and find matching parts roughly using graph neural network. Finally, we explore fine-grained features around the rough matching pairs, and determine the identity according to the number of matching pairs. All of these process is done by neural network. We evaluated the proposed scheme and showed a fine result for palmprint identification.

17:00-17:20 Closing

Location: F. C. Room