IEVC2019: THE 6TH IIEEJ INTERNATIONAL CONFERENCE ON IMAGE ELECTRONICS AND VISUAL COMPUTING
PROGRAM FOR THURSDAY, AUGUST 22ND
Days:
next day
all days

View: session overviewtalk overview

13:30-15:30 Session 1A: Computer Vision
Location: Room 1
13:30
Likelihood-Based VKOP Detection with Reliability of Estimated Planes
PRESENTER: Kazuma Uenishi

ABSTRACT. The VKOP detection method allocates 3D keypoints in virtual positions utilizing the planar surfaces of 3D point clouds. When compared to conventional keypoints, VKOP is particularly effective in an environment where there are several man-made objects. However, it strongly depends on the performance of the planes estimation method. To improve its repeatability, several methods that evaluate the likelihood of the estimated planar surfaces have been proposed. In this paper, we proposed methods that integrates these metrics and compared its performance with the conventional methods. We also evaluated the effect of pre-filtering planes, and examined the optimal method to detect VKOP keypoints.

13:50
Robust, Efficient and Deterministic Planes Detection in Unorganized Point Clouds based on Sliding Voxels

ABSTRACT. Planes detection in unorganized point clouds is a fundamental and essential task in 3D computer vision. It is also a prerequisite in a wide variety of 3D vision tasks such as object recognition, registration, and so on. Conventional planes detection methods are extremely slow because they require the computation of point-wise normal vectors. Therefore, we propose a sliding voxel based method that efficiently detects planes via coplanarity weighing and robust refitting. Experiments with simulated and realistic point clouds confirmed that the proposed method is several orders of magnitude faster, more accurate and robust to noise than the conventional methods.

14:10
The More, The Better: Evaluation of Model Ensemble Approach for Few-shot Learning
PRESENTER: Toshiki Kikuchi

ABSTRACT. Despite the recent success in neural networks on the visual domain, we need a large amount of data to train the networks. Previous works addressed this issue as the few-shot learning. Some methods performed well on the few-shot tasks, but need a complex architecture and/or specialized loss functions. In this paper, we evaluate the performance of the ensemble approach aggregating a huge number of simple models (up to 128 models) on standard few-shot datasets. Surprisingly, although the approach is simple, our experimental results show that the ensemble approach is competitive with state-of-the-art methods among similar architecture methods in some settings.

14:30
Smart Reading Device for Visually Impaired People
PRESENTER: Deepak Rai

ABSTRACT. This paper presents an efficient, user-friendly and realtime cost-effective automated document reader for visually impaired people, developed using RaspberryPi. The setup of this project consists of the integration of a complete Text Read-Out system. It involves taking pictures of text documents in real time using RaspberryPi camera, pre-processing of text images and finally extracting texts from those images using Tesseract library.These text files are converted and read out, using text-to-speech (TTS) synthesizer unit, installed on RaspberryPi. We have changed different font, size and style to know what combination works best for our system.

14:50
Image Measurement and Accelerated Micro-Geometry Modeling of Human Skins Considering Skin Conditions of Each Part

ABSTRACT. We have been studying the representation of micro-geometry of human skins applying computer graphics technologies. The technique rendered the micro-geometry as polygon models. However, it was problematic that the technique required large computation time for polygon generation and generated a large number of triangles. Also, we could not represent differences of skin conditions of each face part because we were modeling the micro-geometry only from images of cheeks. Therefore, we developed a new implementation to represent the micro-geometry by the shader program with displacement mapping to accelerate this process. Furthermore, we developed a method to measure the parameters of micro-geometry using 22 images of different face parts and interpolate the micro-geometry around the shooting points. This paper describes the processing flow of the image measurement and accelerated modeling of the micro-geometry, and then introduces examples of the results.

13:30-15:30 Session 1B: Medical Image Processing
Location: Room 2
13:30
Image Acquisition and Analysis of Microcirculation
PRESENTER: Hideaki Haneishi

ABSTRACT. We have been developing some image analysis technologies of microcirculation as well as developing our original optical setups. One application is oximetry. Measuring the oxygen saturation (SO2) of the microcirculation is effective in clinical diagnosis. Therefore, we have developed an SO2 estimation method using sidestream dark-field (SDF) imaging. We performed in vivo animal experiment under hypoxic stimulation and enabled visualization of SO2 of local microcirculation. Another application is flow analysis. We constructed a non-contact optical set up for imaging microcirculation of septic rats. Furthermore, using the obtained motion pictures, we estimated the flow of red blood cells (RBCs) and investigated the relationship between the flow of RBCs and septic shock.

13:50
Segmentation of Cell Nuclei Using Watershed Transform in the H&E-Stained Histopathological Images
PRESENTER: Cynthia Hayat

ABSTRACT. Image segmentation is an important step for automation image processing. This paper proposes the watershed transform for segmented the nuclei in H&E-stained histopathological images. The color deconvolution, the Otsu thresholding, the morphological opening and reconstruction, and also contour detection, must be done before conduct the watershed transform. The experiment result using 12 images show that the proposed method successfully detect and segment the nuclei in almost all of the images. The method also overcome the overlapping nuclei. The color contrast between the nuclei and the environment surrounding can influence the fruitfulness of the segmentation process. The nuclei with the color contrast almost the same with its surrounding will be more difficult to segmented than the nuclei that have different color contrast with its surrounding.

14:10
Binary Malignancy Classification of Skin Tissue Using Reflectance and Texture Features from Macropathology Multi-Spectral Images

ABSTRACT. This study suggests an analysis procedure on macropathology multi-spectral images (macroMSI), for visual representation of grossly malignant regions of skin samples during excision margin pathological diagnosis. We implemented binary malignancy classification on a database of ten high-resolution 7-channel macroMSI tissue samples, captured before and after formalin fixing. We reconstructed spectral reflectance by Wiener estimation and described texture using local binary patterns (LBP). Highlighted malignancy regions were derived from an optimal classifier selected by cross-validated performance. The results show that malignant regions are highlighted fairly accurately and indicate the importance of analyzing unfixed tissue in conjunction with fixed tissue.

14:30
Elastic and Collagen Fibers Segmentation Based on U-Net Deep Learning Using Hematoxylin and Eosin Stained Hyperspectral Images

ABSTRACT. This study proposes a U-net deep learning based segmentation method to segment elastic and collagen fiber regions from Hematoxylin and Eosin (H&E) stained pathological images using hyperspectral images. Our proposed method uses U-net as a CNN architecture with random augmentation 64x64 small patches of H&E stained hyperspectral images to feed our U-net model. Groundtruth of the segmentation is obtained using Verhoeff’s Van Gieson (EVG) stained images, which are commonly used for recognizing elastic and collagen fiber regions. Our model is evaluated by three cross validations. The segmentation result show that H&E stained hyperspectral images performed better segmentation than H&E stained RGB image by comparing the segmentation of EVG stained images visually and quantitatively.

14:50
Invertebrate Fossil Classification Using Support Vector Machine Method

ABSTRACT.   Invertebrate fossils are fossils of organisms that have no vertebrae, their numbers are abundant and well preserved in various types of rocks, originating from various types of organisms, most of which live in long geological time frames. Currently to identify the types of invertebrate fossils still using acetic acid, namely by methods involving acetic acid or other, this involves chemical compounds that often take time and potentially damage the fossil itself. This study was made to classify three types of invertebrate fossils particularly Anthropods, Brachiopods, and Molluscs based on texture feature extraction i.e contrast, correlation, energy, and homogeneity, and classification algorithms with Support Vector Machine (SVM). The feature extraction value will then be processed into input for classification using SVM. The highest accuracy is achieved in this study at 94.87%, using 39 training data and 39 test data.

15:10
Retinal Authentication Based on Image Registration Using ICP Algorithm
PRESENTER: Yuji Hatanaka

ABSTRACT. This paper describes a retinal authorization technique using retinal image registration because of the retinal blood vessel pattern rarely changes over time. The proposed method consists of four steps; 1) blood vessel detection using top-hat transformation and skeletonization, 2) optic disc detection, 3) two-step registration of image pair, 4) authorization using the Jaccard similarity coefficient. The retinal image pairs were rough matched through the center of their optic discs, and the pair was matched using the Iterative Closest Point algorithm based on blood vessel skeletons. The accuracy actually reached 100% in 87 correct pairs and 60378 incorrect pairs.

13:30-15:30 Session 1C: Imaging Device
Location: Room 3
13:30
Real-World High-Definition Video Archive System with 250 Million-pixel 19K13K Uncompressed Image Recorder

ABSTRACT. This study proposes high-definition video archive system which real-world visual information can be recorded without losing detail. We have developed 19K13K uncompressed image recorder equipped with 250 million-pixel CMOS image sensor. The proposed method for video archive production consists of two recoding modes: (1) Full-screen high-resolution and (2) Angle-clip high-speed recoding mode. In mode-(1), the 19K13K uncompressed entire images are recorded. Experimental results showed that the high-definition movie with both wide-angle and close-up view could be produced from the identical 19K13K uncompressed image materials. In mode-(2), it records Region of Interest (ROI) only. When ROI resolution is set to 8K4K, the recording speed increases to 24fps. The selected ROI position moves freely in real time, so the best composition can be determined instantaneously with the camera position fixed. Experimental results showed that this function successfully captured the decisive moment “a wild deer climbing the cliff”.

13:50
Concept and Prototyping of Large Area Wall Display Using Electronic Paper Tile

ABSTRACT. A novel concept for large area displays, “e-Tile”, is introduced. A typical e-Tile configuration, in which 100 pixels are mounted on a 100 mm square board, is designed and prototyped. One promising application is an unobtrusive information board, which is far less annoying than the conventional vivid LED/LCD in public spaces.

14:10
Implementation of Self Localization System Using Frequency Modulated LED Lighting and Omnidirectional Camera for Wheelchair Basketball

ABSTRACT. This paper describes the implementation of our self-localization system in an indoor environment based on LED optical frequency modulation. Full-color LEDs are used as markers for position estimation. The characteristic of this system is that red, green and blue led’s optical patterns are frequency modulated independently and used them for including some kinds of information. By using the information of these optical patterns, the system can acquire all positions of the markers before the calibration. In this paper, we conduct experiments which confirm the method to acquire the information which is provided from LED optical patterns in an actual environment.

14:30
AR Display Guidance Services of Indoor Facilities Using 2.4 GHz Band Position/Direction Measuring Prototype

ABSTRACT. Currently, GPS is used to indicate the location of facilities and the paths to them. Because the positioning accuracy of GPS is so low indoors, GPS can often not be used to indicate the location of indoor facilities and the paths to them. Therefore, we have developed a system including both hand-crafted transmitters and receivers operating with radio waves in the 2.4 GHz band in order to estimate the current position of the indoor user. Based on the estimation results, it became possible to indicate the path to the destination visually for the user. In addition, by adjusting the frequency band, there is a possibility that the smartphone can be used as a receiver.

14:50
Safety Driving Measures for Elderly Drivers by W-DRM System

ABSTRACT. One of the human vital signs is respiration rate. Accurately measuring the respiration rate has provided to be a difficult task and those methods that do succeed prove to not be suitable for a mobile measurement. In this paper a mobile method of respiration rate measurement is proposed that satisfies the need for accurate data as well as enabling it to be easily accessed. TWO DOPPLER RADAR MODULE (W-DRM) SYSTEM can be used to measure the Doppler fading that is caused by the motion of the chest cavity. Previous attempts at using this have been met with limited success due to a high error rate. The method proposed in this paper will improve the level of accuracy of the W-DRM without removing the potential mobility and ease with which it can be integrated into devices such as cars.

15:10
Technique for Embed Information in 3D Printed Objects Using Near Infrared Fluorescent Dye

ABSTRACT. This paper presents a novel technique to embed information in a 3D printed object using a near infrared fluorescent dye. Patterns containing a small amount of fluorescent dye are formed inside the object. These patterns expresses binary information. We embedded the patterns at two different depths to increase amount of embedded information. Utilizing the fact that blur of the pattern image depends on the depth, we recognize the pattern depth from the image using the deep learning to read out embedded information. From the experiments, we demonstrate the feasibility of this technique.

16:00-17:40 Session 2A: Image Processing
Location: Room 1
16:00
Filling Small Spots and Changing the Line Thickness of the Line Drawing by FCN

ABSTRACT. When drawing illustrations and comics, filling small spots with black ink and changing the line thickness give a plastic impression to the line drawings and make the illustration more attractive. We consider the processing such line drawings using deep learning. We propose to add a plastic impression to a line drawing using FCN which is a kind of deep learning.

16:20
A Free Pour Latte Art Support System by Showing Paths of Pouring Milk Using Design Templates

ABSTRACT. When it comes to practicing free pour latte art, most latte art learners watch videos which show how to make basic patterns of latte art such as Heart, Rosetta, and Tulip. However, it is difficult to understand how to manipulate the milk jug and paths of pouring milk by the videos. In this paper, we focus on free pour latte art and develop a free pour latte art support system by which latte art learners can design original latte art using design templates and learn paths of pouring milk by showing animated lines.

16:40
Estimation of Luminance Distribution Around the Sun Based on Analytical Sky Model
PRESENTER: Asato Maekawa

ABSTRACT. Depending on the camera, high brightness areas around the sun may cause inaccurate brightness measurements. To solve this problem, we estimated the brightness distribution around the sun using the brightness distribution of the Hosek sky model. It was found that we could estimate the brightness of the sky far from the sun, but we could not estimate the brightness of the high brightness area around the sun. It is considered that the reason is the value of the sun is not considered in the Hosek sky model as the cause.

17:00
A Data-Driven Method for Reproducing Artificial Light Sources in Night View Style Transfer Based on Luminance Map
PRESENTER: Xu Wang

ABSTRACT. In this paper, we introduce a data-driven method for reproducing the artificial light sources in night view transfer. Our idea is to blend the daytime image with artificial light sources. The idea is similar to the method of image matting, which separates a given image into a foreground and a background image. Our method regards the daytime scene as background image and the artificial light sources as foreground. Since the estimate of daytime scene luminance map against nighttime one is also needed, we create a day-to-night dataset to train a pix2pix framework. After computing the artificial light sources based on the luminance map and the luminance-color mapping model, we blend them with a daytime scene and optimize the result according to the predicted nighttime luminance map. The result is restrained to the pix2pix framework but in one respect succesfully restores the artificial light sources in night view image. Our primary contribution is to reproduce a set of artificial light sources in the process of nighttime image synthesis. The new method would not only present an easier approach to transform various daytime images but make the processing more controllable through user interaction.

17:20
Natural Extension Method of Video Length
PRESENTER: Miki Iwasaki

ABSTRACT. We propose extension methods of video length for music videos. In general, music videos are created by editing and concatenating appropriate parts of multiple source videos. However, it is difficult for musicians to prepare enough source videos. In this paper, we present several methods to extend short source videos to long video sequences. By combining image segmentation, repeated play, backward play, and Poisson image editing, natural long video sequences are created even from source videos with non-repetitive motions.

16:00-17:40 Session 2B: Visualization
Location: Room 2
16:00
Generating Route Panoramas for Street Guide Maps

ABSTRACT. Maps are useful for navigating those who walk in unfamiliar cities. Although conventional maps, such as ordinal 2D maps, 3D CG maps, and the Google Street View, have several advantages, these maps are not suitable for a guide when walking in narrow streets or districts without landmarks because of the limited or distorted representations of the landscape. Our goal is to design more practical and useful representations for a map. For this purpose, we propose a novel street guide map in which route panoramas along every street are projected on a 2D map. Users can grasp the present location more easily with our map. As the first step, in this paper, we propose a method to correct several distortions of route panoramas taken by an off-the-shelf camera. Specifically, we correct the distortion aberration and inclination of the landscape in the panoramas.

16:20
Two-Dimensional Flow Field Visualization Using Hierarchical Poisson Disk Sampling
PRESENTER: Hiroki Watai

ABSTRACT. In flow field visualization, it is important to show various attributes of the flow field, such as flow velocity, direction, orientation, and macro/micro structure. In conventional methods, it is difficult to show all of these attributes in one image. We propose an easy- to-understand 2D flow field visualization method that enables to present multiple attributes in one image. The main idea is to combine LIC (line integral convolution) method and OLIC method (using oriented short streamlines). To improve comprehensibility, overlapping of streamlines are avoided with modified hierarchical Poisson disk sampling method.

16:40
Gaze Analysis for Optimization of Advertisement Content Layout

ABSTRACT. There are many advertisements around us, but not all advertisements are focused on, and some advertisements do not work well. Therefore, it is important to create an advertisement that is easily noticed and effective. In this study, we conducted experiments using a gaze-tracking system, and analyzed changes in the degree of attention caused by changing the arrangement of elements that make up poster advertisements, and changes in the attention when changing the arrangement of publication places. Then, we will clarify the influence of the arrangement of elements and placement place of the component on the gaze, and establish the index for efficient advertising production with high advertising effectiveness.

17:00
User Experience Design in Virtual Learning Environment for Complex Projective Geometry

ABSTRACT. With the advent of the high-performance graphics and networking technologies that enable us to create virtual worlds networked via the Internet, various virtual environments have been developed to support mathematics education. In the environments that have been inherently two- or three-dimensional Euclidean, students have discovered and experienced mathematical concepts and processes in almost the same ways that they can do in real life. We present an immersive virtual environment that allows the user to set environmental limits beyond the Euclidean 3-space while providing a good user experience. The problem here is that the higher the level of mathematics, the more the visualization method tends to become abstract that only experts with advanced degrees can fathom. We also show how our figurative approach is essential for bridging the gap between elementary and more sophisticated mathematical visualizations.

17:20
aflak: Visual Programming Environment with Macro Support for Collaborative and Exploratory Astronomical Analysis

ABSTRACT. This paper describes an extendable graphical framework, aflak, which provides a collaborative visualization environment for the analysis of multi-spectral astronomical datasets. aflak allows the astronomer to share and define analytics pipelines through a node editing interface, in which the user can compose together a set of built-in transforms (e.g. dataset import, integration, Gaussian fit) over astronomical datasets. Not only is aflak fast and responsive, but its macro can be conveniently exported, imported and shared among researchers using a custom data interchange format.

16:00-17:40 Session 2C: Neural Network
Location: Room 3
16:00
Analysis of Deep Learning Based Japanese Cursive Style Character Recognition

ABSTRACT. In this study, to promote the translation and digitization of historical documents, we attempted to recognize Japanese classical kuzushiji characters by using the dataset released by the Center for Open Data in the Humanities (CODH). Kuzushiji were anomalously deformed and written in cursive style. As such, even experts would have difficulty recognizing these characters. Using deep learning, which has undergone remarkable development in the field of image classification, we analyzed how successfully deep learning could classify more than 1,000-class kuzushiji characters through experiments. As a result of the analysis, we identified the causes of poor performance for specific characters: (1) Hiragana and katakana have a root kanji called jibo and that leads to various shapes for one character, and (2) shapes for hand-written characters also differ depending on the writer or the work. Based on this, we found that it is necessary to incorporate specialized knowledge in kuzushiji in addition to the improvement of recognition technologies such as deep learning.

16:20
Sentiment Analysis Using Stacking Based Ensemble Learning of Textual and Image Features on Social Media Posts

ABSTRACT. Social Media Analysis is currently an important research area due to the tremendous development of web and mobile technology and the trend where people tend to share their views, sentiment, and stance on microblogs. This research analyses text based user posts and image as a comprehensive study to understand sentiments of microblog users which aims to obtain an opinion towards an entity. In this study, we implemented Stacking based Ensemble Learning. We implemented Deep Convolutional Neural Network (DCNN) for both text and image models. For text processing, Glove is used as a word vector representation

16:40
Local Branch Ensemble Network: Autonomous Driving System Using End-to-End Deep Learning
PRESENTER: Zelin Zhang

ABSTRACT. Recent deep neural networks have proven their role in various fields. Solving the problem of autonomous driving through deep learning has become more and more popular. Although the recent decomposition task system has achieved great successes, we still believe that perception-based strategy is more suitable, because it is similar to the way humans drive the vehicles. In this paper, we propose a perceptual-based end-to-end autonomous driving system which maps the image captured from one single camera to the steering controls. Instead of using single indicator, we apply three indicators in the work with a novel network architecture which we call local branch ensemble network. In addition, we adopt a pre-processing to make the system more stable. The simulation results demonstrate the validity of our proposed method.

17:00
Applying Curvatures Estimated from 3D Point Clouds to Environment Recognition in Forests Using SegNet
PRESENTER: Takeo Kaneko

ABSTRACT. To generate 3D maps with which monitoring robots can understand the surrounding environment in the forest, this paper proposes a 3D map generation method that applies curvatures estimated from 3D point clouds to the segmentation results obtained by SegNet. The authors’ previous method uses only SegNet, but that method cannot achieve sufficient recognition accuracies of scene objects such as the ground to be monitored. The proposed method utilizes curvatures estimated from 3D point clouds for classifying objects in the image into the obstacle class and ground class, and then these results are applied to SegNet. Experiments using a depth camera show promising results for improving the Accuracy of seven objects.

17:20
Image Completion of 360-Degree Images by cGAN with Residual Multi-Scale Dilated Convolution

ABSTRACT. In this paper, we are tackling the problem of the entire 360-degree image generation process by viewing one specific aspect of it. We believe that there are two key points to the achieving this 360-degree image completion: firstly, understanding the scene context; for instance, if a road is in front, the road continues on the back side; and secondly, the treatment of the area-wise information bias; for instance, the upper and lower areas are sparse due to the distortion of equirectangular projection, and the center is dense since it is less affected by it. Although the context of the whole image can be understood through using dilated convolutions in a series, such as recent conditional Generative Adversarial Networks (cGAN)-based inpainting methods, these methods cannot simultaneously deal with detailed information. Therefore, we propose a novel generator network with multi-scale dilated convolutions for the area-wise information bias on one 360-degree image and a self-attention block for improving the texture quality. Several experiments show that the proposed generator can better capture the properties of a 360-degree image and that it has the effective architecture for 360-degree image completion.