View: session overviewtalk overview
14:00 | Room Style Estimation for Style-Aware Recommendation ABSTRACT. Interior design is a complex task as evident by multitude of professionals, websites, and books, offering design advice. Additionally, such advice is highly subjective in nature, since different experts might have different interior design opinions. Our goal is to offer data-driven recommendations for an interior design task, that reflects an individual's room style preferences. We present a style-based, image suggestion framework, to search for room ideas and relevant products, for a given query image. We train a deep neural network classifier by focusing on high volume classes with high-agreement samples, using a VGG architecture. The resulting model shows promising results, and paves the way to style-aware product recommendation with a holistic understanding of the room style. |
14:15 | Augmented Reality for Human-Robot Cooperation in Aircraft Assembly ABSTRACT. Augmented Reality (AR) is often discussed as one of the enabling technologies in Industrie 4.0. In this paper, we describe a practical application, where Augmented Reality glasses are used not only for assembly assistance, but also as a means of communication to enable the orchestration of a hybrid team consisting of a human worker and two mobile robotic systems. The task of the hybrid team is to rivet so-called stringers onto an aircraft hull. While the two robots do the physically demanding, unergonomic and possibly hazardous tasks (squeezing and sealing rivets), the human takes over those responsibilities that need experience, multi-sensory sensitiveness and specialist knowledge. We describe the working scenario, the overall architecture and give design and implementation details on the AR application. |
14:30 | Structuring and inspecting 3D anchors for seismic volume into Hyperknowledge Base in virtual reality ABSTRACT. Seismic data is a source of information which geoscientists use to investigate underneath regions and look for possible resources to explore. Such data are volumetric and noisy, thus a challenge to visualize. Over the years, these data motivated the research of new computational systems to assist the expert in that endeavor, such as visualization methods, signal processing, and machine learning models, to name a few. We propose a system that aids geologists, geophysicists, and related experts of the domain to interpret seismic data in virtual reality (VR). The system uses a hyperknowledge base (HKBase), which structures ROI's as anchors with semantics from the user to the system and vice-versa. For instance, through the HKBase, the user can load and inspect the output from AI systems or give new inputs and feedback in the same way. We ran tests with experts to evaluate the system in their tasks to collect feedback and new insights on how the software could contribute to their processes. According to our results, we claim that we took one step forward in VR for the oil & gas industry by creating a valuable experience for the task of seismic interpretation. |
14:45 | Deep Learning on VR-Induced Attention PRESENTER: Gang Li ABSTRACT. Some evidence suggests that virtual reality (VR) approaches may lead to a greater attentional focus than experiencing the same scenarios presented on computer monitors. The aim of this study is to differentiate attention levels captured during a perceptual discrimination task presented on two different viewing platforms, standard personal computer (PC) monitor and head-mounted-display (HMD)-VR, using a welldescribed electroencephalography (EEG)-based measure (parietal P3b latency) and deep learning-based measure (that is EEG features extracted by a compact convolutional neural network—EEGNet and visualized by a gradient-based relevance attribution method—DeepLIFT). Twenty healthy young adults participated in this perceptual discrimination task in which according to a spatial cue they were required to discriminate either a “Target” or “Distractor” stimuli on the screen of viewing platforms. Experimental results show that the EEGNet-based classification accuracies are highly correlated with the p values of statistical analysis of P3b. Also, the visualized EEG features are neurophysiologically interpretable. This study provides the first visualized deep learning-based EEG features captured during a HMD-VR-based attentional task. |
15:00 | Situation-adaptive object grasping recognition in VR environment ABSTRACT. In this paper, we propose a method for recognizing grasping of virtual objects in VR environment. The proposed method utilizes the fact that the position and shape of the virtual object to be grasped are known. A camera acquires an image of the user grasping a virtual object, and the posture of the hand is extracted from that image. The obtained hand posture is used to classify whether it is a grasping action or not. In order to evaluate the proposed method, we created a new dataset that was specialized for grasping virtual objects with a bare hand. There were three shapes and three positions of virtual objects in the dataset. The recognition rate of the classifier that was trained using the dataset with specific shapes of virtual objects was 93.18 %, and that with all the shapes of virtual objects was 87.71 %. This result shows that the recognition rate was improved by training the classifier using the shape-dependent dataset. |
15:15 | PRESENTER: Menghe Zhang ABSTRACT. We present an attention training system based on the principles of multitasking training scenario and neurofeedback, which can be targeted on PCs and VR platforms. Our training system is a video game following the principle of multitasking training, which is designed for all ages. It adopts a non-invasive Electroencephalography (EEG) device Emotiv EPOC+ to collect EEG. Then wavelet package transformation(WPT) is applied to extract specific components of EEG signals. We then build a multi-class supporting vector machine(SVM) to classify different attention levels. The training system is built with the Unity game engine, which can be targeted on both desktops and Oculus VR headsets. We also launched an experiment by applying the system to preliminarily evaluate the effectiveness of our system. The results show that our system can generally improve users' abilities of multitasking and attention level. |
15:30 | Remote Environment Exploration with Drone Agent and Haptic Force Feedback ABSTRACT. Camera drones allow exploration of remote scenes that are inaccessible or inappropriate to visit in person. However, these exploration experiences are often limited due to insufficient scene information provided by front cameras, where only 2D images or videos are supplied. Combining a camera drone vision with haptic feedback would augment users’ spatial understanding of the remote environment. But such combinations are usually difficult for users to learn and apply, due to the complexity of the system and unfluent UAV control. Here, we present a new telepresence system for remote environment exploration, with a drone agent controlled by a VR mid-air panel. The drone is capable of generating real-time location and landmark details using integrated Simultaneous Location and Mapping (SLAM). The SLAMs’ point cloud generations are produced using RGB input, and the results are passed to a Generative Adversarial Network (GAN) to reconstruct the remote scene in real-time. The reconstructed objects are taken advantage of by haptic devices which could provide sophisticated haptic rendering to users. Capable of providing both visual and haptic feedback, our system allows users to examine and exploit remote areas without having to be physically present. We have conducted an experiment that confirms the usability of 3D reconstruction result in haptic feedback rendering. |
14:00 | Introduction PRESENTER: Fabien Danieau |
14:10 | Digital humans: models of behavior and interactivity (keynote) ABSTRACT. As techniques for capturing and generating realistic digital humans become more widely available, the need for realistic movement and behavior becomes more important. The Uncanny Valley effect is more pronounced for moving, as opposed to still, imagery, necessitating higher fidelity motion replication, such as from motion capture, as well as higher fidelity behavior models for synthetic movement. This talk explores my work in modeling both appearance and behavior of digital humans, including capture, rigging, and interactivity. |
14:55 | Temporal Interpolation of Dynamic Digital Humans using Convolutional Neural Networks PRESENTER: Irene Viola ABSTRACT. In recent years, there has been an increased interest in point cloud representation for visualizing digital humans in cross reality. However, due to their voluminous size, point clouds require high bandwidth to be transmitted. In this paper, we propose a temporal interpolation architecture capable of increasing the temporal resolution of dynamic digital humans, represented using point clouds. With this technique, bandwidth savings can be achieved by transmitting dynamic point clouds in a lower temporal resolution, and recreating a higher temporal resolution on the receiving side. Our interpolation architecture works by first downsampling the point clouds to a lower spatial resolution, then estimating scene flow using a newly designed neural network architecture, and finally upsampling the result back to the original spatial resolution. To improve the smoothness of the results, we additionally apply a novel technique called neighbour snapping. To be able to train and test our newly designed network, we created a synthetic point cloud data set of animated human bodies. Results from the evaluation of our architecture through a small-scale user study show the benefits of our method with respect to the state of the art in scene flow estimation for point clouds. Moreover, correlation between our user study and existing objective quality metrics confirm the need for new metrics to accurately predict the visual quality of point cloud contents.
|
15:20 | Automatic Generation of 3D Facial Rigs ABSTRACT. Digital humans are key aspects of the rapidly evolving areas of virtual reality, augmented reality, virtual production and gaming. Even outside of the entertainment world, they are becoming more and more commonplace in retail, sports, social media, education, health and many other fields. This talk presents a fully automatic pipeline for generating and high geometric and textural quality facial rigs. They are automatically rigged with facial blendshapes for animation. The steps of this pipeline such as photogrammetry, landmarking, retopology, and blenshapes transfer are detailed. Then two applications for creating fast VR avatars, and for generating quality digital doubles are showcased. |
16:15 | The Design Process for Enhancing Visual Expressive Qualities of Characters from Performance Capture into Virtual Reality ABSTRACT. In designing performances for virtual reality one must consider the unique qualities of the VR medium in order to deliver expressive character performance. This means that the design requirements for participant engagement and immersion must evolve to address these new possibilities. Embedding the importance of emotion and expression into the process of making character movement, specifically through strong acting and directing, showcases the need for more attention to expressive human movement to enhance immersive experiences. |
16:30 | Influence of Motion Speed on the Perception of Latency in Avatar Control ABSTRACT. With the dissemination of Head Mounted Display devices in which users cannot see their body, simulating plausible avatars has become a key challenge. For fullbody interaction, avatar simulation and control involves several steps, such as capturing and processing the motion (or intentions) of the user using input interfaces, providing the resulting user state information to the simulation platform, computing a plausible adaptation of the virtual world, rendering the scene, and displaying the multisensory feedback to the user through output interfaces. All these steps imply that the displayed avatar motion appears to users with a delay (or latency) compared to their actual performance. Previous works have shown an impact of this delay on the perception-action loop, with possible impact on Presence and embodiment. In this paper we explore how the speed of the motion performed when controlling a fullbody avatar can impact the way people perceive and react to such a delay. We conducted an experiment where users were asked to follow a moving object with their finger, while embodied in a realistic avatar. We artificially increased the latency by introducing different levels of delays (up to 300ms) and measured their performance in the mentioned task, as well as their feeling about the perceived latency. Our results show that motion speed influenced the perception of latency: we found critical latencies of 80ms for medium and fast motion speeds, while the critical latency reached 120ms for a slow motion speed. We also noticed that performance is affected by both latency and motion speed, with higher speeds leading to decreased performance. Interestingly, we also found that performance was affected by latency before the critical latency for medium and fast speeds, but not for a slower speed. These findings could help to design immersive environments to minimize the effect of latency on the performance of the user, with potential impacts on Presence and embodiment. |
16:45 | Multispectral Illumination in USC ICT's Light Stage X ABSTRACT. USC ICT's computational illumination system Light Stage X has been used for a variety of different techniques: from studio lighting reproduction to high resolution facial scanning. In this talk, I'll describe how adding multispectral LEDs to the system has improved color rendition for a variety of such Light Stage techniques, while also enabling higher resolution facial capture. I will conclude with opportunities for future work on human digitization leveraging multispectral illumination sources. |
17:15 | Automating mass production of digital avatars for VR ABSTRACT. This talk covers how the Vision and Graphics Lab at USC’s ICT is leveraging the latest Light Stage technology to devise a database of facial scans. Recent movements toward a convergence of visual quality in real time and offline rendering, in conjunction with the massive rise of deep learning approaches for processing and recreation of human data, have drastically simplified the ability to generate realistic avatars for VR; something that previously was reserved to high end visual effects studios requiring a multitude of highly specialized artists and engineers. We have developed a pipeline for scanning, preprocessing, and registration of expressive facial scans to automate the building of a database that enables training of machine learning algorithms to generate highly detailed and visually realistic avatars. This presentation will focus on the main obstacles confronted when building such a database and pipeline, aimed specifically for facial scan data but stretching further by combining multiple data sources and providing automatic rigging, animation, and rendering of a massive number of digital avatars. |
17:45 | Wrap up PRESENTER: Fabien Danieau |