View: session overviewtalk overview
09:35 | Exploring Super Resolution Deep Learning Approaches ABSTRACT. Single Image Super Resolution (SISR) is an active area of research within the deep learning community, aiming at enhancing the resolution of a low-resolution (LR) observation to produce its high-resolution (HR) counterparts. We begin by examining various loss functions employed in SISR models, analyzing how they affect performance metrics commonly used to assess these methods. The evaluation of different metrics on the quality and accuracy of the generated HR images is critically evaluated using HR generated samples. Then, we explore the significance of the training datasets, namely how the specificity of these datasets impacts the performance of SISR models. Both quantitative metrics and qualitative assessments of the HR samples are used to gauge the effectiveness of different training datasets. Finally, the paper compares two distinct SISR approaches: the supervised approach, where models are pre-trained on extensive datasets before inference, and the unsupervised approach, which generates HR images from a single LR image without prior training. |
09:55 | Boosting Children’s Reading Motivation with LLM-Generated Story Crossovers PRESENTER: Inês Carmo ABSTRACT. It is widely recognised that reading is essential for children’s education. Nevertheless, the hours children should spend reading are increasingly being replaced by hours in front of tablets, phones, or computer screens. To counter this trend, we developed a tool that allows educators to create stories that combine children’s interests with literary classics. The goal is to motivate the reading indirectly through children’s interests. The tool exploits the generative capabilities of ChatGPT to create compelling crossovers between literature classics and children’s interests. The tool was developed under the design participatory paradigm. A user study conducted in Lisbon, Portugal with 20 children shows a positive experience with the tool and an increase in motivation to read the classics after using it. This paper presents the design, development, and evaluation of the tool. |
10:15 | Learning through dialogues with NPCs using generative AI ABSTRACT. The rapid evolution of generative artificial intelligence (GenAI) is revolutionising various areas, including education and gaming industry. GenAI can create original content to enhance traditional teaching methods, making learning more interactive and personalised. These tools can significantly improve educational outcomes by providing personalised feedback to students and increasing their engagement and motivation. However, the integration of GenAI in education raises ethical concerns, particularly regarding privacy, bias, and the accuracy of AI-generated content, as well as the authenticity and authorship of the work. There is a strong emphasis on the need for robust ethical guidelines and human oversight to mitigate these issues. In your study, we use GenAI to create an NPC with a unique personality and life background and enable learners to interact with the NPC without scripted dialogue, creating an engaging game-based learning environment. The prototype developed was evaluated by a group of sixteen students, and the main results are presented and discussed. |
11:00 | An Immersive Labeling Method for Large Point Clouds (journal track) PRESENTER: Tianfang Lin ABSTRACT. 3D point clouds, such as those produced by 3D scanners, often require labeling –- the accurate classification of each point into structural or semantic categories -- before they can be used in their intended application. However, in the absence of fully automated methods, such labeling must be performed manually, which can prove extremely time and labour intensive. To address this we present a virtual reality tool for accelerating and improving the manual labeling of very large 3D point clouds. The labeling tool provides a variety of 3D interactions for efficient viewing, selection and labeling of points using the controllers of consumer VR-kits. The main contribution of our work is a mixed CPU/GPU-based data structure that supports rendering, selection and labeling with immediate visual feedback at high frame rates necessary for a convenient VR experience. Our mixed CPU/GPU data structure supports fluid interaction with very large point clouds in VR, what is not possible with existing continuous level-of-detail rendering algorithms. We evaluate our method with 25 users on tasks involving point clouds of up to 50 million points and find convincing results that support the case for VR-based point cloud labeling. |
11:20 | Development of a virtual fitting room integrating computer vision, artificial intelligence and virtual reality technologies ABSTRACT. This article presents a virtual fitting room system that improves customer experience by providing personalized image-based recommendations. The prototype designed takes a step forward in the state of the art by contemplating the creation of a customizable avatar based on the user's biometric characteristics and clothing style. Facial recognition models were implemented to predict gender and age, and segmentation and classification techniques are used to extract characteristics from the clothing the user is wearing. The work describes the progress and experiences in the development of some prototype modules and the possible methodologies that are being evaluated to develop a real-time web-based adaptation experience to represent the user's appearance and simulate a combination of multiple garments. |
11:40 | Nutritional Insight: Using OCR to Decode Food Labels for Better Health PRESENTER: Tiago Carvalho ABSTRACT. In the modern world, making healthy food choices is increasingly important due to the rise in food-related illnesses. Existing tools, such as Nutri-Score and comprehensive food labels, often pose challenges for many consumers. This paper proposes an application that uses image recognition technologies to read and interpret food labels, thus upgrading current solutions that rely mainly on reading product barcodes. By using advanced optical character recognition and machine learning techniques, the system aims to accurately extract and analyze nutritional information directly from food packaging without relying on a database of pre registered products. This innovative approach not only increases consumer awareness, but also supports personalized diet management for diseases such as diabetes and hypertension, and also promotes healthier eating habits and better health outcomes. Two minimalist functional prototypes were developed as a result of this work: a desktop application and a mobile application |
12:00 | Spatial navigation concepts based on pose estimation for a VR-CAVE setting ABSTRACT. This paper presents a technical proof-of-concept for utilizing cost-effective, computer vision-based pose estimation to enable natural and virtual navigation in VR-CAVEs. Utilizing MediaPipe for pose estimation we examined key challenges in applying pose data to navigation concepts. To address issues such as low transmission rate of pose data, joint position fluctuations, and difficulties in capturing fast movements, we implemented a processing pipeline that normalizes extremity lengths, filters pose data with a Kalman filter and classifies hand gestures using minimal training data. We developed a prototype (LCVR-CAVE) to test natural navigation through user-centred perspective adjustments based on filtered head positions, and virtual navigation concepts using gesture-based controls and 3D arm raycasts. The virtual navigation methods include teleportation, panning, flying, and view rotation, activated through specific hand gestures and movement patterns. An initial evaluation indicated that while natural navigation provided an immersive experience, the effectiveness of virtual navigation concepts varied depending on the implemented navigation approach, with low pose sampling rates having the most significant impact. |
15:30 | Supporting motion-capture acting with collaborative Mixed Reality (journal track) PRESENTER: Alberto Cannavò ABSTRACT. Technologies such as chroma-key, LED walls, motion capture (mocap), 3D visual storyboards, and simulcams are revolutionizing how films featuring visual effects are produced. Despite their popularity, these technologies have introduced new challenges for actors. An increased workload is faced when digital characters are animated via mocap, since actors are requested to use their imagination to envision what characters see and do on set. This work investigates how Mixed Reality (MR) technology can support actors during mocap sessions by presenting a collaborative MR system named CoMR-MoCap, which allows actors to rehearse scenes by overlaying digital contents onto the real set. Using a Video See-Through Head Mounted Display (VST-HMD), actors can see digital representations of performers in mocap suits and digital scene contents in real time. The system supports collaboration, enabling multiple actors to wear both mocap suits to animate digital characters and VST-HMDs to visualize the digital contents. A user study involving 24 participants compared CoMR-MoCap to the traditional method using physical props and visual cues. The results showed that CoMR-MoCap significantly improved actors' ability to position themselves and direct their gaze, and it offered advantages in terms of usability, spatial and social presence, embodiment, and perceived effectiveness over the traditional method. |
15:50 | BraveHearts AR – A Mobile Game to Reduce Fear in Pediatric Surgery ABSTRACT. The BraveHearts AR project aims to integrate Augmented Reality (AR) with Child Centered Play Therapy, a well-established strategy for mitigating emotions like anxiety, fear and stress, tailored to the specific needs and characteristics of children, as the target audience. Therefore, the development of a game using AR appeared as an attractive solution to explore, serving as a tool for pediatric patients to reduce negative emotions. More specifically, providing users with the positive experiences of relaxation, fun and companionship while educating them about the pre-surgical procedures that they will undergo. Aiming to develop the best solution possible, a related research about relevant subjects was conducted. A prototype of a game was developed using Unity software, designed for mobile devices and features an intuitive interface. With this, the game comprises an introduction, a memory game focused on the playful aspect, minigames that emphasize learning about three pre-surgical procedures, a mechanism planned to keep engaged when playing, and an ending. To enhance the clarity of the medical procedures and to prioritize a positive experience, a narrative was carefully written about the game, involving a buddy character that accompanies the player along the game. Throughout the game, the buddy introduces new characters who interact with the player while explaining medical procedures using simple game mechanics. The ambition of developing an AR game capable of transmitting information and educating its players about surgical procedures was fulfilled. Evaluations about the game’s usability were conducted and the great results obtained so far are motivating to further test this game with patients in appropriate places by the target audience, i.e., with children who are hospitalized for surgery, aiming at accessing its capability to reduce the aforementioned negative emotions. |
16:05 | Augmented Furniture: Enhancing Online Shopping with AR and 3D Visualization ABSTRACT. This paper explores the integration of Augmented Reality (AR) and a 3D product viewer to enhance the furniture buying experience in an e-commerce setting. Utilizing ARCore and ARKit alongside Three.js, the project focused on usability and user experience (UX) through a thoughtful design approach. The prototype achieved positive usability testing outcomes by overcoming challenges such as mobile optimisation. This work contributes to advancing e-commerce for furniture stores through immersive technologies and emphasizes the importance of prioritizing UX. Future directions may involve further usability enhancements and scalability across devices. |