CASA 2025: COMPUTER ANIMATION AND SOCIAL AGENTS 2025
PROGRAM

Days: Monday, June 2nd Tuesday, June 3rd Wednesday, June 4th

Monday, June 2nd

View this program: with abstractssession overviewtalk overview

14:00-15:00 Session 2: How to train large scale 3D human and object foundation models

How to train large scale 3D human and object foundation models by Prof. Gerard Pons-Moll (University of Tübingen, Tübingen AI Center MPII)

Understanding 3D humans interacting with the world has been a long standing goal in AI and computer vision for decades. Lack of 3D data has been the major barrier of progress. This is changing with the increasing number of 3D datasets featuring images, videos and multi-view with 3D annotations, as well as large-scale image foundation models. However, learning models from such sources is non-trivial. Some of the challenges are: 1) Datasets are annotated with different 3D skeleton formats and outputs, 2) image foundation models are 2D and extracting 3D information from them is hard. I will present solutions to each of these 2 challenges. I will introduce a universal training procedure to consume any skeleton format, a diffusion based method tailored to lift foundation models to 3D (human and also general objects), and a mechanism to probe 3D foundation model features in geometry and texture awareness based on 3D Gaussian splatting reconstruction. I will also show a method to systematically create 3D human benchmarks on demand for evaluation (STAGE).

Location: Amphie 23
15:05-16:20 Session 3A: Fluid & Physical Simulation I

CAVW

Location: Amphie 23
15:05
An Adaptive Boundary Material Point Method with Surface Particle Reconstruction (abstract)
PRESENTER: Haokai Zeng
15:30
Going further with Vertex Block Descent (abstract)
PRESENTER: Bastien Saillant
15:55
Simulation of Ball Levitation with SPH
PRESENTER: Sun-Lay Gagneux
15:05-16:20 Session 3B: AI in Education & Interfaces

CAVW

Location: Amphie 24
15:05
A Retrieval-Augmented Generation System for Accurate and Contextual Historical Analysis : AI-Agent for the Annals of the Joseon Dynasty (abstract)
PRESENTER: Jeongha Lee
15:30
AIKII: An AI-enhanced Knowledge Interactive Interface for Knowledge Representation in Educational Games (abstract)
PRESENTER: Dake Liu
15:55
Toward Fluoroscopy Guided Robotic Needle Insertion for Radio Frequency Ablation (abstract)
PRESENTER: Thuc Long Ha
16:35-18:20 Session 4A: Geometry, Rendering & Mesh Processing

CAVW

Location: Amphie 23
16:35
Fuzzy Sampling with Qualified Uniformity Properties for Implicitly Defined Curves and Surfaces (abstract)
PRESENTER: Mingxiao Hu
17:00
A robust 3D mesh segmentation algorithm with anisotropic sparse embedding (abstract)
PRESENTER: Mengyao Zhang
17:25
ReDACT: Reconstructing Detailed Avatar with Controllable Texture
PRESENTER: Zezheng Chen
17:50
A Real-time Virtual-Real Fusion Rendering Framework in Cloud-Edge Environments (abstract)
PRESENTER: Yuxi Zhou
16:35-18:20 Session 4B: Conversational Agents & Virtual Reality

CAVW

Location: Amphie 24
16:35
Talk with Socrates: Relation Between Perceived Agent Personality and User Personality in LLM-based Natural Language Dialogue Using Virtual Reality (abstract)
PRESENTER: Mehmet Efe Sak
16:58
Path Modeling of Visual Attention, User Perceptions, and Behavior Change Intentions in Conversations with Embodied Agents in VR (abstract)
17:21
MemorIA, an Architecture for Creating Interactive AI Historical Agents in Educational Contexts (abstract)
PRESENTER: Antoine Oger
17:44
Exploring the Impact of Multimodal Long Conversations in VR on Attitudes Towards Behavior Change, Memory Retention, and Cognitive Load (abstract)
PRESENTER: Sagar Vankit
18:07
User Interface for Controlling Crowd in Metaverse Using Spatial Controller (abstract)
PRESENTER: Masaki Oshita
Tuesday, June 3rd

View this program: with abstractssession overviewtalk overview

08:30-09:30 Session 5: Keynote: Xubo Yang

Harmonized XR: Seamlessly Bridging Physical and Perceptual Realism by Prof. Xubo yang (Shanghai Jiao Tong University)

Extended Reality (XR) represents a spectrum of immersive technologies that seamlessly blend the digital and physical worlds, creating environments where users can interact with virtual content as if it were part of their reality This keynote synthesizes cutting-edge research across visual perception, physical simulation, and interactive rendering to explore how XR can achieve both physical realism (accurate representation of physical phenomena) and perceptual realism (alignment with human visual and sensory perception).

We begin by addressing the challenges of visual fidelity in XR through innovative techniques that enhance occlusion, color accuracy, and rendering efficiency, ensuring that virtual content aligns seamlessly with human perception. Next, we delve into advancements in simulation methodologies that bring unprecedented physical accuracy to virtual environments, enabling the realistic representation of complex phenomena such as fluids, bubbles, and surface tension effects. Finally, we explore interactive experiences that bridge the gap between physical and perceptual realism by optimizing virtual interactions to align with natural human behavior and visual focus.

By integrating these advancements, XR can achieve a harmonious balance between physical and perceptual realism, creating immersive environments that are not only computationally efficient but also deeply engaging and believable. This keynote will highlight the interplay between these dimensions, offering a comprehensive roadmap for the future of XR technologies.

Location: Amphie 23
09:35-10:50 Session 6A: 3D Face and Talking Head Modeling

3 CAVW  1 LNCS

Location: Amphie 23
09:35
Talking Face Generation with Lip and Identity Priors (abstract)
PRESENTER: Jiajie Wu
09:53
Speech-Driven 3D Facial Animation with Regional Attention for Style Capture
PRESENTER: Jiahao Pan
10:11
Coarse-to-Fine 3D Craniofacial Landmark Detection via Heat Kernel Optimization (abstract)
PRESENTER: Xingfei Xue
10:29
GSFaceMorpher: High-Fidelity 3D Face Morphing via Gaussian Splatting (abstract)
PRESENTER: Xiwen Shi
09:35-10:50 Session 6B: Cultural Heritage & Artistic Generation

CAVW

Location: Amphie 24
09:35
Chinese Painting Generation with A Stroke-by-Stroke Renderer and a Semantic Loss (abstract)
PRESENTER: Yuan Ma
10:00
Research on Multi-Feature Fusion Shadow Puppet Motifs Generation Based on CSPMotifsGAN and Cultural Heritage Preservation (abstract)
PRESENTER: Rui Wang
10:25
CLPFusion: A Latent Diffusion Model Framework for Realistic Chinese Landscape Painting Style Transfer (abstract)
PRESENTER: Jiahui Pan
11:05-12:20 Session 7A: Fluid & Physical Simulation II

CAVW

Location: Amphie 23
11:05
Decoupling Density Dynamics: A Neural Operator Framework for Adaptive Multi-Fluid Interactions (abstract)
PRESENTER: Yuhang Xu
11:30
A Control Simulation of Multiple Bubbles for Representing Desired Shapes (abstract)
PRESENTER: Syuhei Sato
11:55
A versatile energy-based SPH surface tension with spatial gradients (abstract)
PRESENTER: Qianwei Wang
11:05-12:20 Session 7B: Human Behavior and Animation in Virtual and Mixed Reality Environments
Location: Amphie 24
11:05
Virtual Guides and Crowd Behaviors: Understanding Evacuation Decision-Making in Virtual Reality
PRESENTER: Ziyuan Feng
11:23
BACH: Bi-stage Data-driven Piano Performance Animation for Controllable Hand motion (abstract)
PRESENTER: Jihui Jiao
11:41
Risk-Aware Pedestrian Behavior Using Reinforcement Learning in Mixed Traffic (abstract)
PRESENTER: Tzu-Yu Chen
11:59
Improving Fidelity of Close Social Interaction Animations in Social VR with a Machine Learning-based Refinement Framework
PRESENTER: Roberta Macaluso
14:00-15:40 Session 8: AniNex Workshop I: Immersive Media, Culture & Education

4 CAVW 1 LNCS

Chair:
Location: Amphie 23
14:00
Scene-EEGCNN: Visualization of Zen Meditation Experience Based on EEG-Cultural Heritage Integration (abstract)
PRESENTER: Longfei Yang
14:20
Exploring the Therapeutic Potential of VR-Based ASMR Animation: A Comparative Study on Relaxation and Sleep Aid (abstract)
PRESENTER: Jiahao Du
14:40
Immersion Discrepancies in Educational Serious Games Among Children's Age Groups (abstract)
PRESENTER: Yukun Li
15:00
Immersive Video Game Experience through Naturalistic and Emotive Dialogue Agent (abstract)
PRESENTER: Michael Adjeisah
15:15
Photorealistic 3D Head Reconstruction via 2D Gaussians (abstract)
PRESENTER: Anil Bas
16:00-18:00 Session 9: AniNex Workshop II: Simulation, Interaction & Visual Understanding
Chair:
Location: Amphie 23
16:00
Peridynamics-Based Simulation of Viscoelastic Solids and Granular Materials (abstract)
PRESENTER: Jiamin Wang
16:20
Automating Visual Narratives: Learning Cinematic Camera Perspectives from 3D Human Interaction (abstract)
PRESENTER: Boyuan Cheng
16:40
Intelligent Compilation System for Chinese Character Animation Based on Dynamic Data Sets (abstract)
PRESENTER: Xin Luo
17:00
Unsupervised Salient Object Detection with Pseudo-Labels Refinement (abstract)
PRESENTER: Hao Liu
17:20
Using Large Language Models for Evaluation of Radiological Textual Reports (abstract)
17:32
AssetMask: Mask R-CNN-based approach for Asset detection in railroad track health monitoring (abstract)
PRESENTER: Aradhya Saini
17:44
LLM-Powered VR Nursing Training for Dynamic Risk Assessment (abstract)
PRESENTER: Ehtzaz Chaudhry
Wednesday, June 4th

View this program: with abstractssession overviewtalk overview

08:30-09:30 Session 10: Keynote: Jehee Lee

Generative GaitNet and Beyond: Foundational Models for Human Motion Analysis and Simulation by Prof. Jehee Lee (Seoul National University)

Understanding the relationship between human anatomy and motion is fundamental to effective gait analysis, realistic motion simulation, and the creation of human body digital twins. We will begin with Generative GaitNet (SIGGRAPH 2022), a foundational model for human gait that drives a comprehensive full-body musculoskeletal system comprising 304 Hill-type musculotendons. Generative GaitNet is a pre-trained, integrated system of artificial neural networks that operates in a 618-dimensional continuous space defined by anatomical factors (e.g., mass distribution, body proportions, bone deformities, and muscle deficits) and gait parameters (e.g., stride and cadence). Given specific anatomy and gait conditions, the model generates corresponding gait cycles via real-time physics-based simulation. Next, we will discuss Bidirectional GaitNet (SIGGRAPH 2023), which consists of forward and backward models. The forward model predicts the gait pattern of an individual based on their physical characteristics, while the backward model infers physical conditions from observed gait patterns. Finally, we will present MAGNET (Muscle Activation Generation Networks)—another foundational model (SIGGRAPH 2025)—designed to reconstruct full-body muscle activations across a wide range of human motions. We will demonstrate its ability to accurately predict muscle activations from motions captured in video footage. We will conclude by discussing how these foundational models collectively contribute to the development of human body digital twins, and explore their future potential in personalized rehabilitation, surgery planning, and human-centered simulation.

Chair:
Location: Amphie 23
09:35-10:50 Session 11A: Detection & Recognition
Location: Amphie 23
09:35
Perspective Matters: Investigating the Effects of Vibrotactile Mode Design on User Experience in Action-Role Playing Game and Media
PRESENTER: Hongyu Liu
09:53
Exploring Cultural Heritage with AR: The TAM Case Study of Nvshu
PRESENTER: Yejuan Xie
10:11
A Design Study on Contextual and Interactive Serious Games for Children’s Learning of Chinese Character Culture
PRESENTER: Xu Lang
10:29
Summon Arcane: An AI-Driven Pixel Art Game with Interactive Narrative and Immersive Summoning Experience
PRESENTER: Haoxiang Yang
09:35-10:50 Session 11B: AR/VR for Interaction
Location: Amphie 24
09:35
YOLOv8-HAC: Safety helmet detection model for complex underground coal mine scene (abstract)
PRESENTER: Rui Liu
10:00
STA-TAD: Spatial-Temporal Adapter on ViT for Temporal Action Detection
PRESENTER: Tingwei Wu
10:25
AU-guided Feature Aggregation for Micro-Expression Recognition (abstract)
PRESENTER: Weiqi Xu
11:05-12:20 Session 12A: Cross-Modal and Semantic Representation Learning

4 LNCS

Location: Amphie 23
11:05
Potential Representation Learning for Visible-Infrared Person Re-Identification in Virtual Surveillance Systems
PRESENTER: Haoyuan Du
11:30
Hybrid-Granularity Image-Music Retrieval Using Contrastive Learning between Images and Music
PRESENTER: Xudong He
11:55
Text-driven Tree Modeling via CLIP-based Optimization
PRESENTER: Yudai Ichimura
11:05-12:20 Session 12B: Image Restoration & Enhancement

2 CAVW 2 LNCS

Chair:
Location: Amphie 24
11:05
UTMCR:3U-Net Transformer with Multi-Contrastive Regularization for Single Image Dehazing (abstract)
PRESENTER: Hangbin Xu
11:23
SCNet: A Dual-Branch Network for Strong Noisy Image Denoising Based on Swin Transformer and ConvNeXt (abstract)
PRESENTER: Chuchao Lin
11:41
ShadowCraft-NeRF: Occlusion and Shadow Mitigation via SAM-Guided NeRF
PRESENTER: Xun Chen
11:59
Visualizing the Invisible: An Efficient Framework for Microscopic Visualization (abstract)
PRESENTER: Haoran Jia
14:00-15:40 Session 13A: Human Motion & Gesture Synthesis

CAVW

Chair:
Location: Amphie 23
14:00
RIDGE: Rule-Infused Deep Learning for Realistic Co-Speech Gesture Generation (abstract)
PRESENTER: Ghazanfar Ali
14:20
Precise Motion Inbetweening via Bidirectional Autoregressive Diffusion Models (abstract)
PRESENTER: Jiawen Peng
14:40
Motion In-betweening via Recursive Keyframe Prediction (abstract)
PRESENTER: Rui Zeng
15:00
Interaction with Virtual Objects using Human Pose and Shape Estimation (abstract)
PRESENTER: Hong Son Nguyen
15:20
Motion Style Transfer: Methods, Challenges, and Future Directions
PRESENTER: Siyao Du
14:00-15:40 Session 13B: 3D Reconstruction & Representation

CAVW

Chair:
Location: Amphie 24
14:00
LGNet:Local-and-Global Feature Adaptive Network for Single Image Two-Hand Reconstruction (abstract)
PRESENTER: Haowei Xue
14:25
Joint-learning: A Robust Segmentation Method for 3D Point Clouds under Label Noise (abstract)
PRESENTER: Tingyun Miao
14:50
Weisfeiler-Lehman kernel augmented product representation for queries on large-scale BIM scenes (abstract)
PRESENTER: Xiaojun Liu
15:15
DTGS: Defocus-Tolerant View Synthesis using Gaussian Splatting (abstract)
PRESENTER: Xinying Dai