I3DA2025: IMMERSIVE AND 3D AUDIO: FROM ARCHITECTURE TO AUTOMOTIVE 2025
PROGRAM FOR THURSDAY, SEPTEMBER 11TH
Days:
previous day
next day
all days

View: session overviewtalk overview

09:00-09:45 Session Plenary Lecture 2: 'How immersive audio advances humanities research' - Jonathan Berger (30min + 15 min Q&A)

Jonathan Berger is the Denning Family Provostial Professor in Music at Stanford University. Berger is a composer of a wide range of genres including opera, orchestral, chamber, end electroacoustic music. He is also an active researcher, with expertise in computational music theory, music perception and cognition, psychoacoustics, and sonification. He has published over 70 academic articles in a wide variety of fields relating to music, science, and technology, including relevant work in digital audio processing in Neuron, Frontiers in Psychology, andthe Journal of the Audio Engineering Society. Among his awards and commissions are the Guggenheim Fellowship, the Rome Prize, fellowships from the National Endowment for the Arts, and commissions from Lincoln Center Chamber Music Society, the 92nd Street Y, The Spoleto Festival, the Kronos Quartet, and others. Berger is the Principal Investigator of a major grant from the Templeton Religion Trust’s Art Seeking Understanding initiative, to study the interplay of architectural acoustics and musical and ritual sound.

Chair:
Cobi van Tonder (University of Bologna, Italy)
09:45-10:45 Session S2: Advancements in Acoustic Studies, 3D Acoustic Measurements and Simulations for Auditoria and Concert Halls
Chairs:
Edoardo Alessio Piana (University of Brescia, Italy)
Luca Battisti (Università di Bologna, Italy)
09:45
Amelia Trematerra (Università della Campania Luigi Vanvitelli, Italy)
Silvana Sukaj (Department of Engineering and Architecture, European University of Tirana (UET), Tirana, Albania, Italy)
Giovanni Amadasi (SCS-ControlSys—Vibro-Acoustic, 35011 Padova, Italy, Italy)
#50 - Sound absorption measurements of air-filled plastic balloons

ABSTRACT. This paper reports the sound absorption measurements of air-filled balloons. Balloons can be used for the simple and economical acoustic correction of rooms in which, for aesthetic and functional reasons, it is not possible to permanently insert sound-absorbing material, or where it is necessary to perform an acoustic correction suddenly. The balloons are used for children's parties. The dimensions of the plastic balloons are 10 cm in average diameter. The acoustic measurements were carried out in an empty room with plaster walls and with an adequately long reverberation time. Subsequently, the room was filled with balloons so as to cover the floor surface. The value of the sound absorption coefficient was obtained from the difference of the reverberation times measured in an empty room and in a room with balloons. An adequate value of the sound absorption coefficient at low frequencies was noted. Measurements of the absorption coefficient of balloons were also carried out using an impedance tube (Kundt tube) and the effects of balloons arranged in series and double balloons, one inserted into the other, were evaluated. This second configuration allows obtaining the maximum value of the acoustic absorption coefficient.

10:00
Octávio Inácio (InAcoustics, Lda., Portugal)
Filipe Martins (InAcoustics, Lda., Portugal)
André McDade (InAcoustics, Lda., Portugal)
Daniel José (InAcoustics, Lda., Portugal)
#86 - ACOUSTICAL DESIGN OF THE SÃO CARLOS NATIONAL THEATRE RESTORATION AND MODERNIZATION PROJECT

ABSTRACT. After an acoustical measurements campaign performed in 2022 to the pre-refurbishment conditions of the Teatro Nacional de São Carlos in Lisbon, the construction works for its restoration and modernization have recently started. This paper describes the acoustic design process, and the challenges faced to ensure that the authenticity and most of the original characteristics of this 1793 Opera House were preserved, while allowing improvements to be made. Although some modifications were introduced in 1877, 1936-40 and 1992-93, the current intervention is the most profound ever made to this building, addressing a large spectrum of areas ranging from original decorations restoration, fire safety and ventilation improvements, to the construction of new rehearsal rooms for the choir and orchestra. The main modifications include the redesign of a part of the building to incorporate new and modern rooms for staff and musicians while in the main opera hall the interventions are subtle but able to improve comfort and acoustic conditions in the orchestra pit, stage and audience.

10:15
Edoardo Alessio Piana (University of Brescia, Italy)
Jorge Joaquin Garcia (University of Brescia, Italy)
Diego Tonetti (University of Brescia, Italy)
#88 - ACOUSTIC MODEL OF THE OLD CATHEDRAL OF BRESCIA

ABSTRACT. This research is aimed at studying the acoustics of the Duomo Vecchio in Brescia, both through experimental measurements, experimentally verifying the laws dictated by architectural acoustics, and through numerical modelling. Therefore, the goal is not only the complete acoustic characterisation of the environment but also the evaluation of the calculation algorithm applied to the three-dimensional model. To do this, the basic acoustic parameters will be calculated in both ways to describe the acoustic quality and compare the data obtained with the different methods.

10:30
Gino Iannace (Università della Campania Luigi Vanvitelli, Italy)
#68 De Simone Theatre in the City of Benevento - acoustic measurements

ABSTRACT. The "De Simone" Theatre in the City of Benevento was located inside a religious school and was used for school events. The theatre had wooden seats and was built at the beginning of the 1900s. The building's plan is rectangular, narrow and long with a single central row, with flat and parallel side walls. In the 1990s, the entire building was acquired by the Municipality of Benevento and the theatre was renovated with the construction of a new stage and fabric chairs. Today the theatre is used for musical events and conferences.This paper reports a series of acoustic measurements performed inside the theater in accordance with the ISO 3382 standard.

11:30-13:00 Session S7: MEMORIAL SESSION FOR ANGELO FARINA

In Memoriam: Professor Angelo Farina (1958–2025)

Professor Angelo Farina passed away in March 2025. Prof. Farina was the Chair of the Scientific Committee of I3DA.  A beloved mentor, colleague, and pioneer in the field of acoustics, Professor Farina shaped generations of researchers through his innovative work and inspiring teaching. His groundbreaking contributions spanned many areas of acoustics, including immersive audio, room acoustics, and underwater acoustics. He will be missed dearly by colleagues, students, and friends around the world.

To honor his life, legacy, and outstanding scientific achievements, this memorial session is dedicated to Professor Farina

With focus on the fields where Professor Farina left a lasting impact, including but not limited to: Room and architectural acoustics; Acoustic measurements and impulse response analysis; Auralization and binaural rendering; Underwater acoustics; Signal processing in audio and acoustics; Acoustic modeling and simulation.

Chairs:
Lamberto Tronchin (DA - CIARM, Italy)
Antonella Bevilacqua (University of Parma, Italy)
11:30
Adriano Farina (University of Bologna, Italy)
Angelo Farina (University of Parma, Italy)
Lamberto Tronchin (DA - CIARM, Italy)
#8 - DPA4560 vs Meta Rayban: a binaural comparison

ABSTRACT. Rayban | Meta are the second generation of smart glasses developed by Meta and Luxottica. They are one of the first mass-market all-in-one consumer devices allowing users to record and reproduce sounds binaurally. Traditionally, binaural recording systems use two microphones, one in each hearing canal, belonging either to a person or to a dummy head. In both cases, the incoming sound reflects on the body, shoulders, and ear pinnae, thus physically encoding several binaural cues. Rayban | Meta, instead, rely on a 5-microphone array, none of which enter the ear canal, and therefore devoid of the information encoded by the pinnae. The binaural signal is obtained through a beamforming algorithm, about which nothing has been published in the literature. For this reason, we evaluated the quality of the binaural signals through impulse response measurements. Wearing a pair of Rayban | Meta and a set of DPA4560 binaural microphones, we used the exponential sine sweep method, sampling every 10°. Using the Aurora plugins, we obtained values for IACC (Inter Aural Cross Correlation), ITD (Interaural Time Difference) and ILD (Interaural Level Difference). As frequency response tests, especially regarding sound reproduction, are widely available, we focused on the binaural parameters only.

11:45
Giacomo Tentoni (CSA Group SpA, Italy)
Alessandro Martinetti (CSA group SpA, Italy)
Luca Guardigli (University of Bologna, Italy)
Beatrice Turillazzi (University of Bologna, Italy)
Antonella Bevilacqua (University of Parma, Italy)
Adriano Farina (University of Bologna, Italy)
Luca Battisti (CIRI EC, University of Bologna, Italy)
Lamberto Tronchin (DA - CIARM, Italy)
#15 - Two soundscapes in comparison: Piazza Cavour in Rimini and Buckingham Palace Square in London

ABSTRACT. With Agora project, research in acoustics aims to virtually reconstruct an immersive soundscape experience through the combination of 3D audio and panoramic view. The audio station, itinerant across different parts of any urban environment, uses a multichannel microphone composed of 19 channels evenly distributed over the surface of the spherical array. The output signal is an audio with a resolution equal to 3rd order ambisonics (O3A), that renders a very accurate sound directivity to localize the different sound sources movable around the microphone. This paper deals with the analysis of a soundscape comparison between Piazza Cavour in Rimini, Italy, and Buckingham Palace Square in London, UK. These two public spaces represent the places that are more trafficked by tourists for their architectural and landscape attractions. Their common factors consist of limited traffic zones, accessible only to authorized vehicles, while visitors are free to explore the architecture that dominates the squares. The environmental parameters are also analysed in terms of loudness, sharpness, prominence and roughness, highlighting the differences between the two environments.

12:00
Jacopo Grassi (Department of Engineering, University of Ferrara, Italy)
Nicola Prodi (Department of Engineering, University of Ferrara, Italy)
Matteo Pellegatti (Department of Engineering, University of Ferrara, Italy)
Chiara Visentin (Department of Engineering, University of Ferrara, Italy)
#94 - Preliminary testing of minimum audible angles inside a novel Ambisonics test bench

ABSTRACT. Recently, a virtual reality test bench called the “Diamonds Chamber (DC)” was completed at the Department of Engineering, University of Ferrara, Italy. An accurate reproduction of the sound directional characteristics is crucial for the usability of virtual reality test benches and the evaluation of the minimum audible angle (MAA) is used as one of the measures to qualify such directional rendering capacity. This work describes the testing procedures adopted to accomplish preliminary measurements of azimuth and elevation MAA in the DC, with a method that differs in some respects from previous literature. MAA values for various azimuths lying on the horizontal plane and for 30° elevation are obtained by a 3-AFC adaptive procedure.

12:15
Alberto Boem (University of Trento, Italy)
Samuele Mazzei (University of Trento, Italy)
Luca Turchet (University of Trento, Italy)
#101 - Spatial Audio for WebXR: Perceptual Evaluation of Sound Localization Technologies on the Browser (Online Presentation)

ABSTRACT. This study presents a comparative analysis of three open-source spatial audio systems for WebXR: Web Audio API's PannerNode with HRTF, PannerNode with equalpower, and Google Resonance Audio for A-Frame. We developed a WebXR application featuring a spherical array of 42 virtual speakers to evaluate sound localization accuracy across different directions and positions. In our within-subjects experiment twelve participants identified the perceived source location of pink noise stimuli by selecting speakers within the virtual environment. Results showed that PannerNode with HRTF significantly outperformed both alternatives in correct target identification, while Google Resonance Audio produced significantly larger error distances. Performance varied by direction, with PannerNode-HRTF particularly effective for back-positioned sources, and position-specific advantages observed for different systems. Despite objective performance differences, participants reported no significant differences in cognitive workload or simulator sickness between implementations. These findings provide empirical evidence that spatial audio implementation significantly affects localization accuracy in WebXR without impacting user comfort. Our methodology establishes a framework for evaluating spatial audio in web-based immersive environments while addressing browser compatibility constraints. This research offers practical guidance for WebXR developers selecting appropriate audio technologies based on their application requirements.

12:30
Luca Battisti (CIRI EC, University of Bologna, Italy)
Maria Cristina Tommasino (University of Bologna, Italy)
#69 - X-MCFX: Comparison of partitioning schemes in a non-equal partitioned multi-channel convolver

ABSTRACT. Convolution has become a largely exploited signal operation due to its several applications in digital signal processing. In the realm of audio elaboration, convolution has been used to impose a spectral and/or a temporal structure onto a signal. This is possible by convolving the sound signal with the Room Impulse Response (RIR). The acoustic footprint of these sound signals can be completely transferred to another sound signal. With a multichannel approach, convolution assumes even wider application fields. One of the outcomes can be considered the Ambisonics recording made by a multi-channel convolver. A similar concept can be applied to the mixing phase of audio post-production, where direction-based audio objects are converted to Ambisonics to be reproduced in similar speaker setups. This paper deals with the analysis of an existing algorithm related to a multichannel convolver software. An evaluation on its efficiency has been outcome along with an optimization of the performances in terms of computational cost. The results show that different partitioning schemes of the convolution filer can greatly change the computational cost in real-time, multichannel scenarios.

12:45
Filippo Fazi (University of Southampton, UK)
Investigating the Physical Limitations of Near-Field Source Encoding with Higher-Order Ambisonics

ABSTRACT. This contribution originates from an exchange with Prof. Angelo Farina and revisits the well-known challenges of encoding and reproducing near-field sources with Higher-Order Ambisonics (HOA). The mathematical explanation of these limitations will be briefly reviewed, but the main emphasis is on building an intuitive understanding of the underlying issues. Through illustrations and animations, the synthesis of plane waves and point sources will be demonstrated as the superposition of spherical standing waves. A key observation is that the energy of the HOA coefficients grows very rapidly with order, becoming unstable once the Ambisonic order N exceeds the ratio between source distance and wavelength, multiplied by 2π. Finally, the use of regularisation (much loved by Prof. Farina) will be briefly highlighted as a practical approach to address this instability issue.

14:00-15:30 Session S8: Spatialisation, Binaural Reproduction and Personalisation of Audio in Virtual, Augmented Reality

Technologies such as holography, head-mounted displays, full-dome immersive video projection, kinesthetic communication (haptic technology), transparent monitors, and three-dimensional (3D) sound and electronic sensors facilitate sophisticated and interactive environments using augmented and virtual reality. The scope of this augmentation is participant immersion, which is the ultimate goal of an effective virtual or augmented experience. It is a common belief that aurality constitutes an essential part of VR and AR and offers additional details and a visceral sense to the immersive experience. Aurality encloses the synthesis, spatialisation, and reception of sound in a virtual world. In this session, we welcome research papers that lie in all the individual aspects of aurality and sound spatialisation as well as novel engineering efforts and applications that bridge VR and AR development with immersive sound and auralisation.

Chairs:
Athanasios Malamos (Dept of Electrical and Computer Engineering, Hellenic Mediterranean University, Greece)
Luna Valentin (CCRMA, Stanford University, United States)
14:00
Pasquale Mainolfi ("G. Martucci" State Conservatory, Salerno (IT), Italy)
#25 - Real-Time Distance-Extended Binaural Auralization in Hybrid Acoustic Spaces

ABSTRACT. The study presents a model for real-time immersive binaural audio that integrates the management of direct sound and environmental reflections into a unified system, relying entirely on data acquired from real measurements. Specifically, it extends traditional Head-Related Impulse Responses (HRIR), which are typically defined in terms of elevation and azimuth, by introducing distance parameterization and developing a spectral hybridization technique for Room Impulse Responses (RIR). Striking a balance between scientific rigor and perceptual sensitivity, the model enhances spatial rendering through an adaptive algorithm that combines fractional delay, geometric attenuation, and dynamic low-pass filtering. This is achieved via a custom frequency-domain approach that interpolates frequency responses to simulate air absorption, implementing the ISO 9613-1:1993 physical model for atmospheric attenuation. In parallel, the weighted fusion of two or more RIRs in the spectral domain, achieved through nonlinear morphing, enables the creation of hybrid and unique acoustic environments. This additional expressive capability enriches the auditory scene while remaining consistent with the physics of sound, even if not necessarily occurring in nature. To ensure temporal coherence and signal quality in real time, the model employs a kernel blending technique using dynamic convolutional crossfades. This approach effectively eliminates artifacts caused by discontinuities arising from the continuous replacement of convolution kernels while maintaining computational efficiency.

14:15
Johannes Scherzer (spæs - lab for spatial aesthetics in sound Berlin, Germany)
#35 - Rethinking Immersive Sound Design: A Fourfold Model of Emergent Spatial Perception

ABSTRACT. This paper introduces a fourfold model of emergent spatial perception, addressing a gap in spatial audio discourse that often emphasizes localization, simulation fidelity, and acoustic realism while overlooking interpretive, affective, and symbolic dimensions of experience. Building on Sharma’s notion of Shared Perceptual Space, the model specifies how spatial meaning emerges through layered perceptual engagement. The model is organized as a quadrant defined by two intersecting perceptual axes: one contrasts the sensory with the real, the other contrasts the exophenomenal with the endophenomenal. Their intersection yields four perceptual fields: material, aesthetic, lived, and symbolic. These fields show how spatial meaning develops through sound, encompassing physical infrastructure, tonal presence, personal resonance, and cultural framing. Applied to immersive sound design, the model informs perceptual framing, compositional decision-making, and reflexive evaluation. Examples from audio drama and a museum installation demonstrate how immersion emerges differently across perceptual fields, showing that spatial experience is not format-dependent but emerges through perceptual engagement. Shifting attention from rendering techniques to perceptual dramaturgy, the model repositions sound as a medium through which space is composed, and the immersive sound designer as a scenographer of perception: a practitioner who curates spatial meaning through listening, inviting more interpretive and context-sensitive approaches to immersive audio.

14:30
Eito Murakami (CCRMA - Stanford University, United States)
Luna Valentin (CCRMA - Stanford University, United States)
Nima Farzaneh (CCRMA - Stanford University, United States)
Jonathan Berger (CCRMA - Stanford University, United States)
#26 - Ambisonic Virtual Acoustics Playback Toolkit

ABSTRACT. We present the Ambisonic Virtual Acoustics Playback Toolkit, an open-source framework for curating virtual reality scenes to support musicological and archeological studies that examine the effects of room acoustics. This toolkit utilizes Chunreal, the ChucK music programming language in Unreal Engine, and Spatial Audio Framework (SAF) to compute real-time convolution of first-order ambisonic impulse responses with sound sources (both recorded and live). The toolkit includes features such as binaurally decoding first-order ambisonic signals with head (camera) tracking, dynamically loading 360 degree images and videos, and interactive graphical user interface for building and loading scenes. The toolkit attempts to provide a streamlined workflow for deploying measured audiovisual assets to VR experiences without the need to directly use a game engine while offering extensible APIs for developers who wish to customize the features inside the Unreal Engine project. In addition to presenting the functionalities and workflows of the toolkit with implications for both creative and research applications, we present two case studies that utilize the toolkit for presenting their fieldwork materials and conducting acoustic perception experiments.

14:45
Rai Sato (Korea Advanced Institute of Science and Technology, South Korea)
Sungyoung Kim (Korea Advanced Institute of Science and Technology, South Korea)
#40 - Perceptual Factors Influencing Listener Preferences in Head-tracked Binaural Renderers
PRESENTER: Rai Sato

ABSTRACT. This study investigated how perceptual factors shape preferences for head-tracked binaural renderers. Ten experienced listeners compared five commercial renderers using three music excerpts binauralized from 22.2-channel recordings. Structural Equation Modeling (SEM) of their ratings demonstrated a higher-order structure, unifying spatial attributes into a single dominant 'Integrated Spatial Quality' factor, distinct from factors for spectral fidelity and head-tracking discomfort. The model as a whole explained 79.2% of the variance in overall preference. Within this model, the Integrated Spatial Quality factor was the overwhelming predictor (beta = 0.750, p < .001). In contrast, spectral fidelity only trended towards significance (p=0.075), and head-tracking discomfort was not a significant predictor. These findings highlight the importance of cohesive spatial rendering over singular acoustic features in the optimal perceptual balance of future binaural systems.

15:00
Richard Foss (Rhodes University, South Africa)
#55 - A Binaural Capability to Mirror a Loudspeaker Configuration (Online Presentation)

ABSTRACT. A client-server immersive sound system has been created for live sound and installations. Rendering algorithms direct the aux sends of the mixer to enable sound source spatialisation across multiple speakers. Live sound engineers using the system required a binaural capability that would model the venue but enable multichannel spatialisation playback and recording away from the venue. This paper describes the virtual speaker design of this binaural capability and analyzes its operation. Aux bus speaker outputs, instead of being routed to speakers, are directed over USB to the server computer. Here they are mixed to binaural outputs, where delays, gains and filters are applied at the cross points of an N speaker x 2 matrix. The binaural outputs are sent to browser-based clients, where they are played out via Web Audio worklets. Reflections are incorporated by adding delayed samples from each speaker to the others and applying distance and absorption factors. Experimentation with various rendering algorithms, speaker configurations, head heights, head widths and reflection gain and delay factors is possible from the clients. The design will be contrasted with an impulse response/convolution approach both in terms of quality and calculation speed.

15:15
Shurui Zhu (School of Electrical and Computer Engineering, The University of Sydney, Australia)
Minh Nguyen (School of Computer Science, University of Technology Sydney, Australia)
Matteo Mascelloni (School of Electrical and Computer Engineering, The University of Sydney, Australia)
Howe Zhu (School of Architecture, Design and Planning, The University of Sydney, Australia)
Raymond Chia (School of Computer Science, University of Technology Sydney, Australia)
Avinash Singh (School of Computer Science, University of Technology Sydney, Australia)
Chin-Teng Lin (School of Computer Science, University of Technology Sydney, Australia)
Craig Jin (School of Electrical and Computer Engineering, The University of Sydney, Australia)
#47 - Comparing Methods for Generating Binaural Room Impulse Responses for Auditory Navigation in Indoor AR (Online Presentation)
PRESENTER: Shurui Zhu

ABSTRACT. Auditory sensory augmentation has gained increasing attention as a way to enhance spatial awareness and facilitate navigation for individuals with blindness or low vision (BLV), by providing information that extends beyond the immediate reach of a white cane. Recent advances in real-time binaural rendering and wearable devices, such as augmented reality (AR) smart glasses, have opened up new possibilities for delivering spatial audio in mobile contexts. However, it remains unclear which binaural rendering strategies are most effective for auditory sensory augmentation, partly due to a lack of evaluation methods suited to embodied, ambulatory use. To address this, we conducted a controlled study comparing three binaural rendering methods using an auditory navigation task in a complex reverberant environment. The three tested methods include higher-order Ambisonics (HOA), measured Binaural Room Impulse Responses (BRIRs) using a head and torso simulator (HATS), and simulated BRIRs generated with a shoebox-room acoustic model (RoomSim), all based on non-individualized Head-Related Transfer Functions (HRTFs). Blindfolded participants (sample size N = 8) were asked to actively localize spatial sound sources and make informed decisions to follow pre-defined paths based on spatial awareness. We measured their navigational performance as an indicator of how effectively each rendering method supported real-time spatial perception and decision-making. Our results validated the feasibility of an active, ambulatory evaluation for spatial audio, and showed that our shoebox-based room acoustic simulation rendering supported slightly less accurate and efficient navigation performance than HOA-based rendering based on spherical microphone array recordings. Our findings carry important implications for the evaluation and design of spatial audio in auditory sensory augmentation.

15:30-16:45 Session S8b: Spatialisation, Binaural Reproduction and Personalisation of Audio in Virtual, Augmented Reality (continued)
Chairs:
Filippo Fazi (University of Southampton, UK)
Valeria Bruschi (UNIVPM, Italy)
15:30
Loris Grossi (Scuola universitaria professionale della Svizzera italiana, Switzerland)
Andrea Quattrini (Scuola universitaria professionale della Svizzera italiana, Switzerland)
Alberto Vancheri (Scuola universitaria professionale della Svizzera italiana, Switzerland)
Tiziano Leidi (Scuola universitaria professionale della Svizzera italiana, Switzerland)
Valeria Bruschi (Università Politecnica delle Marche, Italy)
Stefania Cecchi (Università Politecnica delle Marche, Italy)
#41 - Comparison of HRTF Interpolation Algorithms based on Neural Network

ABSTRACT. Head-related transfer functions (HRTFs) are used in immersive audio rendering applications. In many cases, these functions have to be calculated or measured in many relative positions between head and source requiring a large amount of time and significant computational resources. Therefore, the generation of HRTFs becomes crucial and interpolation procedures can solve this problem. An extensive analysis of an interpolation method based on neural network is presented taking into consideration the state of the art and the use of a real dataset filtered by a refinement procedure. The neural network interpolation approach is compared with conventional nearest-neighbor interpolation methods. The investigation based on differences between interpolated and measured HRTFs shows promising results of the proposed methodology that are in line with traditional interpolation techniques.

15:45
Georgios Daskalakis (Dept of Electrical and Computer Engineering, Hellenic Mediterranean University, Greece)
Athanasios Malamos (Dept of Electrical and Computer Engineering, Hellenic Mediterranean University, Greece)
Don Brutzman (Naval Postgraduate School, Web3D Consortium, United States)
Eftychia Lakka (Dept of Electrical and Computer Engineering, Hellenic Mediterranean University, Greece)
#52 - A Novel Methodology For Sound Spatialisation And VR Acoustics In The Web

ABSTRACT. Spatial audio plays a crucial role in creating immersive experiences in environments like virtual reality (VR), augmented reality (AR), video games, and interactive installations. To make these experiences lifelike and engaging, it’s essential for sound to align with visual and environmental cues. However, current methods for spatial audio rendering, especially in interactive applications, often fail to accurately reproduce how sound behaves in real-world spaces. Room acoustics simulations are fundamental for modeling how sound propagates in enclosed spaces and calculating the Room Impulse Response (RIR) of a virtual space. The RIR of a space allows us to make anechoic sounds, sound like they were reproduced in that space. While techniques like ray tracing and sound field simulation have been developed to simulate sound propagation, their application in web environments has been challenging due to high computational costs. Real-time sound simulation requires significant processing power, especially in complex environments, and existing solutions are often too resource-intensive for web applications. The Web Audio API has enabled some spatial audio features on the web but lacks the capabilities for accurate room acoustics simulation. Efforts to incorporate spatial audio into 3D spaces (using X3D) have sought to over- come this limitation, but creating immersive experiences that perform well across a range of devices, particularly smartphones, remains difficult. This paper presents a novel Web3D implementation that generates Directional Room Impulse Responses (DRIR) for virtual spaces, simulating key acoustic properties like reflections and reverberation using ray tracing. By implementing this algorithm in WebGL for GPU acceleration, this approach can achieve real-time performance, even on low-end devices. The Web Audio API is then used for auralization, making this solution an efficient way to simulate realistic acoustics on the web without compromising performance.

16:00
Summer Krinsky (Stanford, United States)
#60 Dynamic Spatial Sidechain for First-Order Ambisonics

ABSTRACT. Much of the clarity and perceived separation in contemporary stereo production can be attributed to dynamic sidechain compression and frequency-selective sidechain attenuation, which enable precise temporal shaping of the auditory scene. New techniques for expressing dynamic relationships across the expanded sound field of 3D audio remain underdeveloped, despite their proven importance in stereo mixing.

We present a novel, rotation-based dynamic spatial sidechain system for First-Order Ambisonics (FOA), en- abling energy-driven shaping of directional relationships in immersive audio. Motivated by the time-varying interaction between control and target signals in sidechain compression, our system extends this mechanism into the spatial domain, treating directionality—rather than amplitude—as the primary control dimension.

Both main (target) and sidechain (control) FOA signals are decoded into virtual loudspeaker feeds. From these, we compute instantaneous spatial vectors using energy-weighted summation of loudspeaker directions, analogous to the summation stage in VBAP. By analyzing the angular difference between these vectors, we generate smoothed azimuth and elevation controls to drive a spatial rotator that dynamically steers the main signal relative to the sidechain.

Implemented in Max/MSP, this technique enables real-time, energy-responsive rotation of audio signals in 3D space, with potential applications in VR/AR, spatial music, and immersive media production.

16:15
César Salvador (Universidad Peruana de Ciencias Aplicadas, Peru)
Jorge Treviño (Tohoku University, Japan)
Shuichi Sakamoto (Tohoku University, Japan)
#71 Spatial Acoustics Library for MATLAB (SALM): A Computational Toolkit for Spatial Audio Processing
PRESENTER: César Salvador

ABSTRACT. The Spatial Acoustics Library for MATLAB (SALM) is a computational toolkit for efficient spatial audio signal processing. Its principal contribution is providing a unified and extensible framework that integrates Fourier transform techniques for spatial filtering. A defining characteristic of SALM is its ability to seamlessly handle both fundamental tasks, such as visualizing spherical harmonic functions, and advanced applications, including array processing in the circular and spherical Fourier transform domains. Key use cases include spatial interpolation of head-related transfer functions and room impulse responses, essential for high-fidelity spatial sound reproduction. Rigorously tested and validated in previous publications by the authors, SALM has demonstrated its reliability for immersive acoustic research and prototyping. By equipping researchers and developers with a structured and versatile library, SALM advances spatial acoustics with applications in binaural rendering for virtual reality and sound field analysis for architectural acoustics. The SALM repository and documentation are available at: https://github.com/cesardsalvador/SpatialAcousticsLibraryMATLAB.

16:30
Serafino Di Rosario (IOA, France)
Clement Royon (--, France)
Sylvain Guitton (--, France)
herisSon – Spatial Room Impulse Response (SRIR) measurement tool

ABSTRACT. Measured spatial room impulse responses are a very useful tool to understand the behaviour of acoustic spaces by analysing them in both temporal and spatial domains. herisSon is a tool designed at LINK Acoustique that implements different techniques for analysing and present SRIR measurements results in an updated version compared to the one presented by the author at the Auditorium Acoustics Conference in 2023 [1]. The tool expands the previous research by providing an updated 3D visualisation, the possibility of analysing the 3D response in 1/1 octave bands with a new proposed visualisation, ISO 3382 parameters, new acoustical parameters and visualization proposed by A.Bassuet [2], and a proposed analysis of directional reverberation in 3D. All these new features have been also used to test auditoriums, amplified music halls, immersive sound systems and the installation of an electroacoustic system in a concert hall originally conceived for classical music. The paper presents the results of these measurements and discuss what are the next step to improve our understanding and presentation of the results.

References [1] S. Di Rosario et al., herisSon – an innovative tool for Spatial Room Impulse Response (SRIR) measurements, Proceedings of the Institute of Acoustics, Vol. 45. Pt.2. 2023 [2] A. Bassuet, New Acoustical Parameters and Visualization – Techniques to analyze the spatial distribution of sound in music spaces, Proceedings of the International Symposium on Room Acoustics, ISRA 2010, 29-31/08/2010, Melbourne, Australia

16:45-18:30 Session D2: Demo Sessions: #26, #27, #83, #60

#26 Ambisonic Virtual Acoustics Playback Toolkit - Eito Murakami, Luna Valentin, Nima Farzaneh and Jonathan Berger

#27 What did they hear ? Immersion in Chauvet Cave - Luna Valentin, Eito Murakami, Nima Farzaneh and Jonathan Berger

#83 Virtual Acoustics and the Evocation of Awe in Historical Ritual Spaces - Nima Farzaneh, Anna Marie Gruzas, Eito Murakami and Jonathan Berger

#60 Dynamic Spatial Sidechain for First-Order Ambisonics - Summer Krinsky

16:45-18:30 Session P2: Poster Session
Marta Rossi (Abertay University, UK)
Christos Michalakos (Abertay University, UK)
#1 - Athanasius Kircher’s Sonic Playground: An Acoustic Virtual Reality Installation

ABSTRACT. Athanasius Kircher’s Sonic Playground is an interactive virtual reality (VR) installation that explores architectural acoustics through spatial sound simulation and real-time user engagement. Developed in Unreal Engine 5.4.4 using the Meta XR plugin, the project employs acoustic ray tracing for realistic auralisation and six degrees of freedom (6DoF) binaural audio rendering, enabling users to interact with acoustically responsive virtual spaces. Inspired by seventeenth-century polymath Athanasius Kircher, one of the first scholars to formalise the analogies between sound and light propagation, the project reimagines his acoustic theories and studies within a virtual environment. It features a series of four architecturally resonant spaces: a reconstructed model of the Ear of Dionysius, an elliptical domed chamber, a labyrinth, and a minimalist cathedral-like structure. These environments are designed to recreate acoustic phenomena characteristic of historic architecture such as whispering gallery effects, echo layering, and reverberant decay. Users engage with the environments through controller-triggered impulses, theoretical instruments, and live voice input captured via the headset’s microphone, receiving real-time auditory feedback shaped by the scene’s geometry and materials. By combining early modern acoustic theory with contemporary VR design, Athanasius Kircher’s Sonic Playground provides a playful yet acoustically rigorous framework for investigating architectural acoustics and the relationship between spatial form and auditory perception. The project contributes to current work in immersive cultural heritage, virtual acoustics, and design-led research, proposing a model of interactive auralisation that merges scientific inquiry with speculative reconstruction.

Antonella Bevilacqua (University of Parma, Italy)
Lamberto Tronchin (DA - CIARM, Italy)
#2 - Acoustic analysis of a temporary unseated opera theatre: Teatro Sociale in Bellinzona

ABSTRACT. The Teatro Sociale in Bellinzona is a historic and architecturally significant theatre located in the heart of its city, Switzerland. Acoustic measurements were conducted to assess the room’s response in terms of acoustic parameters, in accordance with ISO 3382-1. In addition to traditional microphones (omnidirectional and B-format), a multi-channel microphone (i.e. em-64 by HM-Acoustics) was used to capture the directivity of sound reflections from the hall’s boundary surfaces. The results indicate that the acoustic response of the room is suitable for both speech and music performances.

Lamberto Tronchin (DA - CIARM, Italy)
Luca Battisti (CIRI EC, University of Bologna, Italy)
#7 - Acoustic characteristics of Youth Theatre of Piatra-Neamț in Romania

ABSTRACT. Many opera theatres for live spectacles have been measured from an acoustic perspective and the small rooms offer often more prose performances than symphonic music, as the space (for both orchestra and audience) is more limited in small theatres. Nevertheless, small-sized theatres are very active and offer spectacles all year round due to the great affluence to this type of live venues. The room impulse responses (RIRs) have been analyzed for a theatre located in Romania: Youth Theatre of Piatra-Neamț. The monoaural and binaural acoustic parameters have been analyzed based on ISO 3382-1 requirements, showing that the reverberation response at mid frequencies is around 0.9 s, meaning that the hall is suitable for both speech and music. In order to complete this analysis, acoustic maps highlighting the directivity of reflections bouncing within the room are very useful to detect which seat is more suitable to have the most envelopment. According to the IACC results, the more binaural response is found in the stalls, which aligns perfectly with the 3D acoustic map, showing that from the direct and early reflections, the soundwave hits the probe from many directions.

Luca Battisti (University of Bologna, Italy)
Lamberto Tronchin (DA - CIARM, Italy)
#6 - Acoustic performance of Victor Ion Popa Theatre – Bârlad, Romania

ABSTRACT. Exploring the acoustic of a local theater with decades of story means to confer scientific endorsement to the authenticity of its value. Modern measurement techniques such exponential sine sweep are critical techniques for the evaluation of acoustic in large spaces; the gathered impulse responses of Victor Ion Popa Theater of the Primăria Municipiului Bârlad are analyzed to obtain acoustic parameters following latest ISO acoustic standards. Alongside standards, innovative acoustic maps have been computed to visually analyze the interaction of the sound waves and the environment between the stage and the listener.

Luca Battisti (CIRI EC, University of Bologna, Italy)
Lamberto Tronchin (DA - CIARM, Italy)
#13 - Acoustic characteristics of Bacău Theatre in Romania

ABSTRACT. The latest advancement in audio technology allows researchers to refine their computation and assessment on complex volumes like theatres. The room impulse responses (RIRs) have been analyzed for the one of small-sized auditorium in Romania: Bacovia Municipal Theatre of Bacau. The monoaural and binaural acoustic parameters have been analyzed based on ISO 3382-1 requirements. The results show that the room response in terms of reverberation within the theatre is about 1.25 s at mid frequencies. Besides the traditional acoustic parameters, the acoustic maps highlight the reflections hitting the probe placed in the central box on the first order coming from the back of the fly tower and from the box ceiling in later instants.

Valeria Bruschi (Università Politecnica delle Marche, Italy)
Stefania Cecchi (Università Politecnica delle Marche, Italy)
Andrea Generosi (Università Pegaso, Italy)
Nefeli Aikaterini Dourou (Università Politecnica delle Marche, Italy)
Maura Mengoni (Università Politecnica delle Marche, Italy)
Susanna Spinsante (Università Politecnica delle Marche, Italy)
#22 - Vehicle Sound Interaction: A Preliminary Study on Driver’s Experience Affected by Immersive Sound Reproduction

ABSTRACT. Immersive audio techniques can create realistic sound environments in the same way humans perceive sound in a natural one. This capability can be applied to car cabin environment to enhance the user’s experience and improve the vehicle-driver interaction. In this scenario, our study is focused on the effects these systems have on the driver’s experience in terms of emotional responsiveness and degree of immersion. The spatial audio system is realized with a modified version of Recursive Ambiophonic Crosstalk Cancellation and a deep analysis is performed with a real-time monitoring achieved by the implementation of a multimodal approach that exploits deep learning and data fusion techniques to ensure a comprehensive evaluation of the driver’s status. Several experimental results are reported by means of a driving simulator equipped with a camera-based driver monitoring system and a physiological acquisition system.

Valeria Bruschi (Università Politecnica delle Marche, Italy)
Stefania Cecchi (Università Politecnica delle Marche, Italy)
Alessandro Terenzi (Università Politecnica delle Marche, Italy)
Maura Mengoni (Università Politecnica delle Marche, Italy)
Andrea Generosi (Università Pegaso, Italy)
#23 - A Preliminary Study on the Effect of Spatial Sound Reproduction based on Physiological Responses and Facial Expressions of the Listener

ABSTRACT. Spatial audio systems have received nowadays great attention not only from the academic sector but also from the market. Many techniques have been studied for the creation of an immersive scenario, but the development of a large zone, where more than one listener can perceive an immersive audio experience, is based on the use of multichannel approaches and sound field technologies. In this context, the presented study aims at an objective analysis of the listeners’ sound experience based on their physiological responses and facial expressions to recognize variations in the subject mental and physical status and in their emotional reactions. Several experiments have been conducted using an immersive loudspeakers sphere installed in a semi-anechoic chamber exploiting Ambisonics technology and monitoring the listeners with a camera-based system equipped with a Facial Coding (FAC) and an advanced multimodal acquisition system to measure different physiological signals.

Valeria Bruschi (Università Politecnica delle Marche, Italy)
Alessandro Terenzi (Università Politecnica delle Marche, Italy)
Nefeli Aikaterini Dourou (Università Politecnica delle Marche, Italy)
Leonardo Gabrielli (Università Politecnica delle Marche, Italy)
Michael Fioretti (Università Politecnica delle Marche, Italy)
Giuseppe Bergamino (Università Politecnica delle Marche, Italy)
Stefania Cecchi (Università Politecnica delle Marche, Italy)
Stefano Squartini (Università Politecnica delle Marche, Italy)
#36 - An Immersive Low-Latency Audio System for Social Interaction with Elderly People

ABSTRACT. The involvement of elderly individuals in social activities is important for their well-being and to keep their minds active. When these individuals spend a lot of time in solitude, for example due to mobility difficulties or lack of private or public transportation, it is necessary to intervene in some way directly within their homes. In this context, the purpose of the work is to present a system that allows engaging elderly people in interactive sessions with a social worker, capable of conversing with the person or even better, involving them in a therapeutic session, be it a rehabilitation (e.g. speech or lung therapy) or music therapy session. The proposed system is focused on an immersive audio system to create a realistic communication experience, and it is based on technologies that allow high-bandwidth low-latency audio and video links. The paper describes a scalable system with the possibility of using multichannel approaches such as Ambisonics or binaural techniques and discusses available technologies for establishing the communication highlighting the lack of off-the-shelf solutions. The proposed approach and its usability in different scenarios are described.

Joseph Clarke (Cork School of Music, MTU, Ireland)
Cárthach Ó Nuanáin (Cork School of Music, MTU, Ireland)
#73 - Evaluating Presence in Immersive Virtual Reality Concert Experiences

ABSTRACT. Virtual Reality (VR) concerts using spatial audio offer novel, compelling and immersive ways of experiencing music performances that move far beyond the realms of existing streaming delivery. Evaluating user's sense of presence and immersion remains at the centre of research in virtual environments in general. These evaluations can influence the development of VR experiences when implementing a co-design or user centred design approach. This design implementation can produce more accessible, enjoyable, and impactful experiences. In this paper we summarise the development of an existing VR concert experience in the Irish music tradition and propose a survey that investigates and evaluates it from a presence perspective. Our methodology is outlined and the results show the positive impact of immersive audio design factors on the subjective experience and sense of presence felt by the participants.

Marco Caniato (University of Applied Science Stuttgart, Germany)
Pia Hofmann (University of Applied Science Stuttgart, Germany)
Kamonwan Iamsam-Ang (University of Applied Science Stuttgart, Germany)
Lukas Loch (University of Applied Science Stuttgart, Germany)
Rabiye Sahin (University of Applied Science Stuttgart, Germany)
Berta Zimmermann (University of Applied Science Stuttgart, Germany)
#80 - Calculation of Reverberation Time in Educational Environments: A Comparison of Analytical Models (Online Poster)

ABSTRACT. This study explores the calculation of reverberation time using various analytical models, with measurements conducted in the classrooms of the University of Applied Science in Stuttgart. The measurements were taken under different conditions of sound absorption, volume and furnishing to evaluate the accuracy of the models in real-world settings. The results highlight the challenges in accurately predicting reverberation time across different frequencies, underscoring the limitations of traditional analytical models. This work provides a critical overview of existing methodologies and benchmarking models and criticalities for more accurate prediction of reverberation time in educational environments.

Angela Guastamacchia (Department of Energy, Politecnico di Torino, Italy)
Arianna Astolfi (Department of Energy, Politecnico di Torino, Italy)
Fabrizio Riente (Department of Electronics and Telecommunications, Politecnico di Torino, Italy)
Pascale Sandmann (Department of Otolaryngology, Head and Neck Surgery, Carl von Ossietzky Universität Oldenburg, Germany)
Volker Hohmann (Department of Medical Physics and Acoustics, Carl von Ossietzky Universität Oldenburg, Germany)
Giso Grimm (Department of Medical Physics and Acoustics, Carl von Ossietzky Universität Oldenburg, Germany)
#64 - Interlaboratory comparison of multi-speaker setups for spatialized audio reproduction within clinical settings

ABSTRACT. Recent hearing research has shown that spatialized audio reproduction systems can help bridge the auditory gap between the oversimplified laboratory test conditions used in standard clinical practice and the complex auditory scenarios encountered by hearing-impaired individuals in daily life. However, implementing such complex multi-speaker setups in clinical settings poses challenges due to constraints in space, cost, and system complexity. This study investigated whether with a limited number of speakers (i.e., 16), it is possible to (i) still achieve satisfactory performance compared to a more complex 45-speaker periphonic (3D) reference setup, and (ii) identify, if any, a preferable configuration as a whole—considering both speaker spatial arrangement and rendering technique—among three different proposals: two horizontal (2D) layouts with different speaker densities in the frontal hemisphere, and one 3D layout, each installed in a different laboratory. Subjects were recruited to evaluate each setup based on two perceptual tests: sound localization and perceived audio quality using selected items from the Spatial Audio Quality Inventory across two complex auditory scenes. Results revealed no significant differences among the setups regarding perceptual quality ratings. Lowest azimuth localization errors were found for the 2D setup with higher frontal speaker density and lowest elevation errors for the 3D setups. The more complex 3D layout led to lower elevation errors only for sounds above the equatorial plane, while no differences were found for sources below. Across all setups, tendencies to perceive elevated sounds closer to the equatorial plane and sounds at 0° elevation as slightly higher were observed.

Yueheng Li (University of Southampton, UK)
Filippo Fazi (University of Southampton, UK)
#29 - Resolution Upscaling of Spatial Room Impulse Response Based on Elastic Net Regularisation

ABSTRACT. The spatial resolution of a spatial room impulse response (SRIR) measured with a spherical microphone array is fundamentally constrained by the array's spherical harmonic order. Starting from order-limited ambisonic signals, the SRIR is estimated via plane wave decomposition, which involves solving an underdetermined inverse problem. Notably, both the spatial sparsity and signal-to-noise ratio (SNR) of the SRIR vary over time. To account for this, we introduce an elastic net regularisation framework that combines L1 and L2 penalties. This approach leverages the known sparsity of early-arriving waves to promote sparse solutions via L1 regularisation, while the inclusion of L2 regularisation also ensures the robustness and interpretability during periods of reduced sparsity. The flexibility of tuning the regularisation parameter and the L1 ratio allows the method to adapt to different SRIR structures. An investigation of applying the elastic net regularisation with different parameters to upscale the spatial resolution of SRIR is carried out in this paper. It is shown that elastic net consistently outperforms the sole L1 (LASSO) or L2 (Tikhonov) regularisation.

Gino Iannace (University of Campania Luigi Vanvitelli, Italy)
Giuseppe Ciaburro (Faculty of Engineering and Informatics, PegasoUniversity, Naples, Italy)
Franc Policardi (Electrotechnical Engineering, Ljubljana, Slovenia)
#43 - The city of Benevento and its theaters
PRESENTER: Gino Iannace

ABSTRACT. The city of Benevento is located in southern Italy. But it has a long history: it was founded by the Samnites, became an important center of the Roman Empire, and became the capital of the Lombard kingdom; for a thousand years it was part of the Papal State and enjoyed autonomy within the Kingdom of two Sicilies until the unification of Italy (1860). In the 80s of the last century a "Benevento citta spettacolo" event was established, with theatrical performances held inside various theaters. The purpose of this paper is to report the acoustic characteristics of the theaters of the city of Benevento where these events took place. Acoustic measurements were carried uot inside theatres. The acoustic characteristics are reported as a function of the volume and the type of theatres.

Lamberto Tronchin (DA - CIARM, Italy)
Ruoran Yan (University of Bologna, Italy)
Kristian Jambrosic (University of Zagreb, Croatia)
Marko Horvat (University of Zagreb, Croatia)
Luca Battisti (CIRI EC, University of Bologna, Italy)
Matteo Fadda (University of Bologna, Italy)
Adriano Farina (University of Bologna, Italy)
Cobi van Tonder (University of Bologna, Italy)
#90 - Comparative Acoustic Survey of Teatro Masini, Faenza: Insights from the 2020 and 2025 Investigations

ABSTRACT. A follow-up acoustic survey of the eighteenth-century Teatro Masini (Faenza) was completed in 2025 to verify changes since the 2020 baseline and to evaluate an expanded measurement layout. Standard ISO 3382-1 data were acquired with omnidirectional and four-channel microphones, while an EM64 spherical array plus panoramic video captured full-sphere impulse responses. The new campaign added a third-tier receiver and a rear-stage sound source, allowing deeper inspection of balcony coverage and dome reflections. Results show mid-band EDT and T20 shortened by 0.15–0.20 s, largely due to renewed velvet panels and the up-stage source’s weaker proscenium returns. Speech clarity (C50) improved, and D50 now exceeds 0.50 above 500 Hz. Music clarity (C80) increased, settling in the 2–6 dB range preferred for operatic detail. Spherical maps confirmed strong early lateral energy from box fronts and a late, focused ceiling return that enhances spatial height. The theatre therefore offers better intelligibility without loss of warmth, and no major acoustic intervention is required—only optional light drapery for speech-centric events.