View: session overviewtalk overview
KEYNOTE
09:10 | KEYNOTE: Technology-Based Real-Time Visual Feedback in the Education of Singers ABSTRACT. Learning to sing requires the acquisition of perceptual-motor skills. The development of such skills is notably facilitated when meaningful visual feedback is provided. The current state of voice science, combined with recent technological advances have paved the way for visualization of relevant physiological and acoustical events. Nowadays, non-invasive real-time visual displays of breathing behaviors, subglottal pressure, vibratory patterns and acoustical properties of the voice are available to both teachers and voice students. In this presentation, examples of such displays and the associated technological tools will be demonstrated. For example, the RespTrack system for real-time display of abdominal and ribcage movements will be presented (Johan Stark, Columbi Computers, Sweden). The relationship between breathing behavior, lung volume and subglottal pressure will be discussed, as well as its relevance to the education of singers. Visualization of vocal fold vibratory patterns by electroglottography (EG2-PCX2, Glottal Enterprises, USA) and its application to the training of phonation types or register transitions will be presented. Also, the recently developed FonaDyn freeware will be explored for documenting singers’ development (Sten Ternström, Sweden). The usefulness of various spectrographic displays will be discussed. Finally, the possible implementation of all these means in current educational settings will be discussed. This keynote is an invited summary presentation of the following article: Lã, F.M.B.; Fiuza, M.B. Real-Time Visual Feedback in Singing Pedagogy: Current Trends and Future Directions. Appl. Sci. 2022, 12, 10781, doi:10.3390/app122110781. (Open Access)
|
10:00 | The impact of room acoustics on choristers' performance: from rehearsal space to concert hall ABSTRACT. While there has been extensive research on the acoustic quality of various performance spaces and concert halls, studied from the audience perspective, less work has been published on the musicians' on-stage acoustic impression and its impact on musicality and performance quality. On stage acoustic conditions vary among performance spaces, and, more often than not, between the latter and rehearsal spaces. As a result, there have been studies investigating adaptation mechanisms developed to match specific acoustic conditions during a performance. This paper discusses the potential impact of acoustic mismatches between rehearsal spaces and concert halls from the perspective of singers and choirs. Based on past research exploring the use of virtual acoustic environments, as a means for investigating this deviation and the way it affects one's performance, a tool is being designed aiming to virtually place users in various spots within a virtual choir on a virtual stage, by augmenting audio recordings with auditory spatialization and room-acoustic cues. Preliminary feedback for the need of this tool along with results from its alpha testing phase are being discussed. |
10:15 | Singing voice range profiling toolbox with real-time interaction and its application to make recording data reusable ABSTRACT. The Singing Voice Range Profiling Toolbox is a software suite that provides real-time interaction for the profiling of singing voices. It utilizes sound field measurement, microphone calibration, and acoustic characteristics of the recording system to analyze and visualize various vocal parameters. The measurement of the sound recording field and background noise provide information to decide the acceptable distance for optimal performance depending on the directivity pattern and frequency response of microphones. The voice profiling includes essential parameters such as fundamental frequency (fo), sound pressure level (SPL), cepstral peak prominence (CPP), and EGG-Oq (if EGG is available). In addition, the toolbox provides real-time feedback on the analyzed characteristics, including visualizations of F1 and F2, which are essential and valuable parameters in studying singing voices. Additionally, the toolbox has facilities for assisting in training and self-learning. The facilities allow users to gain a deeper understanding of the voice profiling process and improve their skills over time. The Singing Voice Profiling Toolbox provides a valuable tool for voice scientists, recording engineers, and singing voice educators, enabling them to make recording data reusable and further advancing the field of voice research. |
SMC Welcome address and first plenary paper session.
13:00 | Structuring music for any sources ABSTRACT. This article describes an interactive notation paradigm, which aids in structuring flexible performances for an arbitrary number of participants and any combination of acoustic or electronic sources. A simple system allows a ‘maestro’ to organise an ensemble and to communicate information to the members by means of an interactive window projected on a surface visible to all (performers and audience). The following text describes the motivation and design of the notation strategy, its implementation in the SuperCollider environment and discusses some compositional, performative and pedagogical issues with reference to a recent work; in this context the ‘system’ is considered to be the ‘piece’ itself. |
13:05 | ABSTRACT. Piano is one of the most popular instruments among people that learn to play music. When playing the piano, the level of loudness is crucial for expressing emotions as well as manipulating tempo. These elements convey the expressiveness of music performance. Detecting the loudness of each note could provide more valuable feedback for music students, helping to improve their performance dynamics. This can be achieved by visualizing the loudness levels not only for self-learning purposes but also for effective communication between teachers and students. Also, given the polyphonic nature of piano music, which often involves parallel melodic streams, determining the loudness of each note is more informative than analyzing the cumulative loudness of a specific time frame. This research proposes a method using Deep Neural Network (DNN) with score information to estimate note-level MIDI velocity of piano performances from audio input. In addition, when score information is available, we condition the DNN with score information using a Feature-wise Linear Modulation (FiLM) layer. To the best of our knowledge, this is the first attempt to estimate the MIDI velocity using a neural network in an end to end fashion. The model proposed in this study achieved improved accuracy in both MIDI velocity estimation and estimation error deviation, as well as higher recall accuracy for note classification when compared to the DNN model that did not use score information. |
13:10 | Accessible Sonification of Movement: A case in Swedish folk dance ABSTRACT. This study presents a sonification tool – SonifyFOLK – designed for intuitive access by musicians and dancers in their sonic explorations of movements in dance performances. It is implemented as a web-based application to facilitate accessible audio parameter mapping of movement data for non-experts, and applied and evaluated with Swedish folk musicians and dancers in their exploration of sonifying dance. SonifyFOLK is based on the WebAudioXML Soni- fication Toolkit and is designed within a group of artists and engineers using artistic goals as drivers for the sound design. The design addresses challenges of providing an accessible interface for mapping movement data to audio parameters, managing multi-dimensional data and creat- ing audio mapping templates for a contextually grounded sound design. The evaluation documents a diversity of sonification outcomes, reflections by participants that im- ply curiosity for further work on sonification, as well as the importance of the immediacy of the both visual and acous- tic feedback of parameter choices. |
13:15 | The "Collective Rhythms Toolbox": an audio-visual interface for coupled-oscillator rhythmic generation ABSTRACT. This paper presents a software package called the "Collective Rhythms Toolbox" (CRT), a flexible and responsive audio-visual interface that enables users to investigate the self-synchronizing behaviors of coupled systems. As a class of multi-agent systems, CRT works with networks of coupled-oscillators and a physical model of coupled-metronomes, allowing users to explore different sonification routines through real-time parameter modulation. Adjustable coefficient matrices allow for complex coupling topologies that can induce a diverse range of dynamic rhythmic states and audio-visual feedback facilitates user engagement and interactive flow. Similarly, several real-time analysis techniques provide the user with visual information pertaining to the state of the system in terms of group synchrony. Ultimately, this paper showcases how parameterizing coupled systems in specific ways allows different computer music and compositional techniques to be carried out through the lens of dynamical systems-based approaches. |
13:20 | A Programmable Linux-Based FPGA Platform for Audio DSP PRESENTER: Tanguy Risset ABSTRACT. Recent projects have been proposing the use of FPGAs (Field Programmable Gate Array) as hardware accelerators for high computing power real-time audio Digital Signal Processing (DSP). Most of them imply specific developments which cannot be re-used between different applications. In this paper, we present an accessible FPGA-based platform optimized for audio applications programmable with the FAUST language and offering advanced control capabilities. Our system allows fast and simple deployment of DSP hardware accelerators for any Linux audio application on Xilinx FPGA platforms. It combines the Syfala compiler – which can be used to generate FPGA bitstreams directly from a FAUST program – with a ready-made embedded Linux distribution running on the Xilinx Zynq SoC. It enables the compilation of complete audio applications involving various control protocols and approaches such as OSC (Open Sound Control) through Ethernet or Wi-Fi, MIDI, web interfaces running on an HTTPD server, etc. This work opens the door to the integration of hardware accelerators in high-level computer music programming environments such as Pure Data, SuperCollider, etc. |
13:25 | A Comparative Computational Approach to Piano Modeling Analysis ABSTRACT. Piano modeling is a topic of great interest in musical acoustics and sound synthesis. Besides challenges in modeling its mechanism, it is also difficult to understand how far the models are from the actual acoustic instrument and why. Identifying the most prominent aspects of the piano’s sound and evaluating the sound-generation fidelity of associated models are usually addressed with studies based on listening tests. This paper shows how computational methods can provide novel insights into piano analysis and modeling, which can be used to complement perceptual analyses. In particular, our approach identifies audio descriptors that present discriminative differences between types of pianos when these are excited with specific stimuli. The proposed method is used to analyze a collection of recordings from upright acoustic and synthetic pianos, excited with single-played notes, triads, and repeated notes. Results show that the sound generated by the considered types of piano presents major differences in terms of spectral descriptors and constant-Q transform coefficients. |
13:30 | A Real-Time Cochlear Implant Simulator - Design and Evaluation ABSTRACT. This article describes the implementation of a flexible real-time Cochlear Implant (CI) simulator, and it's preliminary evaluation set to investigate if a specific set of parameters can simulate the musical experience through CIs using Normal Hearing (NH) subjects. A Melodic Contour Identification (MCI) test is performed with 19 NH subjects to identify melodic contours processed by the simulator. The results showed that the participants had a decrease in precision in determining musical contours as the intervals between notes decreased, showing that the reduced spectral resolution increases the difficulty to identify smaller changes in pitch. These results fall in line with other studies that perform MCI tests on subjects with CI, suggesting that the real-time simulator can mimic the reduced spectral resolution of a CI successfully. This study validates that the implemented simulator, using a pulse-spreading harmonic complex as a carrier for a vocoder, can partially resemble the musical experience had by people with hearing loss using CI hearing technology. This suggests that the simulator might be used to further examine the characteristics that could enhance the music listening experience for people using CIs. |
13:35 | Song Popularity Prediction using Ordinal Classification ABSTRACT. Predicting a song's success based on audio descriptors before its release is an important task in the music industry, which has been tackled in many ways. Most approaches utilize audio descriptors to predict the success of a song, typically captured by either chart positions or listening counts. The popularity prediction task is then either modeled as a regression task, where the popularity metric is precisely predicted, or as a classification task by, e.g., transforming the popularity task to distinct classes such as hits and non-hits. However, this way of modeling the task neglects that most popularity measures form an ordinal scale. While classification ignores the order, regression assumes that the data is in interval (or ratio) scale. Therefore, we propose to model the task of popularity prediction as an ordinal classification task. Further, we propose an approach that utilizes the relative order of classes in an ordinal classification setup to predict the popularity (class) of songs. Our presented approach requires a machine learning model able to predict the relative order of two pieces of music, and hence can flexibly be applied using many types of predictors. Furthermore, we investigate how different ways of mapping the underlying popularity metrics to ordinal classes influence our model. We compare the proposed approach with regression as well as classification models and show its robustness w.r.t. different numbers of ordinal classes and the distribution of the number of songs assigned to them. Additionally, we show that, for some prediction settings, our approach results in a better predictive performance than classical regression and classification approaches, while it achieves similar predictive performance on other settings. |
13:40 | Sonifying energy consumption using SpecSinGAN ABSTRACT. In this paper we present a system for the sonification of the electricity drawn by different household appliances. The system uses SpecSinGAN as the basis for the sound design, which is an unconditional generative architecture that takes a single one-shot sound effect (e.g., a fire crackle) and produces novel variations of it. SpecSinGAN is based on single-image generative adversarial networks that learn from the internal distribution of a single training example (in this case the spectrogram of the sound file) to generate novel variations of it, removing the need of a large dataset. In our system, we use a python script in a Raspberry PI to receive the data of the electricity drawn by an appliance via a Smart Plug. The data is then sent to a Pure Data patch via Open Sound Control. The electricity drawn is mapped to the sound of fire, which is generated in real-time using Pure Data by mixing different variations of four fire sounds - a fire crackle, a low end fire rumble, a mid level rumble, and hiss - which were synthesised offline by SpecSinGAN. The result is a dynamic fire sound that is never the same, and that grows in intensity depending on the electricity consumption. The density of the crackles and the level of the rumbles increase with the electricity consumption. We pilot tested the system in two households, and with different appliances. Results confirm that, from a technical standpoint, the sonification system responds as intended, and that it provides an intuitive auditory display of the energy consumed by different appliances. In particular, this sonification is useful in drawing attention to "invisible" energy consumption. Finally, we discuss these results and future work. |
KEYNOTE Miller Puckette, University of California San Diego, USA
16:00 | Web Applications for Automatic Audio-to-Score Synchronization with Iterative Refinement ABSTRACT. The task of aligning a score to corresponding audio is a well-studied problem of particular relevance for a number of applications. Having this information allows users to explore the materials in unique ways and build rich interactive experiences. This contribution presents web applications that deal with the problem by implementing a two-step synchronization process. The first step implements a score-informed alignment while the second one can be seen as a further refinement, particularly useful for a previous manual or semi-automatic synchronization. These web implementations are specifically conceived to work with the IEEE 1599 standard, which allows for multiple instances of scores and audio renderings to be mutually synchronized together. By adopting web technologies, users are not tied to any specific platform. Evaluations of the performances and current limitations of these processes will be presented. |
16:05 | Developing and evaluating a Musical Attention Control Training game application ABSTRACT. Musical attention control training (MACT) is a Neuro-logic Music Therapy (NMT) technique to strengthen attention skills for people who may have attention defi-cits, for instance related to ADHD or Parkinson Disease (PD), activating different parts of the brain and stimulat-ing neural connectivity. While multiple interventions per week would enhance the effect of MACT, attending sev-eral sessions a week with a therapist can be challenging. Applied game interventions implementing MACT, which can be played at home, could offer complementary train-ing to the limited number of therapy sessions. While applied games have been shown to facilitate successful interventions for cognitive impairments, to date no game exists based on MACT. We propose a novel approach to research the plausibility of applied games to support NMT, conclude game requirements for the specific needs of People with PD (PwPD), and introduce a game that emulates a MACT session. We carried out a pilot exper-iment to gauge how users interact with the game and its efficacy in attention control training with non-PD partici-pants, letting them play 10 game intervention sessions within two weeks. Although no significant short-term attention effects were observed in this timeframe, user evaluations and metrics of game performance suggest that gamified MACT could be a promising supplement to conventional MACT for improving attention skills to optimize quality of life of PwPD. |
16:10 | Efficient simulation of acoustic physical models with nonlinear dissipation PRESENTER: Riccardo Russo ABSTRACT. One long-term goal of physics-based sound synthesis and audio effect modeling has been to open the door to models without a counterpart in the real world. Less explored has been the fine-grained adjustment of the constituent physical laws that underpin such models. In this paper, the introduction of a nonlinear damping law into a plate reverberation model is explored, through the use of four different functions, transferred from the setting of virtual-analog electronics. First, a case study of an oscillator with nonlinear damping is investigated. Results are compared against linear dissipation, illustrating differing spectral characteristics. To solve the systems, a recently proposed numerical solver is employed, that entirely avoids the use of iterative routines such as Newton-Raphson for solving nonlinearities, thus allowing very efficient numerical solution. This scheme is then used to simulate a plate reverbation unit, and tests are run, to investigate spectral variations induced by nonlinear damping. Finally, a musical case is presented that includes frequency-dependent damping coefficients. |
16:15 | Ding-dong: Meaningful Musical Interactions with Minimal Input ABSTRACT. Digital Musical Instruments have given us the power to create unique musical systems of performance, often for people with no musical experience. The prevalence of gestural interfaces with a high number of parameters and limitless mapping possibilities has blossomed in this context. Yet this same flexibility at times leads to creative paralysis, contrary to the presentation of these interfaces as transparent vessels for untapped musical imaginations. This paper outlines a new work, 'The Doorbell', created to investigate how minimal input might produce meaningful musical results. Building on work around constrained interfaces and one-button controllers, the work affords a fun, performative musical experience using only the input of a single button, encouraging anyone to discover and perform a surprising depth of musical possibilities through a household object. By stripping back input variables and taking advantage of natural musical affordances of the doorbell, 'The Doorbell' questions what elements of the interface offer new areas of exploration for DMIs more generally, and how musical narrative and precomposition are contributing factors to a meaningful musical interaction. |
16:20 | Introducing stateful conditional branching in Ciaramella ABSTRACT. Conditional branching in Synchronous Data Flow (SDF) networks is a long-standing issue as it clashes with the underlying synchronicity model. For this reason, conditional update of state variables is rarely implemented in data flow programming environments, unlike simpler selection operators that do not execute code conditionally. We propose an extension to SDF theory to represent stateful conditional branching. We prove the effectiveness of such approach by adding conditional constructs to the Ciaramella programming language without compromising its modular declarative paradigm and maintaining domain-specific optimizations intact. This addition enables easy implementation of common DSP algorithms and helps in writing efficient complex programs. |
CONCERT
Note: For the exact times of the pieces, please refer to the concert schedule.
EVENING CONCERT
The concert ends with the piece "Chopper" by Chris Chafe.
Note: For the exact times of the pieces, please refer to the concert schedule.