previous day
all days

View: session overviewtalk overview

09:00-09:30 Session 25: Keynote
Location: Fanny Hensel Saal
Keynote Speech: Performance space: The spotlight and its implications for performance psychology

ABSTRACT. Professional training in music often lacks repeated exposure to realistic performance situations, with musicians learning all too late (or not at all) how to manage the stresses of performing and the demands of their audiences. This lecture will explore the physiological and psychological differences between practising and performing, specifically examining cardiovascular and neuroendocrine responses in musicians when performing under pressure. It will also introduce the Performance Simulator, an innovative new facility which operates in two modes: (i) concert and (ii) audition simulation. Initial results demonstrate that the Simulator allows musicians to develop and refine valuable performance skills, including enhancement of communication on stage and effective management of high state anxiety.

09:30-10:45 Session 26: Gesture and communication in music performance I
Location: Fanny Hensel Saal
Rhythm perception in the context of music and dance
SPEAKER: Yi-Huang Su

ABSTRACT. Multisensory cues in a music performance – such as the sounds of the instruments and the musicians’ gestures – are important means of communication amongst ensemble musicians, which can also be used by the audience to form multimodal experiences of music. There is, however, a similar and yet less understood scenario that also involves visual observation of movements and auditory perception of music: dance. As dancers coordinate their movements with the musical rhythms, both streams of information may converge temporally into a multimodal percept from the audience’s perspective. Here, I will present some of my recent investigations in this scenario: How do observers extract temporal information from dancelike movements? Do similar processes as found in auditory rhythm perception also underlie visual perception of structured human movements? How does visually perceived rhythmicity interact with perception of auditory rhythms? In one study, we found that observers make use of the underlying periodicity in movement trajectories for temporal estimation of dancelike movements, which seems analogous to the benefit of a regular beat in processing auditory rhythms. Regarding the cross-modal effects, I will show that observing a humanlike figure bounce periodically induces a visual beat percept, which can modulate or improve beat perception of auditory rhythms in parallel. Furthermore, the profile of multisensory gain suggests the presence of an integrated audiovisual beat in rhythm perception and synchronization. Finally, the extent of temporal integration between auditory and visual rhythms appears to depend on the perceived congruency between the two streams. Together these results reveal an audiovisual interplay in the rhythm domain involving sounds and movements, which may be based on their sensorimotor correspondence in perception. It remains to be verified whether musicians, dancers, and the audience may employ (partially) overlapping cross-modal mechanisms for communication, synchronization, and perception.

Enabling Synchronization: Auditory and Visual Modes of Communication during Ensemble Performance
SPEAKER: unknown

ABSTRACT. Ensemble musicians exchange nonverbal auditory and visual cues (e.g. breathing, head nods, changes in tempo/dynamics) during performance to make their intentions more predictable and help enable synchronization. The predictability of performers' intentions is higher in some musical contexts (e.g. within phrases) than in others (e.g. following held notes or long pauses), and recent research suggests that musicians' use of auditory and visual cues may change throughout a performance as the predictability of co-performers' intentions fluctuates. We present two studies that investigate the nature of cues given in high-predictability and low-predictability musical contexts, and that test musicians' abilities to use these cues during duet performance. Study 1 tested pianists' reliance on auditory and visual cues in musical contexts where timing was more or less precisely specified by the score. Pianists performed the secondo part to three duets with recordings of pianists or violinists playing the primo parts, as the presence and absence of auditory and visual visual signals from the primo were manipulated. Asynchronies between primo recording and participant secondo performances were then calculated. The results showed increased reliance on visual cues when uncertainty about co-performers' intentions was high (i.e. at re-entry points following long pauses), but a strong reliance on auditory cues otherwise. Study 2 used motion capture to map the head and hand gestures that pianists and violinists use to cue each other in at the starts of pieces. This study is currently ongoing, but forwards-backwards head acceleration is hypothesized to indicate the timing of starting note onsets, and gesture duration is hypothesized to indicate piece tempo. This research aims to enhance our understanding of which cues and modes of communication are important across different performance contexts, and is expected to benefit ongoing efforts to develop an intelligent accompaniment system capable of responding to human performance cues in real-time.

Perceptual relevance of asynchrony between orchestral instrument groups in two concert halls
SPEAKER: unknown

ABSTRACT. Timing in a music ensemble performance is asynchronous by nature. Asynchrony is generated by the players themselves, and further delays to listeners are introduced by the location and orientation of the instruments on stage. Perfect synchrony might also lead to masking effects. For instance, the harmonics of low register instruments may be partially masked by high-register instruments. Therefore, the perceived loudness of the low instruments lies solely on their low frequency components that may be weak. Additional attenuation in the direct sound at low frequencies up to 1 kHz is introduced by the seat-dip effect.

This paper studies the perceptual relevance of asynchrony between three orchestral instrument groups: I) timpani and double basses, II) cellos, and III) other instruments (winds, brass, violins, violas), in two concert halls auralised by measurements. A perfect synchrony between the instruments was obtained by finding an energy threshold for both the instrument and room impulse onsets. Perfect synchrony was compared to 1) the lows (double basses and timpani) played first with delays of 20 ms (cellos), and 40 (other instruments), and 2) to the highs (other instruments) played first with delays of 20 ms (cellos), and 40 ms (double basses and timpani). These delays are within the observed range in ensembles.

Listener preference was investigated with a paired comparison online listening test using binaural renderings of two concert halls over headphones. The results were analysed with a probabilistic choice model showing that listener preference depends on the asynchrony: the lows-first case is the most preferred in both halls while the highs-first is the least preferred. The results also imply that preference on timing depends on the concert hall, but this requires future listening tests with a spatial audio system in order to reproduce the spatial characteristics of the concert halls more accurately.

Playing slow in reverberant rooms – Examination of a common concept based on empirical data
SPEAKER: unknown

ABSTRACT. During the performance of music the room acoustical environment has a substantial influence on the player’s perception of the music. This presumably affects his way of playing, the more so if one assumes an inner reference of the sound that he wants to convey to the audience. Reducing the tempo in very reverberant rooms is a strategy that is often reported by musicians and it was, for example, recommended by J. J. Quantz in his famous music treatise in 1752. In this paper, the data collected in a field study conducted with a cellist in 7 European concert halls and a laboratory study conducted with 12 musicians in 14 virtual performance spaces is used to investigate in how far this strategy is actually followed in practice. A software-based analysis of the recordings of the musicians as well as room acoustical measurements in the halls were the basis for a statistical analysis of the influence of parameters like T30, EDT or ST_late on the tempo of the pieces played in each concert hall. The results suggest that the adjustment of tempo strongly depends on the basic tempo of the performed music and that there are different types of strategies adopted by musicians. These are elucidated by statements collected in interviews that were conducted with the performers during the experiments.

Flexible Score Following: The Piano Music Companion and Beyond
SPEAKER: unknown

ABSTRACT. In our talk we will present a piano music companion that is able to follow and understand (at least to some extent) a live piano performance. Within a few seconds the system is able to identify the piece that is being played, and the position within the piece. It then tracks the performance over time via a robust score following algorithm. Furthermore, the system continuously re-evaluates its current position hypotheses within a database of scores and is capable of detecting arbitrary ‘jumps' by the performer. The system can be of use in multiple ways, e.g. for piano rehearsal, for live visualisation of music, and for automatic page turning. At the conference we will demonstrate this system live on stage. If possible, we would also like to encourage (hobby-)pianists in the audience to try the companion themselves. Additionally, we will give an outlook on our efforts to extend this approach to classical music in general, including heavily polyphonic orchestral music.

10:30-12:00 Session 27: Poster session III
Location: MDW Aula
Exploring the decay properties of guitar sounds from mobility measurements
SPEAKER: unknown

ABSTRACT. With the goal of providing the instrument maker useful and fast numerical tools to characterize the final objects, we herein propose a processing system to evaluate the decaying properties of guitars with only a few impact measurements. Our method relies on a hybrid synthesis technique first developed by J. Woodhouse (Acta Acustica, 2004). This technique is able to derive synthetic signals of guitar plucks with a very light computational load and makes use of mobility measurement at the bridge. The obtained signal thus includes the complexity and the singularity of the mechanical and acoustical behavior of the guitar body, without having to estimate or model it. In preceding studies (B. David, ISMA, 2014), some preliminary results have been obtained. It was in particular shown that with only a one-dimensional measurement of the mobility it was possible to well represent the decay properties for all the notes of a specific string. This paper extends those results by dealing with different instruments, by comparing the accuracy of the prediction for several strings and by using 2-dimensional measurements of the mobility. The decaying properties are studied with the help of High Resolution (ESPRIT) method and are “summarized” with the help of the Energy Decay Curve feature. This leads to a representation of the whole guitar compass with a so-called decay profile, which allows us to assess at once the properties of the instrument, its timbre homogeneity in terms of extinction and eventually detect and objectivize possible defects like the well known “dead tones”.

Comparison of mouthpiece pressure signal and reed bending signal on clarinet and saxophone
SPEAKER: unknown

ABSTRACT. The clarinet and the saxophone have a similar sound excitation principle. For both instruments, a single reed is mounted to a beak shaped mouthpiece and becomes excited by the player's blowing. Caused by the different shapes of the resonators, the sound of the cylindrical clarinet contains only the odd numbered harmonics, whereas the sound of the conical saxophone contains all members of the harmonic series. Measurements on double reed instruments by Voigt (1975) showed that the closing time of the double reed was constant across all pitches played on the instrument. Consequently, only the offset period was modulated. Characteristic frequency gaps in the spectrum (formants) of an instrument's sound were explained by this specific motion pattern of the oscillator (pulse forming theory, Fricke 1975). Can this theory also be applied to single reed instruments, like the saxophone and the clarinet? For our measurements on a Bb-clarinet and an alto-saxophone, synthetic single-reeds were equipped with strain gauge sensors, to capture the bending of the reed during sound production. A pressure transducer inside the chamber of the mouthpiece tracked the inner mouthpiece pressure. Two professional players performed a chromatic scale over the whole range of the instrument, either on the clarinet or the saxophone. From the reed bending measurements, we calculated the ratio between the opening time and the closing time for each played tone. On the clarinet, this ratio was almost constant for all played tones (M = 0.71, SD = 0.09), whereas on the saxophone these ratios showed larger deviations, but no clear pattern in relation to the played pitch (M = 0.64, SD = 0.44). Closing times for the tones eb', ab' and b' on the saxophone were much shorter than the neighboring pitches. Spectrograms of the reed signal and the mouthpiece pressure signal were calculated for the steady state part of the tones. For the saxophone, both the spectrograms were almost identical, depicting all members of the harmonic series in a decreasing fashion. Against our expectations, we also observed all harmonics in the reed signal of the clarinet, whereas only in the mouthpiece pressure signal merely the odd harmonics appeared. However, these reed bending measurements indicate that the pulse forming theory, which is valid for double reed instruments, can not be transferred to single reed instruments like the clarinet or the saxophone, where the closing time varies with the pitch.

Steady State Sound Production and Investigations on Classic Guitars

ABSTRACT. The discussion about the quality of a guitar goes back to the early days of this instrument. Because its growing popularity in the last decades numerous experiments and theoretical investigations have been published in order to better understand the instrument and to improve the quality of the tone production.

The possibilities and tools to investigate the functionality and properties of a guitar have developed dramatically in recent years due to the application of fast and cheap computers. Mathematical procedures and modelling according to finite element methods allow to simulating any instrument.

Here, a more practical approach is presented. The guitar is slightly modified to produce the sound in the very same way the string tension acts on the bridge. The guitar under test is agitated with steady state signals or, for range measurements, as sweep-sine or MLS signals. With these defined state signals, analysis is by far more easy to accomplish. Shown are comparative frequency response between famous old guitars and new models, influence of string tension and weight distribution, temperature and humidity. All results are verified by conventional analysis methods.

Investigating chime bar vibrations using high-speed stereophotogrammetry
SPEAKER: unknown

ABSTRACT. Stereophotogrammetry is an optical distance measurement method. Using digital image correlation, the 3D coordinates of control points on a surface are calculated from a stereocamera recording. The use of highspeed cameras allows to obtain time resolved displacement data suitable for structural vibration analysis. The technique appears to be attractive to study musical instruments under realistic performance conditions: The measurement is non-invasive and both rigid-body motion as well as acoustically relevant vibrations are obtained from one video sequence. However, the spatial resolution is far less compared to interferometric measurement methods, and depends strongly on the specific measurement setup, namely, the size of the measurement window, the distance from the measurement object, and the lighting situation. This contribution presents a feasability study on a Double Bass Chime Bar in A (f_0 = 110 Hz) and discusses potential fields of use of the method in musical acoustics research.

Pre-assembly violin auralization—listening to plate-tuning trends and to fine model-adjustments
SPEAKER: Robert Mores

ABSTRACT. Listening to violins before or during the manufacture process might be desirable for luthiers. Two different approaches are revisited and compared. One approach identifies the mutual dependencies between plate modes, body modes and cavity modes as empirically derived from pre-assembly and post-assembly measurements [Bissinger, J. Acoust. Soc. Am. 132, 465 (2012)]. The derived model uses the critical frequency as an intermediary key parameter to co-define body and cavity modes as well as the radiation efficiency at higher frequencies. The bridge rocking frequency serves as a secondary key parameter to define radiation in the frequency range above 2 kHz. While varying these parameters the resulting radiation filter allows to listen to trends that directly translate to mode 2 and mode 5 plate tuning but also to bridge tuning [Bissinger and Mores, J. Acoust. Soc. Am. 137, EL293 (2015)]. The other approach is based on sampling technology. The binaural impulse responses of an existing reference violin is sampled in the preferred listening position in a luthier's shop. Based on this sample, a luthier can modify individual resonances while editing in the frequency domain in order to explore fine adjustments for future models. A method has been developed to preserve the sampling quality while transforming such a partially modified violin spectrum into an audio-processable filter [Türckheim et al., DAFx-10 1-6 (2010)]. By means of real-time processing, a luthier can listen to the virtually modified violin while playing a silent violin. Both systems will be examinable during the poster session.

Influence of mouthpiece geometry on saxophone playing
SPEAKER: unknown

ABSTRACT. Saxophonists agree that the detailed geometry of the mouthpiece plays an important role in both playability and sound production of the saxophone. The hypothesis to be tested in this paper is whether there is difference in 1. radiated sound (in terms of spectral centroid and sound pressure level) and in 2. playability when playing mouthpieces with different internal geometries. The results revealed that the radiated sound is scarcely influenced, but that the playability differs significantly depending on the mouthpiece.

The influence of nicks on the sound properties and the airflow in front of flue organ pipe
SPEAKER: unknown

ABSTRACT. Nicks creating is metal flue organ pipe voicing technique which was widely used by Czech organ builders in history. The effect of this restoration intervention is investigated. Together with sound analysis air flow in front of the mouth is studied by means of phase-locked Particle Image Velocimetry. Both measurements were performed in steady state of pipe sound. Analyses are in compliance with organ builders – depression of harmonics is found in spectra after performing intervention. Air flow analysis shows significant differences in vector velocity maps – decrease of vector lengths near the flue. Air flow velocity maps seem to have the potential in organ pipe diagnostics.

TappingFriend – an interactive science exhibit for experiencing synchronicity with real and artificial partners
SPEAKER: unknown

ABSTRACT. TappingFriend is an interactive science game aimed to provide a playful experience of synchronization and cooperation between one or two humans and a virtual partner – the maestro. The players tap in time with the maestro on little drums and the system provides immediate feedback on their synchronization success by showing their taps relative to the maestro’s taps and by counting the taps that were on time, too early, or too late. The aim of the game is to achieve as many on-time taps as possible. The maestro's degree of cooperation differs between play modes, and players should experience the different levels of cooperativity while tapping. Players have to develop different strategies to stay on beat with their fellow players and the virtual partner, the maestro. The four different play modes offer different levels of cooperation by the maestro: in the first play mode, the maestro keeps a strict beat and does not react to either of the two players; thus, the players have to adapt to stay on time. In the second play mode, the maestro changes his tempo and gets faster or slower or both. In the third play mode, the maestro establishes a cooperative tapping behavior by employing a simple phase and period correction model: he “listens” to the taps of his fellow players and adapts his tapping tempo and phase to stay as closely together as possible. In the fourth play mode, the maestro cues in with four beats and leaves the two players on their own. This exhibit implements current sensorimotor models of temporal coordination that are based on current research on synchronization and communication in music ensembles.

Inton – a system for in-situ measurement of the pipe organ
SPEAKER: unknown

ABSTRACT. Inton is a measurement system for the repeatable unbiased acoustical documentation and analysis of pipe organ in-situ, independent on the actual placing of microphones in the space, according to “The method of the acoustical documentation of pipe organs, version 8&1” developed in MARC Prague. The system consists of a microphone set, microphone pre-amplifiers, A/D converters and laptop computers with special software. The system can be used for the acoustical documentation of pipe organs and the subsequent objective and/or subjective evaluation of recorded sounds and data. Organ builders can use Inton as a tool in various stages of the organ building or restoration process. The system allows the measurement of room acoustics using the pipe organ as a sound source without the need of any specialized equipment.

10:45-11:00Coffee Break
11:00-11:45 Session 28: Gesture and communication in music performance II
Location: Fanny Hensel Saal
Real-time estimation of instrument controls with marker based IR cameras.

ABSTRACT. Scientific analysis and understanding of musical performances is an ambitious challenge at the intersection of a wide array of disciplines ranging from motor-learning and cognitive sciences to music pedagogy. Recently, the availability of technology and methods to measure many aspects of the musical performances allow for a better understanding of the mechanisms behind musical practice. Among these aspects, it is of special interest in this work the measurement of instrumental controls. Several methods have been reported in the last years adapted to a specific kind of instrument. Most of these methods are generally intrusive and in many cases they need for data post-processing so that instrumental controls can not be computed in real-time, which in some applications is crucial. We present a method based on high speed video cameras that track the position of reflective markers. The main advantages with respect to previous solutions are that 1) the degree of intrusivity is very low, while 2) it is able to compute the instrumental parameters in real-time based on the geometrical position of the markers and 3) it allows for the measurement of several instruments and performers. The main problem with such optical systems is marker occlusion. Each marker needs to be identified by at least three cameras placed at different angles and planes in order to correctly determine its 3D coordinates. Marker identification is made robust by the use of rigid bodies (RB), a six degrees-of-freedom (6DOF) rigid structure defined by the position of a set of markers and associated with a local system of coordinates (SoC). The position of the markers is constant relative to the local SoC and their global coordinates can be obtained by a simple rotation and translation from the local to the global SoC. Even if some of the markers are occluded, their position can be reconstructed from the others. The method has been successfully applied to bowed strings by tracking the position of bow and strings and it is being adapted to the guitar, which presents extra difficulties as the hands of the performer are flexible skeletons rather than rigid bodies.

Piezoelectric film sensors facilitate simultaneous measurement of bowing parameters and bridge vibrations during violin playing
SPEAKER: unknown

ABSTRACT. In violin playing, an important part of the interaction between the player and the instrument is mediated by the bow. The bridge, on the other hand, transfers the bowed-string vibrations to the violin body. For the present investigations, both the action of the bow and the bridge are detected by thin polymer-film sensors with piezoelectric properties. By combining this technology with conventional acoustic and optical means of detection, aspects and measures of playability (especially the minimum bow force) can be referred to the respective bridge and body vibrations. Results are discussed in terms of the theoretical framework of Woodhouse and in comparison to the experimental work of Schoonderwaldt and Demoucron.

Towards Bridging the Gap in a Musical Live Performance
SPEAKER: unknown

ABSTRACT. Performances across diverse musical genres conventionally happen with a clear one-way structure; musicians perform while spectators listen, except when they sing along, for instance. In most cases, the audience’s opportunities for participation are limited to relatively inexpressive forms of interaction such as clapping, swaying and interjecting. By contrast, recent emerging technologies for audience participation allow spectators to collaborate in expressive and targeted ways with performing artists to influence and shape musical live performances in real time. Already, a rich variety of custom-built instruments, devices and systems have been devised for audience participation with the potential to facilitate richly collaborative performance. The artistic potential of such technology-driven audience participation is high both for musicians and their audiences. Furthermore, it can bridge the gap between the active role of musicians and the passive role of spectators. Participative technologies can qualitatively change the overall experience in new positive directions for all involved. However, if not considered carefully, audience participation can be annoying, may fail, and may lead to frustration. While the reasons for this can be manifold, we posit that the chances of successful audience participation are greatly facilitated by well-considered design. To this end, we systematically analysed a vast number of existing approaches of audience participation in musical and non-musical domains. In addition, we conducted two case studies at live performances to shed light on conceptual and compositional constraints within the process of designing audience participation. Our insights are presented as a collection of structured design aspects able to characterise participatory music performances and their broader contexts. As a result, we propose the design toolkit "LiveMAP", which stands for “Live Music Audience Participation”, and which supports the design and creation of participatory elements in a musical live performance.

11:45-13:00 Session 29: Timbre perception
Location: Fanny Hensel Saal
SPEAKER: unknown

ABSTRACT. Roughness of violin tones was studied in a psychoacoustic experiment focused on perceptions of different types of changes in time course of sound signals. Differing rough sounds of free violin G string, played with different bow speeds and force were recorded simultaneously by high speed video camera and as audio recordings. The audio recordings were used as stimuli in a ranking and rating and pair comparison listening test. Roughness dissimularity ratings and verbal attribute descriptions of perceived differences and the resulting descriptors are joined with stimuli positions in a MDS perception space and with extent of changes in time courses of sound signal and string motion. The research revealed a possible multi-dimensionality of perceived roughness. In the results, the “cracked” percept is shown in connection to irregularities in both signals, and the “buzzing” percept to superposition of very regular waveforms of neighbouring harmonics in one bark.

Towards the Comparability and Generality of Timbre Space Studies
SPEAKER: Saleh Siddiq

ABSTRACT. Background: The perceptual relations of musical timbres are difficult to assess. The so called timbre spaces (TS) are a concept to depict timbre dissimilarities as spatial distances in a euclidean space. Since the 1970s, the TS concept, as intuitively accessible as it is, gained popularity within the scientific community and has been generally accepted. A recent comparison of several TS revealed a lack of consistency among TS studies (Siddiq et al. 2014). It's most likely caused by the from study to study vastly different stimuli-sets. Thus far, instruments were reduced to a single tone, compared at the same pitch, and only (re-)synthesized sounds were used.

Research question: These findings raise the question whether an empirical meta TS would rather comply with the results of the original TS or confirm the inconsistency.

Methods: Based on the original stimuli used by Grey (1975), Krumhansl (1989), McAdams et al. (1995), and additional natural instrument sounds out of the Vienna Symphonic Library (VSL), a hearing experiment was performed. The obtained dissimilarity matrix was, by means of a multidimensional scaling (MDS), graphed into a 3D scatter plot and eventually structured through a hierarchical clustering.

Results: The inconsistency is pretty much confirmed. Instead of an anticipated instrument clustering (e.g. all trumpets roughly located in the same region), the meta TS yields a clear clustering of stimuli-sets. Apparently, there is a greater timbral resemblance among the different instrument sounds from the same stimuli-set than among the sounds of the same instrument across the different stimuli-sets. Hence, these timbral differences between the stimuli-sets seem to prevail as primary features of timbre discrimination which in turn significantly impairs the comparability and thus the generality of TS studies.

Modelling similarity perception of short music excerpts

ABSTRACT. There is growing evidence that human listeners are able to extract considerable amounts of information from short music audio clips containing complex mixtures of timbres and sounds. The information contained in clips as short as a few hundred milliseconds seems to be sufficient to perform tasks such as genre classification (Gjerdigen & Perrott, 2008; Mace, Wagoner, Teachout, & Hodges, 2012) or artist and song recognition (Krumhansl, 2010). The ability to extract useful, task-related information from short audio clips also has been shown to vary between individuals and this variability has been the basis for the construction of a sound similarity sorting test (Musil, El-Nusairi, & Müllensiefen, 2013) as part of the Goldsmiths Musical Sophistication test battery (Müllensiefen, Gingras, Musil & Stewart, 2014) where participants are asked to sort 16 800ms clips into 4 groups by perceived similarity. In this talk we will present data to explain the individual differences in the ability to extract meaningful information from short audio clips and to compare audio extracts on the basis of sound information alone. In addition, we will present two approaches to identify audio features of the short sound clips that drive listeners judgements. The first approach (Musil, El-Nusairi, & Müllensiefen, 2013) makes use of timbre features in combination with powerful statistical prediction methods to approximate listener judgements. In contrast, the second approach (Müllensiefen, Siedenburg & McAdams, in prep.) relies on Tversky’s theoretically motivated model of human similarity perception (Tversky, 1977) to explain listener judgements and makes use of 22 spectro-temporal audio descriptors from the clips using the Timbre Toolbox (Peeters et al., 2011). Non-negative matrix factorization was employed to decompose the clips-descriptor matrix into a matrix of binary features which were then fed into Tversky's ratio models of similarity perception. Results show a superiority of the second approach using Tversky’s similarity model that explains a higher proportion of the variance in the listener judgements and requires considerably less parameter tuning. The results are discussed in the context of psychological approaches to similarity perception which seem to apply well to the perception of musical sound.

Comparing recorded and simulated musical instrument sounds: perspectives for a perceptual evaluation
SPEAKER: unknown

ABSTRACT. To better understand the properties of a musical instrument, a common practice is to compare its recordings in controlled situations with a computational model that attempts to recreate such situations. In this study we present perceptual criteria we applied to evaluate the so-called hummer in two acoustic conditions, with and without first-order reflections. The hummer is a plastic corrugated tube that produces a clear pitch sensation when rotated at specific speeds. Our evaluation was based on estimates of the perceptual descriptors of loudness, loudness fluctuation, roughness and fundamental frequency. We discuss why we chose those descriptors, which limitations our analysis had and what aspects we consider important in order to extend this approach to the evaluation of other musical instruments.

Investigating the Colloquial Description of Sound by Musicians and Non-Musicians
SPEAKER: Jack Dostal

ABSTRACT. What is meant by the words used in a subjective judgment of sound? Do musicians, scientists, instrument makers, and others mean the same things by the same expressions? These groups describe sound using an expansive lexicon of terms (bright, brassy, dark, pointed, muddy, etc.). The same terms and phrases may have different or inconsistent meanings to these different groups of people. They may even fail to be applied consistently when used by a single individual in various contexts. We would be better able to relate scientific descriptions of sound to musical descriptions of sound if the words used to describe sound had less ambiguous interpretations.

To investigate the use of words and phrases in this lexicon, subjects with varying musical and scientific backgrounds are surveyed. The subjects are asked to listen to different pieces of recorded sounds and music and are asked to use their own colloquial language to describe the musical qualities and perceived quality differences in these pieces. Some of the qualitative results of this survey will be described, and some of the more problematic terms used by these various groups to describe sound quality will be identified.

14:00-14:30 Session 30: Voice and auralization
Location: Fanny Hensel Saal
Physical Modeling and Numerical Simulation of Human Phonation
SPEAKER: unknown

ABSTRACT. The human phonation is a complex interaction of fluid mechanics, solid mechanics and acoustics. As the lungs compress, air flows through the larynx passing the vocal folds which form a narrow constriction, the glottis. The air flow forces the vocal folds to vibrate resulting in a pulsating air stream, which is the main sound generating mechanism for phonation. Hence, our modeling approach is to resolve, within the larynx and adjacent regions, the physical details of the phonation process in space and time by means of partial differential equations (PDEs). Due to limitations in computer resources and current numerical methods, full coupling between all three fields for realistic 3D geometries is currently not feasible. Therefore, we concentrate on prescribed flow computations, evaluate the acoustic sources and perform acoustic computations of the generated sound. In this way the fluid-solid interaction problem, whose accuracy critically depends on reliable geometrical and material parameters of all layers of the vocal folds, is circumvented. We apply the open source program OpenFoam for solving the 3D incompressible Navier-Stokes equations, and CFS++ (in-house research code) to compute the acoustic sources as well as sound propagation. The main findings of our current simulations can be summarized as follows. The dominant acoustic sources of the fundamental frequency as well as its harmonics are located inside the glottis and the highest amplitudes are found in a thin layer right above the surface of the vocal folds. For the non-harmonic frequencies, the acoustic sources are concentrated in the vortical decay region. The simulated formant frequencies for the /i/ and /u/ vowels compared well with the formant frequencies measured on human subjects. Furthermore, the simulations suggested that the false vocal folds induce an amplification of higher harmonics in the radiated acoustic field.

Auralization for musicians and instrument makers as a tool for evaluating a musical instrument
SPEAKER: unknown

ABSTRACT. When a musician is considering the purchase of a new instrument, the test room often does not have ideal acoustics, and a musician rarely has the opportunity to test the new instrument in a concert or recital hall before purchase. However, one of the most important criteria when choosing a musical instrument is its sound in a representative performance environment. Auralization techniques can aid the musician in choosing the appropriate instrument to purchase. If recorded in a relatively anechoic environment, the dry sound of an instrument can be merged with the acoustic reflections of a performance environment – even in real time while playing the instrument. By reproducing the auralized sound, a musician can listen to his/her own playing in a concert hall, for example, to help determine which instrument is the best one to purchase. Auralization techniques can then be a service offered to musicians by the instrument maker in addition to standard factory showrooms.

14:30-15:00 Session 31: Organ acoustics
Location: Fanny Hensel Saal
Study on the influence of acoustics on organ playing using room enhancement
SPEAKER: unknown

ABSTRACT. A pilot study on the influence of different reverberation on the musical performance of organ players is presented. Using an organ with MIDI output three different organ players have been recorded performing the same pieces. Moreover, a room acoustics enhancement system is used to modify the acoustic conditions of the Detmold concert house in real time.

Since the dynamics and tuning properties of the organ remain constant, the analysis focuses mainly on tempo features such as total duration, break duration and tempo variability, as extracted from the MIDI files. A set of binaural recordings has been obtained in order to relate the performance variations to the acoustic feedback received by the musicians. Finally, the participants are interviewed individually after the experiment to obtain their impressions of the influence of the acoustics.

The results show that the reverberation has a direct influence on the musicians, leading to changes in tempo and duration of breaks between consecutive notes. However, this relation is conditioned by other factors such as the character of the piece, the level, the global tempo and the individual players.

Coupled Organ Pipes and Synchronization - Numerical Investigations and Methods
SPEAKER: unknown

ABSTRACT. We present a new approach to investigate the interaction of two organ pipes numerically. By solving the compressible Navier-Stokes equations under suitable boundary and initial conditions we can completely retrace the way of mutual interplay of the nonlinear coupled system of two organ pipes, which leads to synchronization. We give detailed insights into the concept of implementation and run such complex CFD/CAA simulations using parts of the open source C++ toolbox OpenFOAM. Our robust numerical results are in excellent accordance to data of real synchronization experiments with organ pipes. This opens a new window to analyze the nonlinear fluidmechanical and aeroacoustical mechanisms of sound generation, sound propagation and acoustical interaction of organ pipes. Especially the properties and functions of coherent turbulent fluidmechanical objects inside organ pipes, like the oscillating air sheet, the jet, and the primary vortex in the lower resonator region, as well as the influence of the upper labium are of our augmented interest. The shown techniques define a new step beyond the present research of interactions of wind driven musical instruments.

15:00-16:00 Session 32: Physical modeling tools for musical instruments
Location: Fanny Hensel Saal
Digital Guitar Workshop – A physical modeling software for instrument builders
SPEAKER: unknown

ABSTRACT. Instrument builders have an ideal of sound in mind. Their experience in craftsmanship guides them through the production process. It is a more or less trial-and-error method. Each part of the instrument considered to be relevant to fit the imagination is going to be modified during the production process. The whole production process starting from a certain imagination of sound to the completed guitar and backwards represents an inverse problem. Still at best the solution would be inverse, where the builders would know how a geometry looks like producing the desired sound. The physical modeling software Digital Guitar Workshop (DGW) is a tool for guitar builders to do both, the trial-and-error as well as the inverse problem solution. Therefore 32 sample guitars were measured in terms of geometries of each part and radiation using microphone array techniques including the Minimum Energy back-propagation method. Selecting a preferred guitar out of the sample, builders can use a Graphical User Interface (GUI) to change properties of the instruments in detail like quantity, size and position of fan bracings, top and back plate thickness distribution, sound hole size and position, bridge size and position and much else. Builders can use the graphical soundboard to pluck a virtual tone which is calculated immediately to surveil the whole design process . The physical model of the guitar sound, simplified with respect to longitudinal waves and ribs, bases on the principles already widely publicized by R. Bader: Computational Mechanics of the Classical Guitar. (Springer 2005). Furthermore the builder can change an existing sound in the software by in- or decreasing sound strength within a chosen band and bandwidth with an equalizer. Subsequently the software proposes a geometry which meets this sound with best approximation. This inverse problem solution in the existing software version is a simple search algorithm using a large sample of pre-changed guitar geometries. The huge database shows large deviations in sound when changing the basic shape of the guitar and small deviations in detail by changing the fan bracing or plate thickness.Future work will be to improve the mentioned sound processing algorithm towards a mathematical solution.

Real-time physical modeling of large instrument geometries using coupled FPGAs

ABSTRACT. A recent methodology utilised to simulate and synthesize physical models of complete instrument geometries is extended to facilitate the implementation of larger physical models in real-time. The existing system utilises explicit finite difference methods to simulate and synthesize physical models of music instruments on Field Programmable Gate Array (FPGA) hardware. To extend the computational abilities of the existing system, it is enhanced by more recent FPGA hardware consisting of two FPGAs of the Virtex-7 family, a 2000T and a 690T which are connected to a personal computer via a PCIe interface protocol. A first implementation of a large scale geometry using explicit finite difference methods is compared to a specifically adapted pseudo-spectral implementation of a plate model, which is applied to simulate a geometrically correct model of a grand piano soundboard. A central interest of this work lies on the applicability to real-world problems arising in instrument acoustic research and instrument design. Thus, a dynamic configurability and contrallability of the models is sought after. To this end, an Input/Output protocol is utilised for real-time adaptability of the physical parameters of each model part. The simulation results of the soundboard model are compared to measurements taken on real grand piano soundboards in different production stages.

Feasibility analysis of real-time physical modeling using WaveCore processor technology on FPGA

ABSTRACT. WaveCore is a scalable many-core processor technology. This technology is specifically developed and optimized for acoustical modeling applications. The programmable WaveCore soft-core processor is in principle silicon-technology independent and hence can be targeted to ASIC or FPGA technologies. The WaveCore programming methodology is based on dataflow principles and the abstraction level of the programming language is close to the mathematical structure of for instance finite-difference time-domain schemes. The instruction set of the processor inherently supports delay-lines and data-flow graph constructs. Hence, the processor technology is well suitable to capture both digital waveguide as well as finite-difference oriented algorithm descriptions. We have analysed the feasibility of mapping 1D and 2D finite-difference models onto this processor technology, where we took Matlab reference code as a starting point. We analyzed the scalability and mapping characteristics of such models on the WaveCore architecture. Furthermore we investigated the composability of such models, which is an important property to enable the creation and mapping of complete musical instrument models. One part of the composability analysis has been the combination of digital waveguide (FDN reverberation model) with finite-difference time-domain models (primitive six-string instrument model). Our main conclusion is that WaveCore is a promising technology for this application domain. The mapping experiments show a high efficiency in terms of FPGA utilization, combined with a programming methodology that matches in a transparent way with the mathematical abstraction level of the application domain. We used a standard FPGA board to get full-circle confidence of the carried-out analysis, as well as WaveCore compiler and simulator results to show the scalability of the processor technology to support large models.

Software Simulation of Clarinet Reed Vibrations


Electric Circuit Analysis Programs, such as MicroCAP [1] are useful for simulating acoustical and mechanical behaviour of musical instruments. As has been shown in [2], frequency dependent characteristics such as acoustical impedances of wind instruments can be simulated. Also pressure and volume-flow inside a tube can be demonstrated graphically [3]. A so called AC-analysis was used for these tasks. In the present paper first the transient response of a purely mechanical device, the clarinet reed, is studied. MicroCAP offers the possibility to show several parameters on a time scale. For this a different analysis is used, namely TR-analysis (transient analysis). The quasi-static relation between volume flow and pressure difference is the only acoustical-mechanical question that is dealt with in this paper. The paper explains the electro-mechanical-acoustical analogies that are the basis for the simulations. Finally the suitability of the software-model used is demonstrated by checking the results against the literature [4], [5].

[1] Micro-Cap, Electronic Circuit Analysis Program, Spectrum Software 1021 South Wolfe Road, Sunnyvale, CA 94086

[2] Schueller, F., Poldy, C.: Improving a G-high Clarinet using Measurement Data and an Electronic Circuit Analysis Program, Proceedings of the 3 rd Conference, Viennatalk, Sept. 2015

[3] Schueller, F., Poldy, C.: Using software simulation of the tone-hole lattice in clarinet-like systems, Proceedings of the 3 rd Conference, Viennatalk, Sept. 2015

[4] Avanzini, F., Walstijn, M.v.: Modelling the Mechanical Response of the Reed-Mouthpiece-Lip System of a Clarinet.
Part I. A One-Dimensional Distributed Model, Acta Acustica united with Acustica, Vol. 90 (2004) 537 –547

[5] Walstijn, M.v., Avanzini, F.: Modelling the Mechanical Response of the Reed-Mouthpiece-Lip System of a Clarinet.
Part II. A Lumped Model Approximation, Acta Acustica united with Acustica, Vol. 93 (2007) 435-446

16:00-16:15Coffee Break
16:15-17:30 Session 33: Audio signal processing
Location: Fanny Hensel Saal
Spatial Manipulation of Musical Sound: Informed Source Separation and Respatialization

ABSTRACT. "Active listening" enables the listener to interact with the sound while it is played, like composers of electroacoustic music. The main manipulation of the musical scene is (re)spatialization: moving sound sources in space.

This is equivalent to source separation. Indeed, moving all the sources of the scene but one away from the listener separates that source. And moving separate sources then rendering from them the corresponding scene (spatial image) is easy.

Allowing this spatial interaction / source separation from fixed musical pieces with a sufficient quality is a (too) challenging task for classic approaches, since it requires an analysis of the scene with inevitable (and often unacceptable) estimation errors.

Thus we introduced the informed approach, which consists in inaudibly embedding some additional information. This information, which is coded with a minimal rate, aims at increasing the precision of the analysis / separation. Thus, the informed approach relies on both estimation and information theories.

Since the initial presentation at VITA 2010, several informed source separation (ISS) methods were proposed. Among the best methods is the one based on spatial filtering (beamforming), with the spectral envelopes of the sources (perceptively coded) as additional information.

More precisely, the proposed method is realized in an encoder-decoder framework. At the encoder, the spectral envelopes of the (known) original sources are extracted, their frequency resolution is adapted to the critical bands, and their magnitude is logarithmically quantized. These envelopes are then passed on to the decoder with the stereo mixture. At the decoder, the mixture signal is decomposed by time-frequency selective spatial filtering guided by a source activity index, derived from the spectral envelope values.

The real-time manipulation of the source sources is then possible, from musical pieces initially fixed (possibly on some support like CDs), and with an unequaled (controllable) quality.

Modeling the spectrum structure within the NMF framework. Application to inharmonic sounds.
SPEAKER: unknown

ABSTRACT. The Non negative Matrix Factorization has received much attention during the last decade, with some remarkably successful applications in the field of audio source separation or automatic music transcription. To that aim, prior information and modeling have often been included in the general framework to account for our knowledge on the spectral structure of sounds, such as the harmonicity, characteristic of a number of musical sounds, or the smoothness of their spectrum envelope. These help the algorithm to approach a relevant solution. The NMF indeed decomposes a non negative time-frequency representation of a musical scene into the product of low rank non negative matrices: the matrix of spectral atoms or templates and the matrix of their time activation. But this decomposition is not unique and moreover it is not rare that the convergence reaches a local minimum of the cost function where the spectral atoms are not easily identifiable, or at least, have to be post-processed either to separate the contributions of different audio events or to aggregate them to recover a single, coherent, musical note. This presentation will review techniques we developed in the past few years to model the spectrum structure within the NMF framework and their application to analyze inharmonic sounds, such as those of the piano. This allows us to examine some of large scale properties that characterize the state of the instrument, like the design of its tuning, or the inharmonicity curve along the whole compass.

Towards Realistic and Natural Synthesis of Musical Performances: Performer, Instrument and Sound modeling
SPEAKER: Alfonso Perez

ABSTRACT. Imitation of musical performances by a machine is an ambitious challenge involving several disciplines such as signal processing, musical acoustics or machine learning. The most important techniques are focused on modelling either the instrument (physical models) or the sound (signal models), but they forget an explicit representaiton of the performer. Recently, the availability of technology and methods to accurately measure instrumental controls by the performer can be exploited to improve current sound synthesis models. In this work we present an approach that combines the modeling of characteristics of the sound, of the instrument as well as of the performer, in order to generate natural performances with realistic sounds automatically from a musical score. The method uses the violin as a use case and is composed of three layers. The first layer corresponds to expressivity models, the second one is a signal model driven by performer actions and the third one consists of an acoustic model of the sound radiation properties of the violin body.

A comparison of single-reed and bowed-string excitations of a hybrid wind instrument
SPEAKER: unknown

ABSTRACT. A hybrid wind instrument is constructed by connecting a theoretical excitation model (such as a real-time computed physical model of a single-reed mouthpiece) to a loudspeaker and a microphone which are placed at the entrance of a wind instrument resonator (a clarinet-like tube in our case). The successful construction of a hybrid wind instrument, and the evaluation with a single-reed physical model, has been demonstrated in previous work. In the present paper, inspired by the analogy between the principal oscillation mechanisms of wind instruments and bowed string instruments, we introduce the stick-slip mechanism of a bow-string interaction model (the hyperbolic model with absorbed torsional waves) to the hybrid wind instrument setup. Firstly, a dimensionless and reduced parameter form of this model is proposed, which reveals the (dis-)similarities with the single-reed model. Just as with the single-reed model, the hybrid sounds generated with the bow-string interaction model are close to the sounds predicted by a complete simulation of the instrument. However, the hybrid instrument is more easily destabilised for high bowing forces. The bow-string interaction model leads to the production of some raucous sounds (characteristic to bowed-string instruments, for low bowing speeds) which represents the main perceived timbral difference between it and the single reed model. Another apparent timbral difference is the odd/even harmonics ratio, which spans a larger range for the single-reed model. Nevertheless, for both models most sound descriptors are found within the same range for a (stable) variety of input parameters so that the differences in timbre remain relatively low. This is supported by the similarity of both excitation models and by empirical tests with other, more dynamical excitation models. Finally, a generalised stability condition for the hybrid instrument is obtained: the derivative of the dimensionless nonlinear function (which characterises the excitation model) should stay below unity over its entire operational domain.

Automatic music transcription using spectrogram factorization methods

ABSTRACT. Automatic music transcription (AMT) is defined as the process of converting an acoustic music signal into some form of human- or machine-readable musical notation. It can be divided into several subtasks, which include multi-pitch detection, note onset/offset detection, instrument recognition, pitch/timing quantisation, extraction of rhythmic information, and extraction of dynamics and expressive information. AMT is considered a key enabling technology in music signal processing but despite recent advances it still remains an open problem, especially when considering multiple-instrument music.

A large part of current AMT research focuses on spectrogram factorization methods, which decompose a time-frequency representation of a music signal into a series of note templates and note activations. This has led to music transcription systems that are computationally efficient, robust, and interpretable. In this talk, I will present recent advances in AMT focusing on proposed systems that are able to detect multiple pitches and instruments, and are able to support tuning changes and frequency modulations. Recent work on creating a transcription system that models the temporal evolution of each note as a succession of sound states (such as attack, sustain, and decay) will also be presented.

The final part of this talk will be on the applicability of AMT methods to fields beyond music signal processing, namely musicology, performance science, and music education. Specific examples on the use of AMT technology will be given in problems related to the analysis of temperament, the analysis of non-Western music, and the creation of systems for automated piano tutoring.

17:30-18:00 Session : Farewell
Location: Fanny Hensel Saal