Computer-aided Composition of Spatial Sound Sculpture


This collaborative artistic research project aimed to further deepen and artistically explore the notion of spatial sound synthesis, i.e. the extension of sound synthesis algorithms to the spatial domain, including concepts such as extended or compound sources, spectral spatialization, as well as the concept of a “synthetic-soundscape” i.e. working with densities in various perceptual dimensions (e.g. frequency vs time vs space). One of the artistic motivations was to explore not only the composition of sound expanded in space but a synthetic background around the listener that is densifying-increasing, in distinct emerging sound layers. 

Marlon Schumacher’s research and development extended the spatial sound synthesis framework embedded in the OpenMusic computer-aided composition (CAC) environment with new DSP algorithms and control tools, based on a restructured, modular architecture. For handling the complexity of spatial sound composition and synthesis, Núria Giménez-Comas employed graphical representations of masses and densities to drive the sound synthesis and spatialisation tools, more precisely thanks to the OMChroma and OMPrisma frameworks. 

A number of artworks have been created during this residency project by Giménez-Comas: 1) a « poetic architectural » piece, 2) an installation study and 3) an ambisonics-acousmatic piece. The rendering of close proximity sound objects and distant textures by combining different spatialisation systems (Ambisonics and Virtual Microphones) in an unorthodox loudspeaker-sculpture configuration for the first and second artworks, for a promenade installation work in which audience members were invited to freely move in space and actively explore the rendered soundscape. 

Fig. 1 Pilot loudspeaker configuration by M. Schumacher at MUTProbe2 - HfM Karlsruhe, September 2017 

Research Motivations

Control Strategies 

The control of compound sound sources and spatial textures often requires the specification of large sets of parameters, which can quickly become overwhelming and unmanageable for literal setting. A possible approach can be the use of higher-level models or algorithms that can be adapted based on a given user context, which -using a fewer number of higher-level parameters- can render structured arrays of individual parameter specifications. Thanks to OpenMusic’s / Common LISP’s underlying programming paradigms (object-oriented, MOP, functional) a number of new tools and interfaces for meta programming and manipulation of arrays of spatial control data have been developed: The new function traj-array for instance, accepts a trajectory, a distribution of 3D-points (e.g. a 3D-mesh, point-cloud or possibly a sampled “meta-“trajectory) and additional parameters, allowing to apply global manipulations and transforms (e.g. translations/rotations) to compute new instances of trajectories from these meta specifications. This process is open to recursion (each resulting trajectory can be used as a meta-specification, or a trajectory “convoluted” with itself) and therefore provides the user with the flexibility of traversing various levels of depth. The figure below, for example, illustrates the generation of a structured array of trajectories with individual rotations and translations from a few higher-level parameters.  

Fig. 2 The function “traj-parametric” generates a single trajectory; the function “2D-mesh” produces a regular 2D-point-grid; the function “om-sample” turns a continuous line (breakpoint function) into a list of individual values. Using these specifications, the function “traj-array” computes copies of the trajectory at the position of the point-grid, increasingly rotated, as specified by the sampled breakpoint-function at its third inlet. The “3DC-lib” object displays the resulting mesh of trajectories.

Projection vs. Immersion

A further consideration regarding the artistic motivations (somewhat complementary to the synthesis and control tools) pertains to the capabilities of the auditory display system to deliver the required perceptual cues for making these abstract ideas (extended sources, textures, etc.) perceivable to the human hearing system. As much the visual analogy of physical shapes with attributes like extension, orientation etc. is intriguing, its application to the auditory domain is limited, as there are important differences between the perceptual modalities which need to be accounted for. Perhaps even more so in the case of producing spatial scenes of such complexity, since the auditory system’s evolutionary “heuristic” mechanism is to detect redundancies and commonalities/correlations for grouping and fusing stimuli in order to achieve the simplest (most plausible) mental representation. 

For the presentation of extended sources, such as shapes, areas, and other geometries, the perception/localization in all 3 spatial dimensions (in particular distance) might be essential. Since already in natural settings the auditory perception of depth/distance is somewhat limited, in the context of 3D-audio reproduction this can become even more difficult, e.g. when listeners are constrained to a fixed listening position. In fact, in many conventional approaches, a 3D sound scene can often be considered rather a 2D “projection” for an immovable listener fixed at an idealized position (sweetspot) - the case with most panning-based and simple ambisonics systems. There are findings, however, suggesting that additional locomotive and proprioceptive cues combined with acoustic parallax effects can increase and stabilize the auditory perception of distance and depth. In order to account for this, we employed a spatialization and auditory display system based on virtual microphone control (ViMiC), with loudspeakers distributed throughout the stage/presentation spaces. Each of the loudspeakers delivers the signal of a virtual microphone corresponding to its physical position (rendering its individual level and time-differences) and thus acts as an acoustic “window” to a virtual scene. This rather unorthodox reprooduction format invites listeners to freely move and explore the spatial sound rendering in an enactive mode of listening rather than a passive one, including the possibility of making use of additional perceptual cues (see e.g. the work of D. Wessel for reference). 

Scene Virtualization

While creative work with abstract musical material might be independent of physical setups, this is not the case as soon as we are rendering sounds for specific speaker configurations and spatial venues. The reality of not always having the possibility to access a physical loudspeaker setup –even more evident during the pandemic– has revealed the requirement of findig solutions for virtualizing physical setups. Especially, the “enactive listening mode” requires real-time auralization which is able to simulate listener movement within a given spatial configuration. In order to develop and explore arbitrary (spaced, on-stage, etc.) loudspeaker configurations, Schumacher developed a standalone application titled “Binauralix”(1), allowing for interactive binaural rendering of spaced loudspeaker setups and acoustics of the venue. Via a SpatDIF-based protocol, loudspeaker configurations with their respective positions, orientations and directivities, can be transmitted as OSC-messages from OpenMusic to Binauralix (represented as sources with “apertures”). A virtual “listener” avatar represents the listening position/orientation which can be dynamically moved in the virtual space, simulating the enactive exploration of the spatial venue. 

Fig. 3 Granular synthesis with ViMiC for spaced loudspeaker configuration. Left: the OM patch for differed-time rendering, Right: the Binauralix application for real-time auralization. 

Spatial Sound Synthesis Framework

OMPrisma(2) is a library for the OpenMusic environment, providing the user with high-level control interfaces and a rich collection of spatialization “instruments” (i.e. classes with object-oriented features such as polymorphism and multiple inheritance). These classes dynamically generate code for the Csound language which acts as the dsp rendering kernel. Over the past 12 years it has been employed for the creation of a broad variety of projects and contexts. Drawing from this experience, its architecture has been redesigned following two main considerations: 

1 - For the user: introduction of “processor” classes and a class-combination system, for daisy chaining classes (e.g. for source preprocessing and spatialization) into more complex “compound instruments” (underlying DSP-graphs). At the moment of writing this article the collection of instruments has been extended with filters (FIR, IIR), resonators, spectral tools and decorrelators. 

2 - For the developer: In order to facilitate user-based design/extension, the internal architecture has been refactored with meta-programming tools for dynamically constructing Csound instruments via the use of User-defined Opcodes (UDOs) and Macros. This architecture provides higher-level “modules” and therefore greater structure and flexibility, allowing for rapid prototyping, free from limitations in terms of internal/external audio channels. 

The OMPrisma library in conjunction with the OM-SoX(3) library (for multichannel audio analysis/manipulation), constitute a framework for designing spatial sound processors by combination of modular components (e.g. rendering of direct sound, early reflections, late reverberation). The possibility of mixing spatialization techniques (perceptual, physical, signal-based), the construction of synthesizers and processing classes, extend this framework to a new kind of virtual “lutherie”, opening up possibilities for “spatial orchestrations”, blurring traditional concepts of timbre (“instruments”) and space (“spatializers”).




Compositional motivations 

The studies composed as part of this project explore the idea of multiple spatial sound morphologies within and around the audience. For this project an extended version of Virtual Microphone Control (ViMiC) was used as main rendering method. It provides the flexibility of simulating and combining spatial rendering techniques ranging from time-difference based (acoustic curtain, holophony / WFS) to coincident (e.g. ambisonics, soundfield reproduction at a single point) approaches. One of the initial steps of the research was the development of a new Chroma-Prisma class, with M. Stroppa and M. Schumacher, implementing a spectral sound model based on sinusoidal content (partials) + noise (residual). These sound components and their parameters served for transformations and mapping to spatial re-synthesis algorithms. 

First steps - first artwork 

For the piece « Back into Nothingness » Giménez-Comas applied some of the ideas of the research residency to the electronics part of the poetic and spatial dramaturgy of the piece: resynthesis as sound « metaphor » and first essays of spatial granular synthesis. 

The idea of « sound metaphor » stems from the resynthesis of poetic « images » from the text. Transforming data of concrete sounds, and extending a somewhat « blurred » sound image in space. This is employing the idea of spectral spatialization as for example focusing noisy parts of sound in one part of space and expanding sinusoidal parts in different parallel trajectories. 

Giménez-Comas worked with a dialog of two spaces, one on the scene, with three speakers placed in different angles, orientations and heights and the other speakers placed around the audience, as a surrounding (immersive) circle. These two spaces could enter into dialog, alternate at some dramaturgical moments, while the speakers on stage could enter into dialog with the actress and the choir placed on the scene. 

Fig. 4 Back into Nothingness, scene by Giuseppe Frigeni at TNP of Villeurbanne (Théâtre National Populaire) 

In this dialog, as well as in the resynthesis of concrete sounds, Giménez-Comas worked also with granular synthesis in space, as immersive “block textures” of sound. Further, granulation in space with CataRT(4) was employed for creating big ’sound masses, choosing position depending on descriptor information of each grain. These masses containing very different nature concret sounds were a parallel of the « faits divers » mixing a lot of these press news also of very different natures, as an over-accumulation of information. Giménez Comas carried out first tests of synthetic sooundscapes and « synthetic cloud textures», taking the idea of the metaphor of « sound clouds » from the poetic text of Laure Gauthier(5). 


Spectral spatialization and resynthesis, synthesized textures 

In the first steps of the research, various approaches were validated with informal perceptual tests: applying spatial transformations: for example from one source to multiple sources manipulating different parameters, such as time and trajectory of a simple spectrum. Further, exploring artistic possibilities on spatial installations in unorthodox loudspeaker setups in continuation of to the work of the « synthetic soundscape » Giménez-Comas elaborated different installation studie, both  during the residency at ZKM and within the studios of Ircam. 

« Installation » developing two ideas: 

in promenade format - nearby sources
synthetic soundscape in ambisonics format 

Each miniature focuses on a specific topic/theme which is reflected in a respective music installation study. Some of the ideas were e.g. spectral spatialization, extended sources, density in space, textures in diverse layerings; integrating spatial parameters with sound synthesis processes from the very beginning. 

For the idea of spectral spatialization Schumacher developed specific algorithms and tools for realizing the idea of an « extended source », e.g. to break up spectral contents in space by filters or partial tracking, and algorithmic specification of directivity patterns for individual sound components. Further the breaking apart of a single trajectory into multiple trajectories with the same morphology. Also, controlling complex or dense spectra and textures using meta-specifications to distribute them in space in perceptually pertinent ways. 

Núria Giménez Comas developed some tools to elaborate the concept of “density”, specifically for the installation studies: spectral density mapped to space, also granular textures with multiple distributions of diverse densities in space, i.e. the idea of controlling meta-parameters of synthetic textures (granular and additive). In granular textures she has worked on temporal and spatial densities, separating space into individual areas as shown in Figure 5. 

Fig. 5 Left: Visualization of sound distributions with different densities in specific “block” areas. Right: Trajectories of sound components for a compound source.

For additive synthesis Giménez Comas has worked in spatial density and spectral density using logistic map data (using the Chaos Library of Mikhail Malt) to create clusters around harmonics. One particular interest was exploring the threshold between pitched sounds and noise, by controlling the spectral density.  The combination of spatialization systems and speaker setups, e.g. ambisonics and ViMiC (Virtual Microphone Control) allowed the synthesis of distinct layers of nearby sources and distant resonances or ambiances. This was further made possible through the specific loudspeaker setup, reminiscent of pillars in a room, allowing the spectators to explore different auditory perspectives for the same virtual sound scene. 

Fig. 6 The patch shows the synthesis of the textures on the Vimic setup 

The studies that Núria Giménez-Comas has realized at ZKM are:

« à l’intérieur des clôches » :  combination of Vimic and Higher-Order Ambisonics

« between the leaves » : combination of Vimic and Higher-Order Ambisonics 

« synthetic soundscape » : Higher-Order Ambisonics 

In « à l’intérieur des clôches » bell sounds are resynthesized in space following a spectral distribution in the sense of separating noisy content from sinusoidal components. By further manipulating the sinusoidal structures, blurry images of polyphonic bell sounds in space are created. In addition, the audio feeds for eight loudspeakers are placed as virtual sources into a virtual reverberation (room) using Higher-Order Ambisonics to create a dialogue between both spaces. This piece was presented as a multi-media installation with a video displaying images of Dan Browne’s « Nude descending (after Duchamp) »(6) 

Fig. 7 shows the structure of the resynthesis process between OMChroma and OMPrisma distribution of the partials in space. 

In the study « between the leaves » presented in a “promenade format”, i.e. with speakers distributed throughout the hall, Giménez-Comas focused on polyphonic spatial textures (granular synthesis). This granulation has been sculpted by different density areas and filtering of different layers to obtain mixed textures in space (opposite spectral areas and opposite densities in spatial areas). There is a dialog of two simultaneous spatial areas: eight loudspeakers reproducing a space, and simultaneously being sources in a virtual surrounding space. In this installation Núria Giménez Comas has worked with image fragments of Dan Browne’s installation « An Island is land "(7). 



Fig. 8 Installation @ Kubus, ZKM. Images by Dan Browne 

« Synthetic soundscape » is a study where Giménez-Comas has specifically worked with densities in space and in frequency (spectrum): grouping components in synthesized extended sources, between almost white noise and pitched sounds (« nuages sonores »), and addressing the relationship of deferred-time rendering and real-time performance (or real-time mixing/diffusion, listening space). Giménez-Comas elaborated the idea of a synthetic soundscape, inspired from natural sounds, but entirely realized via synthesis, also developing metaphoric ideas, such as the notion of « synthetic clouds ». For this study, in particular the work with distinct layers, the exchanges with Jean Bresson and Thibaut Carpentier were very important, in particular for support with the new OM-Spat library (for differed-time rendering) for visualizing the spatial parameter morphologies in the temporal dimension, and for the connection between differed-time sound « sculpture » and real-time layers. 

Fig. 9 Installation setup @ Kubus, ZKM 

Technical considerations

It has been of great value for this project, being able to work in different studios and spaces and to exchange ideas with the local teams, such as EAC Team of Ircam (Espaces Acoustiques et Cognitifs) and the team of engineers and researchers at ZKM. The work and tests in smaller sized spaces, with closer speaker proximity (like ZKM’s mini dome, Ircam’s studio 1 and studio 5), has been complementary to the work in big spaces, such as the Kubus of ZKM. Changing studios required working with flexible tools and formats, e.g. HOA (encoded files) at times requiring local adaptations and fine-tunings, such as order reduction, also conversions from 3D to 2D, etc. for which the technical expertise of in-house engineers and researchers have been an invaluable resource.

Fig. 10 Installation setup @ Studio 5, Ircam 


Lectures and presentations: 

ICST (Zurich),

Hochschule für Musik Stuttgart (Hfmdk), 

ZKM (Karlsruhe)

Ircam Forum (studio live presentation),

CICM (presentation of the research and diffusion),

Musidanse (Paris 8 University),

Ircam Live presentation on spatial synthesis 

Conference contributions: 

  • IRCAM Forum Workshops Hors Les Murs 2021. Montreal, Canada / Online
  • Table ronde / Round table 6: Composing timbre and space Modérateur / Chair—Jonathan Goldman (Université de Montréal)
  • Klingt Gut! KLG 2020/21. 5th international Symposium on Sonic Art and Spatial Audio. Hamburg, Germany / Online
  • “Miniature Sound Sculpture Studies. Exploring Novel Approaches to Musical Spatialization”
  • “Spatialization beyond the Point: New Extensions to the Spatial Sound Synthesis Framework for Computer- Aided Composition”.
  • Journées d'Informatique Musicale 2021. Bordeaux, France / Online
  • “Interactive Experience of Spatial Sound Sculptures: ‘Between the Leaves’ and ‘à l’intérieur des clôches”

  • Artworks:
    • « Back into Nothingness», Commission of Ircam-Grame with the “Aide à l’écriture d’une oeuvre musicale originale du Ministère de la Culture et de la Communication” premiere by Spirito Choir and Anna Clementi, at TNP in Biennale « Musique en scène » of Lyon - Archipel Festival (Geneva)
    • Installation studies and ambisonics acousmatic work: “between the leaves”, “à l’intérieur des cloches” and “Synthetic soundscape” (events at ZKM-september 2018, and at Ircam Forum-19)
    • Performances: « Synthetic soundscape » at La Carènne, Brest and spatialisation system presentation

Sound Excerpts:   

• nothingness-excerpts  


This collaborative artistic research between Marlon Schumacher and Núria Giménez Comas has explored and developed the notion of spatial sound synthesis from artistic and perceptual viewpoints. The research has inspired a number of artworks and vice versa, compositional motivations have also informed some directions of the research. 

From a perceptual perspective, the developed tools might help to better learn and understand how these -musically mostly separated- dimensions interact to evoke different sound sensations. Studying the physical correlates and signal-processing for higher-level auditory attributes (related to shape/volume, texture, “material”, etc.) can be an invaluable resource for artistic approaches and deepen our understanding of sound perception, specifically for music. 

Similar to sound synthesis in creative contexts, the development efforts focused on modularity and ease of use, providing flexible frameworks and facilitating the combination of different spatialization systems. We hope that the pooject encourages an extension of the compositional thought to the combination of spatial processing algorithms and instruments as part of the creation process in the sense of an “orchestration of the space”. 

Following this initial/pilot work, there are many ways and approaches to how the new framework and tools can be employed, suggesting several avenues for further exploration. 

From an artistic perspective, several improvements of the installation idea, the connection with visual aspects, and hence, the possibility of real-time interactions between space, synthesis and image are envisaged for new versions or future installations. Also, investigating possibilities for adapting the studies to WFS or other systems. 

For her work on a new « Musique-Fiction » for IRCAM, Núria Giménez-Comas is currently using some of these tools to re-synthesize « natural » immersive soundscapes in 3D. She is also taking advantage to realize new researches, connecting these ideas with realtime approaches and exchanging with Jose Miguel Fernandez about possibilities offered by the software Antescollider. 


We would like to thank all of the people involved in this project, most notably: Jean Bresson, Marco Stroppa, Thibaut Carpentier, Markus Noisternig and the whole Acoustic and Cognitive Spaces team at Ircam, Greg Beller, Paola Palumbo and all the team of the Ircam Forum, Götz Dipper, Benjamin Miller, Ludger Brümmer and the team of ZKM. 


  • F. Bayle, “A propos de l’acousmonium,,” Recherche Musicale au GRM, vol. 397, pp. 144–146, 1989.
  • J. Borish, „Extension of the image model to arbitrary polyhedra“, in J. Acoustical Society of America, 1964,
    pp. 1827-1836.
  • J. Braasch, N. Peters, and D. L. Valente, “A Loudspeaker-Based Projection Technique for Spatial Music Applications Using Virtual Microphone Control.,” Computer Music Journal, vol. 32, no. 3, pp. 55–71, 2008.
  • A. S. Bregman, Auditory Scene Analysis - The Perceptual Organization of Sound. MIT Press: Cambridge, MA, 1994, pp. 1–854.
  • J. Bresson, D. Bouche, T. Carpentier, D. Schwarz, and J. Garcia, “Next-generation Computer-aided Composition Environment: A New Implementation of OpenMusic ,” presented at the ICMC 2017.
  • S. Carlile and J. Leung, “The Perception of Auditory Motion,” Trends in Hearing, vol. 20, no. 1, pp. 233121651664425–19, Feb. 2016.
  • T. Carpentier, N. Barret, R. Gottfried, M. Noisternig, “Holophonic Sound in Ircam’s Concert Hall: Technological and Aesthetic Practices”
  • G. Eckel, R. González-Arroyo, M. Rumori, and D. Pirro, “A Framework for the Choreography of Sound,” presented at the International Computer Music Conference, 2012, pp. 1–8.
  • M. Frank, “How to make Ambisonics sound good,” presented at the Forum Acusticum,(Krakow), 2014.
  • J. Garcia, J. Bresson, M. Schumacher, T. Carpentier, and X. Favory, “Tools and Applications for Interactive-Algorithmic Control of Sound Spatialization in OpenMusic,” presented at the inSONIC2015, Aesthetics of Spatial Audio in Sound, Music and Sound Art, Karlsruhe, Germany, 2015. 
  • S. Getzmann, “Auditory motion perception: onset position and motion direction are encoded in discrete processing stages,” European Journal of Neuroscience, vol. 33, no. 7, pp. 1339–1350, Mar. 2011.
  • K. L. Hagan, “Textural Composition: Aesthetics, Techniques, and Spatialization for High-Density Loudspeaker Arrays,” pp. 1–12, Apr. 2017.
  • G. S. Kendall, “The Decorrelation of Audio Signals and Its Impact on Spatial Imagery,” Computer Music Journal, vol. 19, no. 4, pp. 71–87, 1995.
  • S. James, “Spectromorphology and Spatiomorphology of Sound Shapes: audio-rate AEP and DBAP panning of spectra,” May 2015.
  • M. N. Montag, “Wave Field Synthesis in Three Dimensions By Multiple Line Arrays,”
  • C. Roads, Microsound. The MIT Press, 2001.
  • M. Schumacher, “Ab-Tasten: Atomic Sound Modeling with a Computer-Controlled Grand Piano,” in The OM Composer's Book: Volume 3, J. Bresson, G. Assayag, and C. Agon, Eds. Éditions Delatour France / IRCAM— Centre Pompidou, 2016, pp. 341–359.
  • M. Schumacher, “A Framework for Computer-Aided Composition of Space, Gesture, and Sound. Conception, Design, and Applications.,” PhD thesis, McGill University, 2016.
  • M. Schumacher and J. Bresson, “Spatial Sound Synthesis in Computer-Aided Composition,” Organised Sound, vol. 15, no. 3, pp. 271–289, Dec. 2010.
  • I. Senna, C. V. Parise, and M. O. Ernst, “Modulation frequency as a cue for auditory speed perception,” Proc. R. Soc. B., vol. 284, no. 1858, pp. 20170673–7, Jul. 2017.
  • X. Serra and J. Smith, “Spectral modeling synthesis: A sound analysis/synthesis system based on a deterministic plus stochastic decomposition,” Computer Music Journal, vol. 14, no. 4, pp. 12–24, Dec. 1990.
  • J. C. Steinberg and W. B. Snow “Auditory Perspective—Physical Factors,” J. AIEE, vol. 53, pp. 12–15 (1934).
  • J. Stuchlik, “Virtuelle Raumakustik als modularer Ansatz, basierend auf physikalischen, perzeptuellen und signalbasierten Verfahren,” Bachelor Thesis, Institute for Music Informatics and Musicology, Hochschule für Musik Karlsruhe, 2017.
  • G. Theile and H. Wittek, “Principles in Surround Recordings with Height,” pp. 1–12, 2011.
  • M. J. Wohlgemuth, N. B. Kothari, and C. F. Moss, “Action Enhances Acoustic Cues for 3-D Target Localization
    by Echolocating Bats,” PLoS Biol, vol. 14, no. 9, pp. e1002544–21, Sep. 2016.
  • F. Zotter, M. Frank, G. Marentakis, and A. Sontacchi, “Phantom Source Widening with Deterministic Frequency Dependent Time Delays,” 2011.