Monuments at the limit of the fertile trihedron. A note on extratemporal music and volumetric modelling sound synthesis

Article

Published on 19 avr. 2023 by francesco_vitale

What remains of music once time is removed from it? As paradoxical such a question may appear, its formulation may allow us to think deeper what music could really be – namely: something that was never meant to be simply listened to. If we suppress time, one could easily argue that we’re also suppressing the possibility of music itself; but, on the other hand, the absence of a temporal dimension opens the chance to discover a network of underlying structures that touches on the ultimate nature of musical composition (whatever this nature may be). There we can recognize a system of relations that can be inspected and manipulated without caring about the ordering the unidirectionality of time inevitably imposes. The principal concern of this project will be thus to investigate the most radical implications of the notion of hors-temps, elaborated by architect and composer Iannis Xenakis (1922-2001) exactly to express that which, in the writing or in the analytical study of a musical piece, can always be thought without considering the “before” and “after” distinctions.

The hors-temps categoryconcerns that which in music is independent of temporal becoming. This is not at all a strange new feature in Western compositional practices, considering that it already has been rooted for centuries in its most elementary operations, such as transposition, repetition, retrogradation, inversion and retrograde inversion. A melody is based on a temporal order of notes, its notation on a spatial order. When we play a written melody, its spatial order is converted into the temporal order of the sounds. But if we want to transform further the melodic phrase with the said operations, we must take it again “outside time”, treating it like an oriented planar figure, where the set of notes goes forwards and backwards, ascends and descends without reference to time. So, the aforementioned manipulations rely not on a temporal order, but on a spatial one. Transposition withdraws the melody from time, treating it as a geometric profile of pitches to be subjected to a vertical translation; whereas repetition applies on it a horizontal translation; inversion flips it “upside down” with a horizontal reflection; retrogradation, operating also on onset rhythmic contours, mirrors it with a vertical reflection; while the retrograde inversion results in a 180° rotation of the starting material.

In 1977 Xenakis and his collaborators brough to reality the prototype of an electronic tool for producing digital sound synthesis with strictly graphic methods, a hardware workstation baptized with the acronym UPIC (Unité Polyagogique Informatique du CEMAMu, that is to say the Centre d’Etudes de Mathématique et Automatique Musicales, founded by Xenakis himself). Its main interface consisted of a digitizing tablet reminiscent of an architect’s desk, upon which the user could draw evolving frequency trajectories (here called “sonic arcs”) with the aid of an electromagnetic pen. On this tablet, every sonic arc was inscribed with its durational values on the abscissa and its pitch values on the ordinate. One could assign to each arc a waveform period, a pressure envelope and an overall loudness indication (resembling the score dynamics ranging from pianissimo to fortissimo). The tablet was connected to an analogue to digital converter, and the actions performed on it could be visualized on a screen or printed on paper with a copying machine. In substance, with the UPIC technology, organized sound was extracted literally out of time and displayed synoptically on a surface of a board as a series of drawings. Furthermore, this equipment allowed Xenakis generalizing the linear transformations in the plane with compressions and expansions of the sonic arcs (the UPIC’s millimetric tablet could be arbitrarily changed in scale, the sounds could be shortened and stretched at will), as well as rotations through any angle.

The graphic scores created on the UPIC were remarkably different from the ones produced since the beginning of the second half of the XX century by the likes of E. Brown, M. Feldman, J. Cage, C. Cardew, M. Kagel, A. Logothetis and many others. In the latter cases, the visual aspect is prominent, reclaiming a considerable autonomy from the musical context to which it was linked – as if their authors, maybe unbeknownst to themselves, aspired to be refined, sophisticated drawers instead of composers. What is more, is that these experiments yield to rather nebulous notation systems, authorizing a great amount of arbitrariness and interpretative laxity. Since any precise instructions of how to decipher the drawings is often purposefully lacking, the notational symbolism is here necessarily vague and suggestive, and the resulting music bears always a remote, metaphorical connection to its score. On the contrary, within the UPIC framework the visual elements are directly translated into sound signals, without any interpretation mediated by analogies, synesthetic associations and other idiosyncratic readings.

Also, the UPIC seems to establish an hierarchical privilege of sound over image, debunking the calligraphic flourishes and the unmotivated aesthetic exuberance presented by the indirect, analogical symbolism: for Xenakis, drawing was only a means to compose music, and was not seen as an end per se. Assuming that the graphic synthesis was simply a technique, and not a goal, the rendition of images into acoustic waves was understood to be an unidirectional conversion, where the visual aspects could be partially lost into the aural domain in which they have been encoded. In fact, UPIC allowed a mapping from drawings to sounds, but it was difficult, if not impossible, to recover with a spectral analysis the initial images from which the sounds were synthesized. In the first composition to be completed on the UPIC, Mycènes Alpha (1978), undeniably serving as an essential case study in graphic sound synthesis, the score shows only the pitch versus time inscriptions, and the music reveals only vaguely in its spectrum the drawings behind it. The fuzziness of their original traces (partly erased, partly superimposed) is arguably due to the heavy presence of aliasing, as well as to the use of complex sound pressure envelopes not shown in the score. The same happens also in the later UPIC scores by Xenakis, like Taurhiphanie (1987-88) or Voyage Absolu des Unari vers Andromède (1989). Now, this obfuscation is at the detriment of the perspicuity of notation and engenders some major discrepancies between the audible result and the graphical procedures that have led to it (although the differences are not as radical as in what we called the “indirect symbolism”). So, the UPIC scores run again the danger to be essentially an approximation or an on oblique evocation of the music.

In order to get rid of these inaccuracies, the loudness envelopes should be displayed simultaneously in the score along with the pitches and the durations, but this requirement would force us to take a totally new direction, embracing a three-dimensional coordinate system, equipped with a “trihedron of reference” (as P. Shaeffer would say) – where the x is the frequency axis, y is the amplitude and z is time. Since the visual data should be reconstructible from a suitable analysis of the sound without loss of information, how we can achieve a full reversibility between music and virtual 3D objects? By that same reversibility, we could bring the notation to a previously unreached rigor and a give self-sufficiency to the visual aspect, without hiding it anymore in the sounds. This leads us to ask: what if the notation had exactly the same relevance as the musical output, representing an achievement valuable in its own right, being no longer a hypocritically undeclared ambition (as in the “indirect symbolism” mannerisms) or a mere tool instrumental to the composition (as in the UPIC)? These are some of the questions we tried to answer with our volumetric modelling graphic sound synthesis.

First of all, we must begin elucidating the main reasons and advantages in support of the volumetric solution. Even if we can always conceive a spectrogram as a representation of tridimensional information plotted in a heightmap terrain, there are some strong limitations in using rasters of pixels encoding elevation values, soon to be highlighted. Usually, in a spectrogram, the amplitude of the spectrum for each time frame will be rendered by the brightness in a gray scale plot, where the higher values are mapped onto brighter pixels. Then, the 2D spectrographic image can be thought of as a view from the top of an irregular 3D surface generated by the sampled signal, with black representing minimum height (or distance from the floor of the surface) and white representing maximum height. If we can convert a sound spectrum into pixels, we can also perform the inverse operation. Here encoding images in sound is made possible by simply reversing the method of creation of a spectrogram, so that the brightness of a pixel is converted in an amplitude value, and its position in the raster in a frequency and a time value. We can then observe that while the degree of brightness in a gray scale spectrogram can carry some depth information, the result will be very akin to a low-relief, which is still a strong compromise between 2D and 3D. Hence, the choice of volumetric modelling is elicited by the many features which cannot be faithfully represented in a heightmap terrain. For example, cavities and meanderings cannot be shown in a heightmap due to only the elevation data being taken into account, leaving everything below unrepresented. The hollows that would otherwise be the inside of the holes, or the underside of arches and protrusions disappear as if a veil was laid over these objects.

Image 1 – Paving the way for the transition from 2D to 3D. In OpenMusic we patched a series of algorithms to write a photograph as sound. We began A) creating the heightmap and B) storing it in a 1GB4 SDIF file; then C) we checked the content of the resulting file in SDIF-Edit; and D) we synthesized it with the SuperVP phase vocoder engine. As we can see in E), the resulting spectrogram displays correctly the photograph encoded in sound.

Now, if we actually step into the realm of a 3D space, we must be able to visualize and manipulate entities with high topological complexity (translating them from and into sound). Thus, it becomes necessary to switch from pixels to voxels, and also from FFT spectra to sinusoidal models made for partial analysis and resynthesis. In a volumetric terrain, partials are easily described by successive sample points, where each point is a voxel, written as a triplet of (positive or negative) real numbers. In this 3D environment, partials are connected series of points, represented as breakpoint functions. Analysis and resynthesis of partials will then provide trajectories with instantaneous frequency (in Hz) and amplitude (linear) values changing along temporal sampling frames. The biunivocal relationship we encountered in the heightmap terrains between images and sounds is here confirmed, because every voxel corresponds to a sample point of a partial and conversely. This means that the partials’ breakpoints, as temporal indices with matching frequential and intensity parameters, can be understood as the breakpoints of 3D curves, and vice-versa. To test the efficacy of this correspondence, we decided to work with a genuine tridimensional architecture, full of arching shapes, curving meanders and warped surfaces. We elected Xenakis’ Philips Pavilion for the 1958 Brussels World Fair as our touchstone. Having found that the peculiar Pavilion’s skeleton, made out of hyperbolic paraboloid and conoid shell portions, could be rebuilt simply with rectilinear segments, we successfully reconstructed it with our volumetric approach in the three dimensions of time, frequency and amplitude. As a final confirmation, the analyzed sinusoidal model revealed the original architectonic structure with its unaltered shapes.

Image 2 – Reconstructing the Philips Pavilion in OpenMusic. Designing the Philips Pavilion’s ruled surfaces as a 3DC-lib, A) we used its x, y and z cartesian coordinates to write a text file read and exported as a 1TRC SDIF by the software SPEAR; then B) we extracted the SDIF content displaying its data again as a 3DC-lib, finding again the original 3D object we started with. Finally, C) we synthesized the SDIF with the PM2 additive synthesizer. The sonogram of the resulting sound shows the architectural model from our chosen frontal perspective, but since the Pavilion is a three-dimensional entity, it can be virtually encoded in sound from indefinitely many other points of view; hence, D) having applied a 180 degrees rotation on the roll axis, we’ve also exported and synthesized it in SPEAR as seen from a rear perspective.

After having tested the intuitive effectiveness and the reliability of our method, we proceeded in implementing the elementary compositional principles (the organization of melodic, intervallic and rhythmic patterns, as well as the assignment of durations and dynamics) through 3D geometric transformations.

For instance, translation along the abscissa places the objects in 3D space according to the rhythmic onsets, upon which the distribution of synchronic (intervallic) and diachronic (melodic) pitches depends: the objects with the same onset value are treated as notes in a chord, while different onsets displace the objects as notes in a melody. Horizontal scaling, which expands or compress the size of the object by the z factors (the onset and offset time), determines the duration of each object; vertical scaling, by the x factors (the fundamental and the last harmonic Hz values) determines its pitch; depth scaling by the y factors (the minimum and maximum linear amplitude thresholds) determines its overall loudness. We recall incidentally that the link between depth and dynamics has historical roots that go back to G. Gabrieli’s Sonata pian’e forte (1597). In fact, dynamics hierarchize acoustic elements like depth arranges visible elements between foreground, middle ground and background, so that the louder sounds result in being placed closer to the perceiver, while the softer ones are placed farther from him.

Image 3: Basic musical operations as 3D transformations. Using a single sound object (a sphere) as a building block, we can notice how the several compressions and expansion of its original size visualize the duration scaling (as horizontal deformations on the time plane) and the pitch scaling (as vertical deformations on the frequency plane), whereas the velocity scaling is expressed by the displacements of the shapes in the amplitude plane. All the scaling and translation information is here derived from MIDI values stored in a chord-seq object.

But more importantly, a music generated and varied in a 3D space brings us results that are simply irreducible to the elementary transformation techniques that have governed the compositional strategies until now. We have seen that the variety of transformation techniques of musical material depends completely on the writing medium within which these transformations are performed. If the manipulations are applied to the pentagram’s space, the performable “outside time” actions are restricted to the vertical and horizontal translation, the vertical and horizontal reflection, the 180° rotation and their combinations. Unfortunately, Xenakis never fully investigated the real potential of the hors-temps manipulations offered by his invention, since even the UPIC can be seen as a mere revisitation of the common notation techniques, relying exclusively on the pitch versus time plane. According to this paradigm, imprisoned in the traditional flatness of the score, the idea of a sound studied hors-temps remains confined to the bidimensional space.

Image 4: Views from the Hors-Temps étude n.1 score. This is an overview of the score's 3D landscapes with the corresponding sonogram, which shows them as 2D frontal perspectives.

Therefore, our aim was to produce a concrete example of a new extratemporal music, where every aspect of the composition, down to the tiniest detail, is observed and shaped turning around 3D sonic objects in all directions. In these explorations, we discovered a wide array of plastic operations (such as the folding, twisting, bending, smoothing, morphing, carving, rippling of partials) all documented in the electronic piece Hors-Temps étude n.1, and its accompanying score, written in freely navigable 3D PDF metadata. Here, the visual morphologies influence and are influenced by the acoustic ones, because they presuppose each other in an inextricable interdependence. Accordingly, the piece is based on a process (the volumetric modelling sound synthesis) which at once generates the music and its representation, constituting both the aural content of the composition and the evidence that such composition has taken place, i.e., its notation. Moreover, instead of limiting ourselves to administer the partials and model their structures visually, we have constructed entire visual landscapes made of partials; instead of controlling sound graphically, plastically or architectonically we are drawing, sculpting and even building architectures with sound.