Latent Terrain: Dissecting the Latent Space of Neural Audio Autoencoders by Shuoyang Jasper Zheng

Exploring musical affordances of neural networks beyond their task-oriented capabilities and deriving sonic materials for musical expressions.

A photo of the latent terrain synth interface being used

Presented by : Shuoyang Jasper Zheng

Biography

 

We present Latent Terrain, an algorithmic approach to dissecting the latent space of a neural audio autoencoder into a two-dimensional plane. Latent Terrain questions the conventional paradigms of dimensionality reduction in creative interactive systems, in which the projection from high to low dimensional spaces is done by modelling similar objects with nearby points. Instead, with a mountainous and steep surface, a terrain material generated by our approach affords greater spectral complexity when navigating an audio autoencoder's latent space.

Extending from this, we present Latent Terrain Synthesis, which is a method for sound synthesis whereby a waveform is generated by pathing through a terrain surface. Latent terrain synthesis aims to help musicians create tailorable and flexible materials to explore musical expressions leveraging the sonic capabilities of neural audio autoencoders such as RAVE. 

We provide our MaxMSP externals nn_terrain that work together with nn~ to generate latent terrains for pre-trained RAVE models and allow users to navigate the terrain in real-time.

In this talk, I will first present the technical details behind latent terrain, workflow, how it integrates with RAVE, and a demo interface with a tablet and a stylus. I will also present a recent user study workshop at the Centre for Digital Music at Queen Mary University of London, with co-authors Anna Xambó Sedó and Nick Bryan-Kinns, of 18 musicians from various backgrounds exploring musical affordances and deriving sonic materials for musical expressions.  

Acknowledgment: This work is supported by the UKRI Centre for Doctoral Training in Artificial Intelligence and Music are supported by UK Research and Innovation [grant number EP/S022694/1].