tabula rasa Exploring sound/gesture typo-morphology for enactive computer music performance IRCAM Musical Research Residency 2011 Thomas Grill http://grrrr.org
Contents • Introduction – concepts • Implementation – current status • Implications – outlook
Enactive musical performance with recorded sounds • Make sampled sounds within a corpus accessible in an intuitive manner, e.g. for live performance • Resulting sound should preserve bonding to source material
Enactive musical performance with recorded sounds • Organization – by which criteria? • Interfacing – visual, haptic, auditory? • Controlling and synthesizing – how to preserve qualities of corpus elements / achieve a predictable stream of sound?
Enactive musical performance with recorded sounds • Design the interface in a way that it is immediately self-explaining ➔ enactivity • Use principles of cross-modality ➔ embodiment ‣ Associate perceptual characteristics in corpus sounds to characteristics of interaction
In the mines of matter / 2ex (2010)
Enactive musical performance with recorded sounds • Analyze hand gestures and surface interaction sounds • Associate interaction traces with elements in sound corpus ( spectral and temporal ) • Synthesize audio stream, corresponding to both the corpus and the real-time gestures
Musical interface • Expressive (sonic complexity) • Withstand and process large dynamic range (very soft to brute force) • High resolution ("analog" feel) • No obstruction between performer & audience • Portable & stage compatible (lighting, beer)
Musical interface – bonus • Use different (exchangeable / modular) surfaces • Tile surfaces for larger area • Augment interaction using preparation – objects to interact with • Use interaction position for selection of sound material
Musical interface 40-50cm contact mics mic pre-amps and ADC 30cm polyacrylic interaction surface flexible, microstructured (<1mm) ] [ foam spacing (~10mm) MDF base (3mm) force sensitive sensors foam rubber antislip (2mm) Arduino board (ADC)
Musical interface – Sensorics – • 4x pickup microphones provide rich sonic trace of surface interaction • Matrix of force sensors provides additional data on interaction strength • Combined: data on interaction position
Musical interface
Musical interface – Force sensors – • 4x4 sensors – Arduino Mega has 16 ADCs (10 bits @ max 10kHz) • FSR sensor range ~100g – 10kg (resolution 10g) • 2D interpolation (e.g. with peak picking) should allow position detection
Musical interface – Contact microphones – • Piezo discs exhibit large tolerances in their resonance modes • Larger piezo discs have lower fundamental resonances and are more sensitive • Background noise / hum • 2D position detection through amplitudes • Alternative: more expensive contact mics
Gesture segmentation • GestureFollower follows corpus phrases always from the start ‣ Necessity for meaningful segmentation • The interaction sound is very noisy material + background noise / hum ‣ Use combination of onset detection and statistical segmentation
Gesture segmentation • Onset detection (Jean-Philippe Lambert): � log (melband i ) − log (melband i ) ≥ threshold i • Statistical segmentation (~Arnaud Dessein): D KL lookahead Gaussians (diagonal covariances) Kullback-Leibler divergence past
Gesture segmentation
Gesture tracing • GestureFollower can process ~100 parallel phrases at a time • A typical sound corpus consists of 1000s of sample segments • Phrase length of interaction data and corpus don't match – what to do when corpus phrase ends before the interaction gesture is finished?
Anticipation • We need heuristics to reduce the number of parallel phrases taken into account by GF • We need to be able to (reasonably) continue a prematurely ended corpus phrase ‣ Employ classification of whole segments to predict possibly following segments (Markov Model)
Analysis – interaction sound Interaction sound Segmentation Feature analysis Whitening Front Back GMM GMM Posteriors Posteriors Transition probabilities
Gaussian mixture model Segmented Posteriors Features audio
preceding following probabilities Transition Segment Segment fronts backs
Analysis – sound corpus Corpus sound Segmentation Feature analysis Whitening Front Back Front GMM GMM Posteriors Posteriors Posteriors Transition probabilities
Analysis – sound corpus Corpus sound Segmentation Feature analysis Whitening Back Front GMM iors Posteriors Posteriors Downsampling Gesture learning Transition probabilities
Live – gesture following Interaction Sound Segmentation Feature analysis Whitening Back Back GMM Posteriors iors Posteriors * Transition probabilities Downsampling * Corpus posteriors Transition probabilities Candidate segments Gesture following
Live – sound synthesis • Candidate segments come along with individual probabilities ‣ List can be ordered resp. probabilities used to seed GestureFollower ‣ Thresholds can be used for covered total probability or maximum number of candidates
Live – gesture following Interaction Sound Segmentation Feature analysis Whitening Back Posteriors * Transition probabilities Downsampling Corpus sound * Corpus posteriors Candidate segments Gesture following Gesture synthesis
Live – sound synthesis • GestureFollower delivers (for each input frame) individual phrase likelihoods, time positions and speed estimates ‣ We need to take into account varying tracing speeds for synthesis ‣ Use some speed-variable (granular) synthesis technique
Implementation • Whole system based on Max • Feature analysis and segmentation is done using FTM • Model data (GMMs, transition matrix), corpus data (segmentation and posteriors) and GestureFollower data is stored using mubu • Clustering and anticipation is done using Python / numpy / scikits.learn by means of py/pyext
What is missing? • Speed-variable sample playing • Integrate force sensor data e.g. for material selection (position) and dynamics processing (strength) • Finish interface hardware
Future improvements – software – • Use higher-level (perceptual) features to describe sound characteristics • Tune segmentation algorithm • Evaluate anticipation strategy (clustering method, front/back model, probability thresholds) • Find out feasible feature rate for GestureFollower • Seed GestureFollower with candidate probabilities
Future improvements – hardware – • Use clamp-frame mechanism to allow exchangeable surfaces (if possible) • Explore drum-skin-like surfaces (not damped) • Tactile feedback (potential cooperation with ISIR haptics lab at Paris 6) • Visual feedback (projection on surface)
Artistic implications • Staging / ways to interact (novel instrument) • How to radiate sound? • Linking performance concepts to sound, using additional objects and respective sound corpus
Thanks! See you!
Recommend
More recommend