tabula rasa
play

tabula rasa Exploring sound/gesture typo-morphology for enactive - PowerPoint PPT Presentation

tabula rasa Exploring sound/gesture typo-morphology for enactive computer music performance IRCAM Musical Research Residency 2011 Thomas Grill http://grrrr.org Contents Introduction concepts Implementation current status


  1. tabula rasa Exploring sound/gesture typo-morphology for enactive computer music performance IRCAM Musical Research Residency 2011 Thomas Grill http://grrrr.org

  2. Contents • Introduction – concepts • Implementation – current status • Implications – outlook

  3. Enactive musical performance with recorded sounds • Make sampled sounds within a corpus accessible in an intuitive manner, e.g. for live performance • Resulting sound should preserve bonding to source material

  4. Enactive musical performance with recorded sounds • Organization – by which criteria? • Interfacing – visual, haptic, auditory? • Controlling and synthesizing – how to preserve qualities of corpus elements / achieve a predictable stream of sound?

  5. Enactive musical performance with recorded sounds • Design the interface in a way that it is immediately self-explaining ➔ enactivity • Use principles of cross-modality ➔ embodiment ‣ Associate perceptual characteristics in corpus sounds to characteristics of interaction

  6. In the mines of matter / 2ex (2010)

  7. Enactive musical performance with recorded sounds • Analyze hand gestures and surface interaction sounds • Associate interaction traces with elements in sound corpus ( spectral and temporal ) • Synthesize audio stream, corresponding to both the corpus and the real-time gestures

  8. Musical interface • Expressive (sonic complexity) • Withstand and process large dynamic range (very soft to brute force) • High resolution ("analog" feel) • No obstruction between performer & audience • Portable & stage compatible (lighting, beer)

  9. Musical interface – bonus • Use different (exchangeable / modular) surfaces • Tile surfaces for larger area • Augment interaction using preparation – objects to interact with • Use interaction position for selection of sound material

  10. Musical interface 40-50cm contact mics mic pre-amps and ADC 30cm polyacrylic interaction surface flexible, microstructured (<1mm) ] [ foam spacing (~10mm) MDF base (3mm) force sensitive sensors foam rubber antislip (2mm) Arduino board (ADC)

  11. Musical interface – Sensorics – • 4x pickup microphones provide rich sonic trace of surface interaction • Matrix of force sensors provides additional data on interaction strength • Combined: data on interaction position

  12. Musical interface

  13. Musical interface – Force sensors – • 4x4 sensors – Arduino Mega has 16 ADCs (10 bits @ max 10kHz) • FSR sensor range ~100g – 10kg (resolution 10g) • 2D interpolation (e.g. with peak picking) should allow position detection

  14. Musical interface – Contact microphones – • Piezo discs exhibit large tolerances in their resonance modes • Larger piezo discs have lower fundamental resonances and are more sensitive • Background noise / hum • 2D position detection through amplitudes • Alternative: more expensive contact mics

  15. Gesture segmentation • GestureFollower follows corpus phrases always from the start ‣ Necessity for meaningful segmentation • The interaction sound is very noisy material + background noise / hum ‣ Use combination of onset detection and statistical segmentation

  16. Gesture segmentation • Onset detection (Jean-Philippe Lambert): � log (melband i ) − log (melband i ) ≥ threshold i • Statistical segmentation (~Arnaud Dessein): D KL lookahead Gaussians (diagonal covariances) Kullback-Leibler divergence past

  17. Gesture segmentation

  18. Gesture tracing • GestureFollower can process ~100 parallel phrases at a time • A typical sound corpus consists of 1000s of sample segments • Phrase length of interaction data and corpus don't match – what to do when corpus phrase ends before the interaction gesture is finished?

  19. Anticipation • We need heuristics to reduce the number of parallel phrases taken into account by GF • We need to be able to (reasonably) continue a prematurely ended corpus phrase ‣ Employ classification of whole segments to predict possibly following segments (Markov Model)

  20. Analysis – interaction sound Interaction sound Segmentation Feature analysis Whitening Front Back GMM GMM Posteriors Posteriors Transition probabilities

  21. Gaussian mixture model Segmented Posteriors Features audio

  22. preceding following probabilities Transition Segment Segment fronts backs

  23. Analysis – sound corpus Corpus sound Segmentation Feature analysis Whitening Front Back Front GMM GMM Posteriors Posteriors Posteriors Transition probabilities

  24. Analysis – sound corpus Corpus sound Segmentation Feature analysis Whitening Back Front GMM iors Posteriors Posteriors Downsampling Gesture learning Transition probabilities

  25. Live – gesture following Interaction Sound Segmentation Feature analysis Whitening Back Back GMM Posteriors iors Posteriors * Transition probabilities Downsampling * Corpus posteriors Transition probabilities Candidate segments Gesture following

  26. Live – sound synthesis • Candidate segments come along with individual probabilities ‣ List can be ordered resp. probabilities used to seed GestureFollower ‣ Thresholds can be used for covered total probability or maximum number of candidates

  27. Live – gesture following Interaction Sound Segmentation Feature analysis Whitening Back Posteriors * Transition probabilities Downsampling Corpus sound * Corpus posteriors Candidate segments Gesture following Gesture synthesis

  28. Live – sound synthesis • GestureFollower delivers (for each input frame) individual phrase likelihoods, time positions and speed estimates ‣ We need to take into account varying tracing speeds for synthesis ‣ Use some speed-variable (granular) synthesis technique

  29. Implementation • Whole system based on Max • Feature analysis and segmentation is done using FTM • Model data (GMMs, transition matrix), corpus data (segmentation and posteriors) and GestureFollower data is stored using mubu • Clustering and anticipation is done using Python / numpy / scikits.learn by means of py/pyext

  30. What is missing? • Speed-variable sample playing • Integrate force sensor data e.g. for material selection (position) and dynamics processing (strength) • Finish interface hardware

  31. Future improvements – software – • Use higher-level (perceptual) features to describe sound characteristics • Tune segmentation algorithm • Evaluate anticipation strategy (clustering method, front/back model, probability thresholds) • Find out feasible feature rate for GestureFollower • Seed GestureFollower with candidate probabilities

  32. Future improvements – hardware – • Use clamp-frame mechanism to allow exchangeable surfaces (if possible) • Explore drum-skin-like surfaces (not damped) • Tactile feedback (potential cooperation with ISIR haptics lab at Paris 6) • Visual feedback (projection on surface)

  33. Artistic implications • Staging / ways to interact (novel instrument) • How to radiate sound? • Linking performance concepts to sound, using additional objects and respective sound corpus

  34. Thanks! See you!

Recommend


More recommend