matching the analysis scheme to the signal
play

Matching the Analysis Scheme to the Signal Fritz Menzer - PowerPoint PPT Presentation

Time-Frequency Analysis for Audio Workshop Matching the Analysis Scheme to the Signal Fritz Menzer (fritz.menzer@epfl.ch) Communication Systems, 5 th year Ecole Polytechnique F ed erale de Lausanne 15th April, 2004 Overview 1


  1. Time-Frequency Analysis for Audio Workshop Matching the Analysis Scheme to the Signal Fritz Menzer (fritz.menzer@epfl.ch) Communication Systems, 5 th year Ecole Polytechnique F´ ed´ erale de Lausanne 15th April, 2004

  2. Overview 1 Introduction 3 2 Perfect Reconstruction - who cares? 4 2.1 Definition of perfect reconstruction . . . . . . . . . . . . . . . . . . 4 2.2 Do we need perfect reconstruction? . . . . . . . . . . . . . . . . . . 5 3 Harmonic Band Wavelet Transform 7 3.1 Coefficient modeling . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.2 Advantages / Drawbacks . . . . . . . . . . . . . . . . . . . . . . . . 11 4 From HBWT to inharmonic sound modeling 12 4.1 Taking filters from different PR filterbanks . . . . . . . . . . . . . . 13 4.2 Why aliasing is not a problem . . . . . . . . . . . . . . . . . . . . . 14 4.3 Method Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.4 Sounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 5 Time-Frequency Analysis and Granular Synthesis 19 5.1 Time-domain effects . . . . . . . . . . . . . . . . . . . . . . . . . . 25 5.2 Scale of all grains in a 1024-band full-tree wavelet decomposition . . 26 A References 27 2

  3. 1 Introduction • If you know what you’re looking at, you can examine it more precisely. 3

  4. 2 Perfect Reconstruction - who cares? 2.1 Definition of perfect reconstruction • Definition: Perfect Reconstruction (PR) method: method providing direct and inverse transforms T and T − 1 such that for any signal s , T − 1 ( T ( s )) = s • FFT based methods, Cosine Modulated Filterbanks and Wavelet transforms are usually PR methods. • Simple operations like filtering or distortion do not necessarily allow PR (i.e. it may be impossible to find T − 1 ). Example: Quantisation obviously does not allow to reconstuct the original signal perfectly. 4

  5. 2.2 Do we need perfect reconstruction? 1.5 150 1 0.5 100 0 −0.5 50 −1 −1.5 −2 0 0 5 10 15 20 0 5 10 15 20 1.5 80 1 60 0.5 0 40 −0.5 −1 20 −1.5 −2 0 0 5 10 15 20 0 5 10 15 20 samples frequency [kHz] Noise Noise, down- and upsampled by 4 5

  6. Do we need perfect reconstruction? • Not needed for: – Modifying a signal – Handling noise – If the nature of the signal is known • Why use PR methods for compression? – Generality (ideally any signal can be treated) – Localising the source of errors! 6

  7. 3 Harmonic Band Wavelet Transform (Polotti and Evangelista, 2000) ... x ( n ) g 0 ( k ) φ ( k ) φ ( k ) DC Comp. ✲ ✲ ✲ ✲ ✲ ✲ ✲ ✲ P 2 2 ❄ ❄ ❄ ψ ( k ) ψ ( k ) ✲ ✲ ✲ ✲ ✲ ✲ 2 2 ❄ ❄ ... g 1 ( k ) φ ( k ) φ ( k ) Sinusoidal ✲ ✲ ✲ ✲ ✲ ✲ ✲ ✲ 2 2 P ❄ ❄ ❄ Part ψ ( k ) ψ ( k ) ✲ ✲ ✲ ✲ ✲ ✲ 2 2 ❄ ❄ ❄ ... ... ... ... g P − 1 ( k ) φ ( k ) φ ( k ) Sinusoidal ✲ ✲ ✲ ✲ ✲ ✲ ✲ ✲ 2 2 P ❄ ❄ ❄ Part ψ ( k ) ψ ( k ) ✲ ✲ ✲ ✲ ✲ ✲ 2 2 ❄ ❄ 7

  8. 10000 9000 8000 7000 6000 frequency [Hz] 5000 4000 3000 2000 1000 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 time [sec]

  9. 10000 9000 8000 7000 6000 frequency [Hz] 5000 4000 3000 2000 1000 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 time [sec]

  10. � � 3.1 Coefficient modeling • Wavelet Transform • Model the scale residual sinusoidally • Model the wavelet coefficients using LPC 7 scale Φ ω | 4,0 ( ) | residual 6 | � Ψ 4,0 ( )| ω N =4 5 | � Ψ 3,0 ( )| ω n =3 4 3 Ψ ω | � 2,0 ( ) n =2 | 2 Ψ ω | � 1,0 ( ) | n =1 1 0 0 0.5 1 1.5 2 2.5 3 3.5 10

  11. 3.2 Advantages / Drawbacks + Meaningful adaptation of frequency and time resolution = ⇒ Visually better resolution + Reasonable model for the coefficients − Works only for monophonic, harmonic sounds − No model for the transients 11

  12. 4 From HBWT to inharmonic sound modeling 1800 1600 1400 1200 frequency [Hz] 1000 800 600 400 200 1 2 3 4 5 6 7 time [sec]

  13. 4.1 Taking filters from different PR filterbanks 1 st partial 2 nd partial 3 rd partial . . . ω 1 st partial 2 nd partial 3 rd partial . . . ω

  14. 4.2 Why aliasing is not a problem If a sinusoid of the form  ˆ   kπ sin P t + ϕ  is the input to a P-channel cosine modulated filterbank, only two bands will output nonzero coefficients: ˆ kπ P ) | � = 0 ⇔ k ∈ { ˆ k − 1 , ˆ | H k ( e j k } partial’s frequency ω = ⇒ there is no aliasing of the sinusoidal part, but only of the part that we model as noise! 14

  15. 4.3 Method Overview Analysis analyse signal → find N partials → determine filterbank ↓ calculate 2 N sets of filterbank coefficients + residual ↓ calculate wavelet transform (WT) of filterbank coefficients ↓ model the WT coefficients sinusoidally and with LPC Synthesis reconstruct WT coefficients ↓ perform inverse wavelet transform → get filterbank coefficients ↓ inverse filterbank ↓ add residual (or not) 17

  16. 4.4 Sounds • Original Gong • Reconstructed from the Filterbank Coefficients • Synthesized from model parameters • 1 octave pitch-shifted Gong • Time-stretched Gong • Sinusoidal-only Gong • First wavelet scale only • Harmonic Gong 18

  17. 5 Time-Frequency Analysis and Granular Synthesis • Any Time-Frequency Transform implements a sort of Granular Synthesis. • Each coefficient corresponds to a grain • Grains are played at precise instants (instead of randomly) • To produce a grain, set all coefficients to zero, except one that will be set to one. Then perform the inverse transform. 19

  18. Windowed FFT (STFT) grain 2 x 10 −3 1.5 1 0.5 0 −0.5 −1 −1.5 −2 0 2 4 6 8 10 12 time [msec] play 20

  19. Cosine Modulated Filterbank grain 0.15 0.1 0.05 0 −0.05 −0.1 −0.15 −0.2 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 time [msec] play 21

  20. Full-tree wavelet “grain” 0.15 0.1 0.05 0 −0.05 −0.1 −0.15 −0.2 0 10 20 30 40 50 60 time [msec] play 22

  21. HBWT grain (noise part) 0.15 0.1 0.05 0 −0.05 −0.1 −0.15 −0.2 0 2 4 6 8 10 12 14 16 time [msec] play 23

  22. HBWT grain (sinusoidal part) 0.05 0.04 0.03 0.02 0.01 0 −0.01 −0.02 −0.03 −0.04 −0.05 0 20 40 60 80 100 120 140 160 time [msec] play 24

  23. 5.1 Time-domain effects 0.15 0.1 0.05 0 −0.05 −0.1 −0.15 −0.2 0 1 2 3 4 5 6 7 time [msec] 0.15 0.1 0.05 0 −0.05 −0.1 −0.15 −0.2 0 1 2 3 4 5 6 7 time [msec] Channel 8: one grain played continuously Channel 9: one grain played continuously 25

  24. 5.2 Scale of all grains in a 1024-band full-tree wavelet decomposition x 10 4 2.2 2 1.8 1.6 1.4 frequency [Hz] 1.2 1 0.8 0.6 0.4 0.2 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 time [sec] play 26

  25. A References • Article on Harmonic Band Wavelet Transform by Polotti and Evangelista http://lcavwww.epfl.ch/publications/publications/2000/PolottiE00b.pdf • DAFx 2002 paper on adaptation to inharmonic sounds http://lcavwww.epfl.ch/publications/publications/2002/PolottiE02.pdf • Some material (presentation slides, Matlab functions and pure data objects) http://www.xsmusic.ch/ 27

Recommend


More recommend