frames for psychoacoustics
play

Frames for Psychoacoustics tics Peter Balazs Erblet transform and - PowerPoint PPT Presentation

Frames for Psychoacous- Frames for Psychoacoustics tics Peter Balazs Erblet transform and perceptual sparsity ARI Frame Theory Multipliers Peter Balazs joint work with T. Necciari, B. Laback, N. Holighaus, D. Stoeva, ... Perceptual


  1. Frames for Psychoacous- Frames for Psychoacoustics tics Peter Balazs Erblet transform and perceptual sparsity ARI Frame Theory Multipliers Peter Balazs joint work with T. Necciari, B. Laback, N. Holighaus, D. Stoeva, ... Perceptual Sparsity by Irrelevance Acoustics Research Institute (ARI) Conclusions Austrian Academy of Sciences, Vienna February Fourier Talks 2014 Peter Balazs Frames for Psychoacoustics page 1

  2. Frames for Psychoacous- tics Peter Balazs ARI Frame Theory Acoustics Research Institute Multipliers Perceptual Sparsity by (ARI) Irrelevance Conclusions Peter Balazs Frames for Psychoacoustics page 2

  3. ARI Interdisciplinary research in acoustics , integrating acoustic Frames for Psychoacous- phonetics, psychoacoustics and computational physics, based tics Peter Balazs on a solid mathematical background. ARI Excellence through Synergy Frame Theory Multipliers Perceptual Sparsity by Irrelevance Conclusions Peter Balazs Frames for Psychoacoustics page 3

  4. ARI Frames for Psychoacous- tics Peter Balazs ARI Frame Theory Multipliers Perceptual Sparsity by Irrelevance Conclusions Peter Balazs Frames for Psychoacoustics page 4

  5. Advantage of a strong mathematical background Frames for Psychoacous- tics Peter Balazs ARI Frame Theory Application- Multipliers oriented mathematics Perceptual Sparsity by True inter- Irrelevance disciplinarity Conclusions Synergy Novel methods Peter Balazs Frames for Psychoacoustics page 5

  6. Overview: Frames for Psychoacous- tics 1 Acoustics Research Institute (ARI) Peter Balazs ARI 2 Frame Theory Time-Frequency Representation Frame Theory Non-stationary Gabor Transform Multipliers ERBlets Perceptual Sparsity by Irrelevance 3 Frame Multipliers Conclusions Mathematical Background 4 Perceptual Sparsity by Irrelevance 5 Conclusions Peter Balazs Frames for Psychoacoustics page 6

  7. Frames for Psychoacous- tics Peter Balazs ARI Signal Representations: Frame Theory Time-Frequency Representation NSGT ERBlets Time-Frequency Analysis and Multipliers Perceptual Sparsity by Frames Irrelevance Conclusions Peter Balazs Frames for Psychoacoustics page 7

  8. Spectrogram Frames for Psychoacous- tics Peter Balazs ARI Frame Theory Time-Frequency Representation NSGT ERBlets Multipliers Perceptual Sparsity by Irrelevance Conclusions Peter Balazs Frames for Psychoacoustics page 8

  9. Short Time Fourier Transformation (STFT) Frames for Psychoacous- Definition (see e.g. [Gr¨ ochenig, 2001]) tics Peter Balazs Let f , g � = 0 in L 2 � R d � , then we call ARI � f ( x ) g ( x − τ ) e − 2 πiωx dx V g f ( τ, ω ) = Frame Theory Time-Frequency Representation R d NSGT ERBlets the Short Time Fourier Transformation (STFT) of the signal f with Multipliers the window g . Perceptual Sparsity by Irrelevance Conclusions Sampled Version is the Gabor transform: f �→ V g ( f )( a · k, b · l ) = � f, g k,l � , where g k,l ( t ) = g ( t − ka ) e i 2 πlbt . When is perfect reconstruction possible? Peter Balazs Frames for Psychoacoustics page 9

  10. Frames Frames for Psychoacous- tics Definition Peter Balazs The (countable) sequence Ψ = ( ψ k | k ∈ K ) is called a frame for the Hilbert space H if constants A > 0 and B < ∞ exist such that ARI Frame Theory |� f, ψ k �| 2 ≤ B · � f � 2 A · � f � 2 � Time-Frequency H ≤ H , ∀ f ∈ H . Representation NSGT k ERBlets Multipliers Perceptual [Duffin and Schaeffer, 1952, Daubechies et al., 1986] Sparsity by Irrelevance Beautiful abstract mathematical setting: Conclusions Frames = generalization of bases; can be overcomplete, allowing redundant representations. Redundancy Active field of research in mathematics! Peter Balazs Frames for Psychoacoustics page 10

  11. Frame Theory Frames for Psychoacous- Interesting for applications: tics Peter Balazs Much more freedom. Finding and constructing frames can be easier and faster. ARI Some advantageous side constraints can only be fulfilled Frame Theory Time-Frequency Representation for frames. NSGT ERBlets Perfect reconstruction is guaranteed with the ‘canonical Multipliers dual frame’ ˜ ψ k = S − 1 ψ k Perceptual Sparsity by Irrelevance � < f, ψ k > ˜ � < f, ˜ f = ψ k = ψ k > ψ k , Conclusions k k where S is the frame operator Sf = � < f, ψ k > ψ k . k Peter Balazs Frames for Psychoacoustics page 11

  12. Frames for Psychoacous- tics Peter Balazs ARI Frame Theory: Frame Theory Time-Frequency Representation NSGT ERBlets Multipliers Non-stationary Gabor Perceptual Sparsity by Irrelevance transform Conclusions Peter Balazs Frames for Psychoacoustics page 12

  13. Non-stationary Gabor Transform Frames for Psychoacous- tics Limitations of Standard Gabor analysis: Quality of Peter Balazs representation highly depends on window choice, but optimal ARI window choice is different for different signal components Frame Theory Time-Frequency Standard Gabor - Wide window Representation Standard Gabor - Narrow window NSGT 8000 8000 ERBlets 7000 7000 Multipliers 6000 6000 Frequency (Hz) Frequency (Hz) 5000 5000 Perceptual 4000 4000 Sparsity by 3000 Irrelevance 3000 2000 2000 Conclusions 1000 1000 0 0 0.2 0.4 0.6 0.8 1 1.2 0.2 0.4 0.6 0.8 1 1.2 Time (s) Time (s) Peter Balazs Frames for Psychoacoustics page 13

  14. Non-stationary Gabor Transform Frames for Psychoacous- tics Our proposition [Balazs et al., 2011]: simple extension to Peter Balazs reduce this limitation by using windows evolving over time. ARI Frame Theory Time-Frequency Representation NSGT ERBlets Multipliers Perceptual Sparsity by Irrelevance Conclusions Peter Balazs Frames for Psychoacoustics page 14

  15. Non-stationary Gabor Transform Frames for Psychoacous- tics Given a sequence of windows ( g n ) n ∈ Z of L 2 ( R ) and sequences Peter Balazs of real numbers ( a n ) n ∈ Z and ( b n ) n ∈ Z , the non-stationary Gabor ARI transform (NSGT) elements are defined, for ( m, n ) ∈ Z 2 , by: Frame Theory Time-Frequency Representation g m,n ( t ) = g n ( t − na n ) e i 2 πmb n t . NSGT ERBlets Multipliers Regular structure in frequency allows FFT implementation. Perceptual Sparsity by Irrelevance An analogue construction in the frequency domain allows easy Conclusions implementation of, e.g. wavelet frames; an invertible CQT [Velasco et al., 2011]. Peter Balazs Frames for Psychoacoustics page 15

  16. Non-stationary Gabor Transform Sampling grid example: Frames for Psychoacous- tics Peter Balazs ARI Frequency Frame Theory Time-Frequency Representation NSGT ERBlets Multipliers Perceptual Sparsity by Irrelevance Time Conclusions Windows Peter Balazs Frames for Psychoacoustics page 16

  17. Non-stationary Gabor Transform Frames for Frame theory allows perfect reconstruction. Particularly Psychoacous- tics efficient in the ’painless’ case: Peter Balazs Theorem ARI For every n ∈ Z , let the function g n ∈ L 2 ( R ) be compactly Frame Theory Time-Frequency 1 supported with supp( g n ) ⊆ [ c n , d n ] such that d n − c n ≤ Representation b n . NSGT The system of functions g m,n forms a frame for L 2 ( R ) if and ERBlets Multipliers only if there exists A > 0 and B < ∞ , such that Perceptual b n | g n ( t − na n ) | 2 ≤ B . In this case, the canonical dual 1 A ≤ � Sparsity by n Irrelevance frame has the same structure and is given by: Conclusions g n ( t ) b k | g k ( t − ka k ) | 2 e 2 πimb n t . ˜ g m,n ( t ) = (1) 1 � k Peter Balazs Frames for Psychoacoustics page 17

  18. Non-stationary Gabor Transform Bird vocalization example: Frames for Psychoacous- tics Standard Gabor - Wide window 3500 Peter Balazs Frequency (Hz) 3000 2500 2000 ARI 1500 1000 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 Frame Theory Time (s) Time-Frequency Standard Gabor - Narrow window Representation 3500 NSGT Frequency (Hz) 3000 ERBlets 2500 2000 Multipliers 1500 1000 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 Perceptual Time (s) Sparsity by Nonstationary Gabor Irrelevance 3500 Frequency (Hz) 3000 Conclusions 2500 2000 1500 1000 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 Time (s) For an overview of adapted and adaptive time-frequency representations, see [Balazs et al., 2013]. Peter Balazs Frames for Psychoacoustics page 18

  19. ERBlets Frames for Psychoacous- Non-stationary Gabor transform adapted to human auditory tics perception [Necciari et al., 2013]: Peter Balazs ARI Frame Theory Time-Frequency Representation NSGT ERBlets Multipliers Perceptual Sparsity by Irrelevance Conclusions ERB-scale Peter Balazs Frames for Psychoacoustics page 19

  20. ERBlets Frames for Non-stationary Gabor transform adapted to human auditory Psychoacous- tics perception [Necciari et al., 2013]: Peter Balazs ARI Frame Theory Time-Frequency Representation NSGT ERBlets Multipliers Perceptual Sparsity by Irrelevance Conclusions Relative reconstruction error: < 10 − 15 . Implementation in LTFAT [Soendergaard et al., 2012]. Peter Balazs Frames for Psychoacoustics page 20

  21. ERBlets Frames for Filterbank: Psychoacous- tics Peter Balazs ARI Frame Theory Time-Frequency Representation NSGT ERBlets Multipliers Perceptual Sparsity by Irrelevance Conclusions Peter Balazs Frames for Psychoacoustics page 21

Recommend


More recommend