filter banks
play

Filter Banks SPEECH RECOGNITION 40833 1 2 Spectral Analysis - PowerPoint PPT Presentation

Filter Banks SPEECH RECOGNITION 40833 1 2 Spectral Analysis Models (a) Pattern Recognition (b) Acoustic phonetic approaches to speech recognition 3 Spectral Analysis Models LPC analysis model 4 THE BANK-OF-FILTERS FRONT- END


  1. Filter Banks SPEECH RECOGNITION 40833 1

  2. 2 Spectral Analysis Models  (a) Pattern Recognition  (b) Acoustic phonetic approaches to speech recognition

  3. 3 Spectral Analysis Models  LPC analysis model

  4. 4 THE BANK-OF-FILTERS FRONT- END PROCESSOR  Complete bank-of-filter analysis model

  5. 5 THE BANK-OF-FILTERS FRONT- END PROCESSOR

  6. 6 THE BANK-OF-FILTERS FRONT- END PROCESSOR Typical waveforms and spectra for analysis of a pure sinusoid in the filter-bank model

  7. THE BANK-OF-FILTERS FRONT- END 7 PROCESSOR Typical waveforms and spectra of a voice speech signal in the bank-of-filters analysis model

  8. THE BANK-OF-FILTERS FRONT- END 8 PROCESSOR Ideal (a) and realistic (b) set of filter responses of a Q-channel filter bank covering the frequency range Fs/N to (Q+1/2)Fs/N

  9. 9 Types of Filter Bank Used for Speech Recognition F    s f i, 1 i Q i N  Q N/2 F  s b i N

  10. Non-uniform Filter Banks  b c 1    α b b , 2 i Q  i i 1   i 1 (b b )     i 1 f f b , i 1 j 2  j 1 10

  11. 11 Nonuniform Filter Banks   Filter 1 : f 300 Hz, b 200Hz 1 1   Filter 2 : f 600 Hz, b 400 Hz 2 2   Filter 3 : f 1200 Hz, b 800 Hz 3 3   Filter 4 : f 2400 Hz, b 1600Hz 4 4

  12. 12 Types of Filter Bank Used for Speech Recognition Ideal specification of a 4-channel octave band-filter bank (a), a 12-channel third-octave band filter bank (b), and a 7-channel critical band scale filter bank

  13. 13 Types of Filter Bank Used for Speech Recognition Ideal specification of a 4-channel octave band-filter bank (c) covering the telephone bandwidth range (200-3200 HZ) The variation of bandwidth with frequency for the perceptually based critical band scale

  14. 14 Implementations of Filter Banks  Instead of direct convolution, which is computationally expensive, we assume each bandpass filter impulse response to be represented by:  hi(n), i-th bandpass filter impulse response, is represented by a fixed  j n e lowpass window, w(n), modulated by complex exponential i  Where w(n) is a fixed lowpass window representing the

  15. 15 Implementations of Filter Banks The signals s(m) and w(n-m) used in evaluation of the short-time Fourier transform

  16. 16 Frequency Domain Interpretation of the Short-Time Fourier Transform VALUE LOG MAGNITUDE (dB) SAMPLE FREQUENCY Short-time Fourier transform using a long (500 points or 50 msec) Hamming window on a section of voiced speech

  17. 17 Frequency Domain Interpretation of the Short-Time Fourier Transform VALUE SAMPLE LOG MAGNITUDE (dB) FREQUENCY Short-time Fourier transform using a short (50 points or 5 msec) hamming window on a section of voiced speech

  18. 18 Frequency Domain Interpretation of the Short-Time Fourier Transform VALUE SAMPLE LOG MAGNITUDE (dB) FREQUENCY Short-time Fourier transform using a long (500 points or 50 msec) hamming window on a section of unvoiced speech

  19. 19 Frequency Domain Interpretation of the Short-Time Fourier Transform VALUE SAMPLE LOG MAGNITUDE (dB) FREQUENCY Short-time Fourier transform using a short (50 points or 5 msec) hamming window on a section of unvoiced speech

  20. 20 Linear Filter Interpretation of the STFT ~ n s ( ) s ( n )   j S n e ( ) 1 w ( n )   j e i

  21. 21 FFT Implementation of a Uniform Filter Bank FFT implementation of a uniform filter bank

  22. 22 Direct implementation of an arbitrary filter bank X 1 n ( ) h 1 n ( ) X 2 n ( ) h 2 n ( ) s ( n )  X Q ( n ) h Q ( n )

  23. 23 Nonuniform FIR Filter Bank Implementations Two arbitrary nonuniform filter-bank filter specifications consisting of eighter 3 bands (part a) or 7 bands (part b).

  24. 24 Tree Structure Realizations of Nonuniform Filter Banks

  25. 25 Practical Examples of Speech-Recognition Filter Banks VALUE TIME IN SAMPLES MAGNITUDE (dB) FREQUENCY (kHz)

  26. 26 Practical Examples of Speech-Recognition Filter Banks MAGNITUDE (dB) FREQUENCY (kHz) Window sequence, w(n), (part a), the individual filter response (part b), and the composite response (part c) of a Q = 15 channel, uniform filter bank, designed singal 101-point Kaiser window smoothed lowpass window (after Dautrich et al).

  27. 27 Practical Examples of Speech-Recognition Filter Banks VALUE TIME IN SAMPLES MAGNITUDE (dB) FREQUENCY (kHz)

  28. 28 Practical Examples of Speech-Recognition Filter Banks MAGNITUDE (dB) FREQUENCY (kHz) Window sequence, w(n), (part a), the individual filter response (part b), and the composite response (part c) of a Q = 15 channel, uniform filter bank, designed singal 101-point Kaiser window directly as the lowpass window (after Dautrich etal).

  29. 29 Generalizations of Filter-Bank Analyzer

  30. Generalizations of Filter-Bank Analyzer 30

  31. 31 Generalizations of Filter-Bank Analyzer

  32. Generalizations of Filter-Bank Analyzer 32

Recommend


More recommend