Filter Banks SPEECH RECOGNITION 40833 1
2 Spectral Analysis Models (a) Pattern Recognition (b) Acoustic phonetic approaches to speech recognition
3 Spectral Analysis Models LPC analysis model
4 THE BANK-OF-FILTERS FRONT- END PROCESSOR Complete bank-of-filter analysis model
5 THE BANK-OF-FILTERS FRONT- END PROCESSOR
6 THE BANK-OF-FILTERS FRONT- END PROCESSOR Typical waveforms and spectra for analysis of a pure sinusoid in the filter-bank model
THE BANK-OF-FILTERS FRONT- END 7 PROCESSOR Typical waveforms and spectra of a voice speech signal in the bank-of-filters analysis model
THE BANK-OF-FILTERS FRONT- END 8 PROCESSOR Ideal (a) and realistic (b) set of filter responses of a Q-channel filter bank covering the frequency range Fs/N to (Q+1/2)Fs/N
9 Types of Filter Bank Used for Speech Recognition F s f i, 1 i Q i N Q N/2 F s b i N
Non-uniform Filter Banks b c 1 α b b , 2 i Q i i 1 i 1 (b b ) i 1 f f b , i 1 j 2 j 1 10
11 Nonuniform Filter Banks Filter 1 : f 300 Hz, b 200Hz 1 1 Filter 2 : f 600 Hz, b 400 Hz 2 2 Filter 3 : f 1200 Hz, b 800 Hz 3 3 Filter 4 : f 2400 Hz, b 1600Hz 4 4
12 Types of Filter Bank Used for Speech Recognition Ideal specification of a 4-channel octave band-filter bank (a), a 12-channel third-octave band filter bank (b), and a 7-channel critical band scale filter bank
13 Types of Filter Bank Used for Speech Recognition Ideal specification of a 4-channel octave band-filter bank (c) covering the telephone bandwidth range (200-3200 HZ) The variation of bandwidth with frequency for the perceptually based critical band scale
14 Implementations of Filter Banks Instead of direct convolution, which is computationally expensive, we assume each bandpass filter impulse response to be represented by: hi(n), i-th bandpass filter impulse response, is represented by a fixed j n e lowpass window, w(n), modulated by complex exponential i Where w(n) is a fixed lowpass window representing the
15 Implementations of Filter Banks The signals s(m) and w(n-m) used in evaluation of the short-time Fourier transform
16 Frequency Domain Interpretation of the Short-Time Fourier Transform VALUE LOG MAGNITUDE (dB) SAMPLE FREQUENCY Short-time Fourier transform using a long (500 points or 50 msec) Hamming window on a section of voiced speech
17 Frequency Domain Interpretation of the Short-Time Fourier Transform VALUE SAMPLE LOG MAGNITUDE (dB) FREQUENCY Short-time Fourier transform using a short (50 points or 5 msec) hamming window on a section of voiced speech
18 Frequency Domain Interpretation of the Short-Time Fourier Transform VALUE SAMPLE LOG MAGNITUDE (dB) FREQUENCY Short-time Fourier transform using a long (500 points or 50 msec) hamming window on a section of unvoiced speech
19 Frequency Domain Interpretation of the Short-Time Fourier Transform VALUE SAMPLE LOG MAGNITUDE (dB) FREQUENCY Short-time Fourier transform using a short (50 points or 5 msec) hamming window on a section of unvoiced speech
20 Linear Filter Interpretation of the STFT ~ n s ( ) s ( n ) j S n e ( ) 1 w ( n ) j e i
21 FFT Implementation of a Uniform Filter Bank FFT implementation of a uniform filter bank
22 Direct implementation of an arbitrary filter bank X 1 n ( ) h 1 n ( ) X 2 n ( ) h 2 n ( ) s ( n ) X Q ( n ) h Q ( n )
23 Nonuniform FIR Filter Bank Implementations Two arbitrary nonuniform filter-bank filter specifications consisting of eighter 3 bands (part a) or 7 bands (part b).
24 Tree Structure Realizations of Nonuniform Filter Banks
25 Practical Examples of Speech-Recognition Filter Banks VALUE TIME IN SAMPLES MAGNITUDE (dB) FREQUENCY (kHz)
26 Practical Examples of Speech-Recognition Filter Banks MAGNITUDE (dB) FREQUENCY (kHz) Window sequence, w(n), (part a), the individual filter response (part b), and the composite response (part c) of a Q = 15 channel, uniform filter bank, designed singal 101-point Kaiser window smoothed lowpass window (after Dautrich et al).
27 Practical Examples of Speech-Recognition Filter Banks VALUE TIME IN SAMPLES MAGNITUDE (dB) FREQUENCY (kHz)
28 Practical Examples of Speech-Recognition Filter Banks MAGNITUDE (dB) FREQUENCY (kHz) Window sequence, w(n), (part a), the individual filter response (part b), and the composite response (part c) of a Q = 15 channel, uniform filter bank, designed singal 101-point Kaiser window directly as the lowpass window (after Dautrich etal).
29 Generalizations of Filter-Bank Analyzer
Generalizations of Filter-Bank Analyzer 30
31 Generalizations of Filter-Bank Analyzer
Generalizations of Filter-Bank Analyzer 32
Recommend
More recommend