Analysis of speech Dr. Anil Kumar Vuppala IIIT Hyderabad
Analysis of speech Representing speech signal on a digital computer • Sampling and Quantization Representing information present in speech • Extraction of parameters Method of analysis is application dependent
Types of Analysis based segment duration • Segmental (10 – 50 ms) • Short-time spectrum, formants, pitch • Subsegmental (1 – 5 ms) • Excitation source characteristics, glottal closure • Suprasegmental ( > 100 ms) • Prosodic features - Intonation, duration, energy contour
Preprocessing Preemphasis • Primarily used for emphasizing high frequency components wrt low frequency • High-pass filtering removes envelope y ( n ) =s ( n )− a ∗ s ( n − 1 ) H ( z )= Y ( z ) 1 S ( z ) = 1 − az − 1
Short-time Analysis Speech signal – quasistationary Block processing or short-time analysis Issues – window shape and size Methods -Short-time spectrum analysis - Filter bank analysis - Spectrographic analysis - Linear prediction analysis - Cepstral analysis
Filter bank analysis: Nonlinear frequency scales • Human ear is frequency selective • Higher resolution at low frequencies, vice-versa
Spectrographic Analysis Narrowband and Wideband
Linear prediction analysis LP residual gives an estimate of the excitation source Normalize LP error (residual to signal energy ratio) is useful in the analysis of different sounds, V/NV detection Peaks in Hilbert enevlop of the residual signal correspond to the GCIs 8
Spectral Envelope via LP analysis
Cepstral Analysis •Cepstrum is computed as IDFT of log-magnitude spectrum •Helps separate system and source information • Provides a compact representation of the spectral envelope • Can be evaluated from short-time (DFT) spectrum or LP spectrum.
Thank you
Recommend
More recommend