Lecture-oct4-a 03 October 2010 11:20 Cepstral analysis in speech processing From speech production model, we have: s[n] = (p[n]*g[n] + u[n]) * v[n] *r[n] p[n] => periodic impulse train u[n] => random white noise g[n] => glottal filter impulse response v[n] => vocal tract impulse response r[n] => lip radiation system impulse response Consider voiced speech: s[n] = p[n] * g[n] * v[n] * r[n] => S(z) = P(z)H(z) where H(z) = G(z)V(z)R(z) The convolved components p[n] and h[n] are additive in the complex cepstrum H(z) will give a complex cepstrum • which is non-zero for both positive and negative time • which decays rapidly for large n P(z) gives a complex cepstrum consisting of decaying impulses at multiples of the pitch period The real cepstrum is the even part of the complex cepstrum Screen clipping taken: 25-09-2013, 15:59 Class A Page 1
Screen clipping taken: 25-09-2013, 16:00 From: cepstrum*murphy.pdf Example of some real cepstra: Screen clipping taken: 03-10-2010, 11:48 From Oppenheim and Schafer, Discrete-time Signal Processing, PHI, 1989 The example suggests that the a window applied to the cepstrum can separate the 2 components. Class A Page 2
Class A Page 3
Lecture-oct4-c 03 October 2010 12:43 Speech parameter estimation Short-time analysis needed for: Formant estimation • Pitch and voicing detection • The low-quefrency part of the cepstrum corresponds primarily to the vocal tract, glottal shaping and radiation. The high-quefrency part is due primarily to the excitation. Part of "chase" [y-axis:increasing time] (From O&S, DT signal processing, PHI, 1989 Class A Page 4
Class A Page 5
Recommend
More recommend