speech signal representations part 1 digital signal
play

Speech Signal Representations Part 1: Digital Signal Processing - PowerPoint PPT Presentation

Speech Signal Representations Part 1: Digital Signal Processing Hsin-min Wang References: 1 X. Huang et al., Spoken Language Processing, Chapters 5-6 2 J. R. Deller et al., Discrete-Time Processing of Speech Signals, Chapters 4-6 3 J. W.


  1. Speech Signal Representations Part 1: Digital Signal Processing Hsin-min Wang References: 1 X. Huang et al., Spoken Language Processing, Chapters 5-6 2 J. R. Deller et al., Discrete-Time Processing of Speech Signals, Chapters 4-6 3 J. W. Picone, “Signal modeling techniques in speech recognition,” proceedings of the IEEE, September 1993, pp. 1215-1247 1

  2. Introduction � Current speech recognition systems are mainly composed of: − A front-end feature extractor (feature extraction module) • Discover salient characteristics suited for classification • Based on scientific and/or heuristic knowledge about patterns to recognize − A back-end classifier (classification module) • Set class boundaries accurately in the feature space • Statistically designed according to the fundamental Bayes’ decision theory 2

  3. Analog Signal to Digital Signal Analog Signal Digital Signal: Discrete-time Discrete-time Signal or Digital Signal signal with discrete amplitude [ ] ( ) = x n x nT , T : sampling period; a t = nT sampling period=125 μ s 1 T F s = sampling rate =>sampling rate=8kHz 3

  4. Two Main Approaches to Digital Signal Processing � Filtering Signal in Signal out Filter [ ] [ ] x n y n Amplify or attenuate some frequency components of [ ] x n � Parameter Extraction Signal in Parameter out Parameter [ ] Extraction x n     c c   c 21 L 1 11       c c c     e.g.:   22 L 2 12       1. Spectrum Estimation       2. Parameter for Recognition             c c c       2 m Lm 1 m 4

  5. 5.1 Digital Signals and Systems 5

  6. Sinusoidal Signals [ ] ( ) = ω + φ x n A cos n f : normalized frequency ≤ f ≤ 0 1 A − : amplitude ( 振幅 ) ω ω = π 2 f − : angular frequency ( 角頻率 ), φ − : phase ( 相角 ) π   = [ ] period : T 25 samples = ω n −   x n A cos  2  = frequency : f 0 . 04 6

  7. Sinusoidal Signals – periodic vs. non-periodic [ ] [ ] [ ] + = is periodic with period N if and only if x n x n N x n ( ) ( ) ω + + φ = ω + φ A cos ( n N ) A cos n π 2 ω = π ω = N 2 N ( ) ω + φ A cos n is not periodic for all values of w � Examples [ ] ( ) = π − is periodic with period N=8 x n cos n / 4 1 [ ] ( ) = π − x n cos 3 n / 8 is periodic with period N=16 2 [ ] ( ) = − x n cos n is not periodic 3 7

  8. Sinusoidal Signals – periodic vs. non-periodic (cont.) [ ] ( ) = π x n cos n / 4 1 π π π     =  +  =  +  cos ( n N ) cos n N 1 1  4   4 4  π ⇒ = π ⋅ ⇒ = N 2 k N 8 k (both N and k are intergers) 1 1 2 4 ∴ = period N 8 1 [ ] ( ) = π x n cos 3 n / 8 2  π   π π  3 3 3 = + = +     cos ( n N ) cos n N 2 2 8 8 8     π 3 16 ⇒ = π ⋅ ⇒ = N 2 k N k (both N and k are intergers) 2 2 2 8 3 ∴ = period N 16 2 [ ] ( ) = x n cos n 3 = + cos( n N ) 3 ⇒ = π ⋅ N 2 k 3 can' t find N that satistify this equation under the condition that both N and k are intergers 3 3 ⇒ non - periodic 8

  9. Sinusoidal Signals – complex exponential expression � A complex number z can be expressed in Cartesian form φ = + = − e j z x jy , j 1 = φ + φ cos j sin � The complex can also be expressed in polar form φ j = φ z Ae , where A is the amplitude and is the phase A sinusoidal signal can be expressed as the real part of the corresponding complex exponential [ ] ( ) = ω + φ x n A cos n = φ = φ x A cos( ), y A sin( ) { } ( ) ω + φ = j n Re Ae 9

  10. Sinusoidal Signals – sum of two signals � The sum of two complex exponential signals with same frequency ( ) ( ) ω + φ ω + φ j n + j n A e A e 0 1 0 1 ( ) φ φ ω = j n j + j e A e A e 0 1 0 1 ω φ = j n j e Ae ( ) ω + φ = j n Ae A , A and A are real numbers 0 1 − taking the real part ( ) ( ) ( ) ω + φ + ω + φ = ω + φ A cos n A cos n A cos n 0 0 1 1 The sum of N sinusoids of the same frequency is another sinusoid of the same frequency 10

  11. Some Digital Signals 11

  12. Some Digital Signals – (cont.) � Any sequence x [ n ] can be represented as a sum of shift and scaled unit impulse sequences (signals) [ ] [ ] [ ] ∞ = δ − x n x k n k ∑ Time-shifted unit = −∞ k Scale/weighted impulse sequence 12

  13. Digital Systems � A digital system T is a system that, given an input signal x [ n ] , generates an output signal y [ n ] [ ] [ ] { } = y n T x n � Properties of digital systems [ ] [ ] [ ] [ ] { } { } { } + = + − Linear : T ax n bx n aT x n bT x n 1 2 1 2 • Linear combination of inputs maps to linear combination of outputs [ ] [ ] { } − = − y n n T x n n − Time-invariant : 0 0 • A time shift in the input by n 0 samples gives a shift in the output by n 0 samples 13

  14. LTI Systems � Linear-time-invariant (LTI) : system output can be expressed as a convolution ( 迴旋積分 ) of the input x [ n ] and the impulse response h [ n ] [ ] [ ] [ ] Time-shifted unit ∞ = δ − x n x k n k ∑ impulse sequence = −∞ k scale { } [ ] [ ] [ ] { } ∞ ⇒ = δ − T x n T x k n k ∑ = −∞ k linear [ ] [ ] ∞ { } = δ − x k T n k ∑ Impulse response = −∞ k [ ] [ ] [ ] [ ] ∞ δ n h n = − x k h n k Digital ∑ Unit impulse System = −∞ k [ ] [ ] Time-invariant = ∗ x n h n Time invariant [ ] [ ] δ  → T n h n convolution [ ] [ ] δ −  → T − n k h n k 14

  15. LTI Systems (cont.) Length= M =3 [ ] 3 δ n 1 [ ] 2 h n LTI 0 1 -2 Length= L =3 [ ] 3 x n ? Length= L+M-1 2 1 LTI 9 0 1 2 [ ] 3 ⋅ 3 h n 3 2 Sum up [ ] ⋅ δ 3 n 0 1 [ ] 11 y n -6 0 [ ] 6 3 ⋅ − 1 3 2 h n 1 2 2 4 [ ] ⋅ δ − 3 2 n 1 1 0 2 -1 1 1 2 -2 -4 [ ] − h n 2 1 3 1 [ ] ⋅ δ − 4 1 n 2 2 2 3 15 -2

  16. LTI Systems - convolution � Reflect h [ k ] about the origin ( → h [- k ] ) � Slide ( h [ - k ] → h [- k + n ] or h [-( k - n) ] ), multiply with x [ k ] � Sum up [ ] x k [ ] Reflect Multiply h k Sum up [ ] h − k slide 16

  17. 3 1 [ ] [ ] [ ] [ ] 2 h k = ∗ y n x n h n 0 1 [ ] [ ] -2 ∞ Reflect = − x k h n k ∑ 3 2 = −∞ 1 k [ ] x k 0 1 2 [ ] 3 = y n , n 0 3 1 -2 [ ] h − k 0 -1 0 -2 11 [ ] 3 = y n , n 1 1 [ ] -1 − k + h 1 Sum up 0 1 1 -2 [ ] 11 3 [ ] y n 1 = y n , n 2 1 [ ] 3 0 − k + 1 3 h 2 4 2 1 2 1 0 2 -2 -1 3 [ ] = -2 y n , n 3 1 [ ] 3 1 − k + h 3 2 3 -1 [ ] -2 = y n , n 4 3 1 [ ] 4 − k + 2 h 4 3 4 -2 -2 17

  18. LTI Systems – convolution (cont.) � Convolution is commutative and distributive [ ] [ ] [ ] x n * h n * h n 1 2 [ ] [ ] [ ] = x n * h n * h n 2 1 [ ] [ ] [ ] [ ] h 1 n h 2 n h 2 n h 1 n Commutation [ ] ( [ ] [ ] ) + x n * h n h n [ ] 1 2 [ ] [ ] [ ] [ ] h 2 n = + x n * h n x n * h n [ ] [ ] 1 2 + h n h n Distribution 1 2 [ ] h 1 n [ ] [ ] [ ] = y n x n * h n – An impulse response has finite duration [ ] [ ] = h n * x n » Finite-Impulse Response (FIR) [ ] [ ] ∞ = − x k h n k ∑ – An impulse response has infinite duration = −∞ k [ ] [ ] ∞ » Infinite-Impulse Response (IIR) = − h k x n k ∑ = −∞ k 18

  19. 5.2 Continuous-Frequency Transforms 19

  20. Discrete-Time Fourier Transform (DTFT) [ ] = jw n x [ n ] e h n y [ n ]=? 0 ( ) ∞ ∞ ∑ ∑ ω − ω − ω ω ω = = = j ( n k ) j n j k j n j y [ n ] h [ k ] e e h [ k ] e e H e 0 0 0 0 0 = −∞ = −∞ k k When the input is a complex exponential, the output is another complex exponential of the same frequency and amplitude multiplied ( ) ω by the complex quantity given by j H e 0 ( ) ∞ ∑ ω − = j jwn H e h [ n ] e = −∞ n The discrete-time Fourier transform of h [ n ] 20

  21. Discrete-Time Fourier Transform (cont.) ( ) ω � The discrete-time Fourier transform of h [ n ], , is a j H e periodic function of w with period 2 π − One period can fully describe it, typically – π < w < π ( ) ω j − H e is a complex function of w , it can be expressed as real part imaginary part ( ) ( ) ( ) ω ω ω j = j + j Cartesian form H e H e jH e r i ( ) ( ) ω j ω ∠ = j j H e Polar form H e e phase magnitude 21

Recommend


More recommend