lecture 8
play

Lecture 8 N.MORGAN / B.GOLD LECTURE 8 - PowerPoint PPT Presentation

LECTURE ON PATTERN RECOGNITION EE 225D University of California Berkeley College of Engineering Department of Electrical Engineering and Computer Sciences Professors : N.Morgan / B.Gold EE225D Spring,1999 Pattern Classification Lecture 8


  1. LECTURE ON PATTERN RECOGNITION EE 225D University of California Berkeley College of Engineering Department of Electrical Engineering and Computer Sciences Professors : N.Morgan / B.Gold EE225D Spring,1999 Pattern Classification Lecture 8 N.MORGAN / B.GOLD LECTURE 8 8.1

  2. LECTURE ON PATTERN RECOGNITION EE 225D Speech Pattern Recognition •Soft pattern classification plus temporal sequence integration •Supervised pattern classification: class labels used in training •Unsupervised pattern classification: class labels not available or used N.MORGAN / B.GOLD LECTURE 8 8.2

  3. LECTURE ON PATTERN RECOGNITION EE 225D Feature Vector ω k Pattern Feature Classification Extraction ≤ < 1 k K x 1 x 2 x d N.MORGAN / B.GOLD LECTURE 8 8.3

  4. LECTURE ON PATTERN RECOGNITION EE 225D •Training: learning parameters of classifier •Testing: classify independent test set, compare with labels and score N.MORGAN / B.GOLD LECTURE 8 8.4

  5. LECTURE ON PATTERN RECOGNITION EE 225D N.MORGAN / B.GOLD LECTURE 8 8.5

  6. LECTURE ON PATTERN RECOGNITION EE 225D N.MORGAN / B.GOLD LECTURE 8 8.6

  7. LECTURE ON PATTERN RECOGNITION EE 225D Feature Extraction Criteria •Class discrimination •Generalization •Parsimony (efficiency) N.MORGAN / B.GOLD LECTURE 8 8.7

  8. LECTURE ON PATTERN RECOGNITION EE 225D ( ) E t E t ) ( ) t t plosive + vowel energies for 2 different gains N.MORGAN / B.GOLD LECTURE 8 8.8

  9. LECTURE ON PATTERN RECOGNITION EE 225D ∂ ∂ ( ) ( ( ) ) log CE t = log C + log E t ∂ ∂ t t ∂ ( ) = log E t ∂ t N.MORGAN / B.GOLD LECTURE 8 8.9

  10. LECTURE ON PATTERN RECOGNITION EE 225D Feature Vector Size •Best representations for discrimination on training set are large (highly dimensioned) •Best representations for generalization to test set are (typically) succinct) N.MORGAN / B.GOLD LECTURE 8 8.10

  11. LECTURE ON PATTERN RECOGNITION EE 225D Dimensionality Reduction •Principal components (i.e., SVD, KL transform, eigenanalysis ...) •Linear Discriminant Analysis (LDA) •Application-specific knowledge •Feature Selection via PR Evaluation N.MORGAN / B.GOLD LECTURE 8 8.11

  12. LECTURE ON PATTERN RECOGNITION EE 225D f 2 x x o x o x o x o x o x o x o o f 1 N.MORGAN / B.GOLD LECTURE 8 8.12

  13. LECTURE ON PATTERN RECOGNITION EE 225D N.MORGAN / B.GOLD LECTURE 8 8.13

  14. LECTURE ON PATTERN RECOGNITION EE 225D PR Methods •Minimum Distance •Discriminant Functions •Linear Discriminant •Nonlinear Discriminant (e.g, quadratic, neural networks) •Statistical Discriminant Functions N.MORGAN / B.GOLD LECTURE 8 8.14

  15. LECTURE ON PATTERN RECOGNITION EE 225D Minimum Distance •Vector or matrix representing element •Define a distance function •Choose the class of stored element closest to new input •Choice of distance equivalent to implicit statistical assumptions •For speech, temporal variability complicates this N.MORGAN / B.GOLD LECTURE 8 8.15

  16. LECTURE ON PATTERN RECOGNITION EE 225D z i = template vector (prototype) x = input vector Choose i to minimize distance T x T x ( ) ( ) ( ) ( ) ( ) T x T z i T z i arg i min x – z i – z i = arg i min x – z i – z i = arg i min x + z i – 2 x     T z i T z i arg i max z i – 2 x 1 - - - - - - - - - - - - - - - - - - - - - - - - - - - z i - T z i T z i = arg i max x –     – 2 2 ⇒ ( ) T z i T z i If z i = 1 for all i arg i max x N.MORGAN / B.GOLD LECTURE 8 8.16

  17. LECTURE ON PATTERN RECOGNITION EE 225D Problems with Min Distance •Proper scaling of dimensions (size, discrimination) •For high dim, sparsely sampled space N.MORGAN / B.GOLD LECTURE 8 8.17

  18. LECTURE ON PATTERN RECOGNITION EE 225D Decision Rule for Min Distance •Nearest Neighbor (NN) - in the limit of infinite samples, at most twice the error of optimum classifier •k-Nearest Neighbor (kNN) •Lots of storage for large problems; potentially large searches N.MORGAN / B.GOLD LECTURE 8 8.18

  19. LECTURE ON PATTERN RECOGNITION EE 225D Some Opinions •Better to throw away bad data than to reduce its weight •Dimensionality-reduction based on variance often a bad choice for supervised pattern recognition N.MORGAN / B.GOLD LECTURE 8 8.19

  20. LECTURE ON PATTERN RECOGNITION EE 225D Discriminant Analysis •Discriminant functions max for correct class, min for others •Decision surface between classes •Linear decision surface for 2-dim is line, for 3 is plane; generally called hyperplane ω T x ω ω ω ω 0 ω ω ω •For 2 classes, surface at + = 0 ω ω T x ω ω ω ω ω ω 0 •2-class quadratic case, surface at x T Wx + + = 0 N.MORGAN / B.GOLD LECTURE 8 8.20

  21. LECTURE ON PATTERN RECOGNITION EE 225D N.MORGAN / B.GOLD LECTURE 8 8.21

  22. LECTURE ON PATTERN RECOGNITION EE 225D Training Discriminant Functions •Minimum distance •Fisher linear discriminant •Gradient learning N.MORGAN / B.GOLD LECTURE 8 8.22

Recommend


More recommend