department of computer science csci 5622 machine learning
play

Department of Computer Science CSCI 5622: Machine Learning Chenhao - PowerPoint PPT Presentation

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 16: Dimensionality Reduction Slides adapted from Jordan Boyd-Graber, Chris Ketelsen 1 Midterm A. Review session B. Flipped classroom C. Go over the example


  1. Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 16: Dimensionality Reduction Slides adapted from Jordan Boyd-Graber, Chris Ketelsen 1

  2. Midterm A. Review session B. Flipped classroom C. Go over the example midterm D. Clustering! 2

  3. Learning objectives • Understand what unsupervised learning is for • Learn principal component analysis • Learn singular value decomposition 3

  4. Supervised learning Unsupervised learning Data: X Labels: Y Data: X 4

  5. Supervised learning Unsupervised learning Data: X Labels: Y Data: X Latent structure: Z 5

  6. When do we need unsupervised learning? 6

  7. When do we need unsupervised learning? • Acquiring labels is expensive • You may not even know what labels to acquire 7

  8. When do we need unsupervised learning? • Exploratory data analysis • Learn patterns/representations that can be useful for supervised learning (representation learning) • Generate data • … 8

  9. When do we need unsupervised learning? https://qz.com/1090267/artificial-intelligence-can-now-show-you- how-those-pants-will-fit/ 9

  10. Unsupervised learning • Dimensionality reduction • Clustering • Topic modeling 10

  11. Unsupervised learning • Dimensionality reduction • Clustering • Topic modeling 11

  12. Principal Component Analysis - Motivation 12

  13. Principal Component Analysis - Motivation Data’s features almost certainly correlated 13

  14. Principal Component Analysis - Motivation Makes it hard to see hidden structure 14

  15. Principal Component Analysis - Motivation To make this easier, let try to reduce this to 1-dimension 15

  16. Principal Component Analysis - Motivation We need to shift our perspective Change the definition of up-down-left-right Choose new features as linear combinations of old features Change of feature-basis 16

  17. Principal Component Analysis - Motivation We need to shift our perspective Change the definition of up-down-left-right Choose new features as linear combinations of old features Change of feature-basis Important: Center and normalize data before performing PCA We will assume that this has already been done in this lecture. 17

  18. Principal Component Analysis - Motivation Proceed incrementally: • If we could choose one combination to describe data? • Which combination leads to the least loss of information? • Once we've found that one, look for another one, perpendicular to the first, the retains the next most amount of information- • Repeat until done (or good enough) 18

  19. Principal Component Analysis - Motivation 19

  20. Principal Component Analysis - Motivation 20

  21. Principal Component Analysis - Motivation 21

  22. Principal Component Analysis - Motivation 22

  23. Principal Component Analysis - Motivation 23

  24. Principal Component Analysis - Motivation 24

  25. Principal Component Analysis - Motivation 25

  26. Principal Component Analysis - Motivation 26

  27. Principal Component Analysis - Motivation 27

  28. Principal Component Analysis - Motivation 28

  29. Principal Component Analysis - Motivation 29

  30. Principal Component Analysis - Motivation 30

  31. Principal Component Analysis - Motivation 31

  32. Principal Component Analysis - Motivation The best vector to project onto is called the 1 st principal component What properties should it have? 32

  33. Principal Component Analysis - Motivation The best vector to project onto is called the 1 st principal component What properties should it have? • Should capture largest variance in data • Should probably be a unit vector 33

  34. Principal Component Analysis - Motivation The best vector to project onto is called the 1 st principal component What properties should it have? • Should capture largest variance in data • Should probably be a unit vector After we’ve found the first, look the second which: • Captures largest amount of leftover variance • Should probably be a unit vector • Should be orthogonal to the one that came before it 34

  35. Principal Component Analysis - Motivation 35

  36. Principal Component Analysis - Motivation 36

  37. Principal Component Analysis - Motivation Main idea: The principal components give a new perpendicular coordinate system to view data where each principle component describes successively less and less information. 37

  38. Principal Component Analysis - Motivation Main idea: The principal components give a new perpendicular coordinate system to view data where each principle component describes successively less and less information. So far: All we’ve done is a change of basis on the feature space. But when do we reduce the dimension? 38

  39. Principal Component Analysis - Motivation But when do we reduce the dimension? Picture data points in a 3D feature space What if the points lied mostly along a single vector? 39

  40. Principal Component Analysis - Motivation The other two principal components are still there But they do not carry much information 40

  41. Principal Component Analysis - Motivation The other two principal components are still there But they do not carry much information Throw them away and work with low dimensional representation! Reduce 3D data to 1D 41

  42. Principal Component Analysis – The How 42

  43. Principal Component Analysis – The How 43

  44. Principal Component Analysis – The How 44

  45. Principal Component Analysis – The How 45

  46. Principal Component Analysis – The How But how do we find w ? 46

  47. Principal Component Analysis – The How But how do we find w ? 47

  48. Principal Component Analysis – The How 48

  49. Principal Component Analysis – The How 49

  50. Principal Component Analysis – The How 50

  51. Principal Component Analysis – The How 51

  52. Principal Component Analysis – The How 52

  53. Principal Component Analysis – The How 53

  54. Principal Component Analysis – The How 54

  55. Principal Component Analysis – The How 55

  56. Principal Component Analysis – The How 56

  57. Principal Component Analysis – The How 57

  58. Principal Component Analysis – The How 58

  59. Principal Component Analysis – The How 59

  60. PCA – Dimensionality reduction Questions: • How do we reduce dimensionality? • How much stuff should we keep? 60

  61. PCA – Dimensionality reduction 61

  62. PCA – Dimensionality reduction 62

  63. Quiz 63

  64. PCA - applications 64

  65. PCA - applications 65

  66. PCA - applications 66

  67. PCA - applications 67

  68. PCA - applications 68

  69. PCA - applications 69

  70. Connecting PCA and SVD 70

  71. SVD Applications 71

  72. Wrap up Dimensionality reduction can be a useful way to • explore data • visualize data • represent data 72

Recommend


More recommend