practical unsupervised learning
play

Practical Unsupervised Learning INFO/CS 4300, Spring 2016 Jack - PowerPoint PPT Presentation

Practical Unsupervised Learning INFO/CS 4300, Spring 2016 Jack Hessel Unsupervised Learning is Cool! But how can we use this in our projects? 1. 2. But first, lets look at our dataset... Data Dimensionality! Probability of Lung Cancer


  1. Practical Unsupervised Learning INFO/CS 4300, Spring 2016 Jack Hessel

  2. Unsupervised Learning is Cool!

  3. But how can we use this in our projects? 1. 2.

  4. But first, let’s look at our dataset...

  5. Data Dimensionality! Probability of Lung Cancer Developing X X X X X X X X X Cigarettes Consumed Per Day

  6. Data Dimensionality! Probability of Lung Cancer Developing X X X X X X X X X Cigarettes Consumed Per Day

  7. Data Dimensionality! Probability of Lung X Cancer Developing X X X X X X X X Cigarettes Consumed Per Day

  8. Data Dimensionality! Probability of Lung X Cancer Developing X X X X X X X X Cigarettes Consumed Per Day

  9. Data Dimensionality! X Probability of Lung X Cancer Developing X X X X X X X X Cigarettes Consumed Per Day

  10. Data Dimensionality! Probability of Lung Cancer Developing X X X X X X X X X Cigarettes Consumed Per Day

  11. Data Dimensionality! Probability of Lung Cancer Developing X X X X X X X X X m o r f ? e t c h n g e i e r e h f Cigarettes Consumed f n i D a e m Per Day

  12. Data Dimensionality! 1 Dimension! Probability of Lung Cancer Developing X X X X X X X X X Cigarettes Consumed Per Day

  13. Words and documents are the same way... |D| X tfidf |V|

  14. Words and documents are the same way... |D| Pineapple X tfidf |V|

  15. Words and documents are the same way... |D| Pineapple X X tfidf |V|

  16. Words and documents are the same way... |D| Pineapple X X tfidf |V|

  17. Words and documents are the same way... |D| Pineapple X (but, really -- a low dimensional subspace…) X tfidf |V|

  18. Words and documents are the same way... e r e w s |D| e l p p a ” … e n d i e P l “ l a c e r X X tfidf |V|

  19. Words and documents are the same way... e r e w s |D| e l p p a ” … e n d i e P l “ l a c e r X X tfidf |V|

  20. Words and documents are the same way... e r e w s |D| e l p p a ” … e n d i e P l “ l a c e r X (but, really -- a low dimensional subspace…) X tfidf |V|

  21. Key questions in unsupervised NLP:

  22. Key questions in unsupervised NLP: 1. How many dimensions does our dataset actually live in?

  23. Key questions in unsupervised NLP: 1. How many dimensions does our dataset actually live in? 2. How do we project our data down to those dimensions?

  24. Key questions in unsupervised NLP: 1. How many dimensions does our dataset actually live in? 2. How do we project our data down to those dimensions? 3. Does any of this stuff actually do anything for our projects?

  25. Key tool in Linear Algebra, NLP, Machine Learning, Data Science, Computer Vision, Algorithms, Matrix Computations, Optimization, Statistics...

  26. Key tool in Linear Algebra, NLP, Machine Learning, Data Science, Computer Vision, Algorithms, Matrix Computations, Optimization, Statistics...

  27. Key tool in Linear Algebra, NLP, Machine Learning, Data Science, Computer Vision, Algorithms, Matrix Computations, Optimization, Statistics...

  28. Key tool in Linear Algebra, NLP, Machine Learning, Data Science, Computer Vision, Algorithms, Matrix Computations, Optimization, Statistics...

  29. |D| X tfidf |V| =

  30. |D| |D| 1 1 X tfidf |V| = |V|

  31. |D| |D| 1 1 X tfidf |V| = k |V|

  32. |D| |D| 1 1 k 1 1 X tfidf |V| = |V|

  33. |D| |D| 2 2 k 1 0 2 2 0 k 2 X tfidf |V| = |V|

  34. |D| |D| 2 2 k 1 0 2 2 0 k 2 X tfidf |V| = |V|

  35. Key questions in unsupervised NLP: 1. How many dimensions does our dataset actually live in? 2. How do we project our data down to those dimensions?

  36. Enough talk, time for some magic.

  37. |D| X tfidf + = |V|

  38. |D| X tfidf + = |V| Latent Semantic Indexing (LSI)

  39. |D| X tfidf + = |V| Latent Semantic Indexing (LSI) = Latent Semantic Analysis (LSA)

  40. As a side note... Great first NLP paper to read! Highly accessible :-) “Indexing by latent semantic analysis” - Deerwester et al. 1990 Scott Deerwester

  41. What are topic models? |D| |D| k k X tfidf |V| |V|

  42. What are topic models? |D| |D| k k X tfidf |V| |V| Latent semantic indexing (Deerwater et al. 1990)

  43. What are topic models? |D| |D| k k X tfidf |V| |V| Latent semantic indexing Non-negative matrix factorization (Deerwater et al. 1990) (Lee and Seung 1999)

  44. What are topic models? |D| |D| Latent dirichlet allocation k (Blei et al. 2003) k X tfidf |V| |V| Latent semantic indexing Non-negative matrix factorization (Deerwater et al. 1990) (Lee and Seung 1999)

  45. Why do we care?? |D| k k |V|

  46. Why do we care?? |D| k k |V|

  47. Why do we care?? |D| k k Interpretable, small number of features for |V| text classification!

  48. Why do we care?? |D| k k Interpretable, small number of features for |V| text classification!

  49. Document length *matters a lot*

  50. Document length *matters a lot* Different regimes of supervised NLP (Jack’s opinions only! Lots of caveats!)

  51. Document length *matters a lot* Different regimes of supervised NLP (Jack’s opinions only! Lots of caveats!) Less words More words 50-100 Words

  52. Document length *matters a lot* Different regimes of supervised NLP (Jack’s opinions only! Lots of caveats!) Topic models fail Topic models work Less words More words 50-100 Words

  53. Document length *matters a lot* Different regimes of supervised NLP (Jack’s opinions only! Lots of caveats!) Topic models fail Topic models work Less words More words 50-100 Words Naive bayes, n-gram features + linear classifier are almost always pretty good in practice :-)

  54. So can we see if our Kickstarter will be successful?

Recommend


More recommend