big picture
play

Big Picture Machine Learning 10701/15781 Carlos Guestrin Carnegie - PowerPoint PPT Presentation

Big Picture Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University March 2 nd , 2005 What you have learned thus far Learning is function approximation Point estimation Regression Nave Bayes Logistic


  1. Big Picture Machine Learning – 10701/15781 Carlos Guestrin Carnegie Mellon University March 2 nd , 2005

  2. What you have learned thus far � Learning is function approximation � Point estimation � Regression � Naïve Bayes � Logistic regression � Bias-Variance tradeoff � Neural nets � Decision trees � Cross validation � Boosting � Instance-based learning � SVMs � Kernel trick � PAC learning � VC dimension � Margin bounds � Mistake bounds

  3. Review material in terms of… � Types of learning problems � Hypothesis spaces � Loss functions � Optimization algorithms

  4. Text Classification Company home page vs Personal home page vs Univeristy home page vs …

  5. Function fitting 12 50 9 54 OFFICE OFFICE QUIET PHONE 16 11 15 51 8 53 10 13 52 14 CONFERENCE 49 17 7 18 STORAGE 48 LAB 19 5 6 ELEC COPY 47 4 20 46 21 3 45 2 SERVER 22 44 KITCHEN 1 23 43 39 37 33 29 27 35 40 31 25 24 42 41 34 38 36 32 30 28 26 Temperature data

  6. Monitoring a complex system � Reverse water gas shift system (RWGS) � Learn model of system from data � Use model to predict behavior and detect faults

  7. Types of learning problems � Classification Input – Features � Regression Output? 28 26 24 22 20 18 40 30 100 80 20 60 40 10 20 0 0 � Density estimation

  8. The learning problem Features/Function approximator Learned function Data Loss function <x 1 ,…,x n ,y> Learning task Optimization algorithm

  9. Comparing learning algorithms � Hypothesis space � Loss function � Optimization algorithm

  10. Logistic regression Naïve Bayes versus Logistic regression Naïve Bayes

  11. Naïve Bayes versus Logistic regression – Classification as density estimation � Choose class with highest probability � In addition to class, we get certainty measure

  12. Logistic regression versus Boosting Logistic regression Boosting Classifier Log-loss Exponential-loss

  13. Linear classifiers – Logistic regression versus SVMs w . x + b = 0

  14. What’s the difference between SVMs and Logistic Regression? (Revisited again) SVMs Logistic Regression Loss function Hinge loss Log-loss High dimensional Yes! Yes! features with kernels Solution sparse Often yes! Almost always no! Type of learning

  15. SVMs and instance-based learning SVMs Classify as Instance based learning Data Classify as <x 1 ,…,x n ,y>

  16. Instance-based learning versus Decision trees Decision trees 1-Nearest neighbor

  17. Logistic regression versus Neural nets Neural Nets Logistic regression

  18. Linear regression versus Kernel regression Linear Kernel Kernel-weighted Regression regression linear regression

  19. Kernel-weighted linear regression Local basis functions for each region Kernels average 12 50 9 54 OFFICE OFFICE QUIET PHONE 16 between 11 15 51 8 53 10 13 52 14 CONFERENCE 49 17 regions 7 18 STORAGE 48 LAB 19 5 6 ELEC COPY 47 4 20 46 21 3 45 2 SERVER 22 44 KITCHEN 1 23 43 39 37 33 29 27 35 40 31 25 24 42 41 34 38 36 32 30 28 26

  20. w . x + b - ε SVM regression w . x + b w . x + b + ε

  21. BIG PICTURE DE density estimation learning Cl Classification task (a few points of comparison) Reg Regression LL Log-loss/MLE loss Mrg Margin-based Boosting function Naïve Cl, exp-loss RMS Squared error Bayes DE, LL SVM regression Logistic SVMs Reg, Mrg Cl, Mrg regression DE, LL kernel regression Instance-based Reg, RMS Learning DE,Cl,Reg Neural Nets linear Decision DE,Cl,Reg,RMS regression trees Reg, RMS DE,Cl,Reg This is a very incomplete view!!!

Recommend


More recommend