statistical machine learning
play

(Statistical Machine-Learning) General framework + Supervised - PDF document

Apprentissage Artificiel (Statistical Machine-Learning) General framework + Supervised Learning Pr. Fabien MOUTARDE Center for Robotics MINES ParisTech PSL Universit Paris Fabien.Moutarde@mines-paristech.fr


  1. Apprentissage Artificiel (Statistical Machine-Learning) General framework + Supervised Learning Pr. Fabien MOUTARDE Center for Robotics MINES ParisTech PSL Université Paris Fabien.Moutarde@mines-paristech.fr http://people.mines-paristech.fr/fabien.moutarde Statistical Machine-Learning: framework + supervised ML, Pr Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 1 Outline • Intro: What is Statistical Machine-Learning? • Typology of Machine-Learning • General formalism for SUPERVISED Learning • Evaluating learnt models: metrics for CLASSIFICATION • Generalization vs. overfitting Statistical Machine-Learning: framework + supervised ML, Pr Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 2

  2. What is Statistical Machine-Learning? ARTIFICIAL INTELLIGENCE STATISTICS OPTIMIZATION Data analysis Machine Learning Statistical Machine-Learning: framework + supervised ML, Pr Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 3 Statistical Machine-Learning • One of many sub-fields of Artificial Intelligence • Application of optimization methods to statistical modelling • Data-driven mathematical modelling, for automated classification , regression, partitioning/clustering , or decision/behavior rule Clustering Statistical Machine-Learning: framework + supervised ML, Pr Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 4

  3. Real-world examples of Machine-Learning applications • Handwritten characters recognition 3 6 Handwritten digits recognition 3 6 system … … • Object category visual recognition « non-pedestrians » Pedestrians Pedestrian recognition system • Speech recognition • Multi-factorial forecasting • Natural Language understanding • Playing GO! • MANY MANY MORE… Statistical Machine-Learning: framework + supervised ML, Pr Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 5 One of simplest ML algorithm: Least Squares Linear Regression • Model: (straight) line y=ax+b (2 parameters a and b ) • Data: n points with target value (x i ,y i ) ÎÂ 2 • Cost function: sum of squares of deviation from line K= S i (y i -a.x i -b) 2 • Algorithm: direct ( or iterative ) solving of linear system æ ö æ ö n n n å å å ç 2 ÷ ç ÷ x x x y æ a ö i i i i ç ÷ ç ÷ × ç ÷ = = = = i 1 i 1 i 1 ç ÷ ç ÷ ç ÷ n n b å è ø å x n y ç ÷ ç ÷ i i è ø è ø = = i 1 i 1 [Question: Where does this equation come from?] Statistical Machine-Learning: framework + supervised ML, Pr Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 6

  4. Regression vs. classification Regression Classification Input = point position target Output = class label output ( ¨ =-1,+=+1) ê Function label=f(x) input (and separation boundary) points = examples è curve = regression Continuous output(s) Discrete output(s) Statistical Machine-Learning: framework + supervised ML, Pr Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 7 Simplest classification method: Nearest Neighbors algorithm Principle of Nearest Neighbors (kNN) for classification [What are the main drawbacks of this method??] Statistical Machine-Learning: framework + supervised ML, Pr Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 8

  5. Outline • Intro: What is Statistical Machine-Learning? • Typology of Machine-Learning • General formalism for SUPERVISED Learning • Evaluating learnt models: metrics for CLASSIFICATION • Generalization vs. overfitting Statistical Machine-Learning: framework + supervised ML, Pr Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 9 Supervised vs Unsupervised learning Learning is called "supervised" when there are "target" values for every example in training dataset: examples = (input-output) = (x 1 ,y 1 ),(x 2 ,y 2 ), … ,(x n ,y n ) The goal is to build a (generally non-linear) approximate model for interpolation, in order to be able to GENERALIZE to input values other than those in training set "Unsupervised" = when there are NO target values: dataset = {x 1 , x 2 , … , x n } The goal is typically either to do datamining (unveil structure in the distribution of examples in input space), or to find an output maximizing a given evaluation function Statistical Machine-Learning: framework + supervised ML, Pr Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 11

  6. Machine-Learning TYPOLOGY Statistical Machine-Learning: framework + supervised ML, Pr Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 12 SUPERVISED LEARNING: regression or classification Regression Classification output input Input { x i , i =1,…N } = points positions Examples {(x i ,y i ), i =1,…N } target Output = class label ( ¨ =-1,+=+1) x i =input, y i =target output è Infer: label=h(x) è Infer: curve = regression y » h(x) (and separation boundary) y: Continuous output(s) y: Discrete output(s) Statistical Machine-Learning: framework + supervised ML, Pr Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 13

  7. UNSUPERVISED LEARNING: Clustering vs. Generative model Generative model Clustering From examples x n , estimate the PROBABILITY DISTRIBUTION p(x) è Can GENERATE new examples SIMILAR to those in training set Points = examples è partitioning in “groups” (colors) based on similarity Statistical Machine-Learning: framework + supervised ML, Pr Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 14 Reinforcement Learning (RL) Goal: find a “policy” a t = p (s t ) that maximizes Typical use of RL: learn a BEHAVIOR Statistical Machine-Learning: framework + supervised ML, Pr Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 15

  8. Outline • Intro: What is Statistical Machine-Learning? • Typology of Machine-Learning • General formalism for SUPERVISED Learning • Evaluating learnt models: metrics for CLASSIFICATION • Generalization vs. overfitting Statistical Machine-Learning: framework + supervised ML, Pr Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 16 Many different supervised ML approaches & algorithms • Linear regressions • Decision trees (ID3 or CART algorithms) • Bayesian (probabilistic) methods • … • Multi-layer neural networks trained with gradient backpropagation • Support Vector Machines • Boosting of "weak" classifiers • Random forests • Deep Learning (Convolutional Neural Networks,…) • … Statistical Machine-Learning: framework + supervised ML, Pr Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 17

  9. Supervised learning Examples (input-output) LEARNING (x 1 ,y 1 ), (x 2 ,y 2 ), … , ( x n , y n ) ALGORITHM H (usually based on h* Î H (parameterized) family optimization of mathematical models so that technique) h*(x i ) » y i Hyper-parameters for training algorithm In most cases, h*= argMin h Î H K ( h, {(x i ,y i )} ) where K=cost K = S i loss( h(x i ),y i ) [+ regularization-term] and loss= || h(x i )-y i || 2 Statistical Machine-Learning: framework + supervised ML, Pr Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 18 Cost function and loss function Most supervised Machine-Learning algorithms work by minimizing a "cost function" • The cost function is generally the average over all training examples of a "loss function" K = S i loss( h(x i ),y i ) (+ sometimes an additional « regularization » term) • The loss function is usually some measure of the difference between target value and prediction by the output of the learnt model Statistical Machine-Learning: framework + supervised ML, Pr Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 19

  10. Linear Multivariate Regression [From slide by ] Statistical Machine-Learning: framework + supervised ML, Pr Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 20 Logistic Multivariate Regression If target output is binary (classification) [From slide by ] Statistical Machine-Learning: framework + supervised ML, Pr Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 21

  11. Usual two distinct phases of supervised Machine-Learning Training Pedestrians « non-pedestrians » Recognition « non-cars» cars Input STATISTICAL MACHINE- CLASSIFIER LEARNING ALGORITHM Category (class) Statistical Machine-Learning: framework + supervised ML, Pr Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 22 Outline • Intro: What is Statistical Machine-Learning? • Typology of Machine-Learning • General formalism for SUPERVISED Learning • Evaluating learnt models: metrics for CLASSIFICATION • Generalization vs. overfitting Statistical Machine-Learning: framework + supervised ML, Pr Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL, Nov.2019 23

Recommend


More recommend