 
              MA2823: Introductjon to Machine Learning CentraleSupélec — Fall 2017 Chloé-Agathe Azencot Centre for Computatjonal Biology, Mines ParisTech chloe-agathe.azencott@mines-paristech.fr
● Course material & contact http://tinyurl.com/ma2823-2017 chloe-agathe.azencott@mines-paristech.fr Slides thanks to Ethem Alpaydi, Matuhew Blaschko, Trevor Hastje, Rob Tibshirani and Jean-Philippe Vert. 2
? What is (Machine) Learning 3
Why Learn? ● Learning : Modifying a behavior based on experience [F. Benureau] ● Machine learning : Programming computers to – Model phenomena – by means of optjmizing an objectjve functjon – using example data. 4
Why Learn? ● There is no need to “learn” to calculate payroll. ● Learning is used when – Human expertjse does not exist (bioinformatjcs); – Humans are unable to explain their expertjse (speech recognitjon, computer vision); – Complex olutjons change in tjme (routjng computer networks). Data Classical program Answers Rules Data Machine learning Rules Answers program 5
What about AI? 6
Artjfjcial Intelligence ML is a subfjeld of Artjfjcial Intelligence – A system that lives in a changing environment must have the ability to learn in order to adapt. – ML algorithms are building blocks that make computers behave more intelligently by generalizing rather than merely storing and retrieving data (like a database system would do). 7
Learning objectjves ● Defjne machine learning ● Given a problem – Decide whether it can be solved with machine learning – Decide as what type of machine learning problem you can formalize it ( unsupervised – clustering , dimension reductjon , supervised – classifjcatjon , regression ?) – Describe it formally in terms of design matrix , features , samples , and possibly target . ● Defjne a loss functjon (supervised settjng) ● Defjne generalizatjon. 8
What is machine learning? ● Learning general models from partjcular examples (data) – Data is (mostly) cheap and abundant; – Knowledge is expensive and scarce. ● Example in retail: From customer transactjons to consumer behavior People who bought “Game of Thrones” also bought “Lord of the Rings” [amazon.com] ● Goal: Build a model that is a good and useful approximatjon to the data. 9
What is machine learning? ● Optjmizing a performance criterion using example data or past experience. ● Role of Statjstjcs: Build mathematjcal models to make inference from a sample. ● Role of Computer Science: Effjcient algorithms to – Solve the optjmizatjon problem; – Represent and evaluate the model for inference. 10
Zoo of ML Problems 11
Unsupervised learning Learn a new representatjon of the data Data! ML algo Data p Images, text, measurements, omics data... X n 12
Dimensionality reductjon Find a lower-dimensional representatjon ML algo Data Data p m Images, text, measurements, omics data... X X n n 13
Dimensionality reductjon Find a lower-dimensional representatjon ML algo Data Data – Reduce storage space & computatjonal tjme – Remove redundances – Visualizatjon (in 2 or 3 dimensions) and interpretability . 14
Clustering Group similar data points together ML algo Data – Understand general characteristjcs of the data; – Infer some propertjes of an object based on how it relates to other objects. 15
Clustering: applicatjons – Customer segmentatjon Find groups of customers with similar buying behaviors. – Topic modeling Groups documents based on the words they contain to identjfy common topics. – Image compression Find groups of similar pixels that can be easily summarized. – Disease subtyping (cancer, mental health) Find groups of patjents with similar pathologies (molecular or symptomes level). 16
Supervised learning Make predictjons ML algo Predictor Data Labels decision function p X y n n 17
Classifjcatjon Make discrete predictjons ML algo Predictor Data Labels 18
Classifjcatjon Make discrete predictjons ML algo Predictor Data Binary classifjcatjon Labels 19
Classifjcatjon Make discrete predictjons ML algo Predictor Data Binary classifjcatjon Labels Multj-class classifjcatjon 20
Classifjcatjon Dog good eater Cat human contact 21
Training set D Dog good eater + Cat - human contact 22
Classifjcatjon: Applicatjons – Face recognitjon Identjfy faces independently of pose, lightjng, occlusion (glasses, beard), make-up, hair style. – Vehicle identjfjcatjon (self-driving cars) – Character recognitjon Read letuers or digits independently of difgerent handwritjng styles. – Sound recognitjon Which language is spoken? Who wrote this music? What type of bird is this? – Spam detectjon – Precision medicine Does this sample come from a sick or healthy person? Will this drug work on this patjent? 23
Regression Make contjnuous predictjons ML algo Predictor Data Labels 24
Regression train occupancy time of day 25
Regression train occupancy time of day 26
Regression: Applicatjons – Click predictjon How many people will click on this ad? Comment on this post? Share this artjcle on social media? – Load predictjon How many users will my service have at a given tjme? – Algorithmic trading What will the price of this share be? – Drug development What is the binding affjnity between this drug candidate and its target? What is the sensibility of the tumor to this drug? 27
Supervised learning settjng features variables p descriptors atuributes data matrix outcome design matrix target label observatjons Binary classifjcatjon: X y samples n data points Multj-class classifjcatjon: Regression: 28
Hypothesis class ● Hypothesis class – The space of possible decision functjons we are considering – Chosen based on our beliefs about the problem 29
Hypothesis class ● Hypothesis class – The space of possible decision functjons we are considering – Chosen based on our beliefs about the problem family car x 2 : Engine power not family car x 1 : Price 30
Hypothesis class ● Hypothesis class – The space of possible decision functjons we are ? considering What shape do you think the discriminant should take? – Chosen based on our beliefs about the problem family car x 2 : Engine power not family car x 1 : Price 31
Hypothesis class ● Hypothesis class – Belief: the decision functjon is a rectangle family car x 2 : Engine power not family car e 2 e 1 x 1 : Price p 1 p 2 32
Loss functjon ● Loss functjon (or cost functjon , or risk ): Quantjfjes how far the decision functjon is from the truth (= oracle ). ● E.g. ? – 33
Loss functjon ● Loss functjon (or cost functjon , or risk ): Quantjfjes how far the decision functjon is from the truth (= oracle ). ● E.g. ? – 34
Loss functjon ● Loss functjon (or cost functjon , or risk ): Quantjfjes how far the decision functjon is from the truth (= oracle ). ● Empirical risk on dataset D 35
Supervised learning: 3 ingredients A good and useful approximatjon ● Chose a hypothesis class ● Parametric methods — e.g. ● Non-parametric methods — e.g. f(x) is the label of the point closest to x. ● Chose a loss functjon L Empirical error: ● Chose an optjmizatjon procedure 36
Generalizatjon A good and useful approximatjon ● It’s easy to build a model that performs well on the training data ● But how well will it perform on new data? ● “Predictjons are hard, especially about the future” — Niels Bohr. – Learn models that generalize well. – Evaluate whether models generalize well. 37
Artjfjcial intelligence Electrical engineering Signal processing Patern recognitjon Engineering Knowledge discovery Optjmizatjon in databases Computer science Data mining Inference Big data Discriminant analysis Business Statjstjcs Data science Inductjon http://www.kdnuggets.com/2016/11/machine-learning-vs-statistics.html 38
Learning objectjves Afuer this course, you should be able to – Identjfy problems that can be solved by machine learning; – Formulate your problem in machine learning terms – Given such a problem, identjfy and apply the most appropriate classical algorithm(s); – Implement some of these algorithms yourself; – Evaluate and compare machine learning algorithms for a partjcular task. 39
Course Syllabus ● Sep 29 1. Introductjon 2. Convex optjmizatjon ● Oct 2 3. Dimensionality reductjon Lab: Principal component analysis + Jupyter, pandas, and scikit-learn. ● Oct 6 4. Model selectjon Lab: Convex optjmizatjon with scipy.optjmize ● Oct 13 5. Bayesian decision theory Lab: Intro to Kaggle challenge ● Oct 20 6. Linear regression Lab: Linear regression 40
● Nov 10 7. Regularized linear regression Lab: Regularized linear regression ● Nov 17 8. Nearest-neighbor approaches Lab: Nearest-neighbor approaches ● Nov 24 9. Tree-based approaches Lab: Tree-based approaches ● Dec 01 10. Support vector machines Lab: Support vector machines ● Dec 08 11. Neural networks Deep learning (Joseph Boyd) + Bioimage informatjcs applicatjons (Peter Naylor) ● Dec 15 12. Clustering Lab: Clustering 41
Recommend
More recommend