introduction lecturer prof aude billard aude billard epfl
play

Introduction Lecturer: Prof. Aude Billard (aude.billard@epfl.ch) - PowerPoint PPT Presentation

MACHINE LEARNING 2013 MACHINE LEARNING Introduction Lecturer: Prof. Aude Billard (aude.billard@epfl.ch) Assistants: Dr. Basilio Noris, Nicolas Sommer (basilio.noris@epfl.ch; n.sommer@epfl.ch) 1 1 MACHINE LEARNING 2013 Practicalities


  1. MACHINE LEARNING – 2013 MACHINE LEARNING Introduction Lecturer: Prof. Aude Billard (aude.billard@epfl.ch) Assistants: Dr. Basilio Noris, Nicolas Sommer (basilio.noris@epfl.ch; n.sommer@epfl.ch) 1 1

  2. MACHINE LEARNING – 2013 Practicalities Alternate: • Lectures: 9h15-11h00 + Exercises: 11h15-13h00 (in room MEB331) • Practicals 9h15-13h00 (in room GRC02) 2 2

  3. MACHINE LEARNING – 2013 Class Timetable http://lasa.epfl.ch/teaching/lectures/ML_Phd/index.php 3 3

  4. MACHINE LEARNING – 2013 Practicalities Website of the class: http://lasa.epfl.ch/teaching/lectures/ML_Phd Lecture Notes Machine Learning Techniques Available at the Librairie Polytechnique Course covers selected chapters of the lecture notes, see website. 4 4

  5. MACHINE LEARNING – 2013 Grading 50% of the grade based on personal work. Choice between: 1. Mini-project implementing and evaluating the algorithm performance and sensibility to parameter choices (should be done individually). OR 2. A literature survey on a topic chosen among a list provided in class (can be done in team of two people) ~25-30 hours of personal work, i.e. count one week of work. 50% based on final Oral Exam 20 minutes preparation 20 minutes answer on the black board (closed book, but allowed to bring a recto-verso A4 page with personal notes) 5 5

  6. MACHINE LEARNING – 2013 Prerequisites Linear Algebra, Probabilities and Statistics Basics in ML can be an advantage (otherwise catch up with lecture notes), Lecture Notes on the website http://lasa.epfl.ch/teaching/lectures/ML_Phd/ 6 6

  7. MACHINE LEARNING – 2013 Syllabus Compulsory reading of background chapters before class! 7 7

  8. MACHINE LEARNING – 2013 Today’s class format • Examples of ML applications • Taxonomy and basic concepts of ML • Brief recap of basic maths for the class • Overview of practicals 8 8

  9. MACHINE LEARNING – 2013 What is Machine Learning to you? What do you think it is used for? Why are you taking this class?! 9 9

  10. MACHINE LEARNING – 2013 Machine Learning, a definition Machine Learning is the field of scientific study that concentrates on induction algorithms and on other algorithms that can be said to ``learn.'' Machine Learning Journal, Kluwer Academic Machine Learning is an area of artificial intelligence involving developing techniques to allow computers to “ learn ” . More specifically, machine learning is a method for creating computer programs by the analysis of data sets, rather than the intuition of engineers. Machine learning overlaps heavily with statistics, since both fields study the analysis of data. Webster Dictionary Machine learning is a branch of statistics and computer science, which studies algorithms and architectures that learn from data sets. WordIQ 10 10

  11. MACHINE LEARNING – 2012 What is Machine Learning? Machine Learning encompasses a large set of algorithms that aim at inferring information from what is hidden. A. M. Bronstein, M. M. Bronstein, M. Zibulevsky, "On separation of semitransparent dynamic images from static background", Proc. Intl. Conf. on Independent Component Analysis 11 and Blind Signal Separation , pp. 934-940, 2006. 11 11 11

  12. MACHINE LEARNING – 2012 What is Machine Learning? The strength of ML algorithms is that they can apply to arbitrary set of data. It can recognizing patterns from what from various source of data. Recognizing human speech. Here this the wave produced when uttering the word “allright”. 12 12 12 12

  13. MACHINE LEARNING – 2012 What is Machine Learning? Same note played by a oboe Piano note 13 13 13 13

  14. MACHINE LEARNING – 2013 What is Machine Learning? What is sometimes impossible to see for humans is easy for ML to pick. Demo Eyes-No-Gaze Demo Eyes-With-Gazes 14 14

  15. MACHINE LEARNING – 2013 What is Machine Learning? ML algorithms makes inference from analyzing a set of signals or data- points. ? Support Vector Regression Demo PCA Wrinkles, Eyelids and Eyelashes Noris et al, 2011, Computer Vision and Image Understanding. 15 15

  16. MACHINE LEARNING – 2013 What is Machine Learning? Conversely, things that seem evident to humans may require more than one ML tool and also some intuition for encoding the data. There is an ambiguity. The two sets of images are differentiable by both orientation and color. Orientation is spurious information coming from poor choice of training data. Color is the feature we try to teach the algorithm. 16 16 16

  17. MACHINE LEARNING – 2012 What is Machine Learning? Conversely, things that seem evident to humans may require more than one ML tool and also some intuition for encoding the data. A good training set must make sure to provide enough information for the algorithm to do proper inference. Here, one must provide images of the two pen in the same set of orientation. 17 17 17 17

  18. MACHINE LEARNING – 2013 Learning versus Memorization Learning implies generalizing. Generalizing consists of extracting key features from the data, matching those across data (to find resemblances) and storing a generalized representation of the data features that accounts best (according to a given metric) for all the small differences across data. Classification and clustering techniques are examples of methods that generalize by categorizing the data. Generalizing is the opposite of memorizing and often one might want to find a tradeoff between over-generalizing, hence losing information on the data, and over fitting, i.e. keeping more information than required. Generalization is particularly important in order to reduce the influence of noise, introduced in the variability of the data. 18 18

  19. MACHINE LEARNING – 2013 Taxonomy in ML • Supervised learning – where the algorithm learns a function or model that maps best a set of inputs to a set of desired outputs. • Reinforcement learning – where the algorithm learns a policy or model of the set of transitions across a discrete set of input-output states (Markovian world) in order to maximize a reward value (external reinforcement). • Unsupervised learning – where the algorithm learns a model that best represent a set of inputs without any feedback (no desired output, no external reinforcement) • Learning to learn – where the algorithm learns its own inductive bias based on previous experiences 19 19

  20. MACHINE LEARNING – 2013 Examples of ML Applications 20 20

  21. MACHINE LEARNING – 2013 Structure Discovery Raw Data Trying to find some structure in the data….. 21 21

  22. MACHINE LEARNING – 2013 Structure Discovery: example Methods for spectral analysis, such as linear/kernel PCA - CCA - ICA aim at finding hidden structure in the data. Reconst. Linear PCA Kernel PCA projections Projection of handwritten digits; kernel PCA projections extract better some of the texture and is less sensitive to noise than linear PCA, which boost reconstruction and recognition of digits (Mika et al, NIPS 2000). 22 22

  23. MACHINE LEARNING – 2013 Structure Discovery: example Methods for spectral analysis, such as linear/kernel PCA - CCA - ICA aim at finding hidden structure in the data. Person identification Task: Top row: Query image and 10 candidates in the gallery set. Bottom row: projections of the query image onto the pre-learned (through kernel PCA) appearance manifold of the 10 candidates. Yang et al, Person Reidentification by Kernel PCA Based Appearance Learning , Canadian Conf. on Computer and Robot Vision (2011) 23 23

  24. MACHINE LEARNING – 2013 Structure Discovery Spectral analysis proceeds by either projecting or lifting the data into a lower, respectively, higher dimensional space. In each projection, groups of datapoints appear more similar than in the original space. Looking at each projection separately allows to determine which feature each group of datapoints share. This can be used in different ways: Feature space - To discard outliers by selecting only the datapoints that have most features in x F (x) common. - To group datapoints according to shared features. F 1 (x) - To rank features according to how frequently these appear. Projections in feature space 24 24

  25. MACHINE LEARNING – 2013 Structure Discovery In this class, we will briefly review some of the key novel algorithms for spectral analysis, including: - Kernel PCA with wide application of its non-linear projections for a variety of domains; - Kernel CCA (Canonical Correlation Analysis): Generalization of kernel PCA to comparison across domains, e.g. combining visual and auditory information; - Kernel ICA that attempt to solve more complex blind source decomposition using non-linear projections; 25 25

  26. MACHINE LEARNING – 2012 Clustering Clustering encompasses a large set of methods that try to find patterns that are similar in some way. Hierarchical clustering builds tree-like structure by pairing datapoints according to increasing levels of similarity. 26

  27. MACHINE LEARNING – 2013 Clustering: example Hierarchical clustering can be used with arbitrary sets of data. Example: Hierarchical clustering to discover similar temporal pattern of crimes across districts in India. Chandra et al, “A Multivariate Time Series Clustering Approach for Crime Trends Prediction”, IEEE SMC 2008. 27 27

Recommend


More recommend