introduction to machine learning cmu 10701
play

Introduction to Machine Learning CMU-10701 2. MLE, MAP What - PowerPoint PPT Presentation

Introduction to Machine Learning CMU-10701 2. MLE, MAP What happened last time? Barnabs Pczos & Aarti Singh 2014 Spring Administration Piazza: Please use it! Blackboard is ready Self assessment questions?


  1. Introduction to Machine Learning CMU-10701 2. MLE, MAP What happened last time? Barnabás Póczos & Aarti Singh 2014 Spring

  2. Administration  Piazza: … Please use it!  Blackboard is ready  Self assessment questions?  Slides are online  HW questions next week  Feedback is important!  Recitation: This Wednesday at 6pm (prob theory) 2

  3. Independence Independent random variables: Y and X don’t contain information about each other. Observing Y doesn’t help predicting X. 3

  4. Dependent / Independent Y Y X X Independent X,Y Dependent X,Y 4

  5. Conditionally Independent Conditionally independent : Knowing Z makes X and Y independent Examples: Dependent: show size and reading skills Conditionally independent: show size and reading skills given age 5

  6. Our first machine learning problem: Parameter estimation: MLE, MAP 6

  7. MLE for Bernoulli distribution Data, D = P(Heads) =  , P(Tails) = 1-  “Frequency of heads” The estimated probability is: 3/5 MLE: Choose  that maximizes the probability of observed data 7

  8. Maximum Likelihood Estimation MLE: Choose  that maximizes the probability of observed data Independent draws Identically distributed 8

  9. How good is this estimator? I want to know the coin parameter  2 [0,1] within  = 0.1 error, with probability at least 1-  = 0.95. How many flips do I need? 9

  10. Rolling a Dice, Estimation of parameters  1 ,  2 ,…,  6 12 24 Does the MLE estimation (relative frequancies) converge to the right value? How fast does it converge? 60 120 10

  11. Rolling a Dice Calculating the Empirical Average Does the empirical average converge to the true mean? How fast does it converge? 11

  12. Rolling a Dice, Calculating the Empirical Average 5 sample traces How fast do they converge to the true mean? 12

  13. Hoeffding’s inequality (1963) It only contains the range of the variables, but not the variances. 13

  14. “Convergence rate” for LLN from Hoeffding From Hoeffding: Convergence rate 14

  15. Introduction to Machine Learning CMU-10701 Stochastic Convergence and Tail Bounds Barnabás Póczos

Recommend


More recommend