lecture 7
play

Lecture 7: Probability Review (contd.) Maximum Likelihood - PowerPoint PPT Presentation

Lecture 7: Probability Review (contd.) Maximum Likelihood Estimation (MLE) Aykut Erdem November 2018 Hacettepe University Administrative Assignment 2 will be out tonight It is due November 24 (i.e. in 2 weeks) You will


  1. Lecture 7: − Probability Review (cont’d.) − Maximum Likelihood Estimation (MLE) Aykut Erdem November 2018 Hacettepe University

  2. Administrative • Assignment 2 will be out tonight − It is due November 24 (i.e. in 2 weeks) − You will implement Naive Bayes classifier for fake news detection • � 2

  3. Administrative • Project proposal due November 16 • A half page description − problem to be investigated, − why it is interesting, − what data you will use, − related work. � 3

  4. D e a d l i n e s i n t h e s y l l a c b l u o s s e a r r e t h 
 a n t h e y a p p e a r � 4

  5. Today • Probabilities - Dependence, Independence, Conditional Independence 
 • Parameter estimation - Maximum Likelihood Estimation (MLE) - Maximum a Posteriori (MAP) ! 5

  6. Last time… Sample space Def : A sample space Ω is the set of all � possible outcomes of a (conceptual or physical) random experiment. ( Ω can be finite or infinite.) � Examples: • Ω may be the set of all possible outcomes of a � � dice roll (1,2,3,4,5,6) 
 • Pages of a book opened randomly. (1-157) 
 slide by Barnabás Póczos & Alex Smola • Real numbers for temperature, location, time, etc ! 6

  7. Last time… Events We will ask the question: What is the probability of a particular event? Def: Event A is a subset of the sample space Ω Examples: What is the probability of - the book is open at an odd number slide by Barnabás Póczos & Alex Smola - rolling a dice the number <4 - a random person’s height X : a<X<b ! 7

  8. Last time… Probability Def: Probability P(A), the probability that event (subset) A happens , is a function that maps the event A onto the interval [0, 1]. P(A) is also called the probability measure of A. outcomes in which A is false sample space � 1,3,5,6 outcomes in which A is slide by Barnabás Póczos & Alex Smola true 2,4 Example: Example: What is the probability that What is the probability that the P(A) is the volume of the area. the number on the dice is 2 or 4? number on the dice is 2 or 4? 10 ! 8

  9. Last time… Kolmogorov Axioms Consequences: slide by Barnabás Póczos & Alex Smola ! 9

  10. Last time… Venn Diagram B A slide by Barnabás Póczos & Alex Smola �� P ( A U B ) = P ( A ) + P ( B ) - P ( A � B ) ! 10

  11. Last time… Random Variables Def: Real valued random variable is a function of the outcome of a randomized experiment Examples: Discrete random variable examples ( � is discrete): • X( � ) = True if a randomly drawn person ( � ) from our • slide by Barnabás Póczos & Alex Smola class ( � ) is female X( � ) = The hometown X( � ) of a randomly drawn person • ( � ) from our class ( � ) ! 11

  12. Last time… Discrete Distributions • Bernoulli distribution: Ber( p ) • Binomial distribution: Bin(n,p) Suppose a coin with head prob. p is tossed n times. What is the probability of getting k heads and n-k tails? slide by Barnabás Póczos & Alex Smola 17 ! 12

  13. Last time… Discrete Distributions • Bernoulli distribution: Ber( p ) • Binomial distribution: Bin(n,p) Suppose a coin with head prob. p is tossed n times. What is the probability of getting k heads and n-k tails? slide by Barnabás Póczos & Alex Smola 17 ! 13

  14. Last time… Discrete Distributions • Bernoulli distribution: Ber( p ) • Binomial distribution: Bin(n,p) Suppose a coin with head prob. p is tossed n times. What is the probability of getting k heads and n-k tails? slide by Barnabás Póczos & Alex Smola 17 ! 14

  15. Last time… Discrete Distributions • Bernoulli distribution: Ber( p ) • Binomial distribution: Bin(n,p) Suppose a coin with head prob. p is tossed n times. What is the probability of getting k heads and n-k tails? slide by Barnabás Póczos & Alex Smola 17 ! 15

  16. Last time… Conditional Probability P(X|Y) = Fraction of worlds in which X event is true given Y event is true. No Flu Flu Headache 1/80 7/80 Y slide by Barnabás Póczos & Alex Smola X � Y X 1/80 71/80 No Headache 28 ! 16

  17. Last time… Conditional Probability P(X|Y) = Fraction of worlds in which X event is true given Y event is true. No Flu Flu Headache 1/80 7/80 Y slide by Barnabás Póczos & Alex Smola X � Y X 1/80 71/80 No Headache 28 ! 17

  18. Independence Independent random variables: Y and X don’t contain information about each other. Observing Y doesn’t help predicting X. Observing X doesn’t help predicting Y. Examples: slide by Barnabás Póczos & Alex Smola Independent: Winning on roulette this week and next week. Dependent: Russian roulette ! 18

  19. Dependent / Independent Y Y slide by Barnabás Póczos & Alex Smola X X Independent X,Y Dependent X,Y ! 19

  20. Conditionally Independent Conditionally independent : Knowing Z makes X and Y independent Examples: Dependent: shoe size of children and reading skills Conditionally independent: shoe size of children and reading skills given age slide by Barnabás Póczos & Alex Smola Stork deliver babies: 
 Highly statistically significant correlation 
 exists between stork populations and 
 human birth rates across Europe. 7 ! 20

  21. Conditionally Independent • London taxi drivers: A survey has pointed out a positive and significant correlation between the number of accidents and wearing coats. They concluded that coats could hinder movements of drivers and be the cause of accidents. A new law was prepared to prohibit drivers from wearing coats when driving. Finally, another study pointed out that people wear slide by Barnabás Póczos & Alex Smola coats when it rains… ! 21

  22. Correlation ≠ Causation Number people who drowned by falling into a swimming-pool correlates with Number of films Nicolas Cage appeared in Correlation: 0.666004 ! 22

  23. Conditional Independence Formally: X is conditionally independent of Y given Z Equivalent to: slide by Barnabás Póczos & Alex Smola Note: does NOT mean Thunder is independent of Rain But given Lightning knowing Rain doesn’t give more info about Thunder ! 23

  24. Conditional Independence Formally: X is conditionally independent of Y given Z Equivalent to: slide by Barnabás Póczos & Alex Smola Note: does NOT mean Thunder is independent of Rain But given Lightning knowing Rain doesn’t give more info about Thunder ! 24

  25. Conditional Independence Formally: X is conditionally independent of Y given Z Equivalent to: slide by Barnabás Póczos & Alex Smola Note: does NOT mean Thunder is independent of Rain But given Lightning knowing Rain doesn’t give more info about Thunder ! 25

  26. Conditional vs. Marginal Independence • C calls A and B separately and tells them a number n ∈ {1,...,10} • Due to noise in the phone, A and B each imperfectly (and independently) draw a conclusion about what the number was. • A thinks the number was n a and B thinks it was n b . n b . • Are n a and n b marginally independent? n a n b - No,we expect e.g. P(n a =1|n b =1)>P(n a =1) = 1) • Are n a and n b conditionally independent given n? ? n - Yes, because if we know the true number, the outcomes n a and n b slide by Barnabás Póczos & Alex Smola are purely determined by the noise in each phone. 
 
 P(n a =1|n b =1,n=2)=P(n a =1|n=2) ! 26

  27. Parameter estimation: MLE, MAP Estimating Probabilities slide by Barnabás Póczos & Alex Smola ! 27

  28. Flipping a Coin I have a coin, if I flip it, what’s the probability that it will fall with the head up? Let us flip it a few times to estimate the probability: slide by Barnabás Póczos & Alex Smola “Frequency of heads” The estimated probability is: 3/5 ! 28

  29. Flipping a Coin I have a coin, if I flip it, what’s the probability that it will fall with the head up? Let us flip it a few times to estimate the probability: slide by Barnabás Póczos & Alex Smola “Frequency of heads” The estimated probability is: 3/5 ! 29

  30. Flipping a Coin I have a coin, if I flip it, what’s the probability that it will fall with the head up? Let us flip it a few times to estimate the probability: slide by Barnabás Póczos & Alex Smola “Frequency of heads” The estimated probability is: 3/5 ! 30

  31. Flipping a Coin I have a coin, if I flip it, what’s the probability that it will fall with the head up? Let us flip it a few times to estimate the probability: slide by Barnabás Póczos & Alex Smola “Frequency of heads” The estimated probability is: 3/5 ! 31

  32. Flipping a Coin 3/5 “Frequency of heads” The estimated probability is: Questions: (1) Why frequency of heads??? (2) How good is this estimation??? slide by Barnabás Póczos & Alex Smola (3) Why is this a machine learning problem??? We are going to answer these questions ! 32

  33. Question (1) Why frequency of heads??? 
 • Frequency of heads is exactly the 
 maximum likelihood estimator for this problem 
 • MLE has nice properties 
 (interpretation, statistical guarantees, simple) slide by Barnabás Póczos & Alex Smola ! 33

  34. ! 34 Maximum Likelihood Estimation slide by Barnabás Póczos & Alex Smola

Recommend


More recommend