Probability Review CMSC 473/673 UMBC Some slides adapted from 3SLP, Jason Eisner
Probability Prerequisites Basic probability axioms Definition of conditional and definitions probability Joint probability Bayes rule Probabilistic Probability chain rule Independence Expected Value (of a Marginal probability function) of a Random Variable
Interpretations of Probability Past performance 58% of the past 100 flips were heads Hypothetical performance If I flipped the coin in many parallel universes… Subjective strength of belief Would pay up to 58 cents for chance to win $1 Output of some computable formula? p(heads) vs q(heads)
(Most) Probability Axioms p( everything ) = 1 p( nothing ) = p( φ ) = 0 empty set p(A) ≤ p(B), when A ⊆ B A B everything p(A ∪ B) = p(A) + p(B), p(A ∪ B) ≠ p(A) + p(B) when A ∩ B = φ p(A ∪ B) = p(A) + p(B) – p(A ∩ B)
Examining p(everything) =1 If p( everything ) = 1…
Examining p(everything) =1 If p( everything ) = 1… and you can break everything into M unique items 𝑦 1 , 𝑦 2 , … , 𝑦 𝑁 …
Examining p(everything) =1 If p( everything ) = 1… and you can break everything into M unique items 𝑦 1 , 𝑦 2 , … , 𝑦 𝑁 … then each pair 𝑦 𝑗 and 𝑦 𝑘 are disjoint ( 𝑦 𝑗 ∩ 𝑦 𝑘 = 𝜚 )…
Examining p(everything) =1 If p( everything ) = 1… and you can break everything into M unique items 𝑦 1 , 𝑦 2 , … , 𝑦 𝑁 … then each pair 𝑦 𝑗 and 𝑦 𝑘 are disjoint ( 𝑦 𝑗 ∩ 𝑦 𝑘 = 𝜚 )… and because everything is the union of 𝑦 1 , 𝑦 2 , … , 𝑦 𝑁 …
Examining p(everything) =1 If p( everything ) = 1… and you can break everything into M unique items 𝑦 1 , 𝑦 2 , … , 𝑦 𝑁 … then each pair 𝑦 𝑗 and 𝑦 𝑘 are disjoint ( 𝑦 𝑗 ∩ 𝑦 𝑘 = 𝜚 )… and because everything is the union of 𝑦 1 , 𝑦 2 , … , 𝑦 𝑁 … 𝑁 𝑞 everything = 𝑞 𝑦 𝑗 = 1 𝑗=1
A Very Important Concept to Remember The probabilities of all unique (disjoint) items 𝑦 1 , 𝑦 2 , … , 𝑦 𝑁 must sum to 1: 𝑁 𝑞 everything = 𝑞 𝑦 𝑗 = 1 𝑗=1
Probabilities and Random Variables Random variables: variables that represent the possible outcomes of some random “process”
Probabilities and Random Variables Random variables: variables that represent the possible outcomes of some random “process” Example #1: A (weighted) coin that can come up heads or tails X is a random variable denoting the possible outcomes X=HEADS or X=TAILS
Distribution Notation If X is a R.V. and G is a distribution: • 𝑌 ∼ 𝐻 means X is distributed according to (“sampled from”) 𝐻
Distribution Notation If X is a R.V. and G is a distribution: • 𝑌 ∼ 𝐻 means X is distributed according to (“sampled from”) 𝐻 • 𝐻 often has parameters 𝜍 = (𝜍 1 , 𝜍 2 , … , 𝜍 𝑁 ) that govern its “shape” • Formally written as 𝑌 ∼ 𝐻(𝜍)
Distribution Notation If X is a R.V. and G is a distribution: • 𝑌 ∼ 𝐻 means X is distributed according to (“sampled from”) 𝐻 • 𝐻 often has parameters 𝜍 = (𝜍 1 , 𝜍 2 , … , 𝜍 𝑁 ) that govern its “shape” • Formally written as 𝑌 ∼ 𝐻(𝜍) i.i.d. If 𝑌 1 , X 2 , … , X N are all independently sampled from 𝐻(𝜍) , they are i ndependently and i dentically d istributed
Probability Prerequisites Basic probability axioms Definition of conditional and definitions probability Joint probability Bayes rule Probabilistic Probability chain rule Independence Expected Value (of a Marginal probability function) of a Random Variable
Joint Probability Probability that multiple things “happen together” A B everything Joint probability
Joint Probability Probability that multiple things “happen together” p(x,y), p(x,y,z), p(x,y,w,z) A B Symmetric: p(x,y) = p(y,x) everything Joint probability
Joint Probability Probability that multiple things “happen together” p(x,y), p(x,y,z), p(x,y,w,z) Symmetric: p(x,y) = p(y,x) Form a table based of A B outcomes: sum across cells = 1 p(x,y) Y=0 Y=1 everything X=“cat” .04 .32 X=“dog” .2 .04 Joint X=“bird” .1 .1 probability X=“human” .1 .1
Joint Probabilities 1 p(A) 0 what happens as we add conjuncts?
Joint Probabilities 1 p(A) p(A, B) 0 what happens as we add conjuncts?
Joint Probabilities 1 p(A) p(A, B) p(A, B, C) 0 what happens as we add conjuncts?
Joint Probabilities 1 p(A) p(A, B) p(A, B, C) p(A, B, C, D) 0 what happens as we add conjuncts?
Joint Probabilities 1 p(A) p(A, B) p(A, B, C) p(A, B, C, D) p(A, B, C, D, E) 0 what happens as we add conjuncts?
A Note on Notation p(X INCLUSIVE_ OR Y) p(X ∪ Y) p(X AND Y) p(X, Y) p(X, Y) = p(Y, X) – except when order matters (should be obvious from context)
Probability Prerequisites Basic probability axioms Definition of conditional and definitions probability Joint probability Bayes rule Probabilistic Probability chain rule Independence Expected Value (of a Marginal probability function) of a Random Variable
Probabilistic Independence Independence: when events can occur and not impact the probability of Q: Are the results of flipping the same other events coin twice in succession independent? Formally: p(x,y) = p(x)*p(y) Generalizable to > 2 random variables
Probabilistic Independence Independence: when events can occur and not impact the probability of Q: Are the results of flipping the same other events coin twice in succession independent? Formally: p(x,y) = p(x)*p(y) A: Yes (assuming no weird effects) Generalizable to > 2 random variables
Probabilistic Independence Q: Are A and B independent? Independence: when events can occur and not impact the probability of other events Formally: p(x,y) = p(x)*p(y) A Generalizable to > 2 random variables B everything
Probabilistic Independence Q: Are A and B independent? Independence: when events can occur and not impact the probability of other events Formally: p(x,y) = p(x)*p(y) A Generalizable to > 2 random variables B everything A: No (work it out from p(A,B)) and the axioms
Probabilistic Independence Q: Are X and Y independent? Independence: when events can occur and not impact the probability of p(x,y) Y=0 Y=1 other events X=“cat” .04 .32 X=“dog” .2 .04 Formally: p(x,y) = p(x)*p(y) X=“bird” .1 .1 X=“human” .1 .1 Generalizable to > 2 random variables
Probabilistic Independence Q: Are X and Y independent? Independence: when events can occur and not impact the probability of p(x,y) Y=0 Y=1 other events X=“cat” .04 .32 X=“dog” .2 .04 Formally: p(x,y) = p(x)*p(y) X=“bird” .1 .1 X=“human” .1 .1 Generalizable to > 2 random variables A: No (find the marginal probabilities of p(x) and p(y))
Probability Prerequisites Basic probability axioms Definition of conditional and definitions probability Joint probability Bayes rule Probabilistic Probability chain rule Independence Expected Value (of a Marginal probability function) of a Random Variable
Marginal(ized) Probability: The Discrete Case y x 2 & y x 3 & y x 4 & y x 1 & y Consider the mutually exclusive ways that different values of x could occur with y Q: How do write this in terms of joint probabilities?
Marginal(ized) Probability: The Discrete Case y x 2 & y x 3 & y x 4 & y x 1 & y Consider the mutually exclusive ways that different values of x could occur with y 𝑞 𝑧 = 𝑞(𝑦, 𝑧) 𝑦
Probability Prerequisites Basic probability axioms Definition of conditional and definitions probability Joint probability Bayes rule Probabilistic Probability chain rule Independence Expected Value (of a Marginal probability function) of a Random Variable
Conditional Probability 𝑞 𝑌 𝑍) = 𝑞(𝑌, 𝑍) 𝑞(𝑍) Conditional Probabilities are Probabilities
Conditional Probability 𝑞 𝑌 𝑍) = 𝑞(𝑌, 𝑍) 𝑞(𝑍) 𝑞 𝑍 = marginal probability of Y
Conditional Probability 𝑞 𝑌 𝑍) = 𝑞(𝑌, 𝑍) 𝑞(𝑍) 𝑞 𝑍 = 𝑞(𝑌 = 𝑦, 𝑍) 𝑦
Revisiting Marginal Probability: The Discrete Case y x 2 & y x 3 & y x 4 & y x 1 & y 𝑞 𝑧 = 𝑞(𝑦, 𝑧) 𝑦 = 𝑞 𝑦 𝑞 𝑧 𝑦) 𝑦
Probability Prerequisites Basic probability axioms Definition of conditional and definitions probability Joint probability Bayes rule Probabilistic Probability chain rule Independence Expected Value (of a Marginal probability function) of a Random Variable
Deriving Bayes Rule Start with conditional p(X | Y)
Deriving Bayes Rule 𝑞 𝑌 𝑍) = 𝑞(𝑌, 𝑍) Solve for p(x,y) 𝑞(𝑍)
Deriving Bayes Rule 𝑞 𝑌 𝑍) = 𝑞(𝑌, 𝑍) Solve for p(x,y) 𝑞(𝑍) 𝑞 𝑌, 𝑍 = 𝑞 𝑌 𝑍)𝑞(𝑍) p(x,y) = p(y,x) 𝑞 𝑌 𝑍) = 𝑞 𝑍 𝑌) ∗ 𝑞(𝑌) 𝑞(𝑍)
Bayes Rule prior likelihood probability 𝑞 𝑌 𝑍) = 𝑞 𝑍 𝑌) ∗ 𝑞(𝑌) 𝑞(𝑍) posterior probability marginal likelihood (probability)
Recommend
More recommend