1
play

1 Generalized Independence Two Dice General definition of - PDF document

A Few Useful Formulas Generality of Conditional Probability For any events A and B: For any events A, B, and E, you can condition consistently on E, and these formulas still hold: P(A B) = P(B A) (Commutativity) P(A B | E) = P(B A | E)


  1. A Few Useful Formulas Generality of Conditional Probability • For any events A and B: • For any events A, B, and E, you can condition consistently on E, and these formulas still hold: P(A B) = P(B A) (Commutativity) P(A B | E) = P(B A | E) P(A B) = P(A | B) P(B) (Chain rule) P(A B | E) = P(A | B E) P(B | E) = P(B | A) P(A) P(B | A E) P(A | E) P(A B c ) = P(A) – P(AB) P(A | B E) = (Bayes Thm.) (Intersection) P(B | E) • Can think of E as “everything you already know” ≥ P(A) + P(B) – 1 P(A B) (Bonferroni) • Formally, P(  | E) satisfies 3 axioms of probability Dissecting Bayes Theorem It Always Comes Back to Dice • Recall Bayes Theorem (common form): • Roll two 6-sided dice, yielding values D 1 and D 2 “Posterior” “Likelihood” “Prior”  Let E be event: D 1 = 1  Let F be event: D 2 = 1 P(E | H) P(H) P(H | E) = • What is P(E), P(F), and P(EF)? P(E)  P(E) = 1/6, P(F) = 1/6, P(EF) = 1/36 • Odds(H | E):   P(EF) = P(E) P(F) E and F independent P(H | E) P(E | H) P(H) • Let G be event: D 1 + D 2 = 5 {(1, 4), (2, 3), (3, 2), (4, 1)} = P(H c | E) P(E | H c ) P(H c ) • What is P(E), P(G), and P(EG)? • How odds of H change when evidence E observed  P(E) = 1/6, P(G) = 4/36 = 1/9, P(EG) = 1/36  P(EG)  P(E) P(G)  E and G dependent  Note that P(E) cancels out in odds formulation • This is a form of probabilistic inference Let’s Do a Proof Independence • Two events E and F are called independent if: • Given independent events E and F, prove: P(E | F) = P(E | F c ) P(EF) = P(E) P(F) Or, equivalently: P(E | F) = P(E) • Proof: • Otherwise, they are called dependent events P(E F c ) = P(E) – P(EF) = P(E) – P(E) P(F) = P(E) [1 – P(F)] • Three events E, F, and G independent if: = P(E) P(F c ) P(EFG) = P(E) P(F) P(G), and So, E and F c independent, implying that: P(EF) = P(E) P(F), and P(E | F c ) = P(E) = P(E | F) P(EG) = P(E) P(G), and • Intuitively, if E and F are independent, knowing P(FG) = P(F) P(G) whether F holds gives us no information about E 1

  2. Generalized Independence Two Dice • General definition of Independence: • Roll two 6-sided dice, yielding values D 1 and D 2 Events E 1 , E 2 , ..., E n are independent if for every  Let E be event: D 1 = 1 subset E 1’ , E 2’ , ..., E r’ (where r  n ) it holds that:  Let F be event: D 2 = 6  Are E and F independent? Yes!  P ( E E E ... E ) P ( E ) P ( E ) P ( E )... P ( E ) 1 ' 2 ' 3 r ' 1 ' 2 ' 3 ' r ' • Let G be event: D 1 + D 2 = 7  Are E and G independent? Yes! • Example: outcomes of n separate flips of a coin  P(E) = 1/6, P(G) = 1/6, P(E G) = 1/36 [roll (1, 6)] are all independent of one another  Are F and G independent? Yes!  Each flip in this case is called a “trial” of the experiment  P(F) = 1/6, P(G) = 1/6, P(F G) = 1/36 [roll (1, 6)]  Are E, F and G independent? No!  P(EFG) = 1/36  1/216 = (1/6)(1/6)(1/6) Generating Random Bits Coin Flips • A computer produces a series of random bits, • Say a coin comes up heads with probability p with probability p of producing a 1.  Each coin flip is an independent trial  Each bit generated is an independent trial  E = first n bits are 1’s, followed by a 0 • P( n heads on n coin flips) = p n  What is P(E)? • P( n tails on n coin flips) = (1 – p) n • Solution  )  = P(1 st bit=1) P(2 nd bit=1) ... P(n th bit=1) k n k  P(first n 1’s) • P(first k heads, then n – k tails) = p ( 1 p = p n   n  P( n+1 bit=0) = (1 – p)     k n k p ( 1 p ) • P(exactly k heads on n coin flips) =    k   P(E) = P(first n 1’s) P( n+1 bit=0) = p n (1 – p) Hash Tables Yet More Hash Table Fun • m strings are hashed (equally randomly) into a • m strings are hashed (unequally) into a hash table hash table with n buckets with n buckets  Each string hashed is an independent trial  Each string hashed is an independent trial, with probability p i of getting hashed to bucket i  E = at least one string hashed to first bucket  E = At least 1 of buckets 1 to k has ≥ 1 string hashed to it  What is P(E)? • Solution • Solution  F i = at least one string hashed into i -th bucket  F i = string i not hashed into first bucket (where 1  i  m ) = P(F 1  F 2  …  F k ) = 1 – P((F 1  F 2  …  F k ) c )  P(F i ) = 1 – 1/n = (n – 1)/n (for all 1  i  m )  P(E) c … F k c F 2 = 1 – P(F 1 c ) ( DeMorgan’s Law)  Event (F 1 F 2 …F m ) = no strings hashed to first bucket c … F k c F 2 c ) = P(no strings hashed to buckets 1 to k )  P(F 1 = 1 – P(F 1 F 2 …F m ) = 1 – P(F 1 )P(F 2 )…P(F m )  P(E) = (1 – p 1 – p 2 – … – p k ) m = 1 – ((n – 1)/n) m = 1 – (1 – p 1 – p 2 – … – p k ) m  Similar to ≥ 1 of m people having same birthday as you  P(E) 2

  3. No, Really, it’s More Hash Table Fun Sending Messages Through a Network • m strings are hashed (unequally) into a hash table • Consider the following parallel network: with n buckets p 1 p 2  Each string hashed is an independent trial, with A B probability p i of getting hashed to bucket i p n  E = Each of buckets 1 to k has ≥ 1 string hashed to it  n independent routers, each with probability p i of • Solution functioning (where 1  i  n )  F i = at least one string hashed into i -th bucket  E = functional path from A to B exists. What is P(E)? = P(F 1 F 2 … F k ) = 1 – P((F 1 F 2 … F k ) c )  P(E) c  F 2 c  …  F k • Solution: = 1 – P(F 1 c ) ( DeMorgan’s Law) = 1 – P(all routers fail)    P(E) k k     c      c c c = 1 – ( r 1 ) P F 1 ( 1 ) P ( F F ... F )   = 1 – (1 – p 1 )(1 – p 2 )…(1 – p n ) i i i i   1 2 r     i 1 r 1 i ... i 1 r n  c c c        m where P ( F F ... F ) ( 1 p p ... p ) = 1 ( 1 p ) i i i i i i i 1 2 r 1 2 r  i 1 Reminder of Geometric Series Simplified Craps n  • Two 6-sided dice repeatedly rolled (roll = ind. trial)       0 1 2 3 n i • Geometric series: x x x x ... x x  E = 5 is rolled before a 7 is rolled  i 0  What is P(E)? • From your “Calculation Reference” handout: • Solution   n 1 n 1 x   F n = no 5 or 7 rolled in first n – 1 trials, 5 rolled on n th trial  i x      1 x  P   i 0     P(E) =  F  P ( F ) n n     n 1 n 1  P(5 on any trial) = 4/36 P(7 on any trial) = 6/36 • As n   , and | x | < 1, then  P(F n ) = (1 – (10/36)) n-1 (4/36) = (26/36) n-1 (4/36)    n n 1  n 1  n 1 x 1      4 26 4 26 4 1 2     i        x  P(E) =          26 1 x 1 x 36 36 36 36 36 5    n 1 n 0   i 0 1   36 DNA Paternity Testing • Child is born with (A, a) gene pair (event B A,a )  Mother has (A, A) gene pair  Two possible fathers: M 1 : (a, a) M 2 : (a, A)  P(M 1 ) = p P(M 2 ) = 1 - p  What is P(M 1 | B A,a )? • Solution  P ( M 1 | B A,a ) = P ( M 1 B A,a ) / P ( B A,a ) P ( B | M ) P ( M )  A , a 1 1  P ( B | M ) P ( M ) P ( B | M ) P ( M ) A , a 1 1 A , a 2 2  1 p 2 p M 1 more likely to be father    p 1  1 p than he was before, since    1 p ( 1 p ) P(M 1 | B A,a ) > P(M 1 ) 2 3

Recommend


More recommend