Example II: medical imaging We need non-invasive ways to explore/diagnose patients: e.g., ultrasound, CT scan, MRI, etc. Ultrasound (very simplified!): send many sound pulses with probe that travel into your body; they hit boundaries between tissues and get reflected; and, image is created using the times echoes take to return to probe.
Example II: medical imaging We need non-invasive ways to explore/diagnose patients: e.g., ultrasound, CT scan, MRI, etc. Ultrasound (very simplified!): send many sound pulses with probe that travel into your body; they hit boundaries between tissues and get reflected; and, image is created using the times echoes take to return to probe. Given tissues produce specific echoes. Machine “inverted” latter indirect (and incomplete!) info and dealt with instrumental errors (random): statistical inverse problem .
Example II: medical imaging We need non-invasive ways to explore/diagnose patients: e.g., ultrasound, CT scan, MRI, etc. Ultrasound (very simplified!): send many sound pulses with probe that travel into your body; they hit boundaries between tissues and get reflected; and, image is created using the times echoes take to return to probe. Given tissues produce specific echoes. Machine “inverted” latter indirect (and incomplete!) info and dealt with instrumental errors (random): statistical inverse problem . Does it do better as instrum. errors decrease?
Outline 1 Introduction Mathematical Statistics Examples 2 Seeing the unseen Coin flips Statistical inverse problems
Outline 1 Introduction Mathematical Statistics Examples 2 Seeing the unseen Coin flips Statistical inverse problems
Coin flips: experiment I Let us simplify things and flip some coins:
Coin flips: experiment I Let us simplify things and flip some coins: with your phones, go to https://albertococacabrero.wordpress.com/openday/ or google Alberto J Coca Cambridge and add openday/ to end of my URL:
Coin flips: experiment I Let us simplify things and flip some coins: with your phones, go to https://albertococacabrero.wordpress.com/openday/ or google Alberto J Coca Cambridge and add openday/ to end of my URL:
Coin flips: experiment I Let us simplify things and flip some coins: with your phones, go to https://albertococacabrero.wordpress.com/openday/ or google Alberto J Coca Cambridge and add openday/ to end of my URL:
Coin flips: experiment I Let us simplify things and flip some coins: with your phones, go to https://albertococacabrero.wordpress.com/openday/ or google Alberto J Coca Cambridge and add openday/ to end of my URL:
Coin flips: experiment I, MLE We do not know the probability p ∈ [0 , 1] of this coin landing Heads (typically p =1 / 2, i.e. fair coin).
Coin flips: experiment I, MLE We do not know the probability p ∈ [0 , 1] of this coin landing Heads (typically p =1 / 2, i.e. fair coin). How to guess p from our data?
Coin flips: experiment I, MLE We do not know the probability p ∈ [0 , 1] of this coin landing Heads (typically p =1 / 2, i.e. fair coin). How to guess p from our data? p = # Heads “Frequentist” guess is ˆ # flips .
Coin flips: experiment I, MLE We do not know the probability p ∈ [0 , 1] of this coin landing Heads (typically p =1 / 2, i.e. fair coin). How to guess p from our data? p = # Heads “Frequentist” guess is ˆ # flips . Is this a sensible guess?
Coin flips: experiment I, MLE We do not know the probability p ∈ [0 , 1] of this coin landing Heads (typically p =1 / 2, i.e. fair coin). How to guess p from our data? p = # Heads “Frequentist” guess is ˆ # flips . Is this a sensible guess? Yes:
Coin flips: experiment I, MLE We do not know the probability p ∈ [0 , 1] of this coin landing Heads (typically p =1 / 2, i.e. fair coin). How to guess p from our data? p = # Heads “Frequentist” guess is ˆ # flips . Is this a sensible guess? Yes: each coin flip has options and probabilities given by Options Heads Tails Probabs. p 1 − p
Coin flips: experiment I, MLE We do not know the probability p ∈ [0 , 1] of this coin landing Heads (typically p =1 / 2, i.e. fair coin). How to guess p from our data? p = # Heads “Frequentist” guess is ˆ # flips . Is this a sensible guess? Yes: each coin flip has options and probabilities given by Options Heads Tails Probabs. p 1 − p If our data is Heads , Tails , Heads , it has probability or likelihood L = p (1 − p ) p = p 2 (1 − p );
Coin flips: experiment I, MLE We do not know the probability p ∈ [0 , 1] of this coin landing Heads (typically p =1 / 2, i.e. fair coin). How to guess p from our data? p = # Heads “Frequentist” guess is ˆ # flips . Is this a sensible guess? Yes: each coin flip has options and probabilities given by Options Heads Tails Probabs. p 1 − p If our data is Heads , Tails , Heads , it has probability or likelihood L = p (1 − p ) p = p 2 (1 − p ); and, more generally, if we flip n coins and obtain m heads, the likelihood of our data is L ( p , m , n ) = p m (1 − p ) n − m .
Coin flips: experiment I, MLE We do not know the probability p ∈ [0 , 1] of this coin landing Heads (typically p =1 / 2, i.e. fair coin). How to guess p from our data? p = # Heads “Frequentist” guess is ˆ # flips . Is this a sensible guess? Yes: each coin flip has options and probabilities given by Options Heads Tails Probabs. p 1 − p If our data is Heads , Tails , Heads , it has probability or likelihood L = p (1 − p ) p = p 2 (1 − p ); and, more generally, if we flip n coins and obtain m heads, the likelihood of our data is L ( p , m , n ) = p m (1 − p ) n − m . p = m Homework : ˆ n maximises q �→ L ( q , m , n ) when q ∈ [0 , 1]!
Coin flips: experiment I, MLE We do not know the probability p ∈ [0 , 1] of this coin landing Heads (typically p =1 / 2, i.e. fair coin). How to guess p from our data? p = # Heads “Frequentist” guess is ˆ # flips . Is this a sensible guess? Yes: each coin flip has options and probabilities given by Options Heads Tails Probabs. p 1 − p If our data is Heads , Tails , Heads , it has probability or likelihood L = p (1 − p ) p = p 2 (1 − p ); and, more generally, if we flip n coins and obtain m heads, the likelihood of our data is L ( p , m , n ) = p m (1 − p ) n − m . p = m Homework : ˆ n maximises q �→ L ( q , m , n ) when q ∈ [0 , 1]! Indeed, ˆ p is called the Maximum Likelihood Estimator (MLE).
Coin flips: experiment I, MLE p = m The MLE ˆ n enjoys mathematically desirable properties:
Coin flips: experiment I, MLE p = m The MLE ˆ n enjoys mathematically desirable properties: e.g., Law of Large Numbers (LLN) It holds that ˆ p → p as n → ∞ .
Coin flips: experiment I, MLE p = m The MLE ˆ n enjoys mathematically desirable properties: e.g., Law of Large Numbers (LLN) It holds that ˆ p → p as n → ∞ . How fast does Error = | ˆ p − p | → 0?
Coin flips: experiment I, MLE p = m The MLE ˆ n enjoys mathematically desirable properties: e.g., Law of Large Numbers (LLN) It holds that ˆ p → p as n → ∞ . How fast does Error = | ˆ p − p | → 0? Mathematical results guarantee it cannot be faster than 1 / √ n .
Coin flips: experiment I, MLE p = m The MLE ˆ n enjoys mathematically desirable properties: e.g., Law of Large Numbers (LLN) It holds that ˆ p → p as n → ∞ . How fast does Error = | ˆ p − p | → 0? Mathematical results guarantee it cannot be faster than 1 / √ n . Note that if Error ≈ 1 / n a = n − a for some a > 0 , then log Error ≈ − a log n .
Coin flips: experiment I, MLE p = m The MLE ˆ n enjoys mathematically desirable properties: e.g., Law of Large Numbers (LLN) It holds that ˆ p → p as n → ∞ . How fast does Error = | ˆ p − p | → 0? Mathematical results guarantee it cannot be faster than 1 / √ n . Note that if Error ≈ 1 / n a = n − a for some a > 0 , then log Error ≈ − a log n . Hence, to find value of a , plot x = log n vs. y = log Error ≈ − ax and compute the slope.
Coin flips: experiment I, MLE p = m The MLE ˆ n enjoys mathematically desirable properties: e.g., Law of Large Numbers (LLN) It holds that ˆ p → p as n → ∞ . How fast does Error = | ˆ p − p | → 0? Mathematical results guarantee it cannot be faster than 1 / √ n . Note that if Error ≈ 1 / n a = n − a for some a > 0 , then log Error ≈ − a log n . Hence, to find value of a , plot x = log n vs. y = log Error ≈ − ax and compute the slope. Plot suggests a = 1 / 2, i.e. Error ≈ 1 / √ n !
Coin flips: experiment I, MLE p = m The MLE ˆ n enjoys mathematically desirable properties: e.g., Law of Large Numbers (LLN) It holds that ˆ p → p as n → ∞ . How fast does Error = | ˆ p − p | → 0? Mathematical results guarantee it cannot be faster than 1 / √ n . Note that if Error ≈ 1 / n a = n − a for some a > 0 , then log Error ≈ − a log n . Hence, to find value of a , plot x = log n vs. y = log Error ≈ − ax and compute the slope. Plot suggests a = 1 / 2, i.e. Error ≈ 1 / √ n ! Thus, MLE is optimal in convergence rates!
Coin flips: experiment I, MLE p = m The MLE ˆ n enjoys mathematically desirable properties: e.g., Law of Large Numbers (LLN) It holds that ˆ p → p as n → ∞ . How fast does Error = | ˆ p − p | → 0? Mathematical results guarantee it cannot be faster than 1 / √ n . Note that if Error ≈ 1 / n a = n − a for some a > 0 , then log Error ≈ − a log n . Hence, to find value of a , plot x = log n vs. y = log Error ≈ − ax and compute the slope. Plot suggests a = 1 / 2, i.e. Error ≈ 1 / √ n ! Thus, MLE is optimal in convergence rates! (Optimal in other senses in view of, e.g., Central Limit Theorem, but no time to explain this.)
Coin flips: experiment I, MLE p = m The MLE ˆ n enjoys mathematically desirable properties: e.g., Law of Large Numbers (LLN) It holds that ˆ p → p as n → ∞ . How fast does Error = | ˆ p − p | → 0? Mathematical results guarantee it cannot be faster than 1 / √ n . Note that if Error ≈ 1 / n a = n − a for some a > 0 , then log Error ≈ − a log n . Hence, to find value of a , plot x = log n vs. y = log Error ≈ − ax and compute the slope. Plot suggests a = 1 / 2, i.e. Error ≈ 1 / √ n ! Thus, MLE is optimal in convergence rates! (Optimal in other senses in view of, e.g., Central Limit Theorem, but no time to explain this.) Other estimators are optimal too:
Coin flips: experiment I, Bayes ‘Bayesian” method: before conducting the experiment, guess probabilities for unknown p , i.e., Prob ( p = q | no data ) = G ( q ) with G , e.g., Green: I am (obviously!) respectable and coin is probably fair;
Coin flips: experiment I, Bayes ‘Bayesian” method: before conducting the experiment, guess probabilities for unknown p , i.e., Prob ( p = q | no data ) = G ( q ) with G , e.g., Green: I am (obviously!) respectable and coin is probably fair; Red: I am (absolutely) not re- spectable and coin is unfair;
Coin flips: experiment I, Bayes ‘Bayesian” method: before conducting the experiment, guess probabilities for unknown p , i.e., Prob ( p = q | no data ) = G ( q ) with G , e.g., Green: I am (obviously!) respectable and coin is probably fair; Red: I am (absolutely) not re- spectable and coin is unfair; Blue: I would rather not guess to avoid confrontations...
Coin flips: experiment I, Bayes ‘Bayesian” method: before conducting the experiment, guess probabilities for unknown p , i.e., Prob ( p = q | no data ) = G ( q ) with G , e.g., Green: I am (obviously!) respectable and coin is probably fair; Red: I am (absolutely) not re- spectable and coin is unfair; Blue: I would rather not guess to avoid confrontations... The initial guess G ( q ) evolves through Bayes rule as we get the data giving L ( q , m , n ) G ( q ) B ( q ) = , q ∈ [0 , 1] . � 1 0 L ( r , m , n ) G ( r ) dr
Coin flips: experiment I, Bayes ‘Bayesian” method: before conducting the experiment, guess probabilities for unknown p , i.e., Prob ( p = q | no data ) = G ( q ) with G , e.g., Green: I am (obviously!) respectable and coin is probably fair; Red: I am (absolutely) not re- spectable and coin is unfair; Blue: I would rather not guess to avoid confrontations... The initial guess G ( q ) evolves through Bayes rule as we get the data giving L ( q , m , n ) G ( q ) B ( q ) = , q ∈ [0 , 1] . � 1 0 L ( r , m , n ) G ( r ) dr � 1 � 1 The normalisation ensures 0 B ( q ) dq = 0 Prob ( p = q | data ) dq = 1.
Coin flips: experiment I, Bayes How does the data-update of G given by B look?
Coin flips: experiment I, Bayes How does the data-update of G given by B look?
Coin flips: experiment I, Bayes How does the data-update of G given by B look? B gives much more information than ˆ p !
Coin flips: experiment I, Bayes How does the data-update of G given by B look? B gives much more information than ˆ p ! It does so without having to maximise likelihood!
Coin flips: experiment I, Bayes How does the data-update of G given by B look? B gives much more information than ˆ p ! It does so without having to maximise likelihood! Mathematical properties?
Coin flips: experiment I, Bayes How does the data-update of G given by B look? B gives much more information than ˆ p ! It does so without having to maximise likelihood! Optimal! Bernstein–von Mises Theorem (BvM) � p + 1 If G ( p ) > 0 , then B ( q ) ≈ ˆ p (1 − p ) N (0 , 1)( q ) as n → ∞ . √ n (Here N (0 , 1) is the “Bell curve”.)
Coin flips: experiment I, Bayes How does the data-update of G given by B look? B gives much more information than ˆ p ! It does so without having to maximise likelihood! Optimal! Bernstein–von Mises Theorem (BvM) � p + 1 If G ( p ) > 0 , then B ( q ) ≈ ˆ p (1 − p ) N (0 , 1)( q ) as n → ∞ . √ n Indeed, Bayesian method is extensively used in practice. However, much harder to analyse mathematically: no-free-lunch principle!
Coin flips: experiment II, MLE Let H = Heads =0 and T = Tails =1.
Coin flips: experiment II, MLE Let H = Heads =0 and T = Tails =1. Now we do not see coin flips directly but “their sum” every 2 tosses.
Coin flips: experiment II, MLE Let H = Heads =0 and T = Tails =1. Now we do not see coin flips directly but “their sum” every 2 tosses. E.g., if coin flips are H , H , T , H , H , T , T , T , the data is 0 , 1 , 1 , 2. ���� ���� ���� ���� 0+0=0 1+0=1 0+1=1 1+1=2
Coin flips: experiment II, MLE Let H = Heads =0 and T = Tails =1. Now we do not see coin flips directly but “their sum” every 2 tosses. E.g., if coin flips are H , H , T , H , H , T , T , T , the data is 0 , 1 , 1 , 2. ���� ���� ���� ���� 0+0=0 1+0=1 0+1=1 1+1=2 How to guess the probability p of the underlying coin landing Heads now?
Coin flips: experiment II, MLE Let H = Heads =0 and T = Tails =1. Now we do not see coin flips directly but “their sum” every 2 tosses. E.g., if coin flips are H , H , T , H , H , T , T , T , the data is 0 , 1 , 1 , 2. ���� ���� ���� ���� 0+0=0 1+0=1 0+1=1 1+1=2 How to guess the probability p of the underlying coin landing Heads now? I.e., how to see the unseen ?
Coin flips: experiment II, MLE Let H = Heads =0 and T = Tails =1. Now we do not see coin flips directly but “their sum” every 2 tosses. E.g., if coin flips are H , H , T , H , H , T , T , T , the data is 0 , 1 , 1 , 2. ���� ���� ���� ���� 0+0=0 1+0=1 0+1=1 1+1=2 How to guess the probability p of the underlying coin landing Heads now? I.e., how to see the unseen ? Each sum of a pair of coin flips has options and probabilities Options 0 1 2 p 2 (1 − p ) 2 Probabs. 2 p (1 − p )
Coin flips: experiment II, MLE Let H = Heads =0 and T = Tails =1. Now we do not see coin flips directly but “their sum” every 2 tosses. E.g., if coin flips are H , H , T , H , H , T , T , T , the data is 0 , 1 , 1 , 2. ���� ���� ���� ���� 0+0=0 1+0=1 0+1=1 1+1=2 How to guess the probability p of the underlying coin landing Heads now? I.e., how to see the unseen ? Each sum of a pair of coin flips has options and probabilities Options 0 1 2 p 2 (1 − p ) 2 Probabs. 2 p (1 − p ) � #0 s Easy guesses given by “inverting table entries”: e.g., ˆ p = # data .
Coin flips: experiment II, MLE Let H = Heads =0 and T = Tails =1. Now we do not see coin flips directly but “their sum” every 2 tosses. E.g., if coin flips are H , H , T , H , H , T , T , T , the data is 0 , 1 , 1 , 2. ���� ���� ���� ���� 0+0=0 1+0=1 0+1=1 1+1=2 How to guess the probability p of the underlying coin landing Heads now? I.e., how to see the unseen ? Each sum of a pair of coin flips has options and probabilities Options 0 1 2 p 2 (1 − p ) 2 Probabs. 2 p (1 − p ) � #0 s Easy guesses given by “inverting table entries”: e.g., ˆ p = # data . Cannot be optimal as it does not use all the info from data.
Coin flips: experiment II, MLE Let H = Heads =0 and T = Tails =1. Now we do not see coin flips directly but “their sum” every 2 tosses. E.g., if coin flips are H , H , T , H , H , T , T , T , the data is 0 , 1 , 1 , 2. ���� ���� ���� ���� 0+0=0 1+0=1 0+1=1 1+1=2 How to guess the probability p of the underlying coin landing Heads now? I.e., how to see the unseen ? Each sum of a pair of coin flips has options and probabilities Options 0 1 2 p 2 (1 − p ) 2 Probabs. 2 p (1 − p ) � #0 s Easy guesses given by “inverting table entries”: e.g., ˆ p = # data . Cannot be optimal as it does not use all the info from data. If n 0 =#0 s , n 1 =#1 s and n 2 =#2 s , the likelihood of the data is L ( p , n 0 , n 1 , n 2 ) = p 2 n 0 (2 p (1 − p )) n 1 (1 − p ) 2 n 2 .
Coin flips: experiment II, MLE Let H = Heads =0 and T = Tails =1. Now we do not see coin flips directly but “their sum” every 2 tosses. E.g., if coin flips are H , H , T , H , H , T , T , T , the data is 0 , 1 , 1 , 2. ���� ���� ���� ���� 0+0=0 1+0=1 0+1=1 1+1=2 How to guess the probability p of the underlying coin landing Heads now? I.e., how to see the unseen ? Each sum of a pair of coin flips has options and probabilities Options 0 1 2 p 2 (1 − p ) 2 Probabs. 2 p (1 − p ) � #0 s Easy guesses given by “inverting table entries”: e.g., ˆ p = # data . Cannot be optimal as it does not use all the info from data. If n 0 =#0 s , n 1 =#1 s and n 2 =#2 s , the likelihood of the data is p = 1 2 (1+ n 0 − n 2 L ( p , n 0 , n 1 , n 2 ) = p 2 n 0 (2 p (1 − p )) n 1 (1 − p ) 2 n 2 . MLE is ˆ ) . n The MLE “inverts table entries” optimally, enjoying same properties as p − p | ≈ 1 / √ n . before (and more): as n → ∞ , ˆ p → p with Error = | ˆ
Coin flips: experiment II, Bayes Now we appreciate further the superiority of the Bayesian method:
Coin flips: experiment II, Bayes Now we appreciate further the superiority of the Bayesian method: with same initial guess G ( q ) as before, we again take L ( q , n 0 , n 1 , n 2 ) G ( q ) B ( q ) = , q ∈ [0 , 1] . � 1 0 L ( r , n 0 , n 1 , n 2 ) G ( r ) dr
Coin flips: experiment II, Bayes Now we appreciate further the superiority of the Bayesian method: with same initial guess G ( q ) as before, we again take � � L ( q , n 0 , n 1 , n 2 ) G ( q ) p = 2 B ( q ) = , q ∈ [0 , 1] . B in action 3 ≈ 67% : � 1 0 L ( r , n 0 , n 1 , n 2 ) G ( r ) dr
Coin flips: experiment II, Bayes Now we appreciate further the superiority of the Bayesian method: with same initial guess G ( q ) as before, we again take � � L ( q , n 0 , n 1 , n 2 ) G ( q ) p = 2 B ( q ) = , q ∈ [0 , 1] . B in action 3 ≈ 67% : � 1 0 L ( r , n 0 , n 1 , n 2 ) G ( r ) dr
Coin flips: experiment II, Bayes Now we appreciate further the superiority of the Bayesian method: with same initial guess G ( q ) as before, we again take � � L ( q , n 0 , n 1 , n 2 ) G ( q ) p = 2 B ( q ) = , q ∈ [0 , 1] . B in action 3 ≈ 67% : � 1 0 L ( r , n 0 , n 1 , n 2 ) G ( r ) dr Again, B gives much more information than ˆ p !
Coin flips: experiment II, Bayes Now we appreciate further the superiority of the Bayesian method: with same initial guess G ( q ) as before, we again take � � L ( q , n 0 , n 1 , n 2 ) G ( q ) p = 2 B ( q ) = , q ∈ [0 , 1] . B in action 3 ≈ 67% : � 1 0 L ( r , n 0 , n 1 , n 2 ) G ( r ) dr Again, B gives much more information than ˆ p ! B much easier than “inverting table entries” (maximising likelihood)!
Coin flips: experiment II, Bayes Now we appreciate further the superiority of the Bayesian method: with same initial guess G ( q ) as before, we again take � � L ( q , n 0 , n 1 , n 2 ) G ( q ) p = 2 B ( q ) = , q ∈ [0 , 1] . B in action 3 ≈ 67% : � 1 0 L ( r , n 0 , n 1 , n 2 ) G ( r ) dr Again, B gives much more information than ˆ p ! B much easier than “inverting table entries” (maximising likelihood)! In fact, we input the table entries and B “inverts” them automatically and optimally: a BvM Theorem holds!
Coin flips: experiment III, MLE Do not see the coin flips directly but flip a second coin with q H ∈ [0 , 1]: if it lands H (T) observe the “sum” of 2 (4 resp.) tosses of the first coin.
Coin flips: experiment III, MLE Do not see the coin flips directly but flip a second coin with q H ∈ [0 , 1]: if it lands H (T) observe the “sum” of 2 (4 resp.) tosses of the first coin. E.g., if second coin’s flips are 0 , 1 , 0 (not observed!) and the first coin’s flips are H , H , T , H , H , T , T , T , the data is 0 , 2 , 2. ���� � �� � ���� 0+0=0 1+0+0+1=2 1+1=2
Coin flips: experiment III, MLE Do not see the coin flips directly but flip a second coin with q H ∈ [0 , 1]: if it lands H (T) observe the “sum” of 2 (4 resp.) tosses of the first coin. E.g., if second coin’s flips are 0 , 1 , 0 (not observed!) and the first coin’s flips are H , H , T , H , H , T , T , T , the data is 0 , 2 , 2. ���� � �� � ���� 0+0=0 1+0+0+1=2 1+1=2 If q H known, how to guess p now? I.e., how to see the unseen ?
Coin flips: experiment III, MLE Do not see the coin flips directly but flip a second coin with q H ∈ [0 , 1]: if it lands H (T) observe the “sum” of 2 (4 resp.) tosses of the first coin. E.g., if second coin’s flips are 0 , 1 , 0 (not observed!) and the first coin’s flips are H , H , T , H , H , T , T , T , the data is 0 , 2 , 2. ���� � �� � ���� 0+0=0 1+0+0+1=2 1+1=2 If q H known, how to guess p now? I.e., how to see the unseen ? Each (random) sum of coin flips has options and probabilities given by Opts. 0 1 ... 4 p 0 = q H p 2 p 1 = q H 2 p (1 − p ) p 4 =(1 − q H ) Probs. ... +(1 − q H )4 p 3 (1 − p ) × (1 − p ) 4 +(1 − q H ) p 4
Coin flips: experiment III, MLE Do not see the coin flips directly but flip a second coin with q H ∈ [0 , 1]: if it lands H (T) observe the “sum” of 2 (4 resp.) tosses of the first coin. E.g., if second coin’s flips are 0 , 1 , 0 (not observed!) and the first coin’s flips are H , H , T , H , H , T , T , T , the data is 0 , 2 , 2. ���� � �� � ���� 0+0=0 1+0+0+1=2 1+1=2 If q H known, how to guess p now? I.e., how to see the unseen ? Each (random) sum of coin flips has options and probabilities given by Opts. 0 1 ... 4 p 0 = q H p 2 p 1 = q H 2 p (1 − p ) p 4 =(1 − q H ) Probs. ... +(1 − q H )4 p 3 (1 − p ) × (1 − p ) 4 +(1 − q H ) p 4 � � 1 / 4 . #4 s 1 Guess obtained inverting p 4 , i.e. ˆ p =1 − 1 q H # data −
Coin flips: experiment III, MLE Do not see the coin flips directly but flip a second coin with q H ∈ [0 , 1]: if it lands H (T) observe the “sum” of 2 (4 resp.) tosses of the first coin. E.g., if second coin’s flips are 0 , 1 , 0 (not observed!) and the first coin’s flips are H , H , T , H , H , T , T , T , the data is 0 , 2 , 2. ���� � �� � ���� 0+0=0 1+0+0+1=2 1+1=2 If q H known, how to guess p now? I.e., how to see the unseen ? Each (random) sum of coin flips has options and probabilities given by Opts. 0 1 ... 4 p 0 = q H p 2 p 1 = q H 2 p (1 − p ) p 4 =(1 − q H ) Probs. ... +(1 − q H )4 p 3 (1 − p ) × (1 − p ) 4 +(1 − q H ) p 4 � � 1 / 4 . Not optimal! #4 s 1 Guess obtained inverting p 4 , i.e. ˆ p =1 − 1 q H # data −
Coin flips: experiment III, MLE Do not see the coin flips directly but flip a second coin with q H ∈ [0 , 1]: if it lands H (T) observe the “sum” of 2 (4 resp.) tosses of the first coin. E.g., if second coin’s flips are 0 , 1 , 0 (not observed!) and the first coin’s flips are H , H , T , H , H , T , T , T , the data is 0 , 2 , 2. ���� � �� � ���� 0+0=0 1+0+0+1=2 1+1=2 If q H known, how to guess p now? I.e., how to see the unseen ? Each (random) sum of coin flips has options and probabilities given by Opts. 0 1 ... 4 p 0 = q H p 2 p 1 = q H 2 p (1 − p ) p 4 =(1 − q H ) Probs. ... +(1 − q H )4 p 3 (1 − p ) × (1 − p ) 4 +(1 − q H ) p 4 � � 1 / 4 . Not optimal! #4 s 1 Guess obtained inverting p 4 , i.e. ˆ p =1 − 1 q H # data − If n j =# js , likelihood of data is L ( p , n 0 , ..., n 4 )= � 4 j =0 p j ( p ) n j .
Coin flips: experiment III, MLE Do not see the coin flips directly but flip a second coin with q H ∈ [0 , 1]: if it lands H (T) observe the “sum” of 2 (4 resp.) tosses of the first coin. E.g., if second coin’s flips are 0 , 1 , 0 (not observed!) and the first coin’s flips are H , H , T , H , H , T , T , T , the data is 0 , 2 , 2. ���� � �� � ���� 0+0=0 1+0+0+1=2 1+1=2 If q H known, how to guess p now? I.e., how to see the unseen ? Each (random) sum of coin flips has options and probabilities given by Opts. 0 1 ... 4 p 0 = q H p 2 p 1 = q H 2 p (1 − p ) p 4 =(1 − q H ) Probs. ... +(1 − q H )4 p 3 (1 − p ) × (1 − p ) 4 +(1 − q H ) p 4 � � 1 / 4 . Not optimal! #4 s 1 Guess obtained inverting p 4 , i.e. ˆ p =1 − 1 q H # data − If n j =# js , likelihood of data is L ( p , n 0 , ..., n 4 )= � 4 j =0 p j ( p ) n j . A unique maximiser of p �→ L ( p , n 0 , ..., n 4 ) exists (MLE) but not in closed form: “inverting table entries” explicitly is too hard!
Coin flips: experiment III, MLE Do not see the coin flips directly but flip a second coin with q H ∈ [0 , 1]: if it lands H (T) observe the “sum” of 2 (4 resp.) tosses of the first coin. E.g., if second coin’s flips are 0 , 1 , 0 (not observed!) and the first coin’s flips are H , H , T , H , H , T , T , T , the data is 0 , 2 , 2. ���� � �� � ���� 0+0=0 1+0+0+1=2 1+1=2 If q H known, how to guess p now? I.e., how to see the unseen ? Each (random) sum of coin flips has options and probabilities given by Opts. 0 1 ... 4 p 0 = q H p 2 p 1 = q H 2 p (1 − p ) p 4 =(1 − q H ) Probs. ... +(1 − q H )4 p 3 (1 − p ) × (1 − p ) 4 +(1 − q H ) p 4 � � 1 / 4 . Not optimal! #4 s 1 Guess obtained inverting p 4 , i.e. ˆ p =1 − 1 q H # data − If n j =# js , likelihood of data is L ( p , n 0 , ..., n 4 )= � 4 j =0 p j ( p ) n j . A unique maximiser of p �→ L ( p , n 0 , ..., n 4 ) exists (MLE) but not in closed form: “inverting table entries” explicitly is too hard! The implicit MLE still enjoys the same desirable properties.
Coin flips: experiment III, Bayes The superiority of the Bayesian method in this experiment is clear:
Recommend
More recommend