Seeing the unseen: from coin flips to statistical inverse problems - PowerPoint PPT Presentation

Example II: medical imaging We need non-invasive ways to explore/diagnose patients: e.g., ultrasound, CT scan, MRI, etc. Ultrasound (very simplified!): send many sound pulses with probe that travel into your body; they hit boundaries between tissues and get reflected; and, image is created using the times echoes take to return to probe.

Example II: medical imaging We need non-invasive ways to explore/diagnose patients: e.g., ultrasound, CT scan, MRI, etc. Ultrasound (very simplified!): send many sound pulses with probe that travel into your body; they hit boundaries between tissues and get reflected; and, image is created using the times echoes take to return to probe. Given tissues produce specific echoes. Machine “inverted” latter indirect (and incomplete!) info and dealt with instrumental errors (random): statistical inverse problem .

Example II: medical imaging We need non-invasive ways to explore/diagnose patients: e.g., ultrasound, CT scan, MRI, etc. Ultrasound (very simplified!): send many sound pulses with probe that travel into your body; they hit boundaries between tissues and get reflected; and, image is created using the times echoes take to return to probe. Given tissues produce specific echoes. Machine “inverted” latter indirect (and incomplete!) info and dealt with instrumental errors (random): statistical inverse problem . Does it do better as instrum. errors decrease?

Outline 1 Introduction Mathematical Statistics Examples 2 Seeing the unseen Coin flips Statistical inverse problems

Coin flips: experiment I Let us simplify things and flip some coins:

Coin flips: experiment I Let us simplify things and flip some coins: with your phones, go to https://albertococacabrero.wordpress.com/openday/ or google Alberto J Coca Cambridge and add openday/ to end of my URL:

Coin flips: experiment I, MLE We do not know the probability p ∈ [0 , 1] of this coin landing Heads (typically p =1 / 2, i.e. fair coin).

Coin flips: experiment I, MLE We do not know the probability p ∈ [0 , 1] of this coin landing Heads (typically p =1 / 2, i.e. fair coin). How to guess p from our data?

Coin flips: experiment I, MLE We do not know the probability p ∈ [0 , 1] of this coin landing Heads (typically p =1 / 2, i.e. fair coin). How to guess p from our data? p = # Heads “Frequentist” guess is ˆ # flips .

Coin flips: experiment I, MLE We do not know the probability p ∈ [0 , 1] of this coin landing Heads (typically p =1 / 2, i.e. fair coin). How to guess p from our data? p = # Heads “Frequentist” guess is ˆ # flips . Is this a sensible guess?

Coin flips: experiment I, MLE We do not know the probability p ∈ [0 , 1] of this coin landing Heads (typically p =1 / 2, i.e. fair coin). How to guess p from our data? p = # Heads “Frequentist” guess is ˆ # flips . Is this a sensible guess? Yes:

Coin flips: experiment I, MLE We do not know the probability p ∈ [0 , 1] of this coin landing Heads (typically p =1 / 2, i.e. fair coin). How to guess p from our data? p = # Heads “Frequentist” guess is ˆ # flips . Is this a sensible guess? Yes: each coin flip has options and probabilities given by Options Heads Tails Probabs. p 1 − p

Coin flips: experiment I, MLE We do not know the probability p ∈ [0 , 1] of this coin landing Heads (typically p =1 / 2, i.e. fair coin). How to guess p from our data? p = # Heads “Frequentist” guess is ˆ # flips . Is this a sensible guess? Yes: each coin flip has options and probabilities given by Options Heads Tails Probabs. p 1 − p If our data is Heads , Tails , Heads , it has probability or likelihood L = p (1 − p ) p = p 2 (1 − p );

Coin flips: experiment I, MLE We do not know the probability p ∈ [0 , 1] of this coin landing Heads (typically p =1 / 2, i.e. fair coin). How to guess p from our data? p = # Heads “Frequentist” guess is ˆ # flips . Is this a sensible guess? Yes: each coin flip has options and probabilities given by Options Heads Tails Probabs. p 1 − p If our data is Heads , Tails , Heads , it has probability or likelihood L = p (1 − p ) p = p 2 (1 − p ); and, more generally, if we flip n coins and obtain m heads, the likelihood of our data is L ( p , m , n ) = p m (1 − p ) n − m .

Coin flips: experiment I, MLE We do not know the probability p ∈ [0 , 1] of this coin landing Heads (typically p =1 / 2, i.e. fair coin). How to guess p from our data? p = # Heads “Frequentist” guess is ˆ # flips . Is this a sensible guess? Yes: each coin flip has options and probabilities given by Options Heads Tails Probabs. p 1 − p If our data is Heads , Tails , Heads , it has probability or likelihood L = p (1 − p ) p = p 2 (1 − p ); and, more generally, if we flip n coins and obtain m heads, the likelihood of our data is L ( p , m , n ) = p m (1 − p ) n − m . p = m Homework : ˆ n maximises q �→ L ( q , m , n ) when q ∈ [0 , 1]!

Coin flips: experiment I, MLE We do not know the probability p ∈ [0 , 1] of this coin landing Heads (typically p =1 / 2, i.e. fair coin). How to guess p from our data? p = # Heads “Frequentist” guess is ˆ # flips . Is this a sensible guess? Yes: each coin flip has options and probabilities given by Options Heads Tails Probabs. p 1 − p If our data is Heads , Tails , Heads , it has probability or likelihood L = p (1 − p ) p = p 2 (1 − p ); and, more generally, if we flip n coins and obtain m heads, the likelihood of our data is L ( p , m , n ) = p m (1 − p ) n − m . p = m Homework : ˆ n maximises q �→ L ( q , m , n ) when q ∈ [0 , 1]! Indeed, ˆ p is called the Maximum Likelihood Estimator (MLE).

Coin flips: experiment I, MLE p = m The MLE ˆ n enjoys mathematically desirable properties:

Coin flips: experiment I, MLE p = m The MLE ˆ n enjoys mathematically desirable properties: e.g., Law of Large Numbers (LLN) It holds that ˆ p → p as n → ∞ .

Coin flips: experiment I, MLE p = m The MLE ˆ n enjoys mathematically desirable properties: e.g., Law of Large Numbers (LLN) It holds that ˆ p → p as n → ∞ . How fast does Error = | ˆ p − p | → 0?

Coin flips: experiment I, MLE p = m The MLE ˆ n enjoys mathematically desirable properties: e.g., Law of Large Numbers (LLN) It holds that ˆ p → p as n → ∞ . How fast does Error = | ˆ p − p | → 0? Mathematical results guarantee it cannot be faster than 1 / √ n .

Coin flips: experiment I, MLE p = m The MLE ˆ n enjoys mathematically desirable properties: e.g., Law of Large Numbers (LLN) It holds that ˆ p → p as n → ∞ . How fast does Error = | ˆ p − p | → 0? Mathematical results guarantee it cannot be faster than 1 / √ n . Note that if Error ≈ 1 / n a = n − a for some a > 0 , then log Error ≈ − a log n .

Coin flips: experiment I, MLE p = m The MLE ˆ n enjoys mathematically desirable properties: e.g., Law of Large Numbers (LLN) It holds that ˆ p → p as n → ∞ . How fast does Error = | ˆ p − p | → 0? Mathematical results guarantee it cannot be faster than 1 / √ n . Note that if Error ≈ 1 / n a = n − a for some a > 0 , then log Error ≈ − a log n . Hence, to find value of a , plot x = log n vs. y = log Error ≈ − ax and compute the slope.

Coin flips: experiment I, MLE p = m The MLE ˆ n enjoys mathematically desirable properties: e.g., Law of Large Numbers (LLN) It holds that ˆ p → p as n → ∞ . How fast does Error = | ˆ p − p | → 0? Mathematical results guarantee it cannot be faster than 1 / √ n . Note that if Error ≈ 1 / n a = n − a for some a > 0 , then log Error ≈ − a log n . Hence, to find value of a , plot x = log n vs. y = log Error ≈ − ax and compute the slope. Plot suggests a = 1 / 2, i.e. Error ≈ 1 / √ n !

Coin flips: experiment I, MLE p = m The MLE ˆ n enjoys mathematically desirable properties: e.g., Law of Large Numbers (LLN) It holds that ˆ p → p as n → ∞ . How fast does Error = | ˆ p − p | → 0? Mathematical results guarantee it cannot be faster than 1 / √ n . Note that if Error ≈ 1 / n a = n − a for some a > 0 , then log Error ≈ − a log n . Hence, to find value of a , plot x = log n vs. y = log Error ≈ − ax and compute the slope. Plot suggests a = 1 / 2, i.e. Error ≈ 1 / √ n ! Thus, MLE is optimal in convergence rates!

Coin flips: experiment I, MLE p = m The MLE ˆ n enjoys mathematically desirable properties: e.g., Law of Large Numbers (LLN) It holds that ˆ p → p as n → ∞ . How fast does Error = | ˆ p − p | → 0? Mathematical results guarantee it cannot be faster than 1 / √ n . Note that if Error ≈ 1 / n a = n − a for some a > 0 , then log Error ≈ − a log n . Hence, to find value of a , plot x = log n vs. y = log Error ≈ − ax and compute the slope. Plot suggests a = 1 / 2, i.e. Error ≈ 1 / √ n ! Thus, MLE is optimal in convergence rates! (Optimal in other senses in view of, e.g., Central Limit Theorem, but no time to explain this.)

Coin flips: experiment I, MLE p = m The MLE ˆ n enjoys mathematically desirable properties: e.g., Law of Large Numbers (LLN) It holds that ˆ p → p as n → ∞ . How fast does Error = | ˆ p − p | → 0? Mathematical results guarantee it cannot be faster than 1 / √ n . Note that if Error ≈ 1 / n a = n − a for some a > 0 , then log Error ≈ − a log n . Hence, to find value of a , plot x = log n vs. y = log Error ≈ − ax and compute the slope. Plot suggests a = 1 / 2, i.e. Error ≈ 1 / √ n ! Thus, MLE is optimal in convergence rates! (Optimal in other senses in view of, e.g., Central Limit Theorem, but no time to explain this.) Other estimators are optimal too:

Coin flips: experiment I, Bayes ‘Bayesian” method: before conducting the experiment, guess probabilities for unknown p , i.e., Prob ( p = q | no data ) = G ( q ) with G , e.g., Green: I am (obviously!) respectable and coin is probably fair;

Coin flips: experiment I, Bayes ‘Bayesian” method: before conducting the experiment, guess probabilities for unknown p , i.e., Prob ( p = q | no data ) = G ( q ) with G , e.g., Green: I am (obviously!) respectable and coin is probably fair; Red: I am (absolutely) not respectable and coin is unfair;

Coin flips: experiment I, Bayes ‘Bayesian” method: before conducting the experiment, guess probabilities for unknown p , i.e., Prob ( p = q | no data ) = G ( q ) with G , e.g., Green: I am (obviously!) respectable and coin is probably fair; Red: I am (absolutely) not respectable and coin is unfair; Blue: I would rather not guess to avoid confrontations...

Coin flips: experiment I, Bayes ‘Bayesian” method: before conducting the experiment, guess probabilities for unknown p , i.e., Prob ( p = q | no data ) = G ( q ) with G , e.g., Green: I am (obviously!) respectable and coin is probably fair; Red: I am (absolutely) not respectable and coin is unfair; Blue: I would rather not guess to avoid confrontations... The initial guess G ( q ) evolves through Bayes rule as we get the data giving L ( q , m , n ) G ( q ) B ( q ) = , q ∈ [0 , 1] . � 1 0 L ( r , m , n ) G ( r ) dr

Coin flips: experiment I, Bayes ‘Bayesian” method: before conducting the experiment, guess probabilities for unknown p , i.e., Prob ( p = q | no data ) = G ( q ) with G , e.g., Green: I am (obviously!) respectable and coin is probably fair; Red: I am (absolutely) not respectable and coin is unfair; Blue: I would rather not guess to avoid confrontations... The initial guess G ( q ) evolves through Bayes rule as we get the data giving L ( q , m , n ) G ( q ) B ( q ) = , q ∈ [0 , 1] . � 1 0 L ( r , m , n ) G ( r ) dr � 1 � 1 The normalisation ensures 0 B ( q ) dq = 0 Prob ( p = q | data ) dq = 1.

Coin flips: experiment I, Bayes How does the data-update of G given by B look?

Coin flips: experiment I, Bayes How does the data-update of G given by B look? B gives much more information than ˆ p !

Coin flips: experiment I, Bayes How does the data-update of G given by B look? B gives much more information than ˆ p ! It does so without having to maximise likelihood!

Coin flips: experiment I, Bayes How does the data-update of G given by B look? B gives much more information than ˆ p ! It does so without having to maximise likelihood! Mathematical properties?

Coin flips: experiment I, Bayes How does the data-update of G given by B look? B gives much more information than ˆ p ! It does so without having to maximise likelihood! Optimal! Bernstein–von Mises Theorem (BvM) � p + 1 If G ( p ) > 0 , then B ( q ) ≈ ˆ p (1 − p ) N (0 , 1)( q ) as n → ∞ . √ n (Here N (0 , 1) is the “Bell curve”.)

Coin flips: experiment I, Bayes How does the data-update of G given by B look? B gives much more information than ˆ p ! It does so without having to maximise likelihood! Optimal! Bernstein–von Mises Theorem (BvM) � p + 1 If G ( p ) > 0 , then B ( q ) ≈ ˆ p (1 − p ) N (0 , 1)( q ) as n → ∞ . √ n Indeed, Bayesian method is extensively used in practice. However, much harder to analyse mathematically: no-free-lunch principle!

Coin flips: experiment II, MLE Let H = Heads =0 and T = Tails =1.

Coin flips: experiment II, MLE Let H = Heads =0 and T = Tails =1. Now we do not see coin flips directly but “their sum” every 2 tosses.

Coin flips: experiment II, MLE Let H = Heads =0 and T = Tails =1. Now we do not see coin flips directly but “their sum” every 2 tosses. E.g., if coin flips are H , H , T , H , H , T , T , T , the data is 0 , 1 , 1 , 2. �� 0+0=0 1+0=1 0+1=1 1+1=2

Coin flips: experiment II, MLE Let H = Heads =0 and T = Tails =1. Now we do not see coin flips directly but “their sum” every 2 tosses. E.g., if coin flips are H , H , T , H , H , T , T , T , the data is 0 , 1 , 1 , 2. �� 0+0=0 1+0=1 0+1=1 1+1=2 How to guess the probability p of the underlying coin landing Heads now?

Coin flips: experiment II, MLE Let H = Heads =0 and T = Tails =1. Now we do not see coin flips directly but “their sum” every 2 tosses. E.g., if coin flips are H , H , T , H , H , T , T , T , the data is 0 , 1 , 1 , 2. �� 0+0=0 1+0=1 0+1=1 1+1=2 How to guess the probability p of the underlying coin landing Heads now? I.e., how to see the unseen ?

Coin flips: experiment II, MLE Let H = Heads =0 and T = Tails =1. Now we do not see coin flips directly but “their sum” every 2 tosses. E.g., if coin flips are H , H , T , H , H , T , T , T , the data is 0 , 1 , 1 , 2. �� 0+0=0 1+0=1 0+1=1 1+1=2 How to guess the probability p of the underlying coin landing Heads now? I.e., how to see the unseen ? Each sum of a pair of coin flips has options and probabilities Options 0 1 2 p 2 (1 − p ) 2 Probabs. 2 p (1 − p )

Coin flips: experiment II, MLE Let H = Heads =0 and T = Tails =1. Now we do not see coin flips directly but “their sum” every 2 tosses. E.g., if coin flips are H , H , T , H , H , T , T , T , the data is 0 , 1 , 1 , 2. �� 0+0=0 1+0=1 0+1=1 1+1=2 How to guess the probability p of the underlying coin landing Heads now? I.e., how to see the unseen ? Each sum of a pair of coin flips has options and probabilities Options 0 1 2 p 2 (1 − p ) 2 Probabs. 2 p (1 − p ) � #0 s Easy guesses given by “inverting table entries”: e.g., ˆ p = # data .

Coin flips: experiment II, MLE Let H = Heads =0 and T = Tails =1. Now we do not see coin flips directly but “their sum” every 2 tosses. E.g., if coin flips are H , H , T , H , H , T , T , T , the data is 0 , 1 , 1 , 2. �� 0+0=0 1+0=1 0+1=1 1+1=2 How to guess the probability p of the underlying coin landing Heads now? I.e., how to see the unseen ? Each sum of a pair of coin flips has options and probabilities Options 0 1 2 p 2 (1 − p ) 2 Probabs. 2 p (1 − p ) � #0 s Easy guesses given by “inverting table entries”: e.g., ˆ p = # data . Cannot be optimal as it does not use all the info from data.

Coin flips: experiment II, MLE Let H = Heads =0 and T = Tails =1. Now we do not see coin flips directly but “their sum” every 2 tosses. E.g., if coin flips are H , H , T , H , H , T , T , T , the data is 0 , 1 , 1 , 2. �� 0+0=0 1+0=1 0+1=1 1+1=2 How to guess the probability p of the underlying coin landing Heads now? I.e., how to see the unseen ? Each sum of a pair of coin flips has options and probabilities Options 0 1 2 p 2 (1 − p ) 2 Probabs. 2 p (1 − p ) � #0 s Easy guesses given by “inverting table entries”: e.g., ˆ p = # data . Cannot be optimal as it does not use all the info from data. If n 0 =#0 s , n 1 =#1 s and n 2 =#2 s , the likelihood of the data is L ( p , n 0 , n 1 , n 2 ) = p 2 n 0 (2 p (1 − p )) n 1 (1 − p ) 2 n 2 .

Coin flips: experiment II, MLE Let H = Heads =0 and T = Tails =1. Now we do not see coin flips directly but “their sum” every 2 tosses. E.g., if coin flips are H , H , T , H , H , T , T , T , the data is 0 , 1 , 1 , 2. �� 0+0=0 1+0=1 0+1=1 1+1=2 How to guess the probability p of the underlying coin landing Heads now? I.e., how to see the unseen ? Each sum of a pair of coin flips has options and probabilities Options 0 1 2 p 2 (1 − p ) 2 Probabs. 2 p (1 − p ) � #0 s Easy guesses given by “inverting table entries”: e.g., ˆ p = # data . Cannot be optimal as it does not use all the info from data. If n 0 =#0 s , n 1 =#1 s and n 2 =#2 s , the likelihood of the data is p = 1 2 (1+ n 0 − n 2 L ( p , n 0 , n 1 , n 2 ) = p 2 n 0 (2 p (1 − p )) n 1 (1 − p ) 2 n 2 . MLE is ˆ ) . n The MLE “inverts table entries” optimally, enjoying same properties as p − p | ≈ 1 / √ n . before (and more): as n → ∞ , ˆ p → p with Error = | ˆ

Coin flips: experiment II, Bayes Now we appreciate further the superiority of the Bayesian method:

Coin flips: experiment II, Bayes Now we appreciate further the superiority of the Bayesian method: with same initial guess G ( q ) as before, we again take L ( q , n 0 , n 1 , n 2 ) G ( q ) B ( q ) = , q ∈ [0 , 1] . � 1 0 L ( r , n 0 , n 1 , n 2 ) G ( r ) dr

Coin flips: experiment II, Bayes Now we appreciate further the superiority of the Bayesian method: with same initial guess G ( q ) as before, we again take � � L ( q , n 0 , n 1 , n 2 ) G ( q ) p = 2 B ( q ) = , q ∈ [0 , 1] . B in action 3 ≈ 67% : � 1 0 L ( r , n 0 , n 1 , n 2 ) G ( r ) dr

Coin flips: experiment II, Bayes Now we appreciate further the superiority of the Bayesian method: with same initial guess G ( q ) as before, we again take � � L ( q , n 0 , n 1 , n 2 ) G ( q ) p = 2 B ( q ) = , q ∈ [0 , 1] . B in action 3 ≈ 67% : � 1 0 L ( r , n 0 , n 1 , n 2 ) G ( r ) dr Again, B gives much more information than ˆ p !

Coin flips: experiment II, Bayes Now we appreciate further the superiority of the Bayesian method: with same initial guess G ( q ) as before, we again take � � L ( q , n 0 , n 1 , n 2 ) G ( q ) p = 2 B ( q ) = , q ∈ [0 , 1] . B in action 3 ≈ 67% : � 1 0 L ( r , n 0 , n 1 , n 2 ) G ( r ) dr Again, B gives much more information than ˆ p ! B much easier than “inverting table entries” (maximising likelihood)!

Coin flips: experiment II, Bayes Now we appreciate further the superiority of the Bayesian method: with same initial guess G ( q ) as before, we again take � � L ( q , n 0 , n 1 , n 2 ) G ( q ) p = 2 B ( q ) = , q ∈ [0 , 1] . B in action 3 ≈ 67% : � 1 0 L ( r , n 0 , n 1 , n 2 ) G ( r ) dr Again, B gives much more information than ˆ p ! B much easier than “inverting table entries” (maximising likelihood)! In fact, we input the table entries and B “inverts” them automatically and optimally: a BvM Theorem holds!

Coin flips: experiment III, MLE Do not see the coin flips directly but flip a second coin with q H ∈ [0 , 1]: if it lands H (T) observe the “sum” of 2 (4 resp.) tosses of the first coin.

Coin flips: experiment III, MLE Do not see the coin flips directly but flip a second coin with q H ∈ [0 , 1]: if it lands H (T) observe the “sum” of 2 (4 resp.) tosses of the first coin. E.g., if second coin’s flips are 0 , 1 , 0 (not observed!) and the first coin’s flips are H , H , T , H , H , T , T , T , the data is 0 , 2 , 2. �� 0+0=0 1+0+0+1=2 1+1=2

Coin flips: experiment III, MLE Do not see the coin flips directly but flip a second coin with q H ∈ [0 , 1]: if it lands H (T) observe the “sum” of 2 (4 resp.) tosses of the first coin. E.g., if second coin’s flips are 0 , 1 , 0 (not observed!) and the first coin’s flips are H , H , T , H , H , T , T , T , the data is 0 , 2 , 2. �� 0+0=0 1+0+0+1=2 1+1=2 If q H known, how to guess p now? I.e., how to see the unseen ?

Coin flips: experiment III, MLE Do not see the coin flips directly but flip a second coin with q H ∈ [0 , 1]: if it lands H (T) observe the “sum” of 2 (4 resp.) tosses of the first coin. E.g., if second coin’s flips are 0 , 1 , 0 (not observed!) and the first coin’s flips are H , H , T , H , H , T , T , T , the data is 0 , 2 , 2. �� 0+0=0 1+0+0+1=2 1+1=2 If q H known, how to guess p now? I.e., how to see the unseen ? Each (random) sum of coin flips has options and probabilities given by Opts. 0 1 ... 4 p 0 = q H p 2 p 1 = q H 2 p (1 − p ) p 4 =(1 − q H ) Probs. ... +(1 − q H )4 p 3 (1 − p ) × (1 − p ) 4 +(1 − q H ) p 4

Coin flips: experiment III, MLE Do not see the coin flips directly but flip a second coin with q H ∈ [0 , 1]: if it lands H (T) observe the “sum” of 2 (4 resp.) tosses of the first coin. E.g., if second coin’s flips are 0 , 1 , 0 (not observed!) and the first coin’s flips are H , H , T , H , H , T , T , T , the data is 0 , 2 , 2. �� 0+0=0 1+0+0+1=2 1+1=2 If q H known, how to guess p now? I.e., how to see the unseen ? Each (random) sum of coin flips has options and probabilities given by Opts. 0 1 ... 4 p 0 = q H p 2 p 1 = q H 2 p (1 − p ) p 4 =(1 − q H ) Probs. ... +(1 − q H )4 p 3 (1 − p ) × (1 − p ) 4 +(1 − q H ) p 4 � � 1 / 4 . #4 s 1 Guess obtained inverting p 4 , i.e. ˆ p =1 − 1 q H # data −

Coin flips: experiment III, MLE Do not see the coin flips directly but flip a second coin with q H ∈ [0 , 1]: if it lands H (T) observe the “sum” of 2 (4 resp.) tosses of the first coin. E.g., if second coin’s flips are 0 , 1 , 0 (not observed!) and the first coin’s flips are H , H , T , H , H , T , T , T , the data is 0 , 2 , 2. �� 0+0=0 1+0+0+1=2 1+1=2 If q H known, how to guess p now? I.e., how to see the unseen ? Each (random) sum of coin flips has options and probabilities given by Opts. 0 1 ... 4 p 0 = q H p 2 p 1 = q H 2 p (1 − p ) p 4 =(1 − q H ) Probs. ... +(1 − q H )4 p 3 (1 − p ) × (1 − p ) 4 +(1 − q H ) p 4 � � 1 / 4 . Not optimal! #4 s 1 Guess obtained inverting p 4 , i.e. ˆ p =1 − 1 q H # data −

Coin flips: experiment III, MLE Do not see the coin flips directly but flip a second coin with q H ∈ [0 , 1]: if it lands H (T) observe the “sum” of 2 (4 resp.) tosses of the first coin. E.g., if second coin’s flips are 0 , 1 , 0 (not observed!) and the first coin’s flips are H , H , T , H , H , T , T , T , the data is 0 , 2 , 2. �� 0+0=0 1+0+0+1=2 1+1=2 If q H known, how to guess p now? I.e., how to see the unseen ? Each (random) sum of coin flips has options and probabilities given by Opts. 0 1 ... 4 p 0 = q H p 2 p 1 = q H 2 p (1 − p ) p 4 =(1 − q H ) Probs. ... +(1 − q H )4 p 3 (1 − p ) × (1 − p ) 4 +(1 − q H ) p 4 � � 1 / 4 . Not optimal! #4 s 1 Guess obtained inverting p 4 , i.e. ˆ p =1 − 1 q H # data − If n j =# js , likelihood of data is L ( p , n 0 , ..., n 4 )= � 4 j =0 p j ( p ) n j .

Coin flips: experiment III, MLE Do not see the coin flips directly but flip a second coin with q H ∈ [0 , 1]: if it lands H (T) observe the “sum” of 2 (4 resp.) tosses of the first coin. E.g., if second coin’s flips are 0 , 1 , 0 (not observed!) and the first coin’s flips are H , H , T , H , H , T , T , T , the data is 0 , 2 , 2. �� 0+0=0 1+0+0+1=2 1+1=2 If q H known, how to guess p now? I.e., how to see the unseen ? Each (random) sum of coin flips has options and probabilities given by Opts. 0 1 ... 4 p 0 = q H p 2 p 1 = q H 2 p (1 − p ) p 4 =(1 − q H ) Probs. ... +(1 − q H )4 p 3 (1 − p ) × (1 − p ) 4 +(1 − q H ) p 4 � � 1 / 4 . Not optimal! #4 s 1 Guess obtained inverting p 4 , i.e. ˆ p =1 − 1 q H # data − If n j =# js , likelihood of data is L ( p , n 0 , ..., n 4 )= � 4 j =0 p j ( p ) n j . A unique maximiser of p �→ L ( p , n 0 , ..., n 4 ) exists (MLE) but not in closed form: “inverting table entries” explicitly is too hard!

Coin flips: experiment III, MLE Do not see the coin flips directly but flip a second coin with q H ∈ [0 , 1]: if it lands H (T) observe the “sum” of 2 (4 resp.) tosses of the first coin. E.g., if second coin’s flips are 0 , 1 , 0 (not observed!) and the first coin’s flips are H , H , T , H , H , T , T , T , the data is 0 , 2 , 2. �� 0+0=0 1+0+0+1=2 1+1=2 If q H known, how to guess p now? I.e., how to see the unseen ? Each (random) sum of coin flips has options and probabilities given by Opts. 0 1 ... 4 p 0 = q H p 2 p 1 = q H 2 p (1 − p ) p 4 =(1 − q H ) Probs. ... +(1 − q H )4 p 3 (1 − p ) × (1 − p ) 4 +(1 − q H ) p 4 � � 1 / 4 . Not optimal! #4 s 1 Guess obtained inverting p 4 , i.e. ˆ p =1 − 1 q H # data − If n j =# js , likelihood of data is L ( p , n 0 , ..., n 4 )= � 4 j =0 p j ( p ) n j . A unique maximiser of p �→ L ( p , n 0 , ..., n 4 ) exists (MLE) but not in closed form: “inverting table entries” explicitly is too hard! The implicit MLE still enjoys the same desirable properties.

Coin flips: experiment III, Bayes The superiority of the Bayesian method in this experiment is clear:

Seeing the unseen: from coin flips to statistical inverse problems - PowerPoint PPT Presentation

Seeing the unseen: from coin flips to statistical inverse problems Alberto J. Coca StatsLab, University of Cambridge Topics Taster University of Cambridge Open Day 5th July 2018 Outline 1 Introduction Mathematical Statistics Examples 2

Attacking Cryptography Flip coins 64 coin flips Some will be assigned to make it up.

Whats going on here? Results from multiple runs of the same program: Flipping a coin: Heads!

Probability Distributions and Introduction to Statistical Inference BIO5312 FALL2017 STEPHANIE

Independence Will Perkins January 17, 2013 Independent Events Definition Two events A and B are

Flips, double flips and advanced flips: next steps in flipping the classroom Dr Fiona J. L.

Statistical Inverse Problems and abstract inverse problems examples Instrumental Variables

Statistical inverse method for the multiscale identification of the apparent random elasticity

Spectral regularization methods for statistical inverse learning problems G. Blanchard

The Statistical Method Will Perkins February 24, 2013 What is statistics? A method for

Statistical Inverse Problems and Instrumental Variables Thorsten Hohage Institut fr Numerische

PARADOX THE UPSIDE DOWN TRUTH OF FAITH PARADOX Week 4 Seeing the Unseen to Truly See

COIN-OR and the COIN-OR Optimization Suite Ted Ralphs COIN fORgery: Developing Open Source Tools

Flips, Arrangements and Tableaux Ron Adin and Yuval Roichman Bar-Ilan University radin, yuvalr

COMPREHENSION OF UNSEEN PASSAGES UNSEEN PASSAGES Teacher : Prof. Indu Bora Subject :

R EVOLUTION & P OLITICAL V IOLENCE TODAYS AGENDA 1 COIN lessons from WW1 2 What is

Inverse Problems: Mathematical and Statistical Methodology H. T. Banks Center for Research in

k heads from n flips Probability k =5 from n = 10 0.246 k =6 from n = 10 0.205 1 Sampling

Language Modeling (Part II) Lecture 10 CS 753 Instructor: Preethi Jyothi Unseen Ngrams By

Minimax theory for a class of non-linear statistical inverse problems Kolyan Ray (joint work

Build and Test The COIN-OR Way Ted Ralphs COIN fORgery: Developing Open Source Tools for OR

& Information Theory Problems with Unseen Sequences Suppose we want to evaluate bigram

1 Some Fun Facts 1.1 Useful Matrix Identities 1. inverse flip identity : ( I n + AB )

testbed in the Do we need a testbed in the Do we need a COIN community and for what ? COIN

Inverse Problems Recovering x 0 R N from noisy observations y = x 0 + w R P Inverse

Seeing the unseen: from coin flips to statistical inverse problems - PowerPoint PPT Presentation

Seeing the unseen: from coin flips to statistical inverse problems Alberto J. Coca StatsLab, University of Cambridge Topics Taster University of Cambridge Open Day 5th July 2018 Outline 1 Introduction Mathematical Statistics Examples 2

Attacking Cryptography Flip coins 64 coin flips Some will be assigned to make it up.

Whats going on here? Results from multiple runs of the same program: Flipping a coin: Heads!

Probability Distributions and Introduction to Statistical Inference BIO5312 FALL2017 STEPHANIE

Independence Will Perkins January 17, 2013 Independent Events Definition Two events A and B are

Flips, double flips and advanced flips: next steps in flipping the classroom Dr Fiona J. L.

Statistical Inverse Problems and abstract inverse problems examples Instrumental Variables

Statistical inverse method for the multiscale identification of the apparent random elasticity

Spectral regularization methods for statistical inverse learning problems G. Blanchard

The Statistical Method Will Perkins February 24, 2013 What is statistics? A method for

Statistical Inverse Problems and Instrumental Variables Thorsten Hohage Institut fr Numerische

PARADOX THE UPSIDE DOWN TRUTH OF FAITH PARADOX Week 4 Seeing the Unseen to Truly See

COIN-OR and the COIN-OR Optimization Suite Ted Ralphs COIN fORgery: Developing Open Source Tools

Flips, Arrangements and Tableaux Ron Adin and Yuval Roichman Bar-Ilan University radin, yuvalr

COMPREHENSION OF UNSEEN PASSAGES UNSEEN PASSAGES Teacher : Prof. Indu Bora Subject :

R EVOLUTION &amp; P OLITICAL V IOLENCE TODAYS AGENDA 1 COIN lessons from WW1 2 What is

Inverse Problems: Mathematical and Statistical Methodology H. T. Banks Center for Research in

k heads from n flips Probability k =5 from n = 10 0.246 k =6 from n = 10 0.205 1 Sampling

Language Modeling (Part II) Lecture 10 CS 753 Instructor: Preethi Jyothi Unseen Ngrams By

Minimax theory for a class of non-linear statistical inverse problems Kolyan Ray (joint work

Build and Test The COIN-OR Way Ted Ralphs COIN fORgery: Developing Open Source Tools for OR

&amp; Information Theory Problems with Unseen Sequences Suppose we want to evaluate bigram

1 Some Fun Facts 1.1 Useful Matrix Identities 1. inverse flip identity : ( I n + AB )

testbed in the Do we need a testbed in the Do we need a COIN community and for what ? COIN

Inverse Problems Recovering x 0 R N from noisy observations y = x 0 + w R P Inverse

R EVOLUTION & P OLITICAL V IOLENCE TODAYS AGENDA 1 COIN lessons from WW1 2 What is

& Information Theory Problems with Unseen Sequences Suppose we want to evaluate bigram