P = P (accidents happen in period t ) = 1 e A P ( B ) t A P ( B - PowerPoint PPT Presentation

Lecture 9. Bayesian Inference - updating priors 1 Igor Rychlik Chalmers Department of Mathematical Sciences Probability, Statistics and Risk, MVE300 • Chalmers • May 2013 1 Bayesian statistics is a general methodology to analyse and draw conclusions from data.

P = P (accidents happen in period t ) = 1 − e − λ A P ( B ) t ≈ λ A P ( B ) t , if probability P is small. Hence Two problems of interest in risk analysis: ◮ The first one will deal with the estimation of a probability p B = P( B ), say, of some event B , for example the probability of failure of some system. In figure B = B 1 ∪ B 2 , B 1 ∩ B 2 = ∅ ◮ The second one is estimation of the probability that at least once an event A occurs in a time period of length t . The problem reduces itself to estimation of the intensity λ A of A . ’ The parameters p B and λ A are unknown. S 1 S 2 S 3 S 4 S 5 S 6 • • • • • • ✲ ❄ ❄ ❄ B 1 B 1 B 2 Figure: Events A at times S i with related scenarios B i .

Odds for parameters Let θ denote the unknown value of p B , λ A or any other quantity. Introduce odds q θ , which for any pair θ 1 , θ 2 represents our belief which of θ 1 or θ 2 is more likely to be the unknown value of θ , i.e. q θ 1 : q θ 2 are odds for the alternatives A 1 = “ θ = θ 1 ” against A 2 = “ θ = θ 2 ”. We require that q θ integrates to one and hence f ( θ ) = q θ is a probability density function representing our belief about the value of θ . The random variable Θ having the pdf serves as a mathematical model for uncertainty in the value of θ .

Prior odds - posterior ods Let θ be the unknown parameter ( θ = p B , θ = λ A ), while Θ denotes any of the variables P or Λ. Since θ is unknown, it is seen as a value taken by a random variable Θ with pdf f ( θ ). If f ( θ ) is chosen on basis of experience without including observations of outcomes of an experiment then the density f ( θ ) is called a prior density and denoted by f prior ( θ ). Since our knowledge may change with time (especially if we observe some outcomes of the experiment) influencing our opinions about the values of parameter θ . This leads to new odds - density f ( θ ). The modified density f ( θ ) will be called the posterior density and denoted by f post ( θ ). The method to update f ( θ ) is f post ( θ ) = cL ( θ ) f prior ( θ ) How to find likelihood function L ( θ ) will be discussed later on.

Predictive probability Suppose f ( p ) has been selected and denote by P a random variable having pdf f ( p ). A plot of f ( p ) is an illustrative measure of how likely the different values of p B are. If only one value of the probability is needed, the Bayesian methodology proposes to use the so-called predictive probability which is simply the mean of P : � P pred ( B ) = E[ P ] = pf ( p ) d p . The predictive probability measures the likelihood that B occurs in future. It combines two sources of uncertainty: the unpredictability whether B will be true in a future accident and the uncertainty in the value of probability p B . Example 6.1

P ( A ∩ B ) = P (accidents in period t ) = 1 − e − λ A P ( B ) t ≈ λ A P ( B ) t , if probability P ( A ∩ B ) is small. The predictive probabilities � P pred ( A ) = E[ P ( A )] = (1 − exp( − λ t )) f Λ ( λ ) d λ � t λ f Λ ( λ ) d λ = t E[Λ] . 2 ≈ � P pred ( A ∩ B ) = (1 − exp( − p λ t )) f Λ ( λ ) f P ( p ) d λ d p � ≈ t p λ f Λ ( λ ) f P ( p ) d λ d p = t E[Λ]E[ P ] . Example 6.2 2 For small x , 1 − exp( − x ) ≈ x .

Credibility intervals: ◮ In the Bayessian approach the lack of knowledge of parameter value θ is described using the probability densities f ( θ ) (odds). Random variable Θ having the pdf f ( θ ) models our knowledge about θ . ◮ The initial knowledge is described using f prior( θ ) density and as the data are gathered it is updated f post( θ ) = c L ( θ ) f prior( θ ) . ◮ The pdf f post( θ ) summarizes our knowledge about θ . However if one value of for the parameter is needed then � θ predictive = E[Θ] = θ f post( θ ) d θ. ◮ If one wishes to describe the variability of θ by means of an interval then the so called credibility interval can be computed [ θ post 1 − α/ 2 , θ post α/ 2 ]

Gamma-priors: Conjugated priors are families of pdf for Θ which are particularly convenient for recursive updating procedures, i.e. when new observations arrive at different time instants. We will use three families of conjugated priors: ✬ ✩ Gamma pdf: Θ ∈ Gamma( a , b ) , a , b > 0, if b a f ( θ ) = c θ a − 1 e − b θ , θ ≥ 0 , c = Γ( a ) . The expectation, variance and coefficient of variation for Θ ∈ Gamma( a , b ) are given by E[Θ] = a V[Θ] = a 1 √ a . b , b 2 , R[Θ] = ✫ ✪

Updating Gamma priors: ✬ ✩ The Gamma priors are conjugated priors for the problem of estimating the intensity in a Poisson stream of events A. If one has observed that in time � t there were k events reported and if the prior density f prior ( θ ) ∈ Gamma ( a , b ) , then f post ( θ ) ∈ Gamma( � a , � � b = b + � b ) , a = a + k , � t . Further, the predictive probability of at least one event A during a period of length t is given by P pred ( A ) ≈ t E[Θ] = t � a � ✫ b ✪ In Example 6.2 the f prior ( θ ) was exponential with mean 1 / 30 [days − 1 ]. This is Gamma(1,30) pdf. Suppose that in 10 days we have not observed any accidents then posteriori density f post ( θ ) is Gamma(1,40). Hence P pred ( A ) ≈ t 40 .

Conjugated Beta-priors: ✬ ✩ Beta probability-density function (pdf): Θ ∈ Beta( a , b ), a , b > 0, if c = Γ( a + b ) f ( θ ) = c θ a − 1 (1 − θ ) b − 1 , 0 ≤ θ ≤ 1 , Γ( a )Γ( b ) . The expectation and variance of Θ ∈ Beta( a , b ) are given by V[Θ] = p (1 − p ) E[Θ] = p , a + b + 1 , where p = a / ( a + b ). Furthermore, the coefficient of variation � 1 − p 1 R(Θ) = √ . p a + b + 1 ✫ ✪

✬ ✩ Updating Beta-priors: The Beta priors are conjugated priors for the problem of estimating the probability p B = P( B ) . Let θ = p B . If one has observed that in n trials (results of experiments), the statement B was true k times and if the prior density f prior ( θ ) ∈ Beta ( a , b ) then f post ( θ ) ∈ Beta( � a , � � b ) , � a = a + k , b = b + n − k . � 1 a � P pred ( B ) = θ f post ( θ ) d θ = . a + � � b ✫ ✪ 0 Consider example of treatment of waste water. Let p be the probability that water is sufficiently cleaned after a week of treatment. If we have no knowledge about p we could use the uniform priors. It is easy to see that it is Beta(1,1) pdf. Suppose that 3 times water was well cleaned and 2 times not. This information gives the posterior density Beta(4,3) and the predictive probability that water is cleaned in one week is 4/7.

Conjugated Dirichlet-priors: ✬ ✩ Dirichlet’s pdf: Θ = (Θ 1 , Θ 2 ) ∈ Dirichlet( a ), a = ( a 1 , a 2 , a 3 ), a i > 0, if f ( θ 1 , θ 2 ) = c θ a 1 − 1 θ a 2 − 1 (1 − θ 1 − θ 2 ) a 3 − 1 , θ i > 0 , θ 1 + θ 2 < 1 , 1 2 Γ( a 1 + a 2 + a 3 ) where c = Γ( a 1 )Γ( a 2 )Γ( a 3 ) . Let a 0 = a 1 + a 2 + a 3 ; then E[Θ i ] = a i V[Θ i ] = a i ( a 0 − a i ) , 0 ( a 0 + 1) , i = 1 , 2 . a 2 a 0 Furthermore the marginal probabilities are Beta distributed, viz. Θ i ∈ Beta( a i , a 0 − a i ) , i = 1 , 2 . ✫ ✪

Updating Dirichlet’s priors. ✬ ✩ The Dirichlet priors are conjugated priors for the problem of estimating the probabilities p i = P( B i ) , i = 1 , 2 , 3 , B i are disjoint, p 1 + p 2 + p 3 = 1 . Let θ i = p i . If one has observed that the statement B i was true k i times in n trials and the prior density f prior ( θ 1 , θ 2 ) ∈ Dirichlet ( a ) , f post ( θ 1 , θ 2 ) ∈ Dirichlet ( � a ) , � a = ( a 1 + k 1 , a 2 + k 2 , a 3 + k 3 ) , where k 3 = n − k 1 − k 2 . Further a i � P pred ( B i ) = E[Θ i ] = . a 1 + � a 2 + � a 3 � ✫ ✪ Let B 1 =”player A wins”, B 2 =”player B wins” (there is possibility of draw). If we do not know strength of players we could use uniform priors which corresponds to Dirichlet(1,1,1) pdf. Now we observed that in two matches A won twice, hence the posteriori density is Dirichlet(3,1,1) and the predictive probability that A wins the next match is then 3/5.

Posterior pdf for large number of observations. ✬ ✩ E ) 2 ) as n → ∞ , where θ ∗ is the ML If f prior ( θ 0 ) > 0 then Θ ∈ AsN( θ ∗ , ( σ ∗ � − ¨ estimate of θ 0 and σ ∗ E = 1 / l ( θ ∗ ). It means that � 1 l ( θ ∗ )( θ − θ ∗ ) 2 � � � E ) 2 �� − 1 f post ( θ ) ≈ c exp ¨ ( θ − θ ∗ ) 2 / ( σ ∗ = c exp . 2 2 ✫ ✪ Sketch of proof: l ( θ ∗ )( θ − θ ∗ ) + 1 l ( θ ) ≈ l ( θ ∗ ) + ˙ ¨ l ( θ ∗ )( θ − θ ∗ ) 2 . 2 Now likelihood function L ( θ ) = e l ( θ ) and ˙ l ( θ ∗ ) = 0, thus � l ( θ ∗ )( θ − θ ∗ ) + 1 � l ( θ ∗ ) + ˙ ¨ l ( θ ∗ )( θ − θ ∗ ) 2 L ( θ ) exp ≈ 2 � 1 ¨ l ( θ ∗ )( θ − θ ∗ ) 2 � = c exp . 2 ¨ ∗

P = P (accidents happen in period t ) = 1 e A P ( B ) t A P ( B - PowerPoint PPT Presentation

Lecture 9. Bayesian Inference - updating priors 1 Igor Rychlik Chalmers Department of Mathematical Sciences Probability, Statistics and Risk, MVE300 Chalmers May 2013 1 Bayesian statistics is a general methodology to analyse and draw

Overfitting Can Happen Overfitting Can Happen Overfitting Can Happen Overfitting Can Happen

(D39MS) UNIT 2: ACCIDENTS Lecture plan Some examples and the real definition of accidents

Normal Accidents: Normal Accidents: A Book Report A Book Report Bill Tet zlaf f Bill Tet zlaf

Childhood Accidents Accidents are the most common cause of death among children aged 1 to 14

Discrete spacetime: Things happen, they just happen in a partial order Fay Dowker Blackett

a Fun Event Skills Testing is important 80% of accidents are backing accidents, yet drivers

Introduction Accidents between motorcycles: analysis Traffic accidents of cases that occurred

Following Strategies Reduces Following Strategies Reduces Accidents, Accidents, but Makes

Major accidents in radiotherapy related to treatment planning Overview 2 historic

Normal Accidents some key points from Charles Perrow, Normal Accidents (New York: Basic Books,

TEA IN THE SONG PERIOD History of the Song Tea Development in the Song Period Teaware

Registration Overview Pre- Registration Advisement Advisement Period Period Period Prepare,

Matthew Samet msamet@tamdistrict.org 1st Period Ph.Un. - Room 120 3rd Period Ph.Un. - Room 115

Accidents And Illnesses Are Facts Of Life. They Can Happen To Anyone At Any Time. 1 Lets

TEDsUWS CHANGE MAKING THE CHANGE WE NEED HAPPEN Working Draft version 160419 Good

Outline Bugs! 1 Avoiding and Finding bugs 2 Bugs still happen 3 Why do bugs still happen ?!

Ebba: An Embedded DSL for Bayesian Inference Linkping University, 17 June 2014 Henrik Nilsson

Trieste, 14 Mai 2015 J. Jasche, Bayesian LSS Inference What do we want to do? homogeneous vs.

Bayes meets Dijkstra Exact Inference by Program Verification Joost-Pieter Katoen Dagstuhl

Bayesian linear regression Dr. Jarad Niemi STAT 544 - Iowa State University April 23, 2019

Inference in Bayesian Networks Marco Chiarandini Department of Mathematics & Computer Science

Bayesian inference for discretely observed diffusion processes Moritz Schauer with Frank van der

Accelerating Bayesian Inference on Structured Graphs Using Parallel Gibbs Sampling Glenn G. Ko

Ohio AAP Brush, Book, Bed Pilot QI Program Action Period Call 1 January 15, 2020 Welcome and

P = P (accidents happen in period t ) = 1 e A P ( B ) t A P ( B - PowerPoint PPT Presentation

Lecture 9. Bayesian Inference - updating priors 1 Igor Rychlik Chalmers Department of Mathematical Sciences Probability, Statistics and Risk, MVE300 Chalmers May 2013 1 Bayesian statistics is a general methodology to analyse and draw

Overfitting Can Happen Overfitting Can Happen Overfitting Can Happen Overfitting Can Happen

(D39MS) UNIT 2: ACCIDENTS Lecture plan Some examples and the real definition of accidents

Normal Accidents: Normal Accidents: A Book Report A Book Report Bill Tet zlaf f Bill Tet zlaf

Childhood Accidents Accidents are the most common cause of death among children aged 1 to 14

Discrete spacetime: Things happen, they just happen in a partial order Fay Dowker Blackett

a Fun Event Skills Testing is important 80% of accidents are backing accidents, yet drivers

Introduction Accidents between motorcycles: analysis Traffic accidents of cases that occurred

Following Strategies Reduces Following Strategies Reduces Accidents, Accidents, but Makes

Major accidents in radiotherapy related to treatment planning Overview 2 historic

Normal Accidents some key points from Charles Perrow, Normal Accidents (New York: Basic Books,

TEA IN THE SONG PERIOD History of the Song Tea Development in the Song Period Teaware

Registration Overview Pre- Registration Advisement Advisement Period Period Period Prepare,

Matthew Samet msamet@tamdistrict.org 1st Period Ph.Un. - Room 120 3rd Period Ph.Un. - Room 115

Accidents And Illnesses Are Facts Of Life. They Can Happen To Anyone At Any Time. 1 Lets

TEDsUWS CHANGE MAKING THE CHANGE WE NEED HAPPEN Working Draft version 160419 Good

Outline Bugs! 1 Avoiding and Finding bugs 2 Bugs still happen 3 Why do bugs still happen ?!

Ebba: An Embedded DSL for Bayesian Inference Linkping University, 17 June 2014 Henrik Nilsson

Trieste, 14 Mai 2015 J. Jasche, Bayesian LSS Inference What do we want to do? homogeneous vs.

Bayes meets Dijkstra Exact Inference by Program Verification Joost-Pieter Katoen Dagstuhl

Bayesian linear regression Dr. Jarad Niemi STAT 544 - Iowa State University April 23, 2019

Inference in Bayesian Networks Marco Chiarandini Department of Mathematics &amp; Computer Science

Bayesian inference for discretely observed diffusion processes Moritz Schauer with Frank van der

Accelerating Bayesian Inference on Structured Graphs Using Parallel Gibbs Sampling Glenn G. Ko

Ohio AAP Brush, Book, Bed Pilot QI Program Action Period Call 1 January 15, 2020 Welcome and

Inference in Bayesian Networks Marco Chiarandini Department of Mathematics & Computer Science