Bayesian networks II. Model building Advanced Herd Management Anders Ringgaard Kristensen Outline � Determining the graphical structure � Milk test � Mastitis diagnosis � Pregnancy � Determining the conditional probabilities � Modeling methods and tricks 1
Outline � Determining the graphical structure � Milk test � Mastitis diagnosis � Pregnancy � Determining the conditional probabilities � Modeling methods and tricks Milk test {“Yes”, “No”} Infected? Infected? {“Positive”, “Negative”} � Sensitivity/Specificity determines the conditional probabilities � Direction of edge! � “Causal direction � Against the “reasoning” direction 2
Daily measurements Inf 1 Inf 2 Inf 3 Inf 4 Inf 5 Inf 6 Inf 7 Test 1 Test 2 Test 3 Test 4 Test 5 Test 6 Test 7 � Are the infection states of different days independent? � Probably not! � Markov property � Duration of disease Dependence between test results Inf 1 Inf 3 Inf 5 Inf 7 Inf 2 Inf 4 Inf 6 Test 1 Test 3 Test 5 Test 7 Test 2 Test 4 Test 6 � Correctness of test depends on whether it was correct yesterday. � To determine whether it was correct yesterday: � The true infection state yesterday � Not observed – we need the conditional probabilities. � The test result yesterday 3
Dependence between test results Inf 1 Inf 3 Inf 5 Inf 7 Inf 2 Inf 4 Inf 6 Cor 1 Cor 2 Cor 3 Cor 4 Cor 5 Cor 6 Test 1 Test 3 Test 5 Test 7 Test 2 Test 4 Test 6 � A simplifying intermediate variable Cor i ∈ {“yes”, “no”} indicating whether the test was correct. Mastitis diagnosis, AMS No No Subclinical Yes Mastitis Heat Clinical Conductivity Temperature � Separate variables (don’t pool mastitis & heat) � Check conditional independence � Are “Conductivity” and “Temperature” independent, given “Mastitis”? 4
Mastitis diagnosis, AMS No No Subclinical Yes Mastitis Heat Clinical Conductivity Temperature � Separate variables (don’t pool mastitis & heat) � Check conditional independence � Are “Conductivity” and “Temperature” independent, given “Mastitis”? If not conditional independent No No Subclinical Yes Mastitis Heat Clinical Conductivity Temperature � If conductivity influences temperature. 5
If not conditional independent No No Subclinical Yes Mastitis Heat Clinical Conductivity Temperature � If temperature influences conductivity. � The causal direction may be difficult to determine If not conditional independent No No Subclinical Yes Mastitis Heat Clinical � If the direction of Conductivity Temperature an edge cannot be determined, a variable is often missing! Other disease 6
Pregnancy (again, again …) � A goat is mated, and six weeks later we want to test it for pregnancy. � We have three tests available: � Blood test � Urine test � Scanning � The variables of our problem are: � Pregnant {“yes”, “no”} � Blood {“positive”, “negative”} � Urine {“positive”, “negative”} � Scan {“positive”, “negative”} Pregnancy test BN Pregnant Blood Urine Scan � Check for conditional independence 7
Pregnancy test BN, revised � The blood test and the Pregnant urine test both measure a hormone level. � The scanning does something completely Hormone different. Blood Urine Scan Outline � Determining the graphical structure � Milk test � Mastitis diagnosis � Pregnancy � Determining the conditional probabilities � Modeling methods and tricks 8
Determining the probabilities � Statistical model with parameters estimated from data. � Law of nature. � Experts of the domain. Milk test example Inf 1 Inf 3 Inf 5 Inf 7 Inf 2 Inf 4 Inf 6 Cor 1 Cor 2 Cor 3 Cor 4 Cor 5 Cor 6 Test 1 Test 3 Test 5 Test 7 Test 2 Test 4 Test 6 � The P( Test i | Inf i ) conditional probability is supplied by the test retailer. 9
P( Test i | Inf i ) P( Test i =“yes”| Inf i ) P( Test i =“yes”| Inf i ) 0.99 0.01 Inf i = “yes” 0.01 0.99 Inf i = “no” � Defined by the sensitivity and specificity of the test. Milk test example Inf 1 Inf 3 Inf 5 Inf 7 Inf 2 Inf 4 Inf 6 Cor 1 Cor 2 Cor 3 Cor 4 Cor 5 Cor 6 Test 1 Test 3 Test 5 Test 7 Test 2 Test 4 Test 6 � The P( Cor i | Inf i , Test i ) conditional probability is trivial. 10
P( Cor i | Inf i , Test i ) P( Cor i =y| Inf i ,Test i ) P( Cor i =n| Inf i ,Test i ) Inf i Test i “yes” “yes” 1 0 “yes” “no” 0 1 “no” “yes” 0 1 “no” “no” 1 0 � If Inf i and Test i agree, the test is correct! Milk test example Inf 1 Inf 3 Inf 5 Inf 7 Inf 2 Inf 4 Inf 6 Cor 1 Cor 2 Cor 3 Cor 4 Cor 5 Cor 6 Test 1 Test 3 Test 5 Test 7 Test 2 Test 4 Test 6 � The P( Test i | Inf i , Cor i -1 ) conditional probabilities needs some assumptions. 11
P( Test i | Inf i , Cor i -1 ) � Assumptions: � A correct test has 99.9% chance of being correct next time. � An incorrect test has 30% chance of being incorrect next time : � Thus, it is still most likely to be correct. � In agreement with the example file provided from the homepage. � In disagreement with the textbook. P( Test i | Inf i , Cor i -1 ) P( Test i =y| Inf i ,Cor i -1 ) P( Test i =n| Inf i ,Cor i -1 ) Inf i Cor i -1 “yes” “yes” 0.999 0.001 “yes” “no” 0.7 0.3 “no” “yes” 0.001 0.999 “no” “no” 0.7 0.3 12
Milk test example Inf 1 Inf 3 Inf 5 Inf 7 Inf 2 Inf 4 Inf 6 Cor 1 Cor 2 Cor 3 Cor 4 Cor 5 Cor 6 Test 1 Test 3 Test 5 Test 7 Test 2 Test 4 Test 6 � The P( Inf 1 ) probabilities must be modeled. P( Inf 1 ) � Assume that the milk test is made on single cow level at the farm. � We need the probability λ that the milk from a particular cow is infected on an arbitrary day (i.e. P( Inf i = “yes”) = λ ). � The farmer has no knowledge about λ , but � The dairy performs a very precise bulk tank test: � If the milk from just one cow is infected, the bulk tank test will be positive. � On average, the bulk tank test is positive once a month 13
P( Inf 1 ) � Further assumptions: � λ is the same for all cows � Cows are infected independently. � Under those assumptions: (1 - λ ) 50 = 29/30 ⇔ λ = 1 – (29/30) 0.02 ≈ 0.0007 Milk test example Inf 1 Inf 3 Inf 5 Inf 7 Inf 2 Inf 4 Inf 6 Cor 1 Cor 2 Cor 3 Cor 4 Cor 5 Cor 6 Test 1 Test 3 Test 5 Test 7 Test 2 Test 4 Test 6 � The P( Inf i | Inf i -1 ) conditional probabilities must be modeled. 14
P( Inf i | Inf i -1 , Inf i -2 ) � Assume the following properties of the infection: � A not-infected cow has probability q of becoming infected. � An infection always lasts for at least 2 days � After 2 days, the probability of recovery is π � Define a state space model: s i ∈ {nn, ny, yn, yy} where e.g. “ny” means: not-infected day i -1 but infected day i P( Inf i | Inf i -1 , Inf i -2 ) � Transition propabilities: Day i Day i +1 nn ny yn yy P(yes) nn (1- q ) 0 0 q q ny 0 0 0 1 1 yn (1- q ) 0 0 q q π (1- π ) (1- π ) yy 0 0 � Only assumptions on min. duration, q and π 15
A procedure � There are basically 3 parameters: � Duration minimum 2 days � Probability of becoming infected q � Daily probability of recovery π ( after 2 days) � The 3 parameters should be estimated from data. � If data is not available, we may have to rely on experts. � Experts’ guesses may be calibrated to the overall probability of infection at a given day. Limiting state distribution 16
Milk test example Inf 1 Inf 3 Inf 5 Inf 7 Inf 2 Inf 4 Inf 6 Cor 1 Cor 2 Cor 3 Cor 4 Cor 5 Cor 6 Test 1 Test 3 Test 5 Test 7 Test 2 Test 4 Test 6 � The P( Inf i | Inf i -1 ) conditional probabilities must be modeled. P( Inf i | Inf i -1 ) � Some assumption compensating for the fact that we don’t know the infection state two days ago… 17
A stud farm: Genealogical tree � John is suffering Ann Brian Cecily from a serious hereditary disease caused by a recessive gene. Fred Dorothy Eric Gwenn � State space for a horse: aa, aA or AA Henry Irene � The genotype aa is diseased. � The genotype aA John is carrier. � We want to cull all carriers! Contitional probabilities from genetics Mother Father aa aA AA aa (1, 0, 0) (0.5, 0.5, 0) (0, 1, 0) aA (0.5, 0.5, 0) (0.25, 0.5, 0.25) (0, 0.5, 0.5) AA (0, 1, 0) (0, 0.5, 0.5) (0, 0, 1) 18
Unknown parents � Two unknown parents: � Assume that the distribution reflects the population probabilities of being healthy or carrier (if they had beem diseased they would not have survived until “breeding age”). � One unknown parent: � Introduce a “dummy” parent reflecting the population distribution. The diseased state “aa” � Impossible for all other horses than John � Two options: � Delete the state and adjust all probabilities accordingly. � Keep the state and enter the evidence that the horses are either healthy or carriers 19
Obtaining the probabilities � Sources: � Pure data estimation (frequency counts) � Model and parameter estimation from data � Provided “by nature” � Subjective expert assessments Outline � Determining the graphical structure � Milk test � Mastitis diagnosis � Pregnancy � Determining the conditional probabilities � Modeling methods and tricks 20
Recommend
More recommend