BIOL 701 Likelihood Methods in Biology: Homework #7 Due Wednesday, May 11th Background You are interested in fecundity of an annual plant (as assessed by the total mass of seeds produced measured in g) as a function of fertilization regimes. You have collected data from several years in eight different fields. Within each field season, you examined plants four different treatments in each field. These treatments correspond to growing the plants with (coded as 1) and without (coded as 0) fertilization with a nitrogen source and a phosphorus source. Thus, the four treatments are: no fertilization, +N, +P, and +N,P. For each treatment × field × year combination, you recorded data for four plants. You are interested in inferring the effectiveness of different treatments in general (e.g. the expected effect of adding Nitrogen to some unspecified field), as well as learning about which fields have the highest yield. Consider a model in which each of the following effects contribute additively to the expected mass for an individual: 1. a year effect (centered around 0), 2. a field-specific expected mass without fertilization (this effect is the same for a field across all years), 3. a field-specific effect of nitrogen fertilization (this effect is the same for a field across all years), 4. a field-specific effect of phosphorus fertilization (this effect is the same for a field across all years) 5. a field-specific effect of adding both N and P (this effect is the same for a field across all years). Of course, you should also expect some variability around this expected value (not all individuals from the same treatment, field, and year will have exactly the same mass). Note that you would like to make some general conclusions about the effect of different fertilizer treatments on some hypothetical field. So, you should structure the model so that you can learn about expected effects across fields, while accounting for the fact that their may be variability in the response of a particular field to a particular treatment. 1
1. Write down the likelihood for your model. Begin Answer: i indexes the year. j indexes the field. k indexes the Nitrogen (0= no Nitrogen, 1 = Nitrogen). m indexes the Phosphorus (0= no Phosphorus, 1 = Phosphorus). q indexes the individual in that year × field × treatment. y ijkmq ∼ N ( α i + β j + kγ j + mδ j + kmρ j , σ 2 e ) a is the number of years b is the number of fields n ijkm is the number of individuals in the specified year × field × treatment − ( yijkmq − αi − βj − kγj − mδj − kmρj )2 1 2 σ 2 f ( y ijkmq | α i , β j , γ j , δ j , ρ j , σ e ) = e (1) e � 2 πσ 2 e n ijkm a b 1 1 � � � � � f ( Y | α , β , γ , δ , ρ , σ e ) = f ( y ijkmq | α i , β j , γ j , δ j , ρ j , σ e ) (2) m =0 q =1 i =1 j =1 k =0 Using the priors introduced below (next section), we can write the likelihood in terms of the “highest level” parameters: f ( Y | σ α , µ β , σ β , µ γ , σ γ , µ δ , σ δ , µ ρ , σ ρ , σ e ) = Z Z Z Z Z f ( Y | α , β , γ , δ , ρ , σ e ) P ( α | σ α ) P ( β | µ β , σ β ) P ( γ | µ γ , σ γ ) P ( δ | µ δ , σ δ ) P ( ρ | µ ρ , σ ρ ) d α d β d γ d δ d ρ (3) (if we take the integration symbol over a vector of latent variables to mean that we perform the integration with respect to each variable in the vector, and we calculate the definite integral from −∞ to ∞ ). Where a − α 2 1 i � 2 σ 2 P ( α | σ α ) = e α � 2 πσ 2 α i − ( βj − µβ )2 b 1 2 σ 2 � P ( β | µ β , σ β ) = e β � 2 πσ 2 j β − ( γj − µγ )2 b 1 � 2 σ 2 P ( γ | µ γ , σ γ ) = e γ � 2 πσ 2 j γ − ( δj − µδ )2 b 1 � 2 σ 2 P ( δ | µ δ , σ δ ) = e δ � 2 πσ 2 j δ − ( ρj − µρ )2 a 1 � 2 σ 2 P ( ρ | µ ρ , σ ρ ) = e ρ � 2 πσ 2 j ρ 2
2 List each parameter and the prior distribution that you have chosen to use for the parameter. (there are lots of possible ways to answer - priors are up to the person Begin Answer: analyzing the data!) α i ∼ Normal(0 , σ 2 a year effects (latent variables) α ) σ 2 The variance of the year effects α ∼ Exponential( λ = . 1) β j ∼ Normal( µ β , σ 2 b field effects (latent variables) β ) The expected value of the field effects (no fert) µ β ∼ Gamma(mean = 250 , variance = 50) σ 2 The variance of the field effects(no fert) β ∼ Exponential( λ = . 05) γ j ∼ Normal( µ γ , σ 2 b field × nitrogen effects (latent variables) γ ) The expected value of the nitrogen effects µ γ ∼ Normal(mean = 5 , variance = 10) σ 2 The variance of the nitrogen effects γ ∼ Exponential( λ = . 1) δ j ∼ Normal( µ δ , σ 2 b field × phosphorus effects (latent variables) δ ) The expected value of the phosphorus effects µ δ ∼ Normal(mean = 5 , variance = 10) σ 2 The variance of the phosphorus effects δ ∼ Exponential( λ = . 1) ρ j ∼ Normal( µ ρ , σ 2 b field × nitrogen × phosphorus interaction effects (la- ρ ) tent variables) The expected value of the nitrogen × phosphorus inter- µ ρ ∼ Normal(mean = 0 , variance = 10) action effect σ 2 The variance of the nitrogen × phosphorus interaction ρ ∼ Exponential( λ = . 1) effects σ 2 The environmental/error standard deviation e ∼ Exponential( λ = . 1) 2.5 Deriving the likelihoods - and it is most helpful to focus on the likelihood ratios for the updates: Begin Answer 3
Updating α i When updating the a latent variables ( α i ), we have to consider a likelihood (the probability of the data for year i ) and a prior (the probability of drawing α i from a Normal(0 , σ 2 α ) distribution): n ijkm b 1 1 � � � � P ( Y i | ln P ( Y i | α i , β , γ , δ , ρ , σ e ) α i , β , γ , δ , ρ , σ e ) = f ( y ijkmq | α i , β j , γ j , δ j , ρ j , σ e ) j =1 k =0 m =0 q =1 (4) Notice that this looks just like Eqn (2) except that instead of taking the product over all values of i , we are only doing the calculation for one value of i . Taking the log, and substituting in for f ( y ijkmq | α i , β j , γ j , δ j , ρ j , σ e ), we can express the likelihood and the log-likelihood ratio for an update: n ijkm b 1 1 − − ( y ijkmq − α i − β j − kγ j − mδ j − kmρ j ) 2 � − 1 � � � � � � 2 πσ 2 � ln P ( Y i | α i , β , γ , δ , ρ , σ e ) = 2 ln e 2 σ 2 e j =1 k =0 m =0 q =1 n ijkm b 1 1 − n i 1 � � � � � 2 πσ 2 � ( y ijkmq − α i − β j − kγ j − mδ j − kmρ j ) 2 = 2 ln − e 2 σ 2 e j =1 k =0 m =0 q =1 ln LR ( α i → α ⋆ ln P ( Y i | α ⋆ i ) = i , β , γ , δ , ρ , σ e ) − ln P ( Y i | α i , β , γ , δ , ρ , σ e ) n ijkm b 1 1 − 1 � � � � ( y ijkmq − α ⋆ i − β j − kγ j − mδ j − kmρ j ) 2 = 2 σ 2 e j =1 k =0 m =0 q =1 n ijkm b 1 1 1 � � � � ( y ijkmq − α i − β j − kγ j − mδ j − kmρ j ) 2 . . . + 2 σ 2 e j =1 k =0 m =0 q =1 n ijkm n ijkm b 1 1 b 1 1 − 1 1 i ) 2 + � � � � � � � � ( Z α ( i, j, k, m, q ) − α ⋆ ( Z α ( i, j, k, m, q ) − α i ) 2 = 2 σ 2 2 σ 2 e e j =1 k =0 m =0 q =1 j =1 k =0 m =0 q =1 Z α ( i, j, k, m, q ) = y ijkmq − β j − kγ j − mδ j − kmρ j n ijkm b 1 1 1 ( Z α ( i, j, k, m, q ) − α i ) 2 − ( Z α ( i, j, k, m, q ) − α ⋆ � � � � ln LR ( α i → α ⋆ i ) 2 � � i ) = 2 σ 2 e j =1 m =0 q =1 k =0 n ijkm b 1 1 1 � � � � Z α ( i, j, k, m, q )( α ⋆ i − α i ) + α 2 i − ( α ⋆ i ) 2 � � = 2 σ 2 e m =0 q =1 j =1 k =0 n ijkm b 1 1 i − ( α ⋆ + ( α ⋆ n i ( α 2 i ) 2 ) i − α i ) � � � � = Z α ( i, j, k, m, q ) 2 σ 2 2 σ 2 e e j =1 k =0 m =0 q =1 n i ( α 2 i − ( α ⋆ i ) 2 ) + ( α ⋆ i − α i ) = W α ( i ) (5) 2 σ 2 2 σ 2 e e n ijkm b 1 1 � � � � W α ( i ) = Z α ( i, j, k, m, q ) j =1 k =0 m =0 q =1 n ijkm b 1 1 � � � � n i = 1 j =1 k =0 m =0 q =1 4
Recommend
More recommend