copula models for dependent data analysis
play

Copula Models for Dependent Data Analysis Yihao Deng Department of - PowerPoint PPT Presentation

Copula Models for Dependent Data Analysis Yihao Deng Department of Mathematical Sciences Purdue University Fort Wayne December 5, 2019 Yihao Deng Copula Models for Dependent Data Analysis Dependent Data Data collected from family members


  1. Copula Models for Dependent Data Analysis Yihao Deng Department of Mathematical Sciences Purdue University Fort Wayne December 5, 2019 Yihao Deng Copula Models for Dependent Data Analysis

  2. Dependent Data Data collected from family members (twins) Return of stocks from the same sector Health measures from the same person (height, weight, blood pressure, cholesterol levels, etc.) Interest lies in the relation among the variables. The most popular measure is correlation coefficient, assuming variables are normally distributed. Yihao Deng Copula Models for Dependent Data Analysis

  3. ρ = 0 . 4 ρ = 0 . 7 Yihao Deng Copula Models for Dependent Data Analysis

  4. What If? same dependence measure as in the previous normal case ( ρ = 0 . 7). Yihao Deng Copula Models for Dependent Data Analysis

  5. Copula A copula C is a joint cumulative distribution function (cdf) where all marginals are uniform on (0 , 1). Suppose that Y i ∼ F i continuous, then F i ( Y i ) ∼ U (0 , 1). The joint cdf H of Y 1 , . . . , Y k can be written as H ( y 1 , . . . , y k ) = C ( F 1 ( y 1 ) , . . . , F d ( y k )) Let U i = F i ( Y i ), then Y i = F − 1 ( U i ). The copula is given by i C ( u 1 , . . . , u k ) = H ( F − 1 1 ( u 1 ) , . . . , F − 1 k ( u k ); θ ) (1) Yihao Deng Copula Models for Dependent Data Analysis

  6. Copula Examples Independence Copula: C ( u 1 , u 2 , . . . , u k ) = u 1 × u 2 × · · · × u k Gaussian Copula: C ( u 1 , u 2 , . . . , u k ) = Φ k (Φ − 1 ( u 1 ) , Φ − 1 ( u 2 ) , · · · , Φ − 1 ( u k ); R ) where � z k � z 1 1 e − 1 2 t ′ R − 1 t dt 1 . . . dt k Φ k ( z 1 , . . . , z k ) = · · · k 1 2 | R | −∞ −∞ (2 π ) 2 and � x 1 2 πe − z 2 2 dz Φ( x ) = √ −∞ Yihao Deng Copula Models for Dependent Data Analysis

  7. Copula Examples (continued) Archimedean Copula: C ( u 1 , u 2 , . . . , u k ) = ψ ( ψ − 1 ( u 1 ) + ψ − 1 ( u 2 ) + · · · + ψ − 1 ( u k ); θ ) Clayton family: ψ = (1 + t ) − 1 /θ Gumbel familty: ψ = e − t 1 /θ θ ln(1 + e − t ( e − θ − 1)) Frank family: ψ = − 1 Joe family: ψ = 1 − (1 − e − t ) 1 /θ Yihao Deng Copula Models for Dependent Data Analysis

  8. Modeling of Dependence Gaussian Copula:  1 ρ 12 ρ 13 . . . ρ 1 k  ρ 12 1 ρ 23 . . . ρ 2 k     ρ 13 ρ 23 1 . . . ρ 3 k   R =   . . . . ...   . . . .  . . . .    ρ 1 k ρ 2 k ρ 3 k . . . 1 which should be positive definite. Archimedean Copula: Exchangeable dependence structure. Or the depenence among all pairs of variables are assumed to be the same. Yihao Deng Copula Models for Dependent Data Analysis

  9. Modeling of Marginal Distribution The random variable Y is often related to some covariates ( X 1 , X 2 , . . . , X p , or in matrix notation X ), where the mean E ( Y ) is linked to the covariates via E ( Y ) = g − 1 ( X β ). Therefore, the effect of the covariates can be incorporated into copula models as U i = F i ( Y i ; g − 1 ( X i β )) Examples � y i − X i β � Probit function: u i = Φ σ ˆ � − 1 � 1 + e − yi − X i β Logistic function: u i = ˆ σ Yihao Deng Copula Models for Dependent Data Analysis

  10. Maximum Likelihood Estimation As soon as we formulate the marginal distributions and dependence structure, the log-likelihood function is simply � ℓ = ln( c ( u 1 , . . . , u k ; β , θ )) where c ( u 1 , . . . , u k ) is the corresponding copula density function. Optimization needs to be done numerically. R function optim and Python function minimize will be helpful. Yihao Deng Copula Models for Dependent Data Analysis

  11. Hierarchical Archimedean Copula Recall that the dependence in Archimedean copulas is assumed to be the same everywhere. Hierarchical Archimedean copula (HAC) was proposed to account for more complicated dependence structures. ψ ( · ; θ 3 ) ψ ( · ; θ 3 ) ϕ ( · ; θ 2 ) ψ ( · ; θ 2 ) U 4 φ ( · ; θ 1 ) ϕ ( · ; θ 2 ) φ ( · ; θ 1 ) U 3 φ ( · ; θ 1 ) U 4 U 1 U 2 U 3 U 4 U 1 U 2 U 1 U 2 U 3 (a) (b) (c) Examples of HAC with four random variables Yihao Deng Copula Models for Dependent Data Analysis

  12. Vine Copula A more flexible copula model is vine copula, which builds the dependence hierarchy using “pair copulas”. 2 2 1 23 | 1 23 | 1 1 12 13 1 3 3 Tree 1 Tree 2 Tree 3 Example of vine construction with three random variables Yihao Deng Copula Models for Dependent Data Analysis

  13. Family Data Blood samples from members of 22 families were collected, erythrocyte adenosine triphosphate (ATP) levels were determined before and after storage at 4 ◦ C in acid citrate dextrose solution for 21 days. famID Member Gender Age pre-ATP post-ATP y 2 Mother 0 62 4.43 2.49 1 2 Father 1 62 3.72 1.79 1 2 Son 1 24 4.18 1.49 1 2 Son 1 41 4.81 2.84 1 2 Daughter 0 31 4.42 2.04 1 2 Daughter 0 38 3.65 1.17 1 . . . . . . . . . . . . . . . . . . . . . Source: Dern R. and Wiorkowski J. (1969). Yihao Deng Copula Models for Dependent Data Analysis

  14. Modeling Discrete Binary Responses By introducing continuous uniform variables U i , we categorize Y i as follows: � 1 if 0 ≤ U i ≤ η i Y i = 0 if η i < U i ≤ 1 where η i = g − 1 ( X β ). We may now model the dependence among continuous variables U i rather than discrete variables Y i . And the log-likelihood function to be maximized is � ℓ = P ( Y i = { 0 / 1 } ) Yihao Deng Copula Models for Dependent Data Analysis

  15. Gaussian Copula Modeling The dependence among family members is assumed to be M F Ch 1 Ch 2 Ch 3 . . . M  1 γ ρ 1 ρ 1 ρ 1 . . .  F γ 1 ρ 2 ρ 2 ρ 2 . . .     Ch 1 ρ 1 ρ 2 1 α α . . .     R = Ch 2  ρ 1 ρ 2 α 1 α . . .      Ch 3 ρ 1 ρ 2 α α 1 . . .   . . . . . .  ...  . . . . . . . . . . . . Evaluation of log-likelihood function is computational intensive since it involves multivariate integration over hyper-rectangle. Yihao Deng Copula Models for Dependent Data Analysis

  16. Analysis Result Parameter Estimate S.E. p-value Intercept 12.466 1.490 < 0 . 001 Gender − 0 . 638 0.556 0.251 Pre-ATP − 2 . 517 0.292 < 0 . 001 γ 0.281 0.398 0.480 ρ 1 0.518 0.274 0.059 ρ 2 0.208 0.376 0.580 α 0.568 0.289 0.050 log-likelihood = − 39 . 195 with logit link function Yihao Deng Copula Models for Dependent Data Analysis

  17. HAC Modeling Selecting hierarchical dependence structures: ψ ( · ; θ 3 ) ψ ( · ; θ 3 ) ψ ( · ; θ 3 ) ϕ ( · ; θ 2 ) Fa ϕ ( · ; θ 2 ) Mo ϕ ( · ; θ 2 ) Mo Fa φ ( · ; θ 1 ) Mo φ ( · ; θ 1 ) Fa φ ( · ; θ 1 ) . . . . . . . . . Ch 1 Ch 2 Ch 1 Ch 2 Ch 1 Ch 2 (a) (b) (c) Selecting Archimedean copula families at each level. For simplicity, I used same family for all levels to avoid incompatible issue. Yihao Deng Copula Models for Dependent Data Analysis

  18. Analysis Result Hierarchy (b) turns out to be the best model, and Frank family is selected. Parameter Estimate S.E. p-value Intercept 12.666 3.257 < 0 . 001 Gender − 0 . 804 0.548 0.143 Pre-ATP − 2 . 561 0.671 < 0 . 001 θ 3 1.316 1.681 0.434 θ 2 2.190 2.610 0.402 θ 1 4.464 3.577 0.212 log-likelihood = − 39 . 588 with logit link function Yihao Deng Copula Models for Dependent Data Analysis

  19. Vine Copula Modeling Pairing processes: Fa F.Ch 2 | M Ch 1 M.Ch 1 . Ch 2 M.Ch 2 F.Ch 1 | M . Mo M.F . . . . . . . F.Ch m | M Ch m M.Ch m Tree 1 Tree 2 Tree 3 Selecting pair copulas: find the maximized log-likelihood from all possible combinations. Yihao Deng Copula Models for Dependent Data Analysis

  20. Analysis Result Joe family and independent copula are selected for pair copulas. Parameter Estimate S.E. p-value Intercept 14.348 3.663 < 0 . 001 Gender − 0 . 738 0.566 0.193 Pre-ATP − 2 . 902 0.738 < 0 . 001 θ 12 1.584 0.689 0.021 θ 13 1.837 0.885 0.038 θ 23 | 1 — — — θ 3 | 12 2.705 2.163 0.211 log-likelihood = − 38 . 138 with logit link function Yihao Deng Copula Models for Dependent Data Analysis

  21. Thank you! Yihao Deng Copula Models for Dependent Data Analysis

  22. Selected References Joe H. Multivariate models and dependence concepts. London: 1 Chapman & Hall. 1997. Nelsen R. An introduction to copulas (2nd edition). New York: 2 Springer. 2006. Joe H. Dependence modeling with copulas. Boca Raton: CRC Press. 3 2015. Kurowicka D, Joe H. Dependence modeling: vine copula handbook. 4 Singapore: World scientific. 2011. Dißmann J, Brechmann E, Czado C, Kurowicka D. Selecting and 5 estimating regular vine copulae and application to financial returns. Computational statistics and data analysis 2013; 59: 52–69. Panagiotelis A, Czado C, Joe H. Pair copula constructions for 6 multivariate discrete data. Journal of the American statistical association 2012; 107: 1063–1072. Panagiotelis A, Czado C, Joe H, Stöber J. Model selection for discrete 7 regular vine copulas. Computational statistics and data analysis 2017; 106: 138–152. Yihao Deng Copula Models for Dependent Data Analysis

Recommend


More recommend