from yates algorithm to multidimensional smoothing with
play

From Yates Algorithm to Multidimensional Smoothing with Plan of - PowerPoint PPT Presentation

From Yates Algorithm to Multidimensional Smoothing with Plan of talk GLAM Classical regression Factorial designs and Yates algorithm Iain Currie Yates algorithm as an array computation Heriot Watt University Generalised linear


  1. From Yates Algorithm to Multidimensional Smoothing with Plan of talk GLAM • Classical regression • Factorial designs and Yates algorithm Iain Currie • Yates algorithm as an array computation Heriot Watt University • Generalised linear models • Array regression with • Examples Themes James Kirkby, Maria Durban & Paul Eilers • Structure of data and models • Algorithms University of Edinburgh December, 2006 1 2 Classical regression Classical regression Structure Universal recipe • Data: a data vector y , n × 1 Fit <- lm(y ∼ X - 1) • Model: a model matrix X , n × c a parameter vector θ , c × 1 µ = E ( y ) = Xθ A model algebra • Error distribution: normal Fit <- lm(y ∼ x + x 2 ) Fit <- lm(y ∼ A + B + A : B) Algorithm Fit <- lm(y ∼ x + x 2 + A ∗ B) • Normal equations X ′ X ˆ θ = X ′ y 3 4

  2. Factorial designs Analysis in R Example: 2 3 design, John & Quenoullie, p64 Fit <- lm(y ∼ N ∗ P ∗ K) X <- model.matrix(y ∼ N ∗ P ∗ K) Treatment Yield Effect Estimate*8 1 N P NP K NK PK NPK 1 94.1 1 867.6 1 -1 -1 1 -1 1 1 -1 n 107.5 N 68.8 1 1 -1 -1 -1 -1 1 1 1 -1 1 -1 -1 1 -1 1 p 97.0 P 24.8 1 1 1 1 -1 -1 -1 -1 np 113.5 NP -10.0 1 -1 -1 1 1 -1 -1 1 k 96.9 K 43.4 1 1 -1 -1 1 1 -1 -1 nk 122.9 NK 9.0 1 -1 1 -1 1 -1 1 -1 pk 111.4 PK 7.0 1 1 1 1 1 1 1 1 npk 124.3 NPK -16.2 X ′ X = 8 I 8 Model ⇒ ˆ 1 = 8 X ′ y θ y ijk = µ + α i + β j + γ k + ( αβ ) ij + ( αγ ) ik + ( βγ ) jk + ( αβγ ) ijk i, j, k = 1 , 2 5 6 Yates Algorithm Kronecker products 1 N P NP K NK PK NPK Yield (1) (2) (3) 1 94.1 201.6 412.1 867.6 1 1 -1 -1 1 -1 1 1 -1 n 107.5 210.5 455.5 68.8 N 1 1 -1 -1 -1 -1 1 1 p 1 -1 1 -1 -1 1 -1 1 97.0 219.8 29.9 24.8 P np 113.5 235.7 38.9 -10.0 NP 1 1 1 1 -1 -1 -1 -1 k 96.9 13.4 8.9 43.4 K 1 -1 -1 1 1 -1 -1 1 nk 122.9 16.5 15.9 9.0 NK 1 1 -1 -1 1 1 -1 -1 pk 111.4 26.0 3.1 7.0 PK 1 -1 1 -1 1 -1 1 -1 npk 124.3 12.9 -13.1 -16.2 NPK 1 1 1 1 1 1 1 1 7 8

  3. Kronecker products Restatement of problem Evaluate X ′ y where    X 1 − X 1 X =  = X Z ⊗ Z ⊗ Z X 1 X 1   and  1 − 1  ⊗ X 1 ⇒ X =   1 1  1 − 1  . = Z 1 1    1 − 1  ⊗ X 2 X 1 = where Model structure: Kronecker product 1 1 Data structure: ??        1 − 1  1 − 1  1 − 1  ⊗  ⊗ = ⇒ X  1 1 1 1 1 1 9 10 Restatement of problem Evaluate X ′ y where pk npk 111.4 124.3 Data structure: Y , a 3- d array p np 113.5 97.0 Model structure: X = Z ⊗ Z ⊗ Z Evaluate X ′ y where Data structure: Y , n 1 × n 2 nk k 122.9 Model structure: X = X 2 ⊗ X 1 , X i , n i × c i 96.9 94.1 n 107.5 n 107.5 ( X 2 ⊗ X 1 ) ′ y ≡ X ′ 1 1 Y X 2 11 12

  4. Efficient computation The rotated H -transform ( X 2 ⊗ X 1 ) ′ y ( X ′ 2 ( X ′ 1 Y ) ′ ) ′ ( X 2 ⊗ X 1 ) ′ y ≡ X ′ 1 Y X 2 = ( X ′ 2 ( X ′ 1 Y ) ′ ) ′ ≡ ( X 3 ⊗ X 2 ⊗ X 1 ) ′ y ( X ′ 3 ( X ′ 2 ( X ′ 1 Y ) ′ ) ′ ) ′ ≡ ??? Example: X 1 , X 2 , 10 3 × 10 2 ( X 3 ⊗ X 2 ⊗ X 1 ) ′ y ≡ ρ ( X 3 , ρ ( X 2 , ρ ( X 1 , Y ))) • No need to compute X • LHS: 10 10 multiplications Definition: X , n 1 × c 1 matrix; A , c 1 × c 2 × c 3 array. RHS: 10 8 multiplications ρ ( X , A ) → XA c 1 × c 2 c 3 = A n 1 × c 2 c 3 → A n 1 × c 2 × c 3 → A c 2 × c 3 × n 1 • Reformat RHS Example: 2 3 factorial design. Evaluate X ′ y , X = Z ⊗ Z ⊗ Z , Z 2 × 2 , Y 2 × 2 × 2 . ρ ( Z ′ , ρ ( Z ′ , ρ ( Z ′ , Y ))) = ˆ Θ 2 × 2 × 2 13 14 Yates Algorithm Yield (1) (2) (3) PK NPK 1 94.1 201.6 412.1 867.6 1 7.0 −16.2 n 107.5 210.5 455.5 68.8 N p 97.0 219.8 29.9 24.8 P P 24.8 NP −10 np 113.5 235.7 38.9 -10.0 NP k 96.9 13.4 8.9 43.4 K nk 122.9 16.5 15.9 9.0 NK pk 111.4 26.0 3.1 7.0 PK npk 124.3 12.9 -13.1 -16.2 NPK NK K 9.0 43.4 1 867.6 68.8 N 15 16

  5. Computation Generalised linear models Structure X ′ ˜ X ′ ˜ X ˜ θ W δ ˜ z W δ X • Data: a data vector y In 1- d , let X = ( x 1 , . . . , x c ) , n × c . • Model: a model matrix X = X 3 ⊗ X 2 ⊗ X 1 n a parameter vector θ � ( X ′ W δ X ) jk = w i x ij x ik = ( x 1 j x 1 k , . . . , x nj x nk ) w = ( x j ∗ x k ) ′ w a link function i =1 µ = E ( y ) , g ( µ ) = Xθ Definition. The row tensor of X , n × c , is • Error distribution: normal, Poisson, binomial, . . . Algorithm G ( X ) = ( X ⊗ 1 ′ ) ∗ ( 1 ′ ⊗ X ) , n × c 2 • Scoring algorithm where 1 is vector of 1’s of length c . X ′ ˜ θ = X ′ ˜ W δ X ˆ W δ ˜ z X ′ W δ X , c × c ≡ G ( X ) ′ w , c 2 × 1 − 1 ( y − ˜ z = X ˜ θ + ˜ where ˜ W δ µ ) . 17 18 Computation of X ′ W δ X Standard errors of X ˆ θ X i , n i × c i , i = 1 , 2 , 3 . We need X = X 3 ⊗ X 2 ⊗ X 1 , n 1 n 2 n 3 × c 1 c 2 c 3 diag X ( X ′ W δ X ) − 1 X ′ = diag XS m X ′ W δ is diagonal, n 1 n 2 n 3 × n 1 n 2 n 3 where S m , c 1 c 2 c 3 × c 1 c 2 c 3 . W is the corresponding array, n 1 × n 2 × n 3 Let S , c 2 1 × c 2 2 × c 2 3 , be the array form of S m . X ′ W δ X , c 1 c 2 c 3 × c 1 c 2 c 3 diag XS m X ′ , n 1 n 2 n 3 × 1 ρ ( G ( X 3 ) ′ , ρ ( G ( X 2 ) ′ , ρ ( G ( X 1 ) ′ , W ))) , c 2 1 × c 2 2 × c 2 ≡ 3 ρ ( G ( X 3 ) , ρ ( G ( X 2 ) , ρ ( G ( X 1 ) , S ))) , n 1 × n 2 × n 3 . ≡ 19 20

  6. Computation: a summary Generalised linear array models Structure Linear functions: X ′ y , c 1 c 2 c 3 × 1 • Data: a data array Y , n 1 × n 2 × n 3 ρ ( X ′ 3 , ρ ( X ′ 2 , ρ ( X ′ 1 , Y ))) , c 1 × c 2 × c 3 • Model: a model matrix X = X 3 ⊗ X 2 ⊗ X 1 , X i , n i × c i a parameter array Θ , c 1 × c 2 × c 3 Xθ , n 1 n 2 n 3 × 1 a link function ρ ( X 3 , ρ ( X 2 , ρ ( X 1 , Θ ))) , n 1 × n 2 × n 3 M = E ( Y ) , g ( M ) = ρ ( X 3 , ρ ( X 2 , ρ ( X 1 , Θ ))) . • Error distribution: normal, Poisson, binomial, . . . Inner products : X ′ W δ X , c 1 c 2 c 3 × c 1 c 2 c 3 Algorithm ρ ( G ( X 3 ) ′ , ρ ( G ( X 2 ) ′ , ρ ( G ( X 1 ) ′ , W ))) , c 2 1 × c 2 2 × c 2 3 • Efficient scoring algorithm Array computation of Diagonal function : diag XS m X ′ , n 1 n 2 n 3 × 1 X ′ ˜ θ = X ′ ˜ W δ X ˆ W δ ˜ z ρ ( G ( X 3 ) , ρ ( G ( X 2 ) , ρ ( G ( X 1 ) , S ))) , n 1 × n 2 × n 3 . − 1 ( y − ˜ z = X ˜ θ + ˜ where ˜ W δ µ ) . 21 22 LM, GLM & GLAM Examples • Factorial designs LM GLM GLAM • Contingency tables Y = Y d Data y y • Smooth models with P -splines µ = Xθ g ( µ ) = Xθ g ( M ) = ρ ( X d , . . . , ρ ( X 1 , Θ ) . . . ) Model Error N N, P, B,... N, P, B,... Algorithm N-equns Scoring Efficient Scoring 23 24

  7. A single cubic B−spline 1−D B−spline basis 0.7 0.6 0.6 0.5 0.5 0.4 Basis functions 0.4 B−spline 0.3 0.3 0.2 0.2 0.1 0.1 0.0 0.0 20 40 60 80 20 40 60 80 25 26 Age Age P -splines in 2- d Structure • Data: deaths matrix, D , exposure matrix, E • Model: a model matrix B = B y ⊗ B a , B a , 80 × 20 , B y , 100 × 25 a parameter array Θ , 20 × 25 a link function 0.5 log( M ) = log E + B a Θ B ′ M = E ( D ) , y . 0.1 0.2 0.3 0.4 • Error distribution: Poisson. Algorithm • Efficient penalised scoring algorithm 0 100 Array computation of 9 9 78 9 1 6 ( B ′ ˜ θ = B ′ ˜ 8 9 W δ B + P )ˆ 56 1 W δ ˜ z Age 3 7 9 1 33 r a 0 e 6 Y − 1 ( d − ˜ 9 1 z = B ˜ θ + ˜ where ˜ W δ µ ) and P = P ( λ a , λ y ) is the penalty matrix. 11 7 4 9 1 27 28

  8. Swedish females: ages 20 to 90 Smooth Lee-Carter models Structure 90 −2 • Data: deaths matrix, D , exposure matrix, E 80 • Model: D ij ∼ P ( µ ij ) , log µ ij = log E ij + α i + β i κ j −4 For given ˆ κ 70 Log(mortality) log( M ) = log E + B a Θ ˆ ′ M = E ( D ) , K 60 where α = B a a , β = B a b , Θ = [ a : b ] , ˆ K = [ 1 : ˆ κ ] . 50 −6 α and ˆ For given ˆ β 40 log( M ) = log E + log( 1 ′ ⊗ ˆ α ) + ˆ M = E ( D ) , βk ′ B ′ k 30 where κ = B k k . −8 20 Algorithm: Efficient penalised scoring algorithm 1900 1920 1940 1960 1980 2000 29 30 Year Microarray data Data are arranged on a 72 × 128 grid in 8 blocks of 2 rows and 4 columns. −1 2.2 −2 2.0 −3 1.8 ′ ′ E ( Y ) = B 3 A ˘ 3 + B 0 C ˘ −4 B B Alpha Beta 0 1.6 −5 1.4 −6 1.2 −7 1.0 −8 20 40 60 80 100 20 40 60 80 100 Age Age −3.6 0.1 −3.8 log(mortality) 0.0 −4.0 Kappa −0.1 −4.2 −0.2 −4.4 Age: 65 −0.3 31 32 1950 1960 1970 1980 1990 2000 1950 1960 1970 1980 1990 2000 Year Year

Recommend


More recommend