From Yates Algorithm to Multidimensional Smoothing with Plan of talk GLAM • Classical regression • Factorial designs and Yates algorithm Iain Currie • Yates algorithm as an array computation Heriot Watt University • Generalised linear models • Array regression with • Examples Themes James Kirkby, Maria Durban & Paul Eilers • Structure of data and models • Algorithms University of Edinburgh December, 2006 1 2 Classical regression Classical regression Structure Universal recipe • Data: a data vector y , n × 1 Fit <- lm(y ∼ X - 1) • Model: a model matrix X , n × c a parameter vector θ , c × 1 µ = E ( y ) = Xθ A model algebra • Error distribution: normal Fit <- lm(y ∼ x + x 2 ) Fit <- lm(y ∼ A + B + A : B) Algorithm Fit <- lm(y ∼ x + x 2 + A ∗ B) • Normal equations X ′ X ˆ θ = X ′ y 3 4
Factorial designs Analysis in R Example: 2 3 design, John & Quenoullie, p64 Fit <- lm(y ∼ N ∗ P ∗ K) X <- model.matrix(y ∼ N ∗ P ∗ K) Treatment Yield Effect Estimate*8 1 N P NP K NK PK NPK 1 94.1 1 867.6 1 -1 -1 1 -1 1 1 -1 n 107.5 N 68.8 1 1 -1 -1 -1 -1 1 1 1 -1 1 -1 -1 1 -1 1 p 97.0 P 24.8 1 1 1 1 -1 -1 -1 -1 np 113.5 NP -10.0 1 -1 -1 1 1 -1 -1 1 k 96.9 K 43.4 1 1 -1 -1 1 1 -1 -1 nk 122.9 NK 9.0 1 -1 1 -1 1 -1 1 -1 pk 111.4 PK 7.0 1 1 1 1 1 1 1 1 npk 124.3 NPK -16.2 X ′ X = 8 I 8 Model ⇒ ˆ 1 = 8 X ′ y θ y ijk = µ + α i + β j + γ k + ( αβ ) ij + ( αγ ) ik + ( βγ ) jk + ( αβγ ) ijk i, j, k = 1 , 2 5 6 Yates Algorithm Kronecker products 1 N P NP K NK PK NPK Yield (1) (2) (3) 1 94.1 201.6 412.1 867.6 1 1 -1 -1 1 -1 1 1 -1 n 107.5 210.5 455.5 68.8 N 1 1 -1 -1 -1 -1 1 1 p 1 -1 1 -1 -1 1 -1 1 97.0 219.8 29.9 24.8 P np 113.5 235.7 38.9 -10.0 NP 1 1 1 1 -1 -1 -1 -1 k 96.9 13.4 8.9 43.4 K 1 -1 -1 1 1 -1 -1 1 nk 122.9 16.5 15.9 9.0 NK 1 1 -1 -1 1 1 -1 -1 pk 111.4 26.0 3.1 7.0 PK 1 -1 1 -1 1 -1 1 -1 npk 124.3 12.9 -13.1 -16.2 NPK 1 1 1 1 1 1 1 1 7 8
Kronecker products Restatement of problem Evaluate X ′ y where X 1 − X 1 X = = X Z ⊗ Z ⊗ Z X 1 X 1 and 1 − 1 ⊗ X 1 ⇒ X = 1 1 1 − 1 . = Z 1 1 1 − 1 ⊗ X 2 X 1 = where Model structure: Kronecker product 1 1 Data structure: ?? 1 − 1 1 − 1 1 − 1 ⊗ ⊗ = ⇒ X 1 1 1 1 1 1 9 10 Restatement of problem Evaluate X ′ y where pk npk 111.4 124.3 Data structure: Y , a 3- d array p np 113.5 97.0 Model structure: X = Z ⊗ Z ⊗ Z Evaluate X ′ y where Data structure: Y , n 1 × n 2 nk k 122.9 Model structure: X = X 2 ⊗ X 1 , X i , n i × c i 96.9 94.1 n 107.5 n 107.5 ( X 2 ⊗ X 1 ) ′ y ≡ X ′ 1 1 Y X 2 11 12
Efficient computation The rotated H -transform ( X 2 ⊗ X 1 ) ′ y ( X ′ 2 ( X ′ 1 Y ) ′ ) ′ ( X 2 ⊗ X 1 ) ′ y ≡ X ′ 1 Y X 2 = ( X ′ 2 ( X ′ 1 Y ) ′ ) ′ ≡ ( X 3 ⊗ X 2 ⊗ X 1 ) ′ y ( X ′ 3 ( X ′ 2 ( X ′ 1 Y ) ′ ) ′ ) ′ ≡ ??? Example: X 1 , X 2 , 10 3 × 10 2 ( X 3 ⊗ X 2 ⊗ X 1 ) ′ y ≡ ρ ( X 3 , ρ ( X 2 , ρ ( X 1 , Y ))) • No need to compute X • LHS: 10 10 multiplications Definition: X , n 1 × c 1 matrix; A , c 1 × c 2 × c 3 array. RHS: 10 8 multiplications ρ ( X , A ) → XA c 1 × c 2 c 3 = A n 1 × c 2 c 3 → A n 1 × c 2 × c 3 → A c 2 × c 3 × n 1 • Reformat RHS Example: 2 3 factorial design. Evaluate X ′ y , X = Z ⊗ Z ⊗ Z , Z 2 × 2 , Y 2 × 2 × 2 . ρ ( Z ′ , ρ ( Z ′ , ρ ( Z ′ , Y ))) = ˆ Θ 2 × 2 × 2 13 14 Yates Algorithm Yield (1) (2) (3) PK NPK 1 94.1 201.6 412.1 867.6 1 7.0 −16.2 n 107.5 210.5 455.5 68.8 N p 97.0 219.8 29.9 24.8 P P 24.8 NP −10 np 113.5 235.7 38.9 -10.0 NP k 96.9 13.4 8.9 43.4 K nk 122.9 16.5 15.9 9.0 NK pk 111.4 26.0 3.1 7.0 PK npk 124.3 12.9 -13.1 -16.2 NPK NK K 9.0 43.4 1 867.6 68.8 N 15 16
Computation Generalised linear models Structure X ′ ˜ X ′ ˜ X ˜ θ W δ ˜ z W δ X • Data: a data vector y In 1- d , let X = ( x 1 , . . . , x c ) , n × c . • Model: a model matrix X = X 3 ⊗ X 2 ⊗ X 1 n a parameter vector θ � ( X ′ W δ X ) jk = w i x ij x ik = ( x 1 j x 1 k , . . . , x nj x nk ) w = ( x j ∗ x k ) ′ w a link function i =1 µ = E ( y ) , g ( µ ) = Xθ Definition. The row tensor of X , n × c , is • Error distribution: normal, Poisson, binomial, . . . Algorithm G ( X ) = ( X ⊗ 1 ′ ) ∗ ( 1 ′ ⊗ X ) , n × c 2 • Scoring algorithm where 1 is vector of 1’s of length c . X ′ ˜ θ = X ′ ˜ W δ X ˆ W δ ˜ z X ′ W δ X , c × c ≡ G ( X ) ′ w , c 2 × 1 − 1 ( y − ˜ z = X ˜ θ + ˜ where ˜ W δ µ ) . 17 18 Computation of X ′ W δ X Standard errors of X ˆ θ X i , n i × c i , i = 1 , 2 , 3 . We need X = X 3 ⊗ X 2 ⊗ X 1 , n 1 n 2 n 3 × c 1 c 2 c 3 diag X ( X ′ W δ X ) − 1 X ′ = diag XS m X ′ W δ is diagonal, n 1 n 2 n 3 × n 1 n 2 n 3 where S m , c 1 c 2 c 3 × c 1 c 2 c 3 . W is the corresponding array, n 1 × n 2 × n 3 Let S , c 2 1 × c 2 2 × c 2 3 , be the array form of S m . X ′ W δ X , c 1 c 2 c 3 × c 1 c 2 c 3 diag XS m X ′ , n 1 n 2 n 3 × 1 ρ ( G ( X 3 ) ′ , ρ ( G ( X 2 ) ′ , ρ ( G ( X 1 ) ′ , W ))) , c 2 1 × c 2 2 × c 2 ≡ 3 ρ ( G ( X 3 ) , ρ ( G ( X 2 ) , ρ ( G ( X 1 ) , S ))) , n 1 × n 2 × n 3 . ≡ 19 20
Computation: a summary Generalised linear array models Structure Linear functions: X ′ y , c 1 c 2 c 3 × 1 • Data: a data array Y , n 1 × n 2 × n 3 ρ ( X ′ 3 , ρ ( X ′ 2 , ρ ( X ′ 1 , Y ))) , c 1 × c 2 × c 3 • Model: a model matrix X = X 3 ⊗ X 2 ⊗ X 1 , X i , n i × c i a parameter array Θ , c 1 × c 2 × c 3 Xθ , n 1 n 2 n 3 × 1 a link function ρ ( X 3 , ρ ( X 2 , ρ ( X 1 , Θ ))) , n 1 × n 2 × n 3 M = E ( Y ) , g ( M ) = ρ ( X 3 , ρ ( X 2 , ρ ( X 1 , Θ ))) . • Error distribution: normal, Poisson, binomial, . . . Inner products : X ′ W δ X , c 1 c 2 c 3 × c 1 c 2 c 3 Algorithm ρ ( G ( X 3 ) ′ , ρ ( G ( X 2 ) ′ , ρ ( G ( X 1 ) ′ , W ))) , c 2 1 × c 2 2 × c 2 3 • Efficient scoring algorithm Array computation of Diagonal function : diag XS m X ′ , n 1 n 2 n 3 × 1 X ′ ˜ θ = X ′ ˜ W δ X ˆ W δ ˜ z ρ ( G ( X 3 ) , ρ ( G ( X 2 ) , ρ ( G ( X 1 ) , S ))) , n 1 × n 2 × n 3 . − 1 ( y − ˜ z = X ˜ θ + ˜ where ˜ W δ µ ) . 21 22 LM, GLM & GLAM Examples • Factorial designs LM GLM GLAM • Contingency tables Y = Y d Data y y • Smooth models with P -splines µ = Xθ g ( µ ) = Xθ g ( M ) = ρ ( X d , . . . , ρ ( X 1 , Θ ) . . . ) Model Error N N, P, B,... N, P, B,... Algorithm N-equns Scoring Efficient Scoring 23 24
A single cubic B−spline 1−D B−spline basis 0.7 0.6 0.6 0.5 0.5 0.4 Basis functions 0.4 B−spline 0.3 0.3 0.2 0.2 0.1 0.1 0.0 0.0 20 40 60 80 20 40 60 80 25 26 Age Age P -splines in 2- d Structure • Data: deaths matrix, D , exposure matrix, E • Model: a model matrix B = B y ⊗ B a , B a , 80 × 20 , B y , 100 × 25 a parameter array Θ , 20 × 25 a link function 0.5 log( M ) = log E + B a Θ B ′ M = E ( D ) , y . 0.1 0.2 0.3 0.4 • Error distribution: Poisson. Algorithm • Efficient penalised scoring algorithm 0 100 Array computation of 9 9 78 9 1 6 ( B ′ ˜ θ = B ′ ˜ 8 9 W δ B + P )ˆ 56 1 W δ ˜ z Age 3 7 9 1 33 r a 0 e 6 Y − 1 ( d − ˜ 9 1 z = B ˜ θ + ˜ where ˜ W δ µ ) and P = P ( λ a , λ y ) is the penalty matrix. 11 7 4 9 1 27 28
Swedish females: ages 20 to 90 Smooth Lee-Carter models Structure 90 −2 • Data: deaths matrix, D , exposure matrix, E 80 • Model: D ij ∼ P ( µ ij ) , log µ ij = log E ij + α i + β i κ j −4 For given ˆ κ 70 Log(mortality) log( M ) = log E + B a Θ ˆ ′ M = E ( D ) , K 60 where α = B a a , β = B a b , Θ = [ a : b ] , ˆ K = [ 1 : ˆ κ ] . 50 −6 α and ˆ For given ˆ β 40 log( M ) = log E + log( 1 ′ ⊗ ˆ α ) + ˆ M = E ( D ) , βk ′ B ′ k 30 where κ = B k k . −8 20 Algorithm: Efficient penalised scoring algorithm 1900 1920 1940 1960 1980 2000 29 30 Year Microarray data Data are arranged on a 72 × 128 grid in 8 blocks of 2 rows and 4 columns. −1 2.2 −2 2.0 −3 1.8 ′ ′ E ( Y ) = B 3 A ˘ 3 + B 0 C ˘ −4 B B Alpha Beta 0 1.6 −5 1.4 −6 1.2 −7 1.0 −8 20 40 60 80 100 20 40 60 80 100 Age Age −3.6 0.1 −3.8 log(mortality) 0.0 −4.0 Kappa −0.1 −4.2 −0.2 −4.4 Age: 65 −0.3 31 32 1950 1960 1970 1980 1990 2000 1950 1960 1970 1980 1990 2000 Year Year
Recommend
More recommend