Estimation equations for multivariate linear models with Kronecker - PowerPoint PPT Presentation

Estimation equations for multivariate linear models with Kronecker structured covariance matrices nska-´ Alvarez a , Chengcheng Hao b , Szczepa´ Yuli Liang c , Dietrich von Rosen d , e a Department of Mathematical and Statistical Methods, Pozna´ n University of Life Sciences, Poland, b School of Business Information, Shanghai University of International Business and Economics, China, c Statistics Sweden, Sweden, d Department of Energy and Technology, Swedish University of Agricultural Sciences, Sweden, e Department of Mathemathics, Link¨ oping University, Sweden 2.12.2016

Model Consider independent and identically matrix normally distributed observations X i ∼ N p , q ( µ , Ψ , Σ ) , i = 1 , ..., n , vec X i ∼ N pq ( vec µ , Σ ⊗ Ψ ), where E [ X i ] = µ - the expected value, D [ X i ] = Σ ⊗ Ψ - the dispersion matrix, Ψ - the p × p matrix describing the unknown covariance structure between the rows of X i , Σ - the q × q matrix describing the unknown covariance structure between the columns of X i

Data Y Let � n Y ( i ) = X i − 1 X i , n i =1 Moreover, � Y 1 i � Y = ( Y (1) , Y (2) , ..., Y ( n ) ) , Y ( i ) = , Y 2 i Y : p × nq , Y ( i ) : p × q , Y 1 i : r × q , Y 2 i : ( p − r ) × q , i = 1 , 2 , ..., n , so � Y 11 � � Y 1 � Y 12 ... Y 1 n Y = = , Y 21 Y 22 ... Y 2 n Y 2 where Y 1 : r × nq , Y 2 : ( p − r ) × nq .

Data ˜ Y ′ Let � ˜ � Y ′ 1 i Y ′ = ( ˜ ˜ (1) , ˜ (2) , ..., ˜ ˜ Y ′ Y ′ Y ′ Y ′ ( n ) ) , ( i ) = ˜ Y ′ 2 i � ˜ � � Y ′ � Y ′ Y ′ Y ′ 1 ... Y ′ = ˜ 11 12 1 n = ˜ Y ′ Y ′ Y ′ ... Y ′ 2 21 22 2 n where Y ′ : q × np , ˜ ˜ Y ′ 1 i : q × r , ˜ Y ′ 2 i : q × ( p − r ), i = 1 , 2 , ..., n .

Case 1 The matrix Σ is unstructured and Ψ is a partitioned matrix of the form � A ( θ ) � B Ψ = , B ′ Ω where A ( θ ): r × r , 1 < r < p , depends on an unknown parameter θ , B : r × ( p − r ) , Ω : ( p − r ) × ( p − r ) - unknown matrices.

Case 1 - Theorem 1 Given that the maximum likelihood estimator for θ in A ( θ ) can be obtained the maximum likelihood estimators of µ , Σ and Ψ satisfy the following equations: n � 1 µ � = X i , n i =1 d A ( � θ ) − 1 vec ( qn A ( � θ ) − Y 1 ( I n ⊗ � Σ − 1 ) Y ′ 1 ) = 0 , d � θ − 1 ′ ′ ′ np � � 1 ( I n ⊗ A ( � θ ) − 1 ) � Y 1 + ( � 2 − � 1 ( I n ⊗ � δ ))( I n ⊗ � Σ = Y Y Y Ψ 2 • 1 ) ′ ′ × ( � 2 − � 1 ( I n ⊗ � δ )) ′ , Y Y − 1 ) Y ′ − 1 ) Y ′ � ( Y 1 ( I n ⊗ � 1 ) − 1 Y 2 ( I n ⊗ � = δ Σ Σ 1 ,

Case 1 - Theorem 1 where � � A ( � � θ ) B � = Ψ , ′ � � B Ω and ′ A ( � � A ( � θ ) � � Ψ 2 • 1 + � � θ ) � B = δ , Ω = δ δ , − 1 )( Y 2 − � ′ Y 1 )( I n ⊗ � ′ Y 1 ) ′ . qn � ( Y 2 − � Ψ 2 • 1 = δ Σ δ

Case 1 - Corollary 1 Under the assumptions of Theorem 1, if − 1 ) Y ′ qn A ( � θ ) − Y 1 ( I n ⊗ � 1 = 0 , Σ then, � n 1 � µ = X i n i =1 1 − 1 ) Y ′ , � nq Y ( I n ⊗ � = Ψ Σ 1 − 1 ) � ′ ( I n ⊗ � � � Σ = Y Ψ Y . np

flip-flop algorithm P.Dutilleul (1999) - Since ( c Σ ) ⊗ ( 1 c Ψ ), all the parameters of Σ and Ψ are not defined uniquely. - The direct product Σ ⊗ Ψ is uniquely defined. - The convergence of the MLE algorithm may be assessed by try- ing various initial solution. If all of the initial solutions tried result in the same direct product ˆ Σ ⊗ ˆ Ψ and the corresponding final solutions are ˆ Σ , ˆ Ψ satisfy the criterion of the second derivatives, then any of the final solutions ˆ Σ , ˆ Ψ should provide maximum likelihood estimates for Σ and Ψ ; otherwise, they correspond, at the least, to local extrema of the likelihood function.

Case 1 - Corollary 2 Under the assumptions of Theorem 1, if A ( � θ ) = 1, then � n 1 � µ = X i n i =1 1 − 1 ) Y ′ , � nq Y ( I n ⊗ � = Ψ Σ 1 − 1 ) � ′ ( I n ⊗ � � � Σ = Y Ψ Y . np

Case 2 The matrix Σ is unstructured and Ψ is a block partitioned matrix, � A ( θ ) � B Ψ = , B ′ Ω where A ( θ ) - a compound symmetric structure, i.e., A ( θ ) = (1 − θ ) I r + θ 1 r 1 ′ r , where 1 r denotes the column vector of size r with all elements equal to 1.

Case 2 - Theorem 2 The maximum likelihood estimators of µ , Σ and Ψ satisfy the following equations: 1 1 − 1 ) Y ′ � r Y 1 ( I n ⊗ � = nqr ( r − 1) tr ( 1 r 1 ′ 1 ) − θ Σ r − 1 , n � 1 µ � = X i , n i =1 A ( � (1 − � θ ) I r + � θ 1 r 1 ′ θ ) = r , − 1 ′ ′ ′ np � � 1 ( I n ⊗ A ( � θ ) − 1 ) � Y 1 + ( � 2 − � 1 ( I n ⊗ � δ ))( I n ⊗ � Σ = Y Y Y Ψ 2 • 1 ) ′ ′ × ( � 2 − � 1 ( I n ⊗ � δ )) ′ , Y Y − 1 ) Y ′ − 1 ) Y ′ � ( Y 1 ( I n ⊗ � 1 ) − 1 Y 2 ( I n ⊗ � δ = Σ Σ 1 ,

Case 2 - Theorem 2 where � � A ( � � θ ) B � Ψ = , ′ � � B Ω and ′ A ( � � A ( � θ ) � � Ψ 2 • 1 + � � θ ) � B = δ , Ω = δ δ , − 1 )( Y 2 − � ′ Y 1 )( I n ⊗ � ′ Y 1 ) ′ . qn � ( Y 2 − � Ψ 2 • 1 = δ Σ δ

Case 3 Both matrices Σ and Ψ follow a compound symmetric covariance structure, i.e. (1 − ρ ) I p + ρ 1 p 1 ′ Ψ = p , σ 1 I q + σ 2 ( 1 q 1 ′ Σ = q − I q ) , where ρ , σ 1 and σ 2 are unknown parameters.

Case 3 - Theorem 3 The maximum likelihood estimators of µ , Σ and Ψ satisfy n � 1 µ � = X i , n i =1 � Ψ = (1 − � ρ ) I p + � ρ 1 p 1 ′ p , � Σ = � σ 1 I q + � σ 2 ( 1 q 1 ′ q − I q ) , where λ 3 / � � q ( � λ 1 + ( q − 1) � q ( � λ 1 − � λ 4 − 1 1 σ 2 = 1 σ 1 � = λ 2 ) , � λ 2 ) , � ρ = λ 4 + p − 1 , � λ 3 / � � λ 1 , � λ 2 - distinct eigenvalues of Σ � λ 3 , � λ 4 - distinct eigenvalues of Ψ

Case 3 - Theorem 3 and ˆ np (ˆ 3 t 1 + ˆ ˆ np ( q − 1) (ˆ 3 t 3 + ˆ 1 λ − 1 λ − 1 1 λ − 1 λ − 1 λ 1 = 4 t 2 ) , λ 2 = 4 t 4 ) , λ − 1 ˆ 1 ˆ λ − 2 4 ˆ λ 2 3 t 2 nq ˆ λ 3 − nq ˆ 3 ˆ λ − 1 − ˆ λ − 1 1 t 1 − ˆ λ − 1 λ 2 2 t 3 + 4 ( p − 1) λ − 1 ˆ 2 ˆ λ − 2 4 ˆ λ 2 3 t 4 p = ˆ λ 3 + ˆ + = 0 , , λ 4 ( p − 1) , ( p − 1) with t 1 = tr { P 1 p Y ( I n ⊗ P 1 q ) Y ′ } , t 2 = tr { Q 1 p Y ( I n ⊗ P 1 q ) Y ′ } , t 3 = tr { P 1 p Y ( I n ⊗ Q 1 q ) Y ′ } , t 4 = tr { Q 1 p Y ( I n ⊗ Q 1 q ) Y ′ } , where P 1 p = 1 p 1 p 1 ′ p and Q 1 p = I p − P 1 p and the observation matrix Y is the centered observation matrix.

Case 4 The matrix Σ is unstructured and in Ψ is the matrix which all diagonal elements equal 1. − 1 − 1 Ψ = T 2 2 d TT d , where T : p × p - the symmetric matrix, T d : p × p - the diagonal matrix with diagonal elements the same as matrix T .

Case 4 - Theorem 4 Maximum likelihood equations are given by the following relations: 1 1 np Σ = ˜ d T − 1 T d ) − 1 )˜ Y ′ ( I n ⊗ ( T 2 2 Y 1 1 1 1 1 − 1 d T − 1 + ( T − 1 T − 2 T − 1 T d T − 1 ) d + ( T − 1 T d AT 2 2 d AT 2 2 d A ) d T 2 2 = 0 , d where 1 1 d T − 1 T d − nq I p − Y ( I n ⊗ Σ − 1 ) Y ′ A = 2 nq T 2 2 and 1 1 1 ( T − 1 T d T − 1 ) d , ( T − 1 T 2 2 d A ) d 2 denote diagonal matrices. d AT

Literature Dutilleul P. (1999). The MLE algorithm for the matrix normal distribution. J. Statist. Comput. Simul, vol. 64, 105-123. Srivastava M. S., von Rosen T. and von Rosen D.(2008). Models with a Kronecker Product Covariance Structure: Estimation and Testing. Mathematical Methods of Statistics, vol. 17, No. 4, 357–370. nska-´ Szczepa´ Alvarez A, Hao Ch., Liang Y., von Rosen D. (2016). Estimation equations for multivariate linear models with Kronec- ker structured covariance matrices. Communications in Statistics- Theory and Methods. DOI10.1080/03610926.2016.1165852

Estimation equations for multivariate linear models with Kronecker - PowerPoint PPT Presentation

Estimation equations for multivariate linear models with Kronecker structured covariance matrices nska- Alvarez a , Chengcheng Hao b , Szczepa Yuli Liang c , Dietrich von Rosen d , e a Department of Mathematical and Statistical Methods,

Outline Multivariate Data 1 Multivariate Parametric Methods Multivariate Normal Distribution 2

Multivariate Linear Regression Max Turgeon STAT 4690Applied Multivariate Analysis

Overview Hash Functions On Building Hash Functions From Multivariate Quadratic Equations

Tests for Multivariate Linear Models with the car Package John Fox McMaster University Hamilton,

MLES & Multivariate Normal Theory STA721 Linear Models Duke University Merlise Clyde

Multivariate t-distributions Surajit Ray Reader, University of Glasgow DataCamp Multivariate

Reading multivariate data Surajit Ray Reader, University of Glasgow DataCamp Multivariate

CS 7616 Pattern Recognition Linear, Linear, Linear Aaron Bobick School of Interactive

Section6.1 Systems of Equations in Two Variables Introduction Definitions A system of equations

Introduction to Data Science: Logistic 0 1 1 according to a data fit criterion. account

On-line estimation with the multivariate Gaussian distribution Sanjoy Dasgupta and Daniel Hsu UC

Higher Order Linear Differential Equations Familiar stuff Example Homogeneous equations Math

Linear Equations, Back Substitution and Elementary Operations Systems of Linear Equations Defn.

Linear Systems CS3220 Summer 2008 Jonathan Kaldor Systems of Linear Equations Want to find

Multivariate and Partially observed models Erik Lindstrm n T Briefly on multivariate models N

Limitations of linear models Richard Erickson Instructor DataCamp Generalized Linear Models in

Scholarship Program Basic Elements A big part of what we do as a Fraternity is mentoring and

CNRS

CENG3420 Lab 2-2: LC-3b Simulator Hao Geng Department of Computer Science and Engineering The

Apache Sentry - High Availability Hao Hao - hao.hao@cloudera.com Seville, Spain, Nov 14 - 16 2016

On multiple SLEs Eveliina Peltola Universit de Genve; Section de Mathmatiques <

Yandex MLW Rome 2013 2 Yandex: ~4000 Russian Speakers And me. ( ) Built

Solutions for Hard and Soft Constraints Using Optimized Probabilistic Satisfiability Marcelo

Algorithms for hard lattice problems Thijs Laarhoven ts

Estimation equations for multivariate linear models with Kronecker - PowerPoint PPT Presentation

Estimation equations for multivariate linear models with Kronecker structured covariance matrices nska- Alvarez a , Chengcheng Hao b , Szczepa Yuli Liang c , Dietrich von Rosen d , e a Department of Mathematical and Statistical Methods,

Outline Multivariate Data 1 Multivariate Parametric Methods Multivariate Normal Distribution 2

Multivariate Linear Regression Max Turgeon STAT 4690Applied Multivariate Analysis

Overview Hash Functions On Building Hash Functions From Multivariate Quadratic Equations

Tests for Multivariate Linear Models with the car Package John Fox McMaster University Hamilton,

MLES &amp; Multivariate Normal Theory STA721 Linear Models Duke University Merlise Clyde

Multivariate t-distributions Surajit Ray Reader, University of Glasgow DataCamp Multivariate

Reading multivariate data Surajit Ray Reader, University of Glasgow DataCamp Multivariate

CS 7616 Pattern Recognition Linear, Linear, Linear Aaron Bobick School of Interactive

Section6.1 Systems of Equations in Two Variables Introduction Definitions A system of equations

Introduction to Data Science: Logistic 0 1 1 according to a data fit criterion. account

On-line estimation with the multivariate Gaussian distribution Sanjoy Dasgupta and Daniel Hsu UC

Higher Order Linear Differential Equations Familiar stuff Example Homogeneous equations Math

Linear Equations, Back Substitution and Elementary Operations Systems of Linear Equations Defn.

Linear Systems CS3220 Summer 2008 Jonathan Kaldor Systems of Linear Equations Want to find

Multivariate and Partially observed models Erik Lindstrm n T Briefly on multivariate models N

Limitations of linear models Richard Erickson Instructor DataCamp Generalized Linear Models in

Scholarship Program Basic Elements A big part of what we do as a Fraternity is mentoring and

CNRS

CENG3420 Lab 2-2: LC-3b Simulator Hao Geng Department of Computer Science and Engineering The

Apache Sentry - High Availability Hao Hao - hao.hao@cloudera.com Seville, Spain, Nov 14 - 16 2016

On multiple SLEs Eveliina Peltola Universit de Genve; Section de Mathmatiques &lt;

Yandex MLW Rome 2013 2 Yandex: ~4000 Russian Speakers And me. ( ) Built

Solutions for Hard and Soft Constraints Using Optimized Probabilistic Satisfiability Marcelo

Algorithms for hard lattice problems Thijs Laarhoven ts

MLES & Multivariate Normal Theory STA721 Linear Models Duke University Merlise Clyde

On multiple SLEs Eveliina Peltola Universit de Genve; Section de Mathmatiques <