Dynamic Equicorrelation Bryan Kelly (Joint work with Rob Engle)
The Problem with Covariances... Since early ‘80s, attempts have been made to estimate multivariate GARCH models Specifications so complex that traditional models are difficult to estimate for more than a few assets In finance, we want to work with large cross sections Portfolio selection Derivatives (basket options, CDOs, etc.) Risk Management
DCC: Problem Solved? Engle (2002) introduces Dynamic Conditional Correlation Massive parameter reduction: an entire matrix evolution can be described by two parameters (sort of...) Computational burden even for a few parameters: must calculate inverse and determinant of N x N matrices many thousands of times in likelihood maximization A pain for a moderate systems Infeasible for very large systems? Other concerns Storing correlation matrices Digesting massive output: N(N-1)/2 series
Dynamic Equicorrelation (DECO) Where to begin? Simplify the problem: All assets share the same correlation each period, but this “equicorrelation” varies through time What does it buy? Analytic inverse and determinant likelihood simple to compute for system of any dimension Entire correlation evolution summarized by a single time series
Outline Model and theoretical properties DECO amid extant covariance models Monte Carlo evaluation Correlations among the S&P 500 Equicorrelation in action
Introducing DECO
Defining Equicorrelation An equicorrelation matrix takes the form 1 − ρ t 0 ρ t ρ t · · · · · · ... ... R t = (1 − ρ t ) I n + ρ t J n = 0 0 + ρ t . . . . . 0 1 − ρ t . ρ t Lemma 1 : 1 − ρ t R − 1 = I n + (1 − ρ t )(1 + [ n − 1] ρ t ) J n t 1 − ρ t det( R t ) = (1 − ρ t ) n − 1 (1 + [ n − 1] ρ t ) . Invertible and positive definite if and only if ρ t ∈ ( − 1 n − 1 , 1)
The Model r t , n x 1 vector, unit variance, correlations R t DECO is born from the DCC process 1 1 2 + β Q t − 1 . 2 r t − 1 r ′ Q t = ¯ ˜ ˜ Q (1 − α − β ) + α Q t − 1 Q t − 1 t − 1 − 1 − 1 2 Q t ˜ = ˜ R DCC 2 Q t Q t t Average pairwise DCC correlations R DECO = (1 − ρ t ) I n + ρ t J n × n t 1 2 q i,j,t � � � ι ′ R DCC ρ t = = ι − n t √ q i,i,t q j,j,t n ( n − 1) n ( n − 1) i � = j,i>j
The Model Assumption 1: ¯ Q is p.d. , α + β < 1 , α > 0 , β > 0 . Theorem 1 : Correlation matrices generated by every realization of a DECO process are p.d. and mean reverting
Estimation Gaussian (Quasi-) Maximum Likelihood Assume returns are conditionally normal r t ≡ D − 1 r t | t − 1 ∼ N (0 , H t ) , ˜ H t = D t R t D t , ˜ r t t Log likelihood can be decomposed into L = − 1 − 1 log | D t | 2 + ˜ � � t D − 2 t R − 1 � � � � r ′ r t − r ′ log | R t | + r ′ ˜ t r t t r t t T T t t Important theorem: two-stage estimator will be consistent!
Estimation Proceed in two easy steps 1. Stock-by-stock GARCH models to “de-volatize” returns 2. Estimate DECO on standardized returns
Data Is Non-Equicorrelated? Have no fear, DECO will provide consistent estimates anyway Theorem 2 : As long as DCC (a very general, non-equicorrelated covariance model) is a consistent estimator of correlations, DECO will be too How useful: arbitrary dimension DCC model can be estimated via DECO, this could be infeasible with DCC alone
Block DECO More flexible structure with the tractability and robustness of DECO Example: industry model - each industry has a single DECO parameter and each industry pair has a single cross-equicorrelation parameter (1 − ρ 1 , 1 ,t ) I n 1 0 ρ 1 , 1 ,t J n 1 ρ 1 , 2 ,t J n 1 × n 2 · · · · · · ... ... 0 0 R t = + ρ 2 , 1 ,t J n 2 × n 1 . . . . . 0 (1 − ρ K,K,t ) I n K . ρ K,K,t J n K
Block DECO Theorem 3 : Two-block DECO has easy analytic inverses and determinants - thus as computationally feasible as DECO det( R ) = (1 − ρ 1 , 1 ) n 1 − 1 (1 − ρ 2 , 2 ) n 2 − 1 � (1 + [ n 1 − 1] ρ 1 , 1 )(1 + [ n 2 − 1] ρ 2 , 2 ) − ρ 2 � 1 , 2 n 1 n 2 � b 1 I n 1 � c 1 J n 1 × n 1 � � c 3 J n 1 × n 2 0 R − 1 = + b 2 I n 2 c 3 J n 2 × n 1 c 2 J n 2 × n 2 0 � � − ρ 2 ρ 2 , 2 ( n 2 − 1) + 1 ρ 1 , 1 1 , 2 n 2 = c 1 � � [ ρ 1 , 1 ( n 1 − 1) + 1][ ρ 2 , 2 ( n 2 − 1) + 1] − n 1 n 2 ρ 2 ( ρ 1 , 1 − 1) 1 , 2 � � − ρ 2 ρ 1 , 1 ( n 1 − 1) + 1 ρ 2 , 2 1 , 2 n 1 = c 2 � � [ ρ 1 , 1 ( n 1 − 1) + 1][ ρ 2 , 2 ( n 2 − 1) + 1] − n 1 n 2 ρ 2 ( ρ 2 , 2 − 1) 1 , 2 ρ 1 , 2 = c 3 � �� � n 1 n 2 ρ 2 ρ 1 , 1 ( n 1 − 1) + 1 ρ 2 , 2 ( n 2 − 1) + 1 1 , 2 −
Block DECO For more blocks - difficult analytics, but cozily falls into composite likelihood framework More information in block composite likelihood than DCC version - potentially more efficient Theorem 4 : like DECO, block DECO is a QML estimator of non-block-equicorrelated systems
Digression: Using Composite Likelihood Composite likelihood splices together likelihood of subsets of assets In DCC, a subset is a pair of stocks , i and j In Block DECO, a subset is all the stocks in pair of blocks i and j Pairs of stocks Pairs of R t = Blocks
DECO Amid Current Literature
Related Models Two types of approaches to estimating time- varying covariances in large systems 1. Factor GARCH (Engle, Ng, Rothschild 1992, Engle 2008) 2. Composite likelihood (Engle, Shephard, Sheppard, 2008)
Factor (Double) ARCH Impose factor structure on system r t = BF t + ǫ t V ar ( r t ) = BV ar ( F t ) B ′ + V ar ( ǫ t )
Factor (Double) ARCH Impose factor structure on system r t = BF t + ǫ t V ar ( r t ) = BV ar ( F t ) B ′ + V ar ( ǫ t ) Univariate GARCH dynamics in factors can generate time-varying correlations while keeping the residual covariance matrix constant through time. F t ∼ GARCH V ar t ( r t ) = BV ar t ( F t ) B ′ + V ar ( ǫ t )
Factor (Double) ARCH Impose factor structure on system r t = BF t + ǫ t V ar ( r t ) = BV ar ( F t ) B ′ + V ar ( ǫ t ) Univariate GARCH dynamics in factors and residuals can generate time-varying correlations while keeping the residual correlation matrix constant through time. F t , ǫ t ∼ GARCH V ar t ( r t ) = BV ar t ( F t ) B ′ + V ar t ( ǫ t )
Factor (Double) ARCH Benefits 1. Feasibility for large numbers of assets - only estimate n+K GARCH (regression) models 2. Full likelihood, potential for efficiency Limitations 1. Don’t know factors? Don’t have data? 2. Misspecification - dynamics in residual correlations?
Composite Likelihood DCC Estimate DCC for arbitrary cross sections 1 1 2 + β Q t − 1 2 r t − 1 r ′ Q t = ¯ Q (1 − α − β ) + α ˜ t − 1 ˜ Q t Q t − 1 − 1 2 Q t ˜ = ˜ R DCC 2 Q t Q t t Modeling any pair will give consistent estimates of α , β Randomly select subset of all pairs - a partial likelihood technique
Composite Likelihood Benefits 1. Very flexible - no structural assumption required Limitations 1. Partial likelihood - never efficient
Fundamental Tradeoff Factor ARCH - strict structural assumptions Composite Likelihood - abandons useful information
Where Does DECO Fit? Flexibly balances this tradeoff Structural models (like factor structures) can be estimated as part of the first stage, and DECO can clean up correlation dynamics in residuals With blocks or first-stage structure, can be as well-specified as composite likelihood, yet more efficient
Monte Carlos
Performance: DECO as DGP As a first check, we ask “How does DECO do when correctly specified?” Simulate DECO processes using various 1. Time series dimensions 2. Cross section sizes 3. Parameter ( ) values α , β
Table 1: DECO as Generating Process !
Performance: DCC as DGP “How does DECO do when incorrectly specified?” Simulate DCC processes Standard deviation of pairwise correlations large, ~0.33
! Table 2: DCC as Generating Process
Correlation Among the S&P 500
S&P 500, 1995-2008 Stocks included if traded over entire sample and a member of S&P 500 at some point in that time Final count: 466 stocks
Estimation Model menu: Choose one of each... First-Stage Model 1. Constant Factor 2. CAPM 3. Fama-French Three-Factor 4. 10 Industry Factors Second-Stage (Correlation) Model 1. DECO 2. 10-Block DECO 3. DCC
Using Composite Likelihood Composite likelihood splices together likelihood of subsets of assets In DCC, a subset is a pair of stocks , i and j In Block DECO, a subset is all the stocks in pair of blocks i and j
Table 3: Full Sample Results !
Interpretation Intuitively, DECO will outperform DCC when there is a dominating component of pairwise correlations inducing all pairwise correlations to move together In this case, smoothing reduces noise without compromising structure
Out-of-Sample Forecasts
Recommend
More recommend