Sign Restrictions, Structural Vector Autoregressions, and Useful Prior Information* James D. Hamilton, UCSD Aarhus University CREATES Lecture November 10, 2015 *Based on joint research with Christiane Baumeister, University of Notre Dame
Can we give structural interpretation to VARs using only sign restrictions? • Parameters only set identified: data cannot distinguish different models within set • Frequentist methods • Awkward and computationally demanding [Moon, Schorfheide, and Granziera, 2013] • Bayesian methods • Numerically simple [Rubio-Ramírez, Waggoner, and Zha (2010)] • For some questions, estimate reflects only the prior [Poirier (1998); Moon and Schorfheide (2012)]
Today’s lecture • Calculate small-sample and asymptotic Bayesian posterior distributions for partially identified structural VAR • Characterize regions of parameter space about which data are uninformative • Explicate the prior that is implicit in traditional sign-restricted structural VAR algorithms • Propose that researchers use informative priors and report difference between prior and posterior distributions • Illustrate with simple model of labor market • Code available at http://econweb.ucsd.edu/~jhamilton/BHcode.zip
Outline 1. Bayesian inference for partially identified structural VARs 2. Implicit priors in traditional approach 3. Empirical application: shocks to labor supply and demand
1. Bayesian inference for partially identified structural vector autoregressions Structural model of interest: A y t � � � B 1 y t � 1 � � � B m y t � m � u t � n � n � � n � 1 � u t � i.i.d. N � 0 , D � D diagonal
Example: demand and supply q t � k d � � d p t � b 11 d p t � 1 � b 12 d q t � 1 � b 21 d p t � 2 d q t � 2 � � � b m 1 d p t � m � b m 2 d q t � m � u t � b 22 d q t � k s � � s p t � b 11 s p t � 1 � b 12 s q t � 1 � b 21 s p t � 2 s q t � 2 � � � b m 1 s p t � m � b m 2 s q t � m � u t � b 22 s 1 � � d A � 1 � � s
Reduced-form (can easily estimate): y t � c � � 1 y t � 1 � � � � m y t � m � � t � t � i.i.d. N � 0 , � � T y t x t � 1 T x t � 1 x t � 1 � 1 � T � � t � 1 � t � 1 � � � x t � 1 � � 1, y t � 1 , y t � 2 ,..., y t � m � � � � � � c � � � 1 � 2 � � m � T x t � 1 � t � y t � � � T � � T � T � 1 � t � 1 � � t � � t �
Structural model: Ay t � � � B 1 y t � 1 � � � B m y t � m � u t u t � i.i.d. N � 0 , D � D diagonal Reduced form: y t � c � � 1 y t � 1 � � � � m y t � m � � t � t � i.i.d. N � 0 , � � � t � A � 1 u t A � A � � D (diagonal) Problem: there are more unknown elements in D and A than in � .
Supply and demand example: 4 structural parameters in A , D � � s , � d , d 11 , d 22 � only 3 parameters known from � � � 11 , � 12 , � 22 � We can achieve partial identification from � s � 0, � d � 0
Structural model: Ay t � � � B 1 y t � 1 � � � B m y t � m � u t u t � i.i.d. N � 0 , D � D diagonal Intuition for results that follow: If we knew row i of A (denoted a i � � , then we could estimate coefficients for i th structural equation ( b i � by � 1 � t � 1 � a i T x t � 1 x t � 1 T x t � 1 y t � i � � T b � t � 1 � a i � � � � ii � T � 1 � t � 1 2 � a i T û t � T a i � T A � � � � diag( A � D � � d
Consider Bayesian approach where we begin with arbitrary prior p � A � E.g., prior beliefs about supply and demand elasticities in the form of joint density p � � s , � d � 1 � � d A � 1 � � s
p � A � could also impose sign restrictions, zeros, or assign small but nonzero probabilities to violations of these constraints.
Will use natural conjugate priors for other parameters: n p � d ii | A � p � D | A � � � i � 1 � 1 | A � � � � i , � i � d ii � 1 | A � � � i / � i E � d ii � 1 | A � � � i / � i 2 Var � d ii uninformative priors: � i , � i � 0
B � B 1 B 2 B m � � n p � b i | D , A � p � B | D , A � � � i � 1 b i | A , D � N � m i , d ii M i � � 1 � 0 uninformative priors: M i
Recommended default priors (Minnesota prior) Doan, Litterman, Sims (1984) Sims and Zha (1998) � elements of m i corresponding to lag 1 given by a i � all other elements of m i are zero � M i diagonal with smaller values on bigger lags � prior belief that each element of y t behaves like a random walk � i function of A (or prior mode of p � A � ) and scale of data
Likelihood: p � Y T | A , D , B � � � 2 � � � Tn /2 |det � A � | T | D | � T /2 � T � Ay t � Bx t � 1 � � D � 1 � Ay t � Bx t � 1 � exp � � 1/2 � � t � 1 prior: p � A , D , B � � p � A � p � D | A � p � B | A , D � posterior: p � Y T | A , D , B � p � A , D , B � p � A , D , B | Y T � � � p � Y T | A , D , B � p � A , D , B � d A d D d B � p � A | Y T � p � D | A , Y T � p � B | A , D , Y T �
Exact Bayesian posterior distribution (all T � : b i | A , D , Y T � N � m i � , d ii M i � � � � i Y � � a i � y 1 ,..., a i � y T , m i � P i � � 1 � � T � k �� � � i X x 0 x T � 1 P i � � � k � � T � k �� � 1 X � � � X � y � i � i � i m i X � i � 1 � � � � M i � X � i � i � 1 M i X P i P i � 1 � 0 � If uninformative prior ( M i � � � a i � T then m i � �
Frequentist interpretation of Bayesian posterior distribution as T � � : If prior on B is not dogmatic � 1 is finite), then (that is, if M i � a i � �� � 1 E � x t � 1 y t p m i � � E � x t � 1 x t � 1 � � a i � � 0 � p M i � 0 � � a i p b i | A , D , Y T � � 0
Posterior distribution for D | A � 1 | A , Y T � � � � i � � T /2 � , � i � � � i � /2 �� d ii � 1 X � � � Y � X � X � Y � i � i � i � i � i � i � i � i Y Y X � i � � 1 � 0 , � i � � T a i � T a i If M i � � T � � T � T � 1 � t � 1 � x t � 1 � t , � t � y t � � � t � � � � t are unrestricted OLS residuals) ( �
If priors on B and D are not dogmatic � 1 , � i , � i are all finite) then (that is, if M i p � / T � a i � � 0 a i � i � � � E � y t x t � 1 � �� E � x t x t � �� � 1 E � x t � 1 y t � 0 � E � y t x t � 1 � � p d ii | A , Y T � a i � � 0 a i
Posterior distribution for A � T A � �� T /2 k T p � A �� det � A � p � A | Y T � � � i � 1 n � / T �� � i � T /2 �� 2 � i / T � � � � i k T � constant that makes this integrate to 1 p � A � � prior � 1 � 0 , and � i � � i � 0, If M i � T A � � | T /2 k T p � A � |det � A � p � A | Y T � � T /2 � T A � � det diag( A �
� T A � � | T /2 k T p � A � |det � A � p � A | Y T � � T /2 � T A � � det diag( A � If evaluated at A for which � T A � � diag( A � � T A � � , A � p � A | Y T � � k T p � A �
� T A � � | T /2 k T p � A � |det � A � p � A | Y T � � T /2 � T A � � det diag( A � Hadamard’s Inequality: If evaluated at A for which � T A � � diag( A � � T A � � , A � � T A � � � T A � � det diag( A � � det � A � p � A | Y T � � 0
kp � A � if A � S � � 0 � p � A | Y T � � 0 otherwise S � � 0 � � A : A � 0 A � diagonal � � � E � y t x t � 1 � �� � 1 E � x t � 1 y t � 0 � E � y t x t � 1 �� E � x t x t � � �
Special case: if model is point-identified (so that S � � � consists of a single point), then posterior distribution converges to a point mass at true A
2. Prior beliefs that are implicit in the traditional approach Alternatively could specify priors in terms of impact matrix: y t � � x t � 1 � Hu t � y t � � A � 1 H � � u t We found solution for all priors on A and joint for p � A , D � when D | A is natural conjugate.
Traditional approach best understood as p � H | � � . (1) Calculate Cholesky factor � � PP � . (2) Generate � n � n � X � � x ij � of N � 0,1 � . (3) Find X � QR for Q orthogonal and R upper triangular. (4) Generate candidate H � PQ and keep if it satisfies sign restrictions.
First column of Q � first column of X normalized to have unit length: 2 � � � x n 1 x 11 / x 11 2 q 11 � � � 2 � � � x n 1 x n 1 / x 11 2 q n 1 E.g., if n � 2, q 11 � cos � for � the angle between � x 11 , x 21 � and � 1,0 � while q 21 � sin � .
cos � � sin � with prob 1/2 sin � cos � Q � cos � sin � with prob 1/2 sin � � cos � � � U � � � , � �
Recommend
More recommend