Bayesian Estimation of Input‐Output Tables for Russia Oleg Lugovoy (EDF, RANE) Andrey Polbin (RANE) Vladimir Potashnikov (RANE) WIOD Conference April 24, 2012 Groningen
Outline • Motivation • Objectives • Bayesian methods: a short intro • Experiments: – Bayesian vs. RAS & Entropy: MC experiment – Updating IOT for Russia • Some conclusions • Further steps
Objectives • Methodological: – Incorporate uncertainties into IOT estimates – Apply Bayesian framework for IOT updating • Practical: – Full density profile estimates with covariates for Russian IOT (OKONH, OKVED 2001‐2010, 15, 23 and 79 activities)
Motivation • Unsatisfied demand for Russian IOT forces users of the data to estimate, update, disaggregate, get their best estimate. Each procedure involves assumptions, which draw results. However, there are a number of ways to do that, and it is not always straightforward to prefer one assumption to another, based on available information. • The assumptions made on the (IOT) estimation stage might be crucial for an analysis that applies the estimated IOT as an input data. • Bayesian inference suggests a natural way to accommodate uncertainties into estimation process. Assigning probability distributions for unknown parameters it allows to trace the link from assumptions to economic analysis results.
Bayesian inference ‐ unknown parameters ‐ data ‐ prior distribution of parameters ‐ likelihood function ‐ Posterior distribution (combination of information from prior and data)
Bayesian Approach to Statistics • Since we are uncertain about the true value of parameters, we will consider them to be random variables. • The rules of probability are used directly to make inferences about the parameters. • Known information about parameters can be naturally incorporated to the estimates using priors. • The result of the estimate is a posterior distribution of uncertain parameters which comes from two sources: the prior distribution and the observed data.
How it works? • For simple (1‐parameter) models closed form solution can be derived • For complicated models, were it is difficult to derive posterior distribution, sampling methods applied • The most efficient sampling algorithms for now is Monte Carlo Markov Chains (MCMC) method with Gibbs or Metropolis‐Hasting algorithm
Bayesian perspective on IOT estimation • Problem: = Y AX , ∑ = ≥ , 0 a a a i j , j i j , i • Solution: – application of MCMC to sample elements of A‐ matrix. Each version of A should satisfy all the constrains.
Estimating IOT: Bayesian perspective • Transforming problem from Y = AX (A – matrix is unknown) to Bz = Y* (z – vector is unknown, standard problem in linear algebra) where z – vectorized matrix A Y* ‐ combined Y and a_hat vectors = + F ξ (1) (1) therefore we have to sample z in form: z z where z‐tilde is a particular solution, F – fundamental matrix, psi – stochastic component.
Experimental estimates
Estimates • Performance: Bayesian vs. RAS vs. Max.Entropy – 1. Artificial data: MC experiment – 2. Historical data: IOT‐2003 (OKONH 23x23) • Updating with limited information: – 3. Historical USE‐2006 (OKVED 15x15) – 4. Forecasting USE 2007‐2010 (OKVED 15x15)
1. Monte‐Carlo experiment: Bayesian vs. RAS, vs. Cross‐Entropy • Generate arbitrary , ,.., A A A A‐matrices 4x4 for six years 1 2 6 • Assume we don’t know the last matrix: A 6 • Estimate A 6 with RAS and Minimal Cross‐ Entropy, assuming we know A 5 , Y 6 and X 6 • Estimate A 6 with MCMC, assuming we know A 1 …A 5 , Y 6 and X 6 (estimating standard deviation for A elements based on A 1 …A 4 , and assigning the information to priors)
MC experiment (cont.) • Three cases, 10 000 experiments each: – i.i.d. σ ≤ t 2 ( , ), 3 a N m i i j , i j , i j , – Stationary AR(1) = − ρ + ρ − + ε ε σ ≤ t t 1 t t 2 (1 ) , (0, ), 3 a m a N i i j , i j , i j , i j , i j , i j , – Random walk − = + ε ε σ ≤ t t 1 t t 2 , (0, ), 3 a a N i , , , , , i j i j i j i j i j
Monte‐Carlo experiment: results Share of experiments where Bayesian methodology provided results closer to true matrix Independent process AR(1) process Random-Walk Entropy RAS Entropy RAS Entropy RAS RMSE 72.2% 73.2% 67.3% 67.8% 62.0% 63.0% MAE 76.0% 77.8% 71.2% 73.1% 66.1% 67.7% MAPE 76.2% 78.1% 72.4% 74.1% 67.1% 69.0% • Distance criteria: ( ) ∑ ∑ 2 4 4 = − ˆ RMSE 1/16 a a RMSE: root of mean squared error = = ij ij i 1 j 1 ∑ ∑ = 4 4 − MAE: mean absolute error ˆ MAE 1/16 a a = = ij ij 1 1 i j − MAPE: mean absolute relative error ˆ a a ∑ ∑ = 4 4 ij ij 1/16 MAPE = = i 1 j 1 a ij
Monte‐Carlo experiment: results 0.06 0.05 RMSE in Bayesian Method 0.04 0.03 0.02 0.01 0 0 0.01 0.02 0.03 0.04 0.05 0.06 RMSE in RAS Method
2. Experiment: Updating IOT for Russia • Symmetric IOT 23x23 for 2003, OKONH (Soviet type) definition of activities • Assuming IOT 1998‐2002 are known, 2003 – unknown • Prior mean: IOT‐2002 • Prior distribution – truncated normal • Prior st.d. – estimated in 1998‐2002 • Comparison of the results with “RAS” and “Cross‐ Entropy”
Experiment: Updating IOT for Russia (cont) • Comparison of the results RMSE MAE MAPE RMSPE Bayes 0.0074 0.0029 0.1844 0.4502 RAS 0.0067 0.0026 0.1728 0.4604 Entropy 0.0065 0.0026 0.1797 0.4552 where RMSPE - root of mean squared percentage error 2 − ˆ a a ∑ ∑ m n = ij ij RMSPE 1/ ( m n * ) = = i 1 j 1 a ij
Experiment: Updating IOT for Russia (cont)
3. Experimental estimates of Russian IOT for 2006 Estimation of USE matrix in OKVED (NACE) definition of activities
Standard problem = ⋅ Y A X A – unknown input‐output matrix Y – known intermediate demand vector X – known output vector What inference can we make toward A, when no any other information is available?
What inference can we make toward A , when X and Y are known Let’s sample N variants of matrix A , satisfying to the constrains: Y = AX , given X и Y , using MCMC, non‐informative priors: a ij ~ uniform(0,1) Sampling USE‐2006 table; N = 45000
Estimated А ‐matrix (blue) vs. true value (red)
Analysis • Sparse matrix: number of values are close to zero – for activities with relatively small output. • Posterior distributions are asymmetric toward zero. This is a result of constrain on the columns sum (<1). If one of a value is relatively large, others should tend to zero.
Distribution of pair‐wise correlation coefficients
Constrains on the coefficients • Substantial correlation between the sampled A‐matrix elements means that constraining one or some of the estimated parameters will affect other. • Let’s impose a constrain on one of the coefficients: A(D,D), specifying ”tight” prior for it.
Sampling with “tight” prior for A(D,D)
Sampling with “tight” prior for A(D,D)
Sampling with “tight” prior for A(D,D)
Comparison of results with true values: no constrains
Comparison of results with true values: constrained А( D,D)
4. Experiment: Forecasting USE with unknown Y • A (t‐1) is known • Y (t) is unknown: b < Y < c
Forecasting USE‐2006 with unknown Y
Forecasting USE with unknown Y prior year 2007 year 2008 year 2009 year 2010
Some conclusions • Bayesian approach is a flexible and natural tool to incorporate uncertainties in data into estimation process • The experimental estimates demonstrate a way of application of Bayesian inference for updating and estimating IOT • The results of the estimates – multidimensional distribution of the estimated parameters might be used as an input information for sensitivity analysis on a stage of implementation of the analysis.
Further steps: • Extending the information for the estimates, involving data from National Accounts and other available. • Joint estimates of multiple accounts (in current and constant prices). • Disaggregation of OKVED‐15 tables to OKVED‐ 79.
Thank you for your attention! olugovoy@gmail.com apolbin@gmail.com potashnikov.vu@gmail.com
Recommend
More recommend