Forecasting skyrocketing unemployment with big data María Rosalía Vicente (mrosalia@uniovi.es) Ana Jesús López (anaj@uniovi.es) Rigoberto Pérez (rigo@uniovi.es) University of Oviedo (Spain) New Techniques and Technologies for Statistics NTTS 2015 10-12 March 2015, Brussels
Monthly evolution of registered unemployment in Spain 5e+006 4.5e+006 Unemployment rates. Year 2014 4e+006 EU-28= 10.2% 3.5e+006 Spain= 24.5% Source: Eurostat (2015) 3e+006 2.5e+006 2e+006 1.5e+006 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 Source: Spanish Ministry of Employment and Social Security (2014)
BACKGROUND Literature on nowcasting and forecasting unemployment with online search-related data: • Two main references: Ettredge, Gerdes and Karuga (2005) and Choi and Varian (2009). • Evidence has been provided for different countries: France (Fondeur and Karamé, 2013), Germany (Askitas and Zimmermann, 2009), Israel (Suhoy, 2009), Italy ( D’Amuri , 2009), Norway (Anvik and Gjelstad, 2010), the UK (McLaren and Shanbhogue, 2011) and the US (D'Amuri and Marcucci, 2009).
DATA Variable of interest : Monthly registered unemployment in Spain. • Source: Spanish Ministry of Employment and Social Security. • Period of analysis: January 2004-December 2012. • Forecasting horizon: January 2013-December 2013. Explanatory variables: • On the demand side : The Employment Confidence Indicator (ECI) which shows the balance between the positive and negative opinions of industrial firms on the current employment situation and their perspectives three-months ahead. • Source: Spanish Ministry of Industry, Energy and Tourism. • On the supply side : Google’s Trend Index which measures the volume of queries made by internet users through this search engine . • Note: This is a weekly index that takes value 100 in the week with the highest number of searches for the words of interest. • Keywords: “ oferta de trabajo ” and “ oferta de empleo” (=job offer). • Source: Google Trends service.
METHODOLOGY Two baselines models: Baseline B1: ARIMA(0,1,2)(0,1,1) (1-L)(1-L 12 )Y t =(1-q 1 L-q 2 L 2 )(1-Q 1 L 12 )u t Baseline B2: ARIMA(0,1,2)(0,1,1) with a level shift (LS) starting in March 2008 and a level shift with trend (t LS) (1-L)(1-L 12 )Y t = (1-q 1 L-q 2 L 2 )(1-Q 1 L 12 )u t + g 1 LS t + g 2 t LS t Three specifications including Google-related variables on job search: Model M1: (1-L)(1-L 12 )Y t = (1-q 1 L-q 2 L 2 )(1-Q 1 L 12 )u t + g 1 LS t + g 2 t LS t + b 1 X t ECI Model M2: (1-L)(1-L 12 )Y t = (1-q 1 L-q 2 L 2 )(1-Q 1 L 12 )u t + g 2 t LS t + b 1 X t ECI + b 2 X t Google-T Model M3: (1-L)(1-L 12 )Y t = (1-q 1 L-q 2 L 2 )(1-Q 1 L 12 )u t + g 2 t LS t + b 1 X t ECI + b 3 X t Google-E
Estimation results for ARIMA and ARIMAX models on Spanish unemployment Baseline B1 Baseline B2 Model M1 Model M2 Model M3 q 1 0.7853 *** 0.7603 *** 0.7422 *** 0.6858 *** 0.6863 *** q 2 0.4055 *** 0.4006 *** 0.3888 *** 0.3763 *** 0.3766 *** Q 1 -0.4618 *** -0.5526 *** -0.5339 *** -0.6607 *** -0.6555 *** g 1 (Level shift) 58439.3 ** g 2 (Level shift with -751.266 ** -258.788 ** -339.137 *** -304.633 *** trend) b 1 (Employment -1206.42 *** -704.939 * -785.996 * Confidence Indicator) b 2 (Google index for 304.563 ** “oferta de empleo”) b 3 (Google index for 308.017 * “oferta de trabajo”) S.D. of innovations 33237.26 33043.72 32428.39 31212.74 31598.58 Akaike Criterion 2259.380 2258.660 2255.088 2249.829 2252.163 Schwarz Criterion 2269.595 2273.983 2270.412 2267.706 2270.040 Normality test Chi- Chi-2=2.57 Chi-2=1.79 Chi-2=1.34 Chi-2=2.41 Chi-2=2.49 square p=0.27 p=0.40 p=0.51 p=0.30 p=0.29
Actual and forecasted unemployment in the horizon January-December 2013 5.05e+006 Unemployment Unemployment_Baseline_forecast Unemployment_M1_forecast 5e+006 Unemployment_M2_forecast Unemployment_M3_forecast 4.95e+006 4.9e+006 4.85e+006 4.8e+006 4.75e+006 4.7e+006 4.65e+006 4.6e+006 2013 Baseline Baseline Model Model Model B1 B2 M1 M2 M3 Root Mean Squared Error 219440 64065 67653 61639 59056 Mean Percentage Error -3.3073 1.2408 0.1319 0.8527 0.6042 Mean Absolute Percentage Error 3.5837 1.2408 1.1794 1.1678 1.075 Theil's U 3.2791 0.9023 0.9707 0.8678 0.8289
SUMMARY • Emerging literature on the use of “Big Data” to improve the nowcasting and forecasting of macroeconomic variables. • This paper has focused on the data coming from individuals’ internet search behavior in order to analyze the evolution of unemployment in Spain. • Searches on “job offers” . • Results confirm the potential of the proposed approach: It significantly improves the estimation and forecasting of unemployment’s figures in a context of important economic shocks. More details in the paper: Vicente, M.R., López, A.J. and Pérez, R. (2015): Forecasting unemployment with internet search data: Does it help to improve predictions when job destruction is skyrocketing?, Technological Forecasting & Social Change, 92, 132-139.
Recommend
More recommend