Speeding Up the ARDL Estimation Command: A Case Study in Efficient - PowerPoint PPT Presentation

Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements Speeding Up the ARDL Estimation Command: A Case Study in Efficient Programming in Stata and Mata Sebastian Kripfganz 1 Daniel C. Schneider 2 1 University of Exeter 2 Max Planck Institute for Demographic Research German Stata Users Group Meeting, June 23, 2017 Kripfganz/Schneider Uni Exeter & MPIDR Speeding Up ARDL June 23, 2017 1 / 27

Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements Contents Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements Kripfganz/Schneider Uni Exeter & MPIDR Speeding Up ARDL June 23, 2017 2 / 27

Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements Introduction ➓ Long code execution times are more than a nuisance: they negatively affect the quality of research ➓ strategies for speeding up execution: ➓ lower-level language ➓ parallelization ➓ writing efficient code ➓ Efficient coding is often the best choice. ➓ Moving to lower-level languages is tedious. ➓ In many settings, speed improvements are higher than through parallelization. Kripfganz/Schneider Uni Exeter & MPIDR Speeding Up ARDL June 23, 2017 3 / 27

Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements Introduction: Speed of Stata and Mata ➓ C is the reference ➓ compiled to machine instructions ➓ Post of Bill Gould (2014) at the Stata Forum: ➓ Stata (interpreted) code is 50-200 times slower than C. ➓ Mata compiled byte-code 5-6 times slower than C. => Mata is 10-40 times faster than Stata. ➓ In real-world applications, Mata is ~2 times slower than C. ➓ Mata has built-in C routines based on very efficient code. Kripfganz/Schneider Uni Exeter & MPIDR Speeding Up ARDL June 23, 2017 4 / 27

Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements Introduction: Efficient Coding Strategies ➓ Using Common Sense ➓ An if-condition requires at least N comparisons. Use in-conditions instead, if possible. ➓ Multiplying two 100x100 matrices requires about 2*100^3 = 2,000,000 arithmetic operations. ➓ Using Knowledge of Your Software (Stata, of course!) ➓ Examples: ➓ Mata: passing of arguments to functions ➓ Efficient operators and functions (e.g. Mata’s colon operator and its c-conformability) ➓ Read the Stata and Mata programming manuals Kripfganz/Schneider Uni Exeter & MPIDR Speeding Up ARDL June 23, 2017 5 / 27

Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements Introduction: Efficient Coding Strategies Using Knowledge of Matrix Algebra ➓ Translating mathematical formulas one-to-one into matrix language expressions is oftentimes (very!) inefficient. ➓ Examples: ➓ diagonal matrices (D) : ➓ multiplication of a matrix by D: don’t do it! Mata: use c-conformability of the colon operator (see [M-2] op_colon ) ➓ inverse: flip diagonal elements instead of calling a matrix solver / inverter function ( O ♣ n q vs. O ♣ n ✸ q ) ➓ block diagonal matrices: ➓ multiplication: just multiply diagonal blocks; the latter is faster by ✶ ④ s ✷ , where s is the number of diagonal blocks ➓ inverse: invert individual blocks ➓ order of matrix multiplication / parenthesization ➓ b ✏ ♣ X ✶ X q ✁ ✶ ♣ X ✶ y q is faster than b ✏ ♣ X ✶ X q ✁ ✶ X ✶ y e.g. for k ✏ ✶✵ , N ✏ ✶✵ , ✵✵✵ : matrix multiplications are 11 times faster! Kripfganz/Schneider Uni Exeter & MPIDR Speeding Up ARDL June 23, 2017 6 / 27

Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements Asymptotic Notation Definition An algorithm with input size n and running time T ♣ n q is said to be Θ ♣ g ♣ n qq (“theta of g of n”) or to have an asymptotically tight bound g ♣ n q if there exist positive real numbers c ✶ , c ✷ , n ✵ → ✵ such that c ✶ g ♣ n q ↕ T ♣ n q ↕ c ✷ g ♣ n q ❅ n ➙ n ✵ 6e9 3 - 1000n 2 + 1000n + 10e9 is O(n 3 ) # of arithmetic operations T(n)=0.8n 4e9 2e9 0 0 500 1000 1500 2000 Algorithm input size n 3 3 Kripfganz/Schneider Uni Exeter & MPIDR T(n) 0.2 * n Speeding Up ARDL 0.801 * n June 23, 2017 7 / 27

Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements Asymptotic Notation ➓ O ♣ g ♣ n qq (“(big) oh of g of n”), as opposed to Θ ♣ g ♣ n qq , is used here to only denote an upper bound. Notation differs in the literature. ➓ Technically, Θ ♣ g ♣ n qq and O ♣ g ♣ n qq are sets of functions, so we write e.g. T ♣ n q P O ♣ g ♣ n qq . ➓ For matrix operations, g ♣ n q is frequently n raised to some low integer power. � n ✷ ✟ � n ✸ ✟ ➓ Θ ♣ n q is much better than Θ , which in turn is much better than Θ ➓ (Square) matrix multiplication is Θ � n ✸ ✟ : each element of the new n ✂ n matrix is a sum of n terms. Costly! ➓ Many types of matrix inversion, e.g. the LU-decomposition, are also � n ✸ ✟ Θ . Costly! ➓ Inner vector products are Θ ♣ n q . ➓ When T ♣ n q is an i -th order polynomial, the leading term � n i ✟ asymptotically dominates: T ♣ n q P O . ➓ Θ ♣ a n q is worse than Θ ♣ n a q ; Θ ♣ lg n q is better than Θ ♣ n q Kripfganz/Schneider Uni Exeter & MPIDR Speeding Up ARDL June 23, 2017 8 / 27

Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements ARDL: Model Setup ➓ ARDL ♣ p , q ✶ , . . . , q k q : autoregressive distributed lag model ➓ Popular, long-standing single-equation time-series model for continuous variables ➓ Linear model : p q ➳ ➳ ✵ , σ ✷ ✟ β ✶ � y t ✏ c ✵ � c ✶ t � φ i y t ✁ i � i ① t ✁ i � u t , u t � iid i ✏ ✶ i ✏ ✵ t q ✶ can be purely I ♣ ✵ q , purely I ♣ ✶ q , or cointegrated: can be used to test for ➓ ♣ y t , ① ✶ cointegration (bounds testing procedure). (Pesaran, Shin, and Smith, 2001). => econometrics of ARDL can be complicated. ➓ ♥❡t ✐♥st❛❧❧ ❛r❞❧ ✱ ❢r♦♠✭❤tt♣✿✴✴✇✇✇✳❦r✐♣❢❣❛♥③✳❞❡✴st❛t❛✮ ➓ This talk: programming; for the statistics of ❛r❞❧ , see Kripfganz/Schneider (2016). Kripfganz/Schneider Uni Exeter & MPIDR Speeding Up ARDL June 23, 2017 9 / 27

Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements ARDL: Computational Considerations ➓ Despite its complex statistical properties, estimating an ARDL model is just based on OLS! ➓ The computational costly parts are: ➓ determination of optimal lag orders (e.g. via AIC or BIC) ➓ treated at length in this talk ➓ simulation of test distributions for cointegration testing (PSS 2001, Narayan 2005). ➓ not covered by this talk Kripfganz/Schneider Uni Exeter & MPIDR Speeding Up ARDL June 23, 2017 10 / 27

Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements Optimal Lag Selection: The Problem ➓ For k � ✶ variables (indepvars + depvar) and maxlag lags for each variable, run a regression and calculate an information criterion (IC) for each possible lag combination and select the model with the best IC value. ➓ Example: 2 variables (v1 v2) , ➓ # of regressions to run is maxlag ✏ ✷ exponential in k : r❡❣r❡ss ✈✶ ▲✭✶✴✶✮✳✈✶ ▲✭✵✴✵✮✳✈✷ maxlags ☎ ♣ maxlags � ✶ q k : r❡❣r❡ss ✈✶ ▲✭✶✴✷✮✳✈✶ ▲✭✵✴✵✮✳✈✷ r❡❣r❡ss ✈✶ ▲✭✶✴✶✮✳✈✶ ▲✭✵✴✶✮✳✈✷ k � ✶ maxlags # regressions r❡❣r❡ss ✈✶ ▲✭✶✴✷✮✳✈✶ ▲✭✵✴✶✮✳✈✷ r❡❣r❡ss ✈✶ ▲✭✶✴✶✮✳✈✶ ▲✭✵✴✷✮✳✈✷ 3 4 100 r❡❣r❡ss ✈✶ ▲✭✶✴✷✮✳✈✶ ▲✭✵✴✷✮✳✈✷ 3 8 � 650 4 8 � 5,800 6 8 � 470,000 8 8 � 38,000,000 Kripfganz/Schneider Uni Exeter & MPIDR Speeding Up ARDL June 23, 2017 11 / 27

Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements Lag Selection: Preliminaries ✔ ✜ ✶ ✵ ✵ ✶ ✵ ✶ ✖ ✣ ✖ ✣ ✶ ✵ ✷ ✖ ✣ ✖ ✣ ✶ ✶ ✵ ✖ ✣ ✖ ✣ ✶ ✶ ✶ ✖ ✣ ➓ Lag combination matrix for k ✏ ✸ and maxlags ✏ ✷ : ✖ ✣ ✶ ✶ ✷ ✖ ✣ ✖ ✣ ✶ ✷ ✵ ✖ ✣ ✖ ✣ ✶ ✷ ✶ ✖ ✣ ✖ ✣ ☎ ☎ ☎ ✕ ✢ ✷ ✷ ✷ ✏ ✘ ➓ e.g. row 3: corresponds to regressors ✶ ✵ ✷ ✏ ✘ ▲✳✈✶ ▲✭✵✴✵✮✳✈✷ ▲✭✵✴✷✮✳✈✸ = v ✶ t ✁ ✶ v ✷ t v ✸ t v ✸ t ✁ ✶ v ✸ t ✁ ✷ ➓ called “lagcombs” in pseudo-code to follow Kripfganz/Schneider Uni Exeter & MPIDR Speeding Up ARDL June 23, 2017 12 / 27

Speeding Up the ARDL Estimation Command: A Case Study in Efficient - PowerPoint PPT Presentation

Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements Speeding Up the ARDL Estimation Command: A Case Study in Efficient Programming in Stata and Mata Sebastian

ardl: Stata module to estimate autoregressive distributed lag models Sebastian Kripfganz 1 Daniel

ardl: Estimating autoregressive distributed lag and equilibrium correction models Sebastian

Command Line Arguments ECE2893 Lecture 20 ECE2893 Command Line Arguments Spring 2011 1 / 5

The Command Line Matthew Bender CMSC Command Line Workshop Octover 30 Matthew Bender (2015)

Speeding up the Inter-Planetary File System (IPFS) Speeding up the Inter-Planetary File System

Speeding Up Your Mac A Joe ON Tech Guide Speeding Up Your Mac Basics Three factors affect

Revisiting The Growth Hypothesis For the Renewables in the Energy-Growth Nexus ~Using ARDL

The Command Line Matthew Bender CMSC Command Line Workshop April 17 Matthew Bender (2015) The

The Command Line Matthew Bender CMSC Command Line Workshop October 16, 2015 Matthew Bender

The Command Line Matthew Bender CMSC Command Line Workshop November 13, 2015 Matthew Bender

The Command Line Matthew Bender CMSC Command Line Workshop October 23 Matthew Bender (2015)

HQ Air Force Space Command Air Force Space Command Lead USAF Major Command for Cyberspace

GIVING GIVING GIVING GIVING Does God command me to tithe? Does God command me to tithe? Does

Motion Estimation by Affine Transforms Motion Estimation by Affine Transforms Motion Estimation

Wheeler Road Virtual Community Meeting Summer 2020 Station 1 Speeding 2 Station 1

Speeding up query execution in PostgreSQL using LLVM JIT compiler Dmitry Melnik dm@ispras.ru

NIE Doctor in Education Nurturing leaders for change in the education professions Associate

Introduction to the SPFPFS Strategic Plan Map Ohios SPFPFS Initiative: OnDemand

CMS: Theory efforts & synergies CMS theory overview Theory groups have core strengths in

Future Deans in Indonesia Lions or Lambs? Dr. Jenny Ngo SEAMEO International Conference Quality

Intraseasonal variability in South America Mariano S. Alvarez Departamento de Ciencias de la

DVDNet Deep Blind Video Decaptioning with 3D-2D Gated Convolutions Dahun Kim, Sanghyun Woo,

Observation-constrained pulsar magnetospheric models Yes, this one needs to be serviced too. It

Build an Accountable Sales Program in Your Small Business Pam Watson Korbel Leading vs. Lagging

Speeding Up the ARDL Estimation Command: A Case Study in Efficient - PowerPoint PPT Presentation

Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements Speeding Up the ARDL Estimation Command: A Case Study in Efficient Programming in Stata and Mata Sebastian

ardl: Stata module to estimate autoregressive distributed lag models Sebastian Kripfganz 1 Daniel

ardl: Estimating autoregressive distributed lag and equilibrium correction models Sebastian

Command Line Arguments ECE2893 Lecture 20 ECE2893 Command Line Arguments Spring 2011 1 / 5

The Command Line Matthew Bender CMSC Command Line Workshop Octover 30 Matthew Bender (2015)

Speeding up the Inter-Planetary File System (IPFS) Speeding up the Inter-Planetary File System

Speeding Up Your Mac A Joe ON Tech Guide Speeding Up Your Mac Basics Three factors affect

Revisiting The Growth Hypothesis For the Renewables in the Energy-Growth Nexus ~Using ARDL

The Command Line Matthew Bender CMSC Command Line Workshop April 17 Matthew Bender (2015) The

The Command Line Matthew Bender CMSC Command Line Workshop October 16, 2015 Matthew Bender

The Command Line Matthew Bender CMSC Command Line Workshop November 13, 2015 Matthew Bender

The Command Line Matthew Bender CMSC Command Line Workshop October 23 Matthew Bender (2015)

HQ Air Force Space Command Air Force Space Command Lead USAF Major Command for Cyberspace

GIVING GIVING GIVING GIVING Does God command me to tithe? Does God command me to tithe? Does

Motion Estimation by Affine Transforms Motion Estimation by Affine Transforms Motion Estimation

Wheeler Road Virtual Community Meeting Summer 2020 Station 1 Speeding 2 Station 1

Speeding up query execution in PostgreSQL using LLVM JIT compiler Dmitry Melnik dm@ispras.ru

NIE Doctor in Education Nurturing leaders for change in the education professions Associate

Introduction to the SPFPFS Strategic Plan Map Ohios SPFPFS Initiative: OnDemand

CMS: Theory efforts &amp; synergies CMS theory overview Theory groups have core strengths in

Future Deans in Indonesia Lions or Lambs? Dr. Jenny Ngo SEAMEO International Conference Quality

Intraseasonal variability in South America Mariano S. Alvarez Departamento de Ciencias de la

DVDNet Deep Blind Video Decaptioning with 3D-2D Gated Convolutions Dahun Kim*, Sanghyun Woo*,

Observation-constrained pulsar magnetospheric models Yes, this one needs to be serviced too. It

Build an Accountable Sales Program in Your Small Business Pam Watson Korbel Leading vs. Lagging

CMS: Theory efforts & synergies CMS theory overview Theory groups have core strengths in

DVDNet Deep Blind Video Decaptioning with 3D-2D Gated Convolutions Dahun Kim, Sanghyun Woo,