Speeding Up the ARDL Estimation Command: A Case Study in Efficient - - PowerPoint PPT Presentation

speeding up the ardl estimation command
SMART_READER_LITE
LIVE PREVIEW

Speeding Up the ARDL Estimation Command: A Case Study in Efficient - - PowerPoint PPT Presentation

Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements Speeding Up the ARDL Estimation Command: A Case Study in Efficient Programming in Stata and Mata Sebastian


slide-1
SLIDE 1

Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements

Speeding Up the ARDL Estimation Command:

A Case Study in Efficient Programming in Stata and Mata

Sebastian Kripfganz1 Daniel C. Schneider2

1University of Exeter 2Max Planck Institute for Demographic Research

German Stata Users Group Meeting, June 23, 2017

Kripfganz/Schneider Uni Exeter & MPIDR Speeding Up ARDL June 23, 2017 1 / 27

slide-2
SLIDE 2

Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements

Contents

Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements

Kripfganz/Schneider Uni Exeter & MPIDR Speeding Up ARDL June 23, 2017 2 / 27

slide-3
SLIDE 3

Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements

Introduction

➓ Long code execution times are more than a nuisance: they

negatively affect the quality of research

➓ strategies for speeding up execution:

➓ lower-level language ➓ parallelization ➓ writing efficient code

➓ Efficient coding is often the best choice.

➓ Moving to lower-level languages is tedious. ➓ In many settings, speed improvements are higher than through

parallelization.

Kripfganz/Schneider Uni Exeter & MPIDR Speeding Up ARDL June 23, 2017 3 / 27

slide-4
SLIDE 4

Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements

Introduction: Speed of Stata and Mata

➓ C is the reference

➓ compiled to machine instructions

➓ Post of Bill Gould (2014) at the Stata Forum:

➓ Stata (interpreted) code is 50-200 times slower than C. ➓ Mata compiled byte-code 5-6 times slower than C.

=> Mata is 10-40 times faster than Stata.

➓ In real-world applications, Mata is ~2 times slower than C. ➓ Mata has built-in C routines based on very efficient code. Kripfganz/Schneider Uni Exeter & MPIDR Speeding Up ARDL June 23, 2017 4 / 27

slide-5
SLIDE 5

Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements

Introduction: Efficient Coding Strategies

➓ Using Common Sense

➓ An if-condition requires at least N comparisons. Use in-conditions

instead, if possible.

➓ Multiplying two 100x100 matrices requires about 2*100^3 = 2,000,000

arithmetic operations.

➓ Using Knowledge of Your Software (Stata, of course!)

➓ Examples: ➓ Mata: passing of arguments to functions ➓ Efficient operators and functions (e.g. Mata’s colon operator and its

c-conformability)

➓ Read the Stata and Mata programming manuals Kripfganz/Schneider Uni Exeter & MPIDR Speeding Up ARDL June 23, 2017 5 / 27

slide-6
SLIDE 6

Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements

Introduction: Efficient Coding Strategies

Using Knowledge of Matrix Algebra

➓ Translating mathematical formulas one-to-one into matrix

language expressions is oftentimes (very!) inefficient.

➓ Examples:

➓ diagonal matrices (D) : ➓ multiplication of a matrix by D: don’t do it!

Mata: use c-conformability of the colon operator (see [M-2] op_colon)

➓ inverse: flip diagonal elements instead of calling a matrix solver / inverter

function (O♣nq vs. O♣n✸q)

➓ block diagonal matrices: ➓ multiplication: just multiply diagonal blocks; the latter is faster by ✶④s✷, where s is

the number of diagonal blocks

➓ inverse: invert individual blocks ➓ order of matrix multiplication / parenthesization ➓ b ✏ ♣X ✶Xq✁✶ ♣X ✶yq is faster than b ✏ ♣X ✶Xq✁✶ X ✶y

e.g. for k ✏ ✶✵, N ✏ ✶✵, ✵✵✵: matrix multiplications are 11 times faster!

Kripfganz/Schneider Uni Exeter & MPIDR Speeding Up ARDL June 23, 2017 6 / 27

slide-7
SLIDE 7

Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements

Asymptotic Notation

Definition An algorithm with input size n and running time T♣nq is said to be Θ♣g♣nqq (“theta of g of n”) or to have an asymptotically tight bound g♣nq if there exist positive real numbers c✶, c✷, n✵ → ✵ such that c✶g ♣nq ↕ T ♣nq ↕ c✷g ♣nq ❅n ➙ n✵

T(n)=0.8n

3 - 1000n 2 + 1000n + 10e9 is O(n 3)

2e9 4e9 6e9 # of arithmetic operations 500 1000 1500 2000 Algorithm input size n T(n) 0.2 * n

3

0.801 * n

3

Kripfganz/Schneider Uni Exeter & MPIDR Speeding Up ARDL June 23, 2017 7 / 27

slide-8
SLIDE 8

Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements

Asymptotic Notation

➓ O ♣g ♣nqq(“(big) oh of g of n”), as opposed to Θ♣g♣nqq, is used

here to only denote an upper bound. Notation differs in the literature.

➓ Technically, Θ♣g♣nqq and O♣g♣nqq are sets of functions, so we

write e.g. T ♣nq PO♣g♣nqq.

➓ For matrix operations, g ♣nq is frequently n raised to some low

integer power.

➓ Θ ♣nq is much better than Θ

  • n✷✟

, which in turn is much better than Θ

  • n✸✟

➓ (Square) matrix multiplication is Θ

  • n✸✟

: each element of the new n ✂ n matrix is a sum of n terms. Costly!

➓ Many types of matrix inversion, e.g. the LU-decomposition, are also

Θ

  • n✸✟

. Costly!

➓ Inner vector products are Θ ♣nq.

➓ When T ♣nq is an i-th order polynomial, the leading term

asymptotically dominates: T ♣nq P O

  • ni✟

.

➓ Θ ♣anq is worse than Θ ♣naq; Θ ♣lg nq is better than Θ ♣nq Kripfganz/Schneider Uni Exeter & MPIDR Speeding Up ARDL June 23, 2017 8 / 27

slide-9
SLIDE 9

Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements

ARDL: Model Setup

➓ ARDL ♣p, q✶, . . . , qkq: autoregressive distributed lag model ➓ Popular, long-standing single-equation time-series model for continuous

variables

➓ Linear model :

yt ✏ c✵ c✶t

p

i✏✶

φiyt✁i

q

i✏✵

β✶

i①t✁i ut, utiid

  • ✵, σ✷✟

➓ ♣yt, ①✶ tq✶ can be purely I♣✵q, purely I♣✶q, or cointegrated: can be used to test for

cointegration (bounds testing procedure). (Pesaran, Shin, and Smith, 2001). => econometrics of ARDL can be complicated.

➓ ♥❡t ✐♥st❛❧❧ ❛r❞❧ ✱ ❢r♦♠✭❤tt♣✿✴✴✇✇✇✳❦r✐♣❢❣❛♥③✳❞❡✴st❛t❛✮ ➓ This talk: programming; for the statistics of ❛r❞❧, see Kripfganz/Schneider

(2016).

Kripfganz/Schneider Uni Exeter & MPIDR Speeding Up ARDL June 23, 2017 9 / 27

slide-10
SLIDE 10

Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements

ARDL: Computational Considerations

➓ Despite its complex statistical properties, estimating an ARDL

model is just based on OLS!

➓ The computational costly parts are:

➓ determination of optimal lag orders (e.g. via AIC or BIC) ➓ treated at length in this talk ➓ simulation of test distributions for cointegration testing (PSS 2001,

Narayan 2005).

➓ not covered by this talk Kripfganz/Schneider Uni Exeter & MPIDR Speeding Up ARDL June 23, 2017 10 / 27

slide-11
SLIDE 11

Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements

Optimal Lag Selection: The Problem

➓ For k ✶ variables (indepvars + depvar) and maxlag lags for

each variable, run a regression and calculate an information criterion (IC) for each possible lag combination and select the model with the best IC value.

➓ Example: 2 variables (v1 v2) ,

maxlag ✏ ✷

r❡❣r❡ss ✈✶ ▲✭✶✴✶✮✳✈✶ ▲✭✵✴✵✮✳✈✷ r❡❣r❡ss ✈✶ ▲✭✶✴✷✮✳✈✶ ▲✭✵✴✵✮✳✈✷ r❡❣r❡ss ✈✶ ▲✭✶✴✶✮✳✈✶ ▲✭✵✴✶✮✳✈✷ r❡❣r❡ss ✈✶ ▲✭✶✴✷✮✳✈✶ ▲✭✵✴✶✮✳✈✷ r❡❣r❡ss ✈✶ ▲✭✶✴✶✮✳✈✶ ▲✭✵✴✷✮✳✈✷ r❡❣r❡ss ✈✶ ▲✭✶✴✷✮✳✈✶ ▲✭✵✴✷✮✳✈✷

➓ # of regressions to run is

exponential in k: maxlags ☎ ♣maxlags ✶qk: k ✶ maxlags # regressions 3 4 100 3 8 650 4 8 5,800 6 8 470,000 8 8 38,000,000

Kripfganz/Schneider Uni Exeter & MPIDR Speeding Up ARDL June 23, 2017 11 / 27

slide-12
SLIDE 12

Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements

Lag Selection: Preliminaries

➓ Lag combination matrix for k ✏ ✸ and maxlags ✏ ✷ :

✔ ✖ ✖ ✖ ✖ ✖ ✖ ✖ ✖ ✖ ✖ ✖ ✖ ✖ ✖ ✕ ✶ ✵ ✵ ✶ ✵ ✶ ✶ ✵ ✷ ✶ ✶ ✵ ✶ ✶ ✶ ✶ ✶ ✷ ✶ ✷ ✵ ✶ ✷ ✶ ☎ ☎ ☎ ✷ ✷ ✷ ✜ ✣ ✣ ✣ ✣ ✣ ✣ ✣ ✣ ✣ ✣ ✣ ✣ ✣ ✣ ✢

➓ e.g. row 3:

✏ ✶ ✵ ✷ ✘ corresponds to regressors ▲✳✈✶ ▲✭✵✴✵✮✳✈✷ ▲✭✵✴✷✮✳✈✸ = ✏ v✶t✁✶ v✷t v✸t v✸t✁✶ v✸t✁✷ ✘

➓ called “lagcombs” in pseudo-code to follow

Kripfganz/Schneider Uni Exeter & MPIDR Speeding Up ARDL June 23, 2017 12 / 27

slide-13
SLIDE 13

Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements

Lag Selection: Naive Approach Using r❡❣r❡ss

Stata/Mata-like pseudocode:

Kripfganz/Schneider Uni Exeter & MPIDR Speeding Up ARDL June 23, 2017 13 / 27

slide-14
SLIDE 14

Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements

Lag Selection: Timings

Timings in seconds (2.5GHz, single core) for N=1000: k ✶ maxlags # regressions regress 3 4 100 1.6 3 8 650 12.5 4 8 5,800 132 6 8 470,000 14000 8 8 38,000,000 (13 days?)

Kripfganz/Schneider Uni Exeter & MPIDR Speeding Up ARDL June 23, 2017 14 / 27

slide-15
SLIDE 15

Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements

Lag Selection: Mata I

Stata/Mata-like pseudocode:

Kripfganz/Schneider Uni Exeter & MPIDR Speeding Up ARDL June 23, 2017 15 / 27

slide-16
SLIDE 16

Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements

Lag Selection: Mata II (no redundant calculations)

Kripfganz/Schneider Uni Exeter & MPIDR Speeding Up ARDL June 23, 2017 16 / 27

slide-17
SLIDE 17

Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements

Lag Selection: Mata III

➓ A sticky point are the many matrix inversions, which are Θ

  • n✸✟

.

➓ We will further improve matters by using results from linear

algebra.

➓ We will introduce and use pointer variables in the process. ➓ The following will put forth a somewhat complicated algorithm

that affects many parts of the loop.

➓ In this talk, we could have focused our attention on many

smaller changes for code optimization, but both things are not possible within the time window for this presentation.

Kripfganz/Schneider Uni Exeter & MPIDR Speeding Up ARDL June 23, 2017 17 / 27

slide-18
SLIDE 18

Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements

Updating

  • X ✶X

✟✁✶ Using Partitioned Matrices

➓ For A ✏

✒ A✶✶ A✶✷ A✷✶ A✷✷ ✚ , with A, A✶✶and A✷✷ square and invertible: A✁✶ ✏ ✒ D ✁DA✶✷A✁✶

✷✷

✁A✁✶

✷✷ A✷✶D ✁A✁✶ ✷✷ ✁A✁✶ ✷✷ A✷✶DA✶✷A✁✶ ✷✷

✚ with D ✏ A✁✶

✶✶ A✁✶ ✶✶ A✶✷

  • A✷✷ ✁ A✷✶A✁✶

✶✶ A✶✷

✟✁✶ A✷✶A✁✶

✶✶

➓ Here: Let Xv ✏

✏ X v ✘ . The cross-product matrix becomes X

vXv ✏

✒A✶✶ ✏ X ✶X A✶✷ ✏ X ✶v A✷✶ ✏ A

✶✷

A✷✷ ✏ v ✶v ✚

➓ Task: calculate

✁ X

vXv

✠✁✶ based on the known terms of: X ✶X, ♣X ✶Xq✁✶, X ✶v, v ✶v

➓ Slight complication: Inserting a column to X , not just appending. ➓ Can be solved by permutation vectors ( see [M-1] permutation).

➓ Let’s call this procedure PMAC (partioned matrices /

append column) to ease exposition.

Kripfganz/Schneider Uni Exeter & MPIDR Speeding Up ARDL June 23, 2017 18 / 27

slide-19
SLIDE 19

Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements

Updating

  • X ✶X

✟✁✶ Using Partitioned Matrices

➓ Problem: columns are sometimes deleted, not just added. ➓ Lag combination matrix (maxlags ✏ ✷ for all variables):

✔ ✖ ✖ ✖ ✖ ✖ ✖ ✖ ✖ ✖ ✖ ✖ ✖ ✕ ✶ ✵ ✵ ✶ ✵ ✶ ✶ ✵ ✷ ✶ ✶ ✵ ✶ ✶ ✶ ✶ ✶ ✷ ✶ ✷ ✵ ✶ ✷ ✶ ☎ ☎ ☎ ✷ ✷ ✷ ✜ ✣ ✣ ✣ ✣ ✣ ✣ ✣ ✣ ✣ ✣ ✣ ✣ ✢

➓ e.g. moving from row 3:

✏ ✶ ✵ ✷ ✘ to row 4: ✏ ✶ ✶ ✵ ✘ deletes two lags of the last regressor

➓ Solution: store matrices the algorithm can jump back to using

pointers.

Kripfganz/Schneider Uni Exeter & MPIDR Speeding Up ARDL June 23, 2017 19 / 27

slide-20
SLIDE 20

Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements

Pointer Variables

➓ General and “advanced” programming concept, but the basics

are easy to understand and apply.

➓ Each variable has a name and a type.

➓ The name really is just a device to refer to a specific location in

memory; every location in memory has a unique address.

➓ Since the type of the variable is known to Mata, it knows how big of a

memory range a variable name refers to, and how to interpret the value (the bits stored there).

➓ Think in these terms: each variable has an address and a value.

➓ Pointer variables hold memory addresses of other variables.

Pointer variables can point to anything: scalars, matrices, pointers, objects, functions ...

Kripfganz/Schneider Uni Exeter & MPIDR Speeding Up ARDL June 23, 2017 20 / 27

slide-21
SLIDE 21

Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements

Pointer Variables

➓ Pointers are often assigned to using “&”; they are dereferenced

using “*”.

➓ Read:

➓ & : “the address of” ➓ * : “the thing pointed to by”

➓ ♠❛t❛✿

s ❂ ❏✭✷✱✷✱✶✮ ♣ ❂ ✫s ♣ ✴✴ ♦✉t♣✉ts s♦♠❡t❤✐♥❣ ❧✐❦❡ ✵①❝❜✸❝❜✻✵ ✯♣ ✴✴ ♦✉t♣✉ts t❤❡ ✷①✷ ♠❛tr✐① ♦❢ ♦♥❡s ✯♣ ❂ ❏✭✷✱✷✱✲✼✮ s ✴✴ ♥♦✇ ❝♦♥t❛✐♥s t❤❡ ♠❛tr✐① ♦❢ ✲✼s ❡♥❞

➓ See [M-2] pointers for many more details. ➓ What we need for our algorithm, is an unknown number (k ✶)

  • f matrices. We solve this by creating a vector of pointers to

Kripfganz/Schneider Uni Exeter & MPIDR Speeding Up ARDL June 23, 2017 21 / 27

slide-22
SLIDE 22

Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements

Using Pointers for Updating

  • X ✶X

✟✁✶

➓ Lag combination matrix : ✔ ✖ ✖ ✖ ✖ ✖ ✖ ✖ ✖ ✖ ✕ ✶ ✵ ✵ ✶ ✵ ✶ ✶ ✵ ✷ ✶ ✶ ✵ ✶ ✶ ✶ ✶ ✶ ✷ ✶ ✷ ✵ ☎ ☎ ☎ ✜ ✣ ✣ ✣ ✣ ✣ ✣ ✣ ✣ ✣ ✢ ➓ 3-element vector of pointers vec ✏

✏ p✶ p✷ p✸ ✘ ; each element points to a matrix. Then calculate ♣X ✶Xq✁✶ for...

➓ ... lags (1 0 0) by ordinary matrix inversion; store using p✶ ➓ ... lags (1 0 1) by PMAC using ✝p✶; store using p✸ ➓ ... lags (1 0 2) by PMAC using ✝p✸ ➓ ... lags (1 1 0) by PMAC using ✝p✶; store using p✷ ➓ ... lags (1 1 1) by PMAC using ✝p✷; store using p✸ ➓ ... lags (1 1 2) by PMAC using ✝p✸; ➓ ... lags (1 2 0) by PMAC using ✝p✷; store using p✷ ... and so forth. Kripfganz/Schneider Uni Exeter & MPIDR Speeding Up ARDL June 23, 2017 22 / 27

slide-23
SLIDE 23

Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements

Lag Selection: Mata III (update inverses)

Kripfganz/Schneider Uni Exeter & MPIDR Speeding Up ARDL June 23, 2017 23 / 27

slide-24
SLIDE 24

Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements

Lag Selection: Timings

Timings in seconds (2.5GHz, single core) for N=1000: Mata 2 : no redundancies Mata 3 : no redundancies + inverse updating k ✶ maxlags # regressions regress Mata 1 Mata 2 Mata 3 3 4 100 1.6 0.36 0.11 0.14 3 8 650 12.5 1.33 0.09 0.13 4 8 5,800 132 11.8 0.31 0.27 6 8 470,000 14,000 1,400 53 37 8 8 38,000,000 (13 days?) 146,000 6,500 3,200

Kripfganz/Schneider Uni Exeter & MPIDR Speeding Up ARDL June 23, 2017 24 / 27

slide-25
SLIDE 25

Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements

Recap

In this talk, we have discussed

➓ Potential strategies for improving code performance ➓ Basic asymptotic notation for the computing time of algorithms ➓ Quick look at the ARDL model ➓ Optimal lag selection ➓ Moving Stata code to Mata and optimizing the Mata code ➓ An advanced way of using linear algebra results to improve

code performance

➓ Pointer variables

We have tried to illustrate that mindful code creation can be superior to the “brute force” methods of low-level programming languages and parallelization.

Kripfganz/Schneider Uni Exeter & MPIDR Speeding Up ARDL June 23, 2017 25 / 27

slide-26
SLIDE 26

Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements

Thank you!

Questions? Comments? S.Kripfganz@exeter.ac.uk schneider@demogr.mpg.de

Kripfganz/Schneider Uni Exeter & MPIDR Speeding Up ARDL June 23, 2017 26 / 27

slide-27
SLIDE 27

Efficient Coding Digression: A Tiny Bit of Asymptotic Notation The ARDL Model Optimal Lag Selection Incremental Code Improvements

References

➓ Gould, William (2014, April 17): Using Mata Operators

efficiently [Msg 8]. Message posted to https://www.statalist.org/forums/forum/general-stata- discussion/mata/993-using-mata-operators- efficiently?p=1826#post1826

➓ Kripfganz / Schneider (2016): ardl: Stata Module to Estimate

Autoregressive Distributed Lag Models. Presentation held at the Stata Conference 2016, Chicago.

➓ Narayan, P

.K. (2005): The Saving and Investment Nexus for China: Evidence from Cointegration Tests. Applied Economics, 37 (17), 1979-1990.

➓ Pesaran, M.H., Shin Y. and R.J. Smith (2001): Bounds Testing

Approaches to the Analysis of Level Relationships. Journal of Applied Econometrics, 16 (3), 289-326.

Kripfganz/Schneider Uni Exeter & MPIDR Speeding Up ARDL June 23, 2017 27 / 27