Modeling Financial Durations Using Estimating Functions Yaohua Zhang 1 Jian Zou 2 Nalini Ravishanker 1 Aerambamoorthy Thavaneswaran 3 1 Department of Statistics, University of Connecticut 2 Department of Statistics, Worcester Polytechnic Institute 3 Department of Statistics, University of Manitoba QPRC, June 15, 2017
Outline ◮ Introduction ◮ Estimating Functions Approach for LogACD Models. ◮ Simulation Study ◮ Application on Real Stock Prices ◮ Summary
Background ◮ Investigators are interested in studying the behavior of the exchange rate process ◮ High frequency price quote data inherently arrive over irreg- ularly spaced time intervals, so that time duration between consecutive data points is not uniform ◮ Traditional discrete-time models which bin the data into equally spaced-time intervals are inadequate (too small = zero, too large = smooth)
Why Do We Care? ◮ Information is important! (How long it will be until prices change) ◮ Rothchild Family ◮ Knowing the time interval as it could influence the speed with which he please an order ◮ In an active market, the price may last much less than a minute/second. ◮ If automated trading system is used, opportunities may be eliminated.
Literature Review ◮ Engle & Russell (1998) proposed a nonlinear model for ir- regularly spaced inter-event durations, called the Autore- gressive Conditional Duration (ACD) model ◮ In fact, the authors treat the arrival times of the data as a point process with an intensity defined conditional on past activity ◮ Several generalizations have been discussed in the litera- ture (Thavaneswaran et.al 2014) ◮ Developing fast and accurate methods for fitting models to long time series of durations under least restrictive assump- tions is an interesting ongoing research problem
A Review of Duration Models Let x i = t i − t i − , where i = , , . . . , denote a time series of du- rations, and let F x i − 1 denote the information associated with pre- vious durations. The ACD ( p , q ) model (Engle & Russell, 1998) is defined as: p q � � x i = ψ i ε i /µ ε , where ψ i = ω + α j x i − j + β j ψ i − j , j = j = The conditions ω > , α j ≥ for j = , . . . , p , β j ≥ 0 for j = , . . . , q and � p j = α j + � q j = β j < ensure that the durations process is non-negative and weakly stationary.
A Review of Duration Models Cont’d The Log ACD ( p , q ) model (Bauwens 2000,Pacurar 2008), which relaxes the restrictions on the parameters that ensure nonneg- ativity on the durations and thus provides greater flexibility than the ACD ( p , q ) model. p q � � x i = exp ( ψ i ) ε i /µ ε , where ψ i = ω + α j log x i − j + β j ψ i − j . j = j = where the condition � max ( p, q ) ( α j + β j ) < ensures weak sta- j = tionarity.
The Problem Suppose durations data { x i } n i = 1 that follow the Log ACD ( p , q ) model are available. Let g = max ( p , q ) . The the maximum likelihood estimates (MLEs) � θ may be obtained by maximizing the conditional likelihood function (Tsay 2009): n � L ( θ | x n ) = f ( x i | x i − 1 , θ ) . i = g + ◮ In practice, the true f ε ( . ) in usually unknown ◮ In some cases, the ML or QML approach may not be feasible (Thavaneswaran, Ravishanker & Liang, 2014) ◮ Model orders ( p , q ) are unknown.
General Framework ◮ We propose a semi-parametric estimation approach which based on combined martingale estimating functions ◮ It only requires the specification of the first four conditional moments of the duration process ◮ Our method can be easily extended by adding a penalized term.
General Framework Cont’d Suppose x i is a realization of a duration process and let F x i − 1 denote the information associated with { x 1 , . . . , x i − 1 } . Suppose the first four conditional moments of { x i } given F x i − 1 are µ i ( θ ) , σ i ( θ ) , γ i ( θ ) , and κ i ( θ ) . Define the linear and quadratic martin- gale differences by m i ( θ ) = x i − µ i ( θ ) and M i ( θ ) = m i ( θ ) − σ i ( θ ) . Their quadratic variations and covariation are E [ m i ( θ ) |F x i − ] = σ � m � i = i ( θ ) � � 2 = κ i ( θ ) − σ E [ m i ( θ ) |F x E [ m i ( θ ) |F x � M � i = i − ] − i − ] i ( θ ) E [ m i ( θ ) |F x � m, M � i = i − ] = γ i ( θ ) .
General Framework Cont’d Consider the class M of zero-mean, square integrable p -dimensional martingale estimating functions, � � � n M = g n ( θ ) : g n ( θ ) = ( a i − ( θ ) m i ( θ ) + b i − ( θ ) M i ( θ )) , i = where a i − ( θ ) and b i − ( θ ) are p × q matrices that are functions of θ and x , . . . , x i − , ≤ i ≤ n .
Three Approaches ◮ Nonlinear Equation Solver Estimation (NESE): solve the sys- tem of nonlinear equations g ∗ C ( θ ) = 0 for θ ◮ Approximate Vector Recursive Estimation (AVRE): estimate θ via recursive formulas ◮ Approximate Iterated Scalar Recursive Estimation (AISRE): estimate θ through a sequence of scalar recursions for each component and iterating these to convergence
Starting Values for the Recursion Suppose { x i } follows the Log ACD ( p , q ) model. The natural logarithm of x i is y i = log x i . Then, y i = ψ i + log ε i − log µ ε p q � � = ω + α j y i − j + β j ψ i − j + log ε i − log µ ε j = 1 j = 1 p q � � = ω + α j y i − j + β j ( y i − j − log ε i − j + log µ ε ) + log ε i − log µ ε j = 1 j = 1 p q q � � � ω ⋆ + = α j y i − j + β j y i − j − β j ν i − j + ν i j = 1 j = 1 j = 1 from which it follows that y i = log x i follows an ARMA(max ( p , q ) , q ) model with non-normal errors, i.e., max ( p , q ) q � � ( α j + β j ) B j ) y i = ω ⋆ + ( 1 − β j B j ) ν i ( 1 − j = 1 j = 1
Simulation Study Table: Percentiles of parameter estimates for the Log ACD ( p , q ) mod- els for L = 250 simulated durations of length n = 7500. NESE AVRE AISRE f ε ( . ) Param True 5 th 50 th 95 th 5 th 50 th 95 th 5 th 50 th 95 th Gamma ω 0 . 25 0 . 23 0 . 25 0 . 26 0 . 23 0 . 25 0 . 26 0 . 24 0 . 25 0 . 27 ( 0 . 6 , 0 . 7 ) α 0 . 06 0 . 04 0 . 06 0 . 08 0 . 04 0 . 06 0 . 08 0 . 05 0 . 06 0 . 08 ω 0 . 04 0 . 02 0 . 04 0 . 07 0 . 02 0 . 04 0 . 18 0 . 03 0 . 04 0 . 06 Exp ( 1 ) α 0 . 05 0 . 03 0 . 05 0 . 07 0 . 02 0 . 05 0 . 24 0 . 04 0 . 05 0 . 07 β 0 . 75 0 . 42 0 . 74 0 . 89 0 . 48 0 . 73 0 . 83 0 . 62 0 . 73 0 . 83 Weibull ω 1 . 00 0 . 37 1 . 12 3 . 65 0 . 63 1 . 06 1 . 83 0 . 63 1 . 08 1 . 90 ( 0 . 4 , 0 . 5 ) α 0 . 05 0 . 01 0 . 05 0 . 08 − 0 . 03 0 . 05 0 . 26 0 . 04 0 . 05 0 . 07 β 0 . 60 − 0 . 45 0 . 55 0 . 85 0 . 32 0 . 58 0 . 75 0 . 29 0 . 57 0 . 75 ω 0 . 50 0 . 37 0 . 51 0 . 68 0 . 42 0 . 51 0 . 62 0 . 42 0 . 51 0 . 62 Weibull α 1 0 . 05 0 . 03 0 . 05 0 . 07 0 . 03 0 . 05 0 . 07 0 . 03 0 . 05 0 . 07 ( 0 . 9 , 0 . 9 ) α 2 0 . 10 0 . 07 0 . 10 0 . 13 0 . 07 0 . 10 0 . 13 0 . 08 0 . 10 0 . 12 β 0 . 60 0 . 47 0 . 59 0 . 69 0 . 52 0 . 59 0 . 65 0 . 52 0 . 59 0 . 65 ω 0 . 15 0 . 07 0 . 18 0 . 63 − 0 . 15 0 . 20 0 . 55 0 . 09 0 . 19 1 . 61 Gamma α 1 0 . 10 0 . 08 0 . 10 0 . 11 − 0 . 02 0 . 10 0 . 22 0 . 08 0 . 10 0 . 12 ( 0 . 5 , 0 . 8 ) α 2 − 0 . 05 − 0 . 07 − 0 . 04 0 . 01 − 0 . 55 − 0 . 04 0 . 19 − 0 . 07 − 0 . 04 − 0 . 01 β 1 0 . 05 − 0 . 54 0 . 02 0 . 15 − 0 . 34 − 0 . 01 0 . 37 − 0 . 26 0 . 01 0 . 14 β 2 0 . 70 0 . 28 0 . 68 0 . 78 0 . 11 0 . 66 0 . 78 0 . 45 0 . 67 0 . 76
Penalized Estimating Equations ◮ Penalized methods are usually used in regression settings ◮ However, the literature on variable selection in estimating equations is rare ◮ Wang et al. (2012) and the references therein discussed penalized generalized estimating equations in longitudinal setup
Penalized Estimating Equations Cont’d ◮ Recap n � g n ( θ ) : g n ( θ ) = ( a i − ( θ ) m i ( θ ) + b i − ( θ ) M i ( θ )) i = ◮ Now g ∗ C ( θ ) − np ′ λ ( | θ | ) where p ′ λ ( | θ | ) is the first derivative of Smoothly Clipped Ab- solute Deviation (SCAD) penalty (Fan et al. 2001) and is defined as λ ( | θ | ) = λ { I ( | θ | ≤ λ ) + ( a λ − | θ | ) + p ′ ( a − 1 ) λ I ( | θ | > λ ) } ◮ Remark : SCAD can achieve unbiasedness (LASSO), spar- sity and continuity.
Illustrative Simulation Study Table: Percentiles of parameter estimates for the Log ACD ( p , 0 ) models for L = 500 durations of length n = 7500. EF w Penalty f ε ( . ) Param True 5 th 50 th 95 th ω Gamma 0.25 0.22 0.25 0.27 ( 0 . 5 , 0 . 6 ) α 1 0.20 0.19 0.20 0.20 α 2 0.10 0.09 0.10 0.10 Gamma ω 0.10 0.06 0.09 0.14 ( 0 . 5 , 0 . 6 ) α 1 0.10 0.09 0.10 0.10 α 2 0.05 0.04 0.05 0.05 α 3 0.05 0.04 0.05 0.05 α 4 0.10 0.09 0.10 0.10
Illustrative Simulation Study Cont’d 0.4 0.3 α 11 ω α 1 α 12 α 2 α 13 α 3 α 14 α 4 α 15 α 5 α 16 α 6 α 17 0.2 θ α 7 α 18 α 8 α 19 α 9 α 20 α 10 0.1 0.0 0.70 0.50 0.30 0.10 0.05 λ Figure: Solution path to the simulated LogACD ( 2 , 0 ) model. The vertical bar represents the optimal λ .
Recommend
More recommend