Examples Linear processes then the sequence is weakly stationary. Let ( ω t ) t ∈ Z a weakly stationary sequence and let a j such that � j ∈ Z | a j | < ∞ � X i = a j ω i − j j ∈ Z Remark The fact that ( ω j ) j ∈ Z is weakly stationary implies that E ( ω 2 1 ) < ∞ . Consequently E ( X 2 1 ) < ∞ .
Examples Linear processes Let ( ω t ) t ∈ Z a strictly stationary sequence such E ( ω 2 1 ) < ∞ and let a j such that � j ∈ Z | a j | < ∞ then the sequence � X i = a j ω i − j ∈ Z is strictly stationary satisfying E ( X 2 1 ) < ∞ . Remark In the context of strictly stationary sequences ( ω j ) j ∈ Z , an additional assumption is required, namely E ( ω 2 1 ) < ∞ . Remark Moreover, if ( ω t ) t ∈ Z is ergodic, so is ( X t ) t ∈ Z .
Examples Linear processes Let ( ω t ) t ∈ Z a white noise sequence with E ( ω 2 1 ) < ∞ and let a j such that � j ∈ Z | a j | 2 < ∞ then the sequence � X i = a j ω i − j ∈ Z is weakly stationary satisfying E ( X 2 1 ) < ∞ .
Examples Linear processes Let ( ω t ) t ∈ Z an independent and identically distributed sequence with 1 ) < ∞ and let a j such that � j ∈ Z | a j | 2 < ∞ then the sequence E ( ω 2 � X i = a j ω i − j ∈ Z is weakly and strictly stationary satisfying E ( X 2 1 ) < ∞ . Remark If ( ω t ) t ∈ Z is ergodic then ( X t ) t ∈ Z is an ergodic sequence.
Examples i.i.d. white noise and i.i.d. Gaussian white noise ( ω t ) ∼ i . i . d . (0 , σ 2 ) when ◮ ( ω t ) ∼ WN (0 , σ 2 ) ◮ and ( ω t ) are i.i.d. ( ω t ) ∼ i . i . d . N (0 , σ 2 ) if, in addition, ω t ∼ N (0 , σ 2 ) for all t ∈ Z . Gaussian white noise 3 2 1 0 −1 −2 −3 −4 0 200 400 600 800 1000
stationarities Exercise Check the stationarity of the following processes : ◮ the white noise, defjned in (1) ◮ the random walk, defjned in (4)
stationarities Properties then is stationary. If ( X t ) t ∈ Z is stationary and if ( a i ) i ∈ Z is a sequence of real numbers satisfying � | a i | < ∞ , i ∈ Z � Y t = a i X t − i , i ∈ Z This also holds if we consider a fjnite sequence of real ( a i ) | i |≤ M .
Models with serial correlation I (2) Moving averages Notice the similarity with SIO and some fMRI series. Consider a white noise ( ω t ) t ∈ Z and defjne the series ( X t ) t ∈ Z as X t = 1 3( ω t − 1 + ω t + ω t +1 ) ∀ t ∈ Z . Moving average 2.5 0.0 −2.5 0 100 200 300 400 500
Models with serial correlation II (3) Autoregression example Notice Consider a white noise ( ω t ) t ∈ Z and defjne the series ( X t ) t ∈ Z as X t = X t − 1 − 0 . 9 X t − 2 + ω t ∀ t ∈ Z . Autoregression 4 0 −4 0 100 200 300 400 500 ◮ the almost periodic behavior and the similarity with the speech series ◮ the above defjnition misses initial conditions, we’ll come back on that later.
Models with serial correlation III (4) previous position drift step X t Random walk with drift Consider a white noise ( ω t ) t ∈ Z and defjne the series ( X t ) t ∈ Z as = 0 , X 0 ∀ t ∈ Z . = δ + + ω t X t − 1 ���� ���� ���� Random walk 80 60 40 xd 20 0 −20 0 50 100 150 200 Time Fig. : Random walk with drift δ = 0 . 2 (upper jagged line), with δ = 0 (lower jagged line) and line with slope 0 . 2 (dashed line)
Models with serial correlation IV signal Signal plus noise (5) white noise Notice the similarity with fMRI signals. Consider a white noise ( ω t ) t ∈ Z and defjne the series ( X t ) t ∈ Z as � � 2 π t + 15 X t = 2 cos + ω t 50 ���� � �� � 2cos ( 2 π t 50 + 0.6 π ) + N ( 0 , 1 ) 3 2 1 cs + w 0 −1 −2 −3 0 100 200 300 400 500 Time
Measures of dependence We now introduce various measures that describe the general behavior of a process as it evolves over time. Mean measure when it exists. Exercise Compute the mean functions of Defjne, for a time series ( X t ) t ∈ Z , the mean function µ X ( t ) = E ( X t ) ∀ t ∈ Z ◮ the moving average defjned in (2). ◮ the random walk plus drift defjned in (4) ◮ the signal+noise model in (5)
Autocovariance Autocovariance Properties observed at difgerent times. We now assume that for all t ∈ Z , X t ∈ L 2 . The autocovariance function of a time series ( X t ) t ∈ Z is defjned as � � ( X s − E ( X s ))( X t − E ( X t )) ∀ s , t ∈ Z . γ X ( s , t ) = Cov ( X s , X t ) = E , ◮ It is a symmetric function γ X ( s , t ) = γ X ( t , s ) . ◮ It measures the linear dependence between two values of the same series ◮ In ( X t ) t ∈ Z is stationary, γ X ( t , t + h ) = γ X ( t + h , t ) = γ X (0 , h ) . In this context we write γ X ( h ) as short for γ X (0 , h ) .
Autocovariance of stationary time series Theorem some stationary time series. Reminder : The autocovariance function γ X of a stationary time series X verifjes 1. γ X (0) ≥ 0 2. | γ X ( h ) | ≤ γ X (0) 3. γ X ( h ) = γ X ( − h ) 4. γ X is positive-defjnite. Furthermore, any function γ that satisfjes (3) and (4) is the autocovariance of ◮ A function f : Z � → R is positive-defjnite if for all n , the matrix F n , with entries ( F n ) i , j = f ( i − j ) , is positive defjnite. ◮ A matrix F n ∈ R n × n is positive-defjnite if, for all vectors a ∈ R n , a ⊤ F n a ≥ 0 .
Autocovariance Exercise Compute the autocovariance functions of ◮ the white noise defjned in (1) ◮ the moving average defjned in (2) ◮ a moving average X t = ω t + θ ω t − 1 where ω t is a weak white noise.
Autocorrelation function (ACF) Associated to the autocovariance function, we defjne the autocorrelation function. Autocorrelation function at difgerent times. The ACF of a time series ( X t ) t ∈ Z is defjned as γ X ( s , t ) ρ X ( s , t ) = � ∀ s , t ∈ Z . γ X ( s , s ) γ X ( t , t ) ◮ It is a symmetric function ρ X ( s , t ) = ρ X ( t , s ) . ◮ It measures the correlation between two values of the same series observed ◮ In the context of stationarity, ρ X ( t , t + h ) = ρ X ( t + h , t ) = ρ X (0 , h ) . In this context we write ρ X ( h ) as short for ρ X (0 , h ) .
Moving average MA(1) model Moving average model MA(1) (6) Exercise � Warning : not to be confused with moving average. Consider a white noise ( ω t ) t ∈ Z ∼ WN (0 , σ 2 ) and construct the MA(1) as X t = ω t + θω t − 1 ∀ t ∈ Z ◮ Study its stationarity. ◮ Compute its ACF and its autocorrelation function. ◮ MA (1) as a linera process
Autoregressive AR(1) model Autoregressive AR(1) (7) Exercise Consider a white noise ( ω t ) t ∈ Z ∼ WN (0 , σ 2 ) and construct the AR(1) as X t = φ X t − 1 + ω t ∀ t ∈ Z , X 0 = 0 . Assume | φ | < 1 . Show that under this condition X t is stationary. And compute ◮ its mean function ◮ its ACF.
AR(1) as a linear process lim Let ( X t ) t ∈ Z be the stationary solution to X t − φ X t − 1 = W t where W t is a white noise WN (0 , σ 2 ) . If | φ | < 1 , ∞ � X t = φ j W t − j j =0 is a solution. This infjnite sum. converges in mean square, since | Φ | < 1 implies � j ≥ 0 | φ j | < ∞ . Furthermore, X t is the unique stationary solution, since we can check that any other stationary solution Y t is the mean square limit : � � 2 n − 1 � n →∞ E ( φ n Y t − n ) 2 = 0 . Y t − n →∞ E Φ i W t − i = lim i =0 As a conclusion, if | φ | < 1 then X t can be written as ∞ � X t = φ j W t − j . j =0 Now, if | Φ | = 1 ? or if | Φ | > 1 ?
AR(1) another formulation equivalent to Also, we can write The equation X t − Φ X t − 1 = W t where W t is a white noise WN (0 , σ 2 ) is X t − Φ X t − 1 = W t ⇐ ⇒ Φ( B ) X t = W t , where B is the back-shift operator BX t = X t − 1 and where Φ( z ) = 1 − Φ z . ∞ ∞ � � X t = Φ j W t − j = Φ j B j W t = Π( B ) W t . j =0 j =0
AR(1) another formulation With these notations, ∞ � Π( B ) = Φ j B j and Φ( B ) = 1 − Φ( B ) , j =0 we can check that Π( B ) = Φ − 1 ( B ) : thus Φ( B ) X t = W t ⇐ ⇒ X t = Π( B ) W t .
AR(1) another formulation so that we can check that the unique stationary solution is polynomials with Notice that manipulating operators like Φ( B ) or Π( B ) is like manipulating 1 1 − Φ z = 1 + Φ z + Φ 2 z 2 + · · · + , provided that | Φ | < 1 and | z | ≤ 1 . If | Φ | > 1 , Π( B ) W t does not converge. But we can rearrange and write X t − 1 = 1 Φ X t − 1 Φ W t , ∞ � Φ − j W t + j . X t = − j =1 Notice that here X t depends on the future of W t . ⇐ ⇒ notions of causality and invertibility
Linear processes Linear process as follows (8) Theorem above is stationary. (see Proposition 3.1.2 in [BD13]). Exercise Consider a white noise ( ω t ) t ∈ Z ∼ WN (0 , σ 2 ) and defjne the linear process X � X t = µ + ψ j ω t − j ∀ t ∈ Z j ∈ Z where µ ∈ R and ( ψ j ) satisfjes � j ∈ Z | ψ j | < ∞ . X The series in Equation (8) converges in L 2 and the linear process X defjned Compute the mean and autocovariance functions of the linear process ( X t ) t ∈ Z .
Examples of linear processes Exercises ◮ Show that the following processes are particular linear processes ◮ the white noise process ◮ the MA(1) process. ◮ Consider a linear process as defjned in (8), put µ = 0 , � ψ j = φ j if j ≥ 0 ψ j = 0 if j < 0 and suppose | φ | < 1 . Show that X is in fact an AR(1) process.
Linear prediction Linear predictor min f f and Best least square estimate of Y given X is E ( Y | X ) . Indeed � � � � E [( Y − f ( X )) 2 ] = min E [( Y − f ( X )) 2 | X ] E [( Y − E ( Y | X )) 2 | X ] = E . E Similarly, the best least squares estimate of X n + h given X n is f ( X n ) = E ( X n + h | X n ) . For Gaussian and stationary ( X t ) t Z , the best estimate of X n + h given X n = x n is f ( X n ) = µ + ρ ( h )( x n − µ ) , E ( X n + h − f ( X n )) 2 = σ 2 (1 − ρ ( h ) 2 ) . Prediction accuracy improves as | ρ ( h ) | − n →∞ 1 . → Predictor is linear since f ( x n ) = µ (1 − ρ ( h )) + ρ ( h ) x n .
ACF and prediction Linear predictor and ACF a crucial property for forecasting. Notice that Prove the result. Exercise Let X be a stationary time series with ACF ρ . The linear predictor ˆ X { n } n + h of X n + h given X n is defjned as �� � 2 � ˆ X { n } X n + h − ( aX n + b ) = ρ ( h )( X n − µ ) + µ n + h = arg min a , b E ◮ linear prediction needs only second order statistics, we’ll see later that it is ◮ the result extends to longer histories ( X n , X n − 1 , . . . ).
Chapter 2 : Estimation of the mean and of the ACF
Estimation of the mean X n I P Theorem Suppose that X is a stationary time series and recall that for all t , h ∈ Z , µ X ( t ) = µ, γ X ( h ) = Cov ( X t , X t + h ) and ρ X ( h ) = γ X ( h ) γ X (0) . Let ( X i ) i ∈ Z is weakly stationary with autocovariance function γ . Then � n ¯ n →∞ E ( X 0 ) if and only if 1 − → i =1 γ X ( i ) − n →∞ 0 . →
Estimation of the ACF compute n P Estimation n n I n Consider observations X 1 , . . . , X n (from the strictly stationary time series � n ( X t ) t ∈ Z ), with E ( X 2 0 ) < ∞ and satisfying 1 i =1 γ X ( i ) − → n →∞ 0 . We can � n ◮ the sample mean ¯ X = 1 t =1 X t � n − p 1 − n →∞ E ( X 0 X p ) . → ◮ i =1 X i X i + p ◮ the sample autocovariance function n −| h | � γ X ( h ) = 1 ( X t + | h | − ¯ X )( X t − ¯ ˆ X ) ∀ − n < h < n t =1 ◮ the sample autocorrelation function ρ X ( h ) = ˆ γ X ( h ) ˆ γ X (0) . ˆ
X t Indeed n not the corresponding empirical covariance ! ! � Warning : γ X ( h ) = Cov ( X t , X t + h ) but the sample autocorrelation function is n −| h | � 1 ( X t + | h | − ¯ X )( X t − ¯ X ) � = t =1 n −| h | n −| h | n −| h | � � � � �� � 1 1 1 X t + | h | − X t − X t + h n − | h | n − | h | n − | h | t =1 t =1 t =1
Examples of sample ACF Can you fjnd the generating time series models (white noise, MA(1), AR(1), random noise with drift) associated with the sample ACF ? Exercise Sample ACF 1 Sample ACF 3 0.8 0.8 ACF ACF 0.0 −0.2 0 5 10 15 20 0 5 10 15 20 Lag Lag Sample ACF 2 Sample ACF 4 1.0 ACF −0.5 0.8 ACF 0.0 0 5 10 15 20 0 5 10 15 20 Lag Lag
Examples of sample ACF Fig. : The ACF of speech data example on slide 9 Notice : 1.0 0.5 ACF 0.0 −0.5 0 50 100 150 200 250 LAG ◮ the regular repetition of short peaks with decreasing amplitude.
Sample ACF Behavior Time Series White Noise zero Trend Slow decay Periodic Periodic Decays to zero exponentially Sample ACF ρ X ( h ) MA ( q ) Zero for | h | > q AR ( p )
Properties of empirical sum : Soft version of Law of Large Numbers I n P independent and identically distributed random variables such that strictly stationary time series. Comments: I P n n n Let X k = f ( · · · , ε k − 1 , ε k , ε k +1 , · · · ) where ( ε ) i ∈ Z is a sequence of independent and identically distributed random variables and where f : R Z � → R . Let g be such that E ( | g ( X 0 ) | ) < ∞ . Then � 1 g ( X k ) − n →∞ E ( g ( X 0 )) . → k =1 • If X k = f ( · · · , ε k − 1 , ε k , ε k +1 , · · · ) where ( ε ) i ∈ Z is a sequence of independent and identically distributed random variables, then ( X k ) k ∈ Z is a • More generally, for g such that E ( | g ( · · · , X − 1 , X 0 , X +1 , · · · ) | ) < ∞ , then � 1 g ( · · · , X k − 1 , X k , X k +1 , · · · ) − n →∞ E [ g ( · · · , X − 1 , X 0 , X +1 , · · · )] . → k =1 • Most time series ( X k ) k ∈ Z satisfy that there exist ( ε ) i ∈ Z , a sequence of X k = f ( · · · , ε k − 1 , ε k , ε k +1 , · · · ) .
See Appendix A [SS10] n function of X k . such as in ”soft version”. Comments: In that case n n n X n Moreover Properties of ¯ X n : Another version of Law of Large Number If ( X k ) k ∈ Z ) is is a stationary time series and the sample mean verifjes n − 1 � X n ) = 1 � 1 − | k | � E (¯ X n ) = µ and V ar (¯ γ X ( k ) . k = − ( n − 1) � n →∞ µ if and only if 1 ¯ L 2 − → γ X ( k ) − n →∞ 0 . → k = − n ∞ ∞ � � n V ar (¯ γ X ( k ) = σ 2 X n ) − → ρ X ( k ) . n →∞ k = −∞ k = −∞ • It concerns time series which are not necessary function of i.i.d. variables • Less general that the ”soft version” since it holds only for ¯ X n and not for
Large sample property : asymptotic normality approximately normally distributed with zero mean and standard deviation Theorem given by signifjcant. See Appendix A [SS10] Under general conditions, if X is a white noise, then for n large, the sample ACF, ˆ ρ X ( h ) , for h = 1 , 2 , . . . , H, where H is fjxed but arbitrary, is 1 ρ X ( h ) = √ n σ ˆ Consequence : only the peaks outside of ± 2/ √ n may be considered to be Sample ACF 1 0.8 ACF −0.2 0 5 10 15 20 Lag
Chapter 3 : ARMA models
Introduction We now consider that we have estimated the trend and seasonal components of and focus on X t . Aim of the chapter : to propose to model the time series X via ARMA models. They allow Key fact : we know that autocovariance function, see [BD13]. and a deterministic process. Y t = T t + S t + X t ◮ to describe this time series ◮ to forecast. ◮ for every stationary process with autocovariance function γ verifying lim h →∞ γ X ( h ) = 0 , it is possible to fjnd an ARMA process with the same ◮ The Wold decomposition (see [SS10] Appendix B) also plays an important role. Its says that every stationary process is the sum of a MA( ∞ ) process
AR(1) Exercice this is not a stationary time series. Consider a time series X following the AR(1) model X t = φ X t − 1 + ω t ∀ t ∈ Z . 1. Show that for all k > 0 X t = φ k X t − k + � k − 1 j =0 φ j ω t − j . = � ∞ L 2 j =0 φ j ω t − j . 2. Assume that | φ | < 1 and prove X t 3. Assume now that | φ | > 1 and prove that 3.1 � k − 1 j =0 φ j ω t − j does not converge in L 2 3.2 one can write X t = − � ∞ j =1 φ − j w t + j 3.3 Discuss why the case | φ | > 1 is useless. The case where | φ | = 1 is a random walk (slide 4) and we already proved that
AR ( 1 ) φ = + 0.9 6 4 2 x 0 −2 0 20 40 60 80 100 AR ( 1 ) φ = − 0.9 5.0 2.5 x 0.0 −2.5 0 20 40 60 80 100
For instance In particular : with complex variables. Note on polynomials in R Notice that manipulating operators like φ (B) is like manipulating polynomials 1 1 − φ z = 1 + φ z + φ 2 z + . . . provided that | φ | < 1 and | z | ≤ 1 . More generally, if the polynome P has not in the unique circle, then for | z | ≤ 1 we have ∞ ∞ � � 1 P ( z ) = ψ i z i , with | ψ i | < ∞ . i =0 i =0 � 1 1 − ρ z = ρ i z i , for | ρ | < 1 and | z | ≤ 1 . i =0
Polynomial of the back-shift operator with j Consider now P ( B ) where BX t = X t − 1 . Can we write ∞ � ˜ ψ i B i , with ˜ P ( B ) ◦ P ( B ) = P ( B ) ◦ ˜ P ( B ) = P ( B ) = Id ? i =0 • If � � A ( B ) = a i B i , and C ( B ) = c i B i , i ∈ Z i ∈ Z with � i | a i | < ∞ and � i | c i | < ∞ then + ∞ � A ( B ) ◦ C ( B )( X t ) = d k X t − k = C ( B ) ◦ A ( B )( X t ) , k = −∞ + ∞ + ∞ � � � d k = a i c k − i = a k − j c j = a ⋆ c , | d j | < ∞ i = −∞ j = −∞ • If P has not roots in the circle unit, then ˜ P ( B ) ◦ P ( B ) = P ( B ) ◦ ˜ P ( B ) = Id . P ( B ) = � ∞ Consequently ˜ i =0 ψ i B i exits and is the inverse of P ( B ) .
Causality Causal linear process A linear process X is said to be causal ( a causal function of W t ) when there is a restriction because we can fjnd causal counterpart to such process. ◮ a power series Ψ : Ψ( B ) = ψ 0 + ψ 1 B + ψ 2 B 2 , . . . , ◮ with � ∞ j =0 | ψ j | < ∞ ◮ and X t = Ψ( B ) ω t ω is a white noise WN (0 , σ 2 ) . In this case X t is σ { ω t , ω t − 1 , . . . } -measurable. • We will exclude non-causal AR models from consideration. In fact this is not • Causality is a property of ( X t ) t and ( W t ) t .
AR(1) and causality Exercise Consider the AR (1) process defjned by Φ( B ) X t = W t = (1 − φ B ) W t . This process Φ( B ) X t = W t is causal and stationary • ifg | φ | < 1 • ifg the root z 1 of the polynomial Φ( z ) = 1 − φ z satisfjes | z 1 | > 1 . • If | φ | > 1 we can defjne an equivalent causal model X t − φ − 1 X t − 1 = ˜ W t , where ˜ W t is a new white noise sequence. • If | φ | = 1 the AR (1) process is not stationnary. • If X t is an MA (1) , is is always causal. Consider the non-causal AR(1) model X t = φ X t − 1 + ω t with | φ | > 1 and suppose that ω ∼ i . i . d . N (0 , σ 2 ) 1. Which distribution has X t ? 2. Defjne the time series Y t = φ − 1 Y t − 1 + η t with η ∼ i . i . d . N (0 , σ 2 / φ 2 ) . Prove that X t and Y t have the same distribution.
Autoregressive model AR( p ) An autoregressive model of order p is of the form will write more concisely X t = φ 1 X t − 1 + φ 2 X t − 2 + . . . + φ p X t − p + ω t ∀ t ∈ Z where X is assumed to be stationary and ω is a white noise WN (0 , σ 2 ) . We Φ( B ) X t = ω t ∀ t ∈ Z where Φ is the polynomial of degree p , Φ( z ) = (1 − φ 1 z − φ 2 z 2 − . . . − φ p z p ) . Without loss of generality, we assume that each X t is centered.
Condition of existence and causality of AR ( p ) A stationary solution to Φ( B ) X t = ω t ∀ t ∈ Z exists if and only if Φ( z ) = 0 = ⇒ | z | � = 1 . In this case, this defjnes an AR ( p ) process, which is causal ifg in addition Φ( z ) = 0 = ⇒ | z | > 1 .
AR ( 2 ) phi_1 = 1.5 phi_2 = − 0.75 5 0 ar2 −5 0 50 100 150 Time Causal Region of an AR(2) 1.0 0.5 real roots 0.0 φ 2 −0.5 complex roots −1.0 −2 −1 0 1 2 φ 1
Recall Causality Causal linear process A linear process X is said to be causal ( a causal function of W t ) when there is ◮ a power series Ψ : Ψ( B ) = ψ 0 + ψ 1 B + ψ 2 B 2 , . . . , ◮ with � ∞ j =0 | ψ j | < ∞ ◮ and X t = Ψ( B ) ω t ω is a white noise WN (0 , σ 2 ) . In this case X t is σ { ω t , ω t − 1 , . . . } -measurable. • Causality is a property of ( X t ) t and ( W t ) t . • Calculating Ψ for an AR ( p ) ?
AR(p) and causality We can solve these linear difgerence equations in several ways : We get that Consider and AR ( p ) process Φ( B ) X t = W t ⇐ ⇒ X t = Ψ( B ) W t where ω is a white noise WN (0 , σ 2 ) . 1 = Ψ( B )Φ( B ) ⇐ ⇒ 1 = ( ψ 0 + ψ 1 B + · · · )(1 − φ 1 B − · · · − φ p B p ) , ⇐ ⇒ 1 = ψ 0 0 = ψ 1 − φ 1 ψ 0 0 = ψ 2 − φ 1 ψ 1 − φ 2 ψ 0 · · · ⇐ ⇒ 1 = ψ 0 0 = ψ j ( j < 0) , 0 = Φ( B ) ψ j . ◮ numerically ◮ by guessing the form of a solution and using an inductive proof ◮ by using the theory of linear difgerence equations
MA(1) and invertibility Defjne X t = W t + θ W t − 1 = (1 + θ B ) W t . • If | θ | < 1 we can write ∞ � (1+ θ B ) − 1 X t = W t ⇐ ⇒ (1 − θ B + θ 2 B 2 − θ 3 B 3 + · · · ) X t = W t ⇐ ⇒ ( − θ ) j X t − j = W t . j =0 That is, we can write W t as a causal function of X t . We say that this MA (1) is invertible . • if | θ | > 1 , the sum � ∞ j =0 ( − θ ) j X t − j diverges, but we can write W t − 1 = − θ − 1 W t + θ − 1 X t . Just like the noncausal AR (1) , we can show that ∞ � ( − θ ) − j X t + j . W t = − j =1 Thus, W t is written as a linear function of X t , but it is non causal. We say that this MA (1) is non invertible.
Moving average model MA( q ) An moving average model of order q is of the form Unlike the AR model, the MA model is stationary for any values of the thetas. X t = ω t + θ 1 ω t − 1 + θ 2 ω t − 2 + . . . + θ q ω t − q ∀ t ∈ Z where ω is a white noise WN (0 , σ 2 ) . We will write more concisely X t = Θ( B ) ω t ∀ t ∈ Z where θ is the polynomial of degree q θ ( x ) = (1 − θ 1 x − θ 2 x 2 − . . . − θ q x q ) .
MA ( 1 ) θ = + 0.9 2 0 x −2 0 20 40 60 80 100
Invertibility I A linear process X is invertible when there is Consider the MA(1) process Show that Invertibility of a MA(1) process In the fjrst case, X is invertible. Invertibility X t = ω t + θω t − 1 = (1 + θ B ) ω t ∀ t ∈ Z where ω is a white noise WN (0 , σ 2 ) . ◮ If | θ | < 1 , ω t = � ∞ j =0 ( − θ ) j X t − j ◮ If | θ | > 1 , ω t = − � ∞ j =1 ( − θ ) − j X t + j ◮ a power series Π : Π( x ) = π 0 + π 1 x + π 2 x 2 , . . . , ◮ with � ∞ j =0 | π j | < ∞ ◮ and ω t = Π( B ) X t ω is a white noise WN (0 , σ 2 ) .
Invertibility II Exercise 2. Can we defjne an invertible time series Y defjned through a new Gaussian Consider the non-invertible MA(1) model X t = ω t + θ ω t − 1 with | θ | > 1 and suppose that ω ∼ i . i . d . N (0 , σ 2 ) 1. Which distribution has X t ? white noise η such that X t and Y t have the same distribution ( ∀ t ) ?
Invertibility I Invertibility of a MA(1) process Consider the MA(1) process noise sequence. X t = ω t + θω t − 1 = (1 + θ B ) ω t ∀ t ∈ Z where ω is a white noise WN (0 , σ 2 ) . ( X t ) t is invertible • ifg | θ | < 1 • ifg the root z 1 of the polynomial Θ( z ) = 1 + θ z satisfjes | z 1 | > 1 . • If | θ | > 1 we can defjne an equivalent invertible model in term of a new white • Is and AR (1) invertible ?
Can accurately approximate many stationary processes Autoregressive moving average model no common factors. This implies it is not a lower order ARMA model. Autoregressive moving average model ARMA ( p , q ) An ARMA ( p , q ) process ( X t ) t ∈ Z is a stationary process that is defjned through Φ( B ) X t = Θ( B ) ω t where ω ∼ WN (0 , σ 2 ) , Φ is a polynomial of order p , Θ is a polynomial of order q and Φ and Θ have no common factors. • For any stationary process with autocovariance γ , and any k > 0 , there is an ARMA process ( X t ) t for which γ X ( h ) = γ ( h ) , h = 0 , 1 , · · · , k . • Usually we insist that φ p � = 0 , θ q � = 0 and that the polynomials Φ and Θ have
Autoregressive moving average model Exercise an ARMA(1,1) process ? Examples ARMA ( p , q ) • AR ( p ) = ARMA ( p , 0) : Θ( B ) ≡ 1 . • MA ( q ) = ARMA (0 , q ) : Φ( B ) ≡ 1 . • WN = ARMA (0 , 0) : Θ( B ) = Φ( B ) ≡ 1 . Consider the process X defjned by X t − 0 . 5 X t − 1 = ω t − 0 . 5 ω t − 1 . Is it trully
Recall Causality and invertibility Causality there is Invertibility A linear process ( X t ) t is said to be causal ( a causal function of W t ) when ◮ a power series Ψ : Ψ( B ) = ψ 0 + ψ 1 B + ψ 2 B 2 , . . . , ◮ with � ∞ j =0 | ψ j | < ∞ ◮ and X t = Ψ( B ) ω t ω is a white noise WN (0 , σ 2 ) . A linear process ( X t ) t is invertible when there is ◮ a power series Π : Π( x ) = π 0 + π 1 x + π 2 x 2 , . . . , ◮ with � ∞ j =0 | π j | < ∞ ◮ and ω t = Π( B ) X t ω is a white noise WN (0 , σ 2 ) .
Stationarity, causality and invertibility Theorem factors. Exercise Discuss the stationarity, causality and invertibility of Consider the equation Φ( B ) X t = Θ( B ) ω t , where Φ and Θ have no common ◮ There exists a stationary solution ifg Φ( z ) = 0 ⇔ | z | � = 1 . ◮ This process ARMA ( p , q ) is causal ifg Φ( z ) = 0 ⇔ | z | > 1 . ◮ It is invertible ifg the roots of Θ( z ) are outside the unit circle. (1 − 1 . 5 B ) X t = (1 + 0 . 2 B ) ω t .
Theorem If satisfjes We can now consider only causal and invertible ARMA processes. Let X be an ARMA process defjned by Φ( B ) X t = Θ( B ) ω t . ∀| z | = 1 θ ( z ) � = 0 , then there are polynomials ˜ φ and ˜ θ and a white noise sequence ˜ ω such that X ◮ ˜ Φ( B ) X t = ˜ Θ( B )˜ ω t , ◮ and is a causal, ◮ invertible ARMA process.
Theorem 1. If that X satisfjes We can now consider only causal and invertible ARMA processes. Let X be an ARMA process defjned by Φ( B ) X t = Θ( B ) ω t . ∀| z | = 1 θ ( z ) � = 0 , then there are polynomials ˜ φ and ˜ θ and a white noise sequence ˜ ω such ◮ ˜ Φ( B ) X t = ˜ Θ( B )˜ ω t , ◮ and is a causal, ◮ invertible ARMA process.
The linear process representation of an ARMA Causal and invertible representations can be rewritten Consider a causal, invertible ARMA process defjned by Φ( B ) X t = Θ( B ) ω t . It ◮ as a MA( ∞ ) : � X t = Θ( B ) Φ( B ) ω t = ψ ( B ) ω t = ψ k ω t − k k ≥ 0 ◮ or as an AR(( ∞ )) � ω t = Φ( B ) Θ( B ) X t = π ( B ) X t = π k X t − k k ≥ 0 Notice that both π 0 and ψ 0 equal 1 and ( ψ k ) and ( π k ) are entirely determined by ( φ k ) and ( θ k ) .
Autocovariance function of an ARMA Autocovariance of an ARMA representation and equals Exercise order 1. Solve this equation. The autocovariance function of an ARMA( p , q ) follows from its MA( ∞ ) γ X ( h ) = σ 2 � ψ k ψ k + h ∀ h ≥ 0 . k ≥ 0 ◮ Compute the ACF of a causal ARMA( 1 , 1 ). ◮ Show that the ACF of this ARMA verifjes a linear difgerence equation of ◮ Compute φ and θ from the ACF.
Defjnition of the spectral measure k Note that, two weakly stationary processes having the same autocovariance Theorem c i X t i function, have the same spectral measure. Let ( X t ) t ∈ Z be a weakly stationary process (centered in order to simplify notations). Its autocovariance function satisfjes ∀ ( s , t ) , γ X ( s − t ) = γ X ( t − s ) and for all k ∈ N ∗ , ( t 1 , · · · , t k ) ∈ Z k and c 1 , · · · , c k ∈ C k , � � 2 � � � � � � = c i c j γ X ( t i − t j ) ≥ 0 . E � � � � i =1 1 ≤ i ; j ≤ k The autocovariance of ( X t ) t ∈ Z satisfjes that for all k ∈ Z , � π � γ X ( k ) = e ikx d µ ( x ) = cos ( kx ) d µ ( x ) − π where µ is a non negative measure, symetric and bounded on [ − π , π ] . The measure µ is unique, with total measure γ X (0) = Var ( X 0 ) . This measure is called the spectral measure of ( X t ) t ∈ Z . Inversely, if µ is a non negative measure, symetric and bounded on [ − π , π ] then � π � γ ( k ) = e ikx d µ ( x ) = cos ( kx ) d µ ( x ) , − π is the autocovariance function of a weakly stationary process ( X t ) t ∈ Z .
Defjnition of the spectral density Moreover this density is continuous and bounded. Two suffjcient conditions for the existence of f : that case, Let ( X t ) t ∈ Z be a weakly stationary process (centered in order to simplify notations) with autocovariance function γ X ( k ) and spectral measure µ . If µ admits a density f with respect to the Lebesgue measure on [ − π , π ] then f is called the spectral density of the process ( X t ) t ∈ Z . It satisfjes � π γ X ( k ) = e ikx d µ ( x ) , − π and γ X ( k ) is the k-th Fourier coeffjcients of f . C1) If � ∞ k =0 | γ X ( k ) | < ∞ , then the spectral density exists et equals � f ( x ) = 1 γ ( k ) e − ixk . 2 π k ∈ Z C2) f exists and belongs to L 2 ([ − π , π ]) if and only if � ∞ k =0 γ X ( k ) 2 < ∞ . In � 1 k ∈ Z γ ( k ) e − ixk converges in L 2 ([ − π , π ]) to f . 2 π
Linear Process sequence of random variables such that Theorem a k e ikx Let ( Y t ) t ∈ Z be a weakly stationary process. A linear process ( X t ) t ∈ Z in L 2 is a � X t = a i Y t − i , i ∈ Z as soon as the sum is well defjned in L 2 . 1. If ( Y t ) t ∈ Z is centered and has a bounded spectral density , and if � j ∈ Z a 2 j < ∞ , , then ( X t ) t ∈ Z is well defjned, is centered and weakly stationary. Moreover ( X t ) t ∈ Z admits a spectral density which statisfjes � � 2 � � � � � ∈ L 1 ([ − π, π ]) . f = g . f Y , where g ( x ) = � � � � k ∈ Z 2. If � j ∈ Z | a j | < ∞ , , then the process ( X t ) t ∈ Z has the spectral measure µ x which that the density g with respect to µ Y . We will then write d µ X = gd µ Y .
Regular representation of ARMA process (1) Theorem (9) Let Φ 1 , · · · , Φ p ∈ R p and θ 1 , · · · , θ q ∈ R q and consider ( X t ) t ∈ Z satisfying Φ( B )( X t ) = Θ( B )( ε t ) , 1 − Φ 1 z − Φ 2 z 2 − · · · − Φ p z p with Φ( z ) = 1 − θ 1 z − θ 2 z 2 − · · · − θ q z q . and Θ( z ) = 1. If Φ has no roots in the unite circle, then there exist a unique stationary solution to (9). ( X t ) t ∈ Z is a causal process with ε t ⊥ span ( X i , i ≤ t − 1) . 2. Morevover, if Θ has no roots in the unique circle , ε t ∈ span ( X i , i ≤ t ) and ( X t ) t ∈ Z is a causal and invertible process. In those cases, ( ε t ) t ∈ Z are the innovations of the process ( X t ) t ∈ Z .
Regular representation of ARMA process (2) roots in the unite circle, of degree p and q such that we have Theorem (10) Let Φ 1 , · · · , Φ p ∈ R p and θ 1 , · · · , θ q ∈ R q and consider ( X t ) t ∈ Z satisfying Φ( B )( X t ) = Θ( B )( ε t ) , 1 − Φ 1 z − Φ 2 z 2 − · · · − Φ p z p with Φ( z ) = 1 − θ 1 z − θ 2 z 2 − · · · − θ q z q . and Θ( z ) = If Φ( z )Θ( z ) � = 0 for | z | = 1 , then there exist two polynoms ¯ Φ and ¯ Θ having no Φ( B )( X t ) = ¯ ¯ Θ( B )( ε ∗ t ) , where ε ∗ n is a weak white noise defjned as the innovations of ( X t ) t ∈ Z . Moreover � � 1 − a j z 1 − b j z ¯ 1 − z / a j , and ¯ Φ( z ) = Φ( z ) Θ( z ) = Θ( z ) 1 − z / b j , r < j ≤ p s < j ≤ q where a r +1 , · · · , a p and b s +1 , · · · , b q are the roots of Φ and Θ belonging to the unite circle and a 1 , · · · , a r and b 1 , · · · , b s are the roots of Φ and Θ being outside of the the unite circle. If r = p and s = q, then ( X t ) t ∈ Z is causal and invertible and ε ∗ n = ε n
Chapter 4 : Linear prediction and partial autocorrelation function
Recall linear predictor n Linear predictor and ACF Linear prediction which satisfjes the prediction equations Let X be a stationary time series with ACF ρ . The linear predictor ˆ X { n } n + h of X n + h given X n is defjned as �� � 2 � ˆ X { n } n + h = argmin X n + h − ( aX n + b ) = ρ ( h )( X n − µ ) + µ E a , b Given X 1 , · · · , X n , the best linear predictor of X n + m is � n + m = α 0 + α i X i , X n i =1 E ( X n + m − X n n + m ) = 0 and E [( X n + m − X n n + m ) X i ] = 0 , for i = 1 , · · · , n . • Orthogonal projection on the linear span generated by the past 1 , X 1 , · · · , X n . • The prediction errors ( X n + m − X n n + m ) are uncorrelated with the prediction variables (1 , · · · , X n ) .
Introduction We aim at building predictions and prediction intervals . Consider here that ( X t ) t ∈ Z is and ARMA(p,q) process, which is causal, invertible and stationary and that the coeffjcients Φ 1 , · · · , Φ p ∈ R p and θ 1 , · · · , θ q ∈ R q are known.
Recall ◮ The linear space L 2 of r.v. with fjnite variance with the inner-product � X , Y � = E ( XY ) is an Hilbert space. ◮ Now considering a time series X with X t ∈ L 2 for all t ◮ the subspace H n = span ( X 1 , . . . , X n ) is a closed subspace of L 2 hence ◮ for all Y ∈ L 2 there exists an unique projection onto H n which is denoted by Π H n ( Y ) . ◮ for all ∀ w ∈ H n � Π H n ( Y ) − Y � ≤ � w − Y � � Π H n ( Y ) − Y , w � = 0 .
Wold decomposition (1) prediction error is Notation : We can easily check that this error does not depend on n , thanks to the weak Let ( X t ) t ∈ Z be a weakly stationary and centered process. ◮ M n = span ( X i , −∞ ≤ i ≤ n ) the Hilbert subspace defjned as the closed subspace in L 2 consisting in fjnite linear combination of ( X i ) i ≤ n . ◮ M ∞ = span ( X i , i ∈ Z ) ◮ M −∞ = ∩ n =+ ∞ n = −∞ M n . ◮ Π M n the orthogonal projection on M n . The best linear prediction of X n +1 given ( X i , i ≤ n ) is Π M n ( X n +1 ) . And the σ 2 = � X n +1 − Π M n ( X n +1 ) � 2 = E [ X n +1 − Π M n ( X n +1 )] 2 . stationarity of the process ( X t ) t ∈ Z . If σ 2 > 0 the process is said regular, if σ 2 = 0 the process is said deterministic. For ( X t ) t ∈ Z regular we set ε t = X t − Π M t − 1 ( X t ) .
Wold decomposition (2) Check that Defjnition ◮ E (Π M t − 1 ( X t )) = 0 . ◮ E ( ε t ) = 0 , E ( ε 2 t ) = σ 2 ◮ ε t ∈ M t and ε t ⊥ M t − 1 , i.e. ∀ U ∈ M t − 1 , E ( ε t U ) = 0 . ◮ E ( ε i ε j ) = 0 if i � = j . The process ( ε t ) t ∈ Z is a weak white noise which is called the innovation of ( X t ) t ∈ Z .
Wold decomposition (3) and the conditions (1)-(4). (11) Theorem Let ( X t ) t ∈ Z be a weakly stationary and centered process. Then we can write ∞ � a j ε t − j + Y t , t ∈ Z X t = j =0 1. with a 0 = 1 , and � j ≥ 0 a 2 j < ∞ 2. ( ε t ) t ∈ Z is a weak white noise such that E ( ε t ) = 0 , E ( ε 2 t ) = σ 2 , ε t ∈ M t and ε t ⊥ M t − 1 . 3. Y t ∈ M −∞ , ∀ t ∈ Z . 4. Y t is deterministic. Moreover, the sequences ( a j ) , ( ε j ) and ( Y j ) are uniquely determined by (11) The process Z t = X t − Y t , has the following wold decomposition Z t = � ∞ j =0 a j ε t − j . Hence ε t = X t − E (Π M t − 1 ( X t )) = Z t − E (Π M t − 1 ( Z t )) . The process ( Z t ) t ∈ Z is purely non deterministic.
Best linear predictor (1) Notations : Let ( X t ) t ∈ Z be a weakly stationary and centered process. We already said that the best linear predictor of X n +1 given ( X i , i ≤ n ) is the orthogonal projection denoted by Π M n ( X n +1 ) . It is given by the Wold decomposition Π M n ( X n +1 ) = � n j =1 a j ε n +1 − j + Y n +1 with ε n +1 = X n +1 − Π M n ( X n +1 ) . The prediction error is then σ 2 = E ( ε 2 0 ) . In practice, the important thing is to predict X n +1 given X 1 , · · · , X n . ◮ H n = span ( X i , 1 ≤ i ≤ n ) the hilbert subspace consisting in fjnite linear combination of ( X i ) 1 ≤ i ≤ n . ◮ Π H n the orthogonal projection on H n . ◮ Π H n = � n i =1 Φ n , i X n +1 − i
Best linear predictor (2) with n This is a linear system that can be written as It remains to fjnd Π H n that is to fjnd Φ n , i . According to Hilbert properties, for all 1 ≤ i ≤ n E [( X n +1 − Π H n ) X n +1 − i ] = 0 . That is all 1 ≤ i ≤ n E [ X n +1 X n +1 − i ] = E [Π H n ( X n +1 − i )] . � E [ X n +1 X n +1 − i ] = Φ n , j E [ X n +1 − j X n +1 − i ] . j =1 Γ n Φ n = γ n , and v n = E [( X n +1 − E [Π H n ( X n +1 )) 2 ] . Φ n , 1 γ (1) γ (0) · · · γ ( n − 1) . . γ (1) · · · γ ( n − 2) Φ n = . , γ n = . , and Γ n = γ ( i − 1) · · · γ ( n − i ) . . γ ( n − 1) · · · γ (0) Φ n , n γ ( n )
Best Linear predictor (3) with Mean squared error for one-step-ahead linear prediction and Linear system Variance is reduced ! satisfying The best linear predictor of X n +1 given X 1 , · · · , X n is Π H n ( X n +1 ) = φ n , 1 X n + φ n , 2 X n − 1 + · · · + φ n , n X 1 Γ n φ n = γ n γ (0) γ (1) · · · γ ( n − 1) γ (1) γ (0) · · · γ ( n − 2) · · · · · Γ n = · · · · · · · · γ ( n − 1) γ ( n − 2) · · · γ (0) φ n = ( φ n , 1 , φ n , 2 , · · · , φ n , n ) t , γ n = ( γ (1) , γ (2) , · · · , γ ( n )) t . v n +1 = E ( X n +1 − Π H n ( X n +1 )) 2 = γ (0) − γ t n Γ − 1 n γ n .
Recommend
More recommend