Thresholds for methods of automatic extraction of time series trend and periodical components with the help of the “Caterpillar”-SSA approach Th.Alexandrov, N.Golyandina theo@pdmi.ras.ru, nina@ng1174.spb.edu St.Petersburg State University – p. 1/12
Signal approximation F N = ( f 0 , . . . , f N − 1 ) : f n = s n + ε n , S N = ( s 0 , . . . , s N − 1 ) – determinate signal, ( ε 0 , ε 1 , ε 2 , . . . , ε N − 1 ) – residual (noise). Signal approximation – in mean-square terms. We want to approximation such signals: � non-stationary, � without information about its parametric model, � and more, without knowledge of its structure. – p. 2/12
“Caterpillar”-SSA approach The method accomplishes such tasks: � finding trend of different resolution, � smoothing, � seasonality extraction, � extraction periodicities with changing amplitudes, � forecast, � change-point detection. History: � USA, UK – SSA (Singular Spectrum Analysis), � Russia – “Caterpillar”-SSA. Advantages: � doesn’t require the knowledge of parametric model of time series, � processes wide spectrum of real-life time series, � match up for non-stationary time series, � work with such natural components as modulated harmonics. – p. 3/12
“Caterpillar”-SSA: base algorithm F N = F (1) + . . . + F ( m ) � Decomposition into sum of components: . N N � Gives the information about each component. Algorithm: f 0 f 1 . . . f N − L 1. Trajectory matrix f 1 f 2 . . . f N − L +1 F N → X ∈ R L × K construction: X = . . . ... ... . . ( L – window length, parameter) . . f L − 1 f L . . . f N − 1 � λ j U j V T X j = j , 2. Singular Value Decomposition X = � X j , (SVD): λ j – e.val. S = XX T , U j – e.v-r S , � V j – e.v-r S T , V j = X T U j λ j . X ( k ) = � 3. Components grouping j ∈ I k X j . { 1 , . . . , d } = � I k , SVD: 4. Reconstruction by diagonal ( k ) . X ( k ) → � averaging: F N – p. 4/12
Grouping (1) . I 1 : X (1) ↔ � F N = F (1) + F (2) F N Common case: N N Grouping is possible, if: 1. F (1) – has finite amount of components, N 2. F (1) is separable from a residual. N Approximation case: (1) – approximation I 1 : X (1) ↔ � F N F N = F (1) + F (2) of a signal. N N signal, noise 1. Every linear combination of multiplication of exponents , e-m harmonics and polynomials has finite amount of components. 2. Asymptotic separability examples: � A determinate signal is asympt. separable from a white noise. � A periodicity is asympt. separable from a trend. – p. 5/12
Identification Identification – choosing of components during grouping. f n = Ae αn . Exponential trend: � it generates one SVD component, � eigenvector: U = ( u 1 , . . . , u L ) T : u k = Ce αk . ( “exponential” form with the same α ) f n = Ae αn cos(2 πωn ) . Exponentially-modulated harmonic: � it generates two SVD components, � eigenvectors: L ) T : = C 1 e αk cos(2 πωk ) . U 1 = ( u (1) 1 , . . . , u (1) u (1) k L ) T : = C 2 e αk sin(2 πωk ) . U 2 = ( u (2) 1 , . . . , u (2) u (2) k ( “exponentially-modulated” form with the same α и ω ) – p. 6/12
Identification Identification – choosing of components during grouping. f n = Ae αn . Exponential trend: � it generates one SVD component, � eigenvector: U = ( u 1 , . . . , u L ) T : u k = Ce αk . ( “exponential” form with the same α ) f n = Ae αn cos(2 πωn ) . Exponentially-modulated harmonic: � it generates two SVD components, � eigenvectors: L ) T : = C 1 e αk cos(2 πωk ) . U 1 = ( u (1) 1 , . . . , u (1) u (1) k L ) T : = C 2 e αk sin(2 πωk ) . U 2 = ( u (2) 1 , . . . , u (2) u (2) k ( “exponentially-modulated” form with the same α и ω ) – p. 6/12
Identification Identification – choosing of components during grouping. f n = Ae αn . Exponential trend: � it generates one SVD component, � eigenvector: U = ( u 1 , . . . , u L ) T : u k = Ce αk . ( “exponential” form with the same α ) f n = Ae αn cos(2 πωn ) . Exponentially-modulated harmonic: � it generates two SVD components, � eigenvectors: L ) T : = C 1 e αk cos(2 πωk ) . U 1 = ( u (1) 1 , . . . , u (1) u (1) k L ) T : = C 2 e αk sin(2 πωk ) . U 2 = ( u (2) 1 , . . . , u (2) u (2) k ( “exponentially-modulated” form with the same α и ω ) – p. 6/12
Trend: low frequencies method Investigate every eigenvector U j . Let us take U = ( u 1 , . . . , u L ) T . LOW FREQUENCIES METHOD � � � + ( − 1) n c L/ 2 , � u n = c 0 + c k cos(2 πnk/L ) + s k sin(2 πnk/L ) 1 � k � L − 1 2 � Periodogram: 2 c 02 , k = 0 , U ( k/L ) = L c k 2 + s k 2 , Π L 1 � k � L − 1 , 4 2 2 c L/ 22 , L – even and k = L/ 2 . Π L U ( ω ) , ω ∈ { k/L } , reflects the contribution of harmonic with frequency ω into the form of U . � Parameter: ω 0 – upper boundary for the “low frequencies” interval � 0 � k � L ω 0 Π L U ( k / L ) � C ( U ) = U ( k / L ) – contribution of LF frequencies. 0 � k � L / 2 Π L C ( U ) � C 0 ⇒ e. v-r U corresponds to a trend . ( C 0 ∈ (0 , 1) – threshold) – p. 7/12
LF method: optimal thresholds values This slide isn’t translated and omitted due to its obsoleteness. – p. 8/12
Periodicity: Fourier method Let us investigate sequences of eigenvectors elements U j , U j +1 for all pairs of neighbor components. FOURIER METHOD θ j = arg min k Π M Stage 1. Check “maximal” frequencies: U j ( k/M ) , � M | θ j − θ j +1 | � s 0 ⇒ the pair ( j, j + 1) is a “harmonical” pair. � Stage 2. Check the form of periodogram: � � ρ ( j,j +1) = 1 Π M U j ( k/M ) + Π M 2 max k U j +1 ( k/M ) , for a harm. pair ρ ( j,j +1) = 1 . ρ ( j,j +1) � ρ 0 ⇒ the pair ( j, j + 1) corresponds to a harmonic . ( ρ 0 ∈ (0 , 1) is the threshold.) – p. 9/12
Fourier method: optimal thresholds values This slide isn’t translated and omitted due to its obsoleteness. – p. 10/12
Real-life situation This slide isn’t translated and omitted due to its obsoleteness. – p. 11/12
Conclusion Monthly data: traffic fatalities, 1960-1974, Ontario. Trend components numbers: 1, 4, 5. Seasonality components numbers: 2, 3, 6-8, 11-14. – p. 12/12
Recommend
More recommend