Spectral Analysis of Stationary Stochastic Process Hanxiao Liu - PowerPoint PPT Presentation

Spectral Analysis of Stationary Stochastic Process Hanxiao Liu hanxiaol@cs.cmu.edu February 20, 2016 1 / 16

Outline ◮ Stationarity ◮ The time-frequency dual ◮ Spectral representation ◮ Marginal/conditional dependencies ◮ Inference 2 / 16

Stationary Stochastic Process Strong stationarity: ∀ t 1 , . . . , t k , h D ( X ( t 1 ) , . . . , X ( t k )) = ( X ( t 1 + h ) , . . . , X ( t k + h )) (1) Weak/2nd-order stationarity: X ( t ) X ( t ) ⊤ � � < ∞ ∀ t (2) E E ( X ( t )) = µ ∀ t (3) Cov ( X ( t ) , X ( t + h )) = Γ( h ) ∀ t, h (4) The r.h.s. does not depend on t . Γ( h ) autocovariance function (marginal dependencies) Γ(0) variance (power) of X 3 / 16

Spectral Representation Theorem � π e iwt dZ ( ω ) X ( t ) = (5) − π ◮ E [ dZ ( ω ) dZ ∗ ( ω ′ )] = 0 if ω � = ω ′ . ◮ ∗ denotes Hermitian (conjugate) transpose. Compared to X ( t ), we are more interested in Γ( h )— 0 illustrative animation A and B. 4 / 16

Spectral Representation Theorem X (0) X ( h ) ⊤ � � Γ( h ) = E (6) �� ω ′ e iw ′ h dZ ∗ ( ω ′ ) e 0 dZ ( ω ) = E (7) ω � � ω ′ e iw ′ h E [ dZ ( ω ) dZ ∗ ( ω ′ )] = (8) ω � e iwh E [ dZ ( ω ) dZ ∗ ( ω )] = (9) ω � e iwh s ( ω ) dω = (10) ω Γ( h ) - covariance with lag h (time domain) s ( ω ) - covariance at frequency ω (freq domain) 5 / 16

Spectral Density Function The Fourier transform pair � e iwh s ( ω ) dω Γ( h ) = (11) ω ∞ s ( ω ) = 1 � Γ( h ) e − iωh (12) 2 π h = −∞ We call s the spectral density function , since � Γ(0) = s ( ω ) dω (13) ω Γ(0) = Cov( X ( t ) , X ( t )) = cumulative effect of s ( w ) 6 / 16

Marginal Dependencies Γ( h ) ← sample autocovariance function N − h − 1 Γ( h ) = 1 � ⊤ ˆ � X ( t ) − ¯ X ( t + h ) − ¯ � � � (14) X X N t =0 Asymptotic normality under mild assumptions. s ( ω ) ← periodogram . Let ω k = 2 πk N , I ( ω k ) = d ( k ) d ( k ) ∗ → ˆ s ( ω ) (15) t =0 X ( t ) e − ikt is obtained via DFT. � N − 1 where d ( k ) := 1 N ◮ bad estimator in general ◮ good estimator with appropriate smoothing 7 / 16

Conditional Dependence For time-series i and j X i X j | X V \{ i,j } (16) = | � � ⇐ ⇒ Cov X i ( t ) , X i ( t + h ) | X V \{ i,j } = 0 , ∀ h (17) ⇒ (Γ( h ) − 1 ) ij = 0 , ∀ h ⇐ (18) ⇒ ( s ( ω ) − 1 ) ij = 0 , ∀ ω ∈ [0 , 2 π ] ⇐ (19) Inferring conditional dependences ◮ = inferring Γ( h ) − 1 ◮ = inferring s ( ω ) − 1 Applicable to any stationary X 8 / 16

Autoregressive Gaussian Process The Autoregressive (AR) process p � X ( t ) = − A h X ( t − h ) + ǫ ( t ) (20) h =1 ǫ ( t ) Gaussian white noise ∼ N (0 , Σ) We’d like to parametrize s ( ω ) − 1 with A ◮ Inferring conditional dependences for AR can be cast as an optimization problem w.r.t. A 9 / 16

Filter Theorem For any stationary X and { a t } s.t. � ∞ t = −∞ | a t | < ∞ , process Y ( t ) = � ∞ h = −∞ a h X ( t − h ) is stationary with s Y ( ω ) = |A ( e iω ) | 2 s X ( ω ) (21) where A ( z ) = � ∞ −∞ a h z − h In 1-d AR, ⇒ s ( ω ) − 1 = |A ( e iω ) | 2 ǫ ( t ) = x ( t ) + � p h =1 a h x ( t − h ) = σ 2 Multi-dimensional analogy: s ( ω ) − 1 = A ( e iω )Σ − 1 A ( e iω ) ∗ (22) where A ( z ) = � p h =0 A h z − h , A 0 := I . 10 / 16

Parametrized Spectral Density Parametrize s ( ω ) − 1 by AR parameters p p � ∗ � � � s ( ω ) − 1 = � � A h e − ihω Σ − 1 A h e − ihω (23) h =0 h =0 p = Y 0 + 1 � e − ihω Y h + e ihω Y ⊤ � � (24) h 2 h =1 where Y 0 = � p h Σ − 1 A h , Y h = 2 � p − h h =0 A ⊤ i =0 A ⊤ i Σ − 1 A i + h def = Σ − 1 ⇒ Y 0 = � p h B h , Y h = 2 � p − h h =0 B ⊤ i =0 B ⊤ 2 A h = B h i B i + h ( s ( ω ) − 1 ) ij = 0 ⇐ ⇒ ( Y h ) ij = ( Y h ) ji = 0, ∀ 0 , . . . , p , i.e. linear constraints over Y ⇐ ⇒ quadratic constraints over B 11 / 16

Conditional MLE Simplification: fix x (1) , . . . x ( p ) p � A h x ( t − h ) ǫ ( t ) = (25) h =0   x ( t ) x ( t − 1)   = [ A 0 , . . . , A h ]  := A x ( t ) ∼ N (0 , Σ) (26)  .  .   .  x ( t − p ) A least-squares estimate. Likelihood = e − 1 � N t = p +1 x ( t ) ⊤ A ⊤ Σ − 1 A x ( t ) = e − 1 � N t = p +1 x ( t ) ⊤ B ⊤ B x ( t ) B =Σ − 1 2 2 2 A = = = = = = m ( N − p ) m ( N − p ) N − p (2 π ) (det Σ) (2 π ) (det B 0 ) p − N 2 2 2 (27) 12 / 16

Regularized ML Maximize log-likelihood � CB ⊤ B � − 2 log det B 0 + tr min (28) B Solution given by Yule-Walker equations. Enforcing sparsity over s ( ω ) − 1 CB ⊤ B + γ � D ( B ⊤ B ) � 1 � � min − 2 log det B 0 + tr (29) B Convex relaxation: min − log det Z 00 + tr ( CZ ) + γ � D ( Z ) � 1 (30) Z � 0 ◮ Exact if rank( Z ∗ ) ≤ m ◮ Bregman divergence + ℓ 1 -regularization. Well studied. 13 / 16

Non-stationary Extensions With stationarity ∞ s ( ω ) = 1 � Γ( h ) e − iωh (31) 2 π h = −∞ No stationarity? The Wigner-Ville spectrum ∞ � � s ( t, ω ) = 1 t + h 2 , t − h � e − iωh Γ (32) 2 π 2 h = −∞ Other types of power spectra ◮ Rihaczek spectrum ◮ (Generalized) Evolutionary spectrum 14 / 16

Reference I Bach, F. R. and Jordan, M. I. (2004). Learning graphical models for stationary time series. Signal Processing, IEEE Transactions on , 52(8):2189–2199. Basu, S., Michailidis, G., et al. (2015). Regularized estimation in sparse high-dimensional time series models. The Annals of Statistics , 43(4):1535–1567. Matz, G. and Hlawatsch, F. (2003). Time-varying power spectra of nonstationary random processes . na. Pereira, J., Ibrahimi, M., and Montanari, A. (2010). Learning networks of stochastic differential equations. In Advances in Neural Information Processing Systems , pages 172–180. Songsiri, J., Dahl, J., and Vandenberghe, L. (2010). Graphical models of autoregressive processes. Convex Optimization in Signal Processing and Communications , pages 89–116. 15 / 16

Reference II Songsiri, J. and Vandenberghe, L. (2010). Topology selection in graphical models of autoregressive processes. The Journal of Machine Learning Research , 11:2671–2705. Tank, A., Foti, N. J., and Fox, E. B. (2015). Bayesian structure learning for stationary time series. In Uncertainty in Artificial Intelligence, UAI 2015, July 12-16, 2015, Amsterdam, The Netherlands , pages 872–881. 16 / 16

Spectral Analysis of Stationary Stochastic Process Hanxiao Liu - PowerPoint PPT Presentation

Spectral Analysis of Stationary Stochastic Process Hanxiao Liu hanxiaol@cs.cmu.edu February 20, 2016 1 / 16 Outline Stationarity The time-frequency dual Spectral representation Marginal/conditional dependencies Inference 2

Spectral Clustering Spectral Clustering? Spectral methods Methods using eigenvectors of

Outline Outline Stationary Solution to Fokker Stationary Solution to Fokker- - Planck

Semi-stationary reflection, stationary reflection and combinatorics Hiroshi Sakai (joint work

Spectral analysis of stationary logGaussian Cox in functions spaces processes in functions

EPAs Air Quality Regulations for Stationary Engines for Stationary Engines Melanie King U.S.

Lecture 6: Outline Recap the stationary distribution, and the vector ( k ) . The vector ( k

An Introduction to Spectral Learning Hanxiao Liu November 8, 2013 An Introduction to Spectral

Persistence of Gaussian Stationary Processes: a spectral perspective Naomi Feldheim (Stanford)

EC3062 ECONOMETRICS IDENTIFICATION OF ARMA MODELS A stationary stochastic process can be

Fuel Cells for Stationary Power Fuel Cells for Stationary Power Generation Generation A

Outlines Stochastic Process Discrete Time Markov Chain (DTMC) 2 Stochastic Process

Lesson 9 Introduction Signal Spectral Analysis: Estimation of the power spectral density

Product form stationary distributions of stochastic reaction networks, and application to

Tail of stationary probability of Stochastic Dynamical systems Gerold Alsmeyer (University of

Ruin problem for integrated stationary Gaussian process Kobelkov S. G. 1 Consider a random process

Robust Spectral Inference for Joint Stochastic Matrix Factorization Kun Dong Cornell University

r ts rs

Lecture 18. Time Series Nan Ye School of Mathematics and Physics University of Queensland 1 /

Time Series Analysis Henrik Madsen hm@imm.dtu.dk Informatics and Mathematical Modelling

Means Recall: We model a time series as a collection of random variables: x 1 , x 2 , x 3 , . .

Assessing the dependence of high-dimensional time series via sample autocovariances and

Volatility is rough Jim Gatheral 1 , Thibault Jaisson 2 and Mathieu Rosenbaum 3 2 1 City

Spectral distributions of high-dimensional sample correlation matrices under infinite variance

CS6220: DATA MINING TECHNIQUES Mining Time Series Data Instructor: Yizhou Sun yzsun@ccs.neu.edu