From Fourier to Koopman Spectral Methods for Long-term Time Series - PowerPoint PPT Presentation

From Fourier to Koopman Spectral Methods for Long-term Time Series Prediction arXiv:2004.00574 Henning Lange, Steven L. Brunton, J. Nathan Kutz

  Objective > Given data snapshots from x t t = 1 t = T to > Predict temporal snapshots x T + h > h in the order of 10.000   > Assumption: > x t is produced by quasi-periodic system

Spatio-Temporal Systems

Outline > Fourier Forecast > Similar to Fourier Transform > No implicit periodicity assumption   > Koopman Forecast > Based on Koopman theory > Fourier Transform in non-linear basis

Outline > Fourier Forecast > Non-convex objective   > Koopman Forecast > Non-linear and non-convex objective > FFT allows for obtaining global optima

Solution strategy > Both learning objectives contain easy and hard to optimize parameters > For both algorithms, the strategy for obtaining the global optimum of a single value of the hard to optimize parameters is introduced > Apply coordinate descent > Alternately optimize hard and easy quantities

Fourier Forecast

Objective > Goal: Fit linear dynamical system to data y t x t T ∑ minimize ( x t − Ay t ) 2 E ( A , B ) = t =1 subject to y t = By t − 1 Re [ eig ( B )] = 0

Objective > Goal: Fit linear dynamical system to data y t x t 2 sin( ω 1 t ) ⋮ T sin( ω N t ) ∑ E ( A , ω ) = x t − A cos( ω 1 t ) t =1 ⋮ cos( ω N t )

Objective > Goal: Fit linear dynamical system to data y t x t T 2 ∑ ( x t − A Ω ( ω t ) ) E ( A , ω ) = t =1

Objective > Goal: Fit linear dynamical system to data y t x t > Because of linearity of and A Ω > Analytic solution for ω i > Symmetry relationship to Fourier Transform T 2 ∑ ( x t − A Ω ( ω t ) ) E ( A , ω ) = t =1

Symmetry T 2 ∑ ( x t − A Ω ( ω t ) ) E ( A , ω ) = t =1 Jaynes, E. T . "Bayesian spectrum and chirp analysis." Maximum-Entropy and Bayesian Spectral Analysis and Estimation Problems. Springer, Dordrecht, 1987. 1-37.

Spectral leakage > For quasi-periodic systems, FT/error surface is superposition of sinc-functions

Combining FFT and GD > Fast Fourier Transform > evaluates the Fourier Transform at T frequencies with period > harmful for forecasting > Gradient Descent > because of non-convexity, will get stuck in bad local minimum

Combining FFT and GD > Use Fast Fourier Transform > to locate global valley of error surface > Use Gradient Descent > to improve initial guess of FFT to break implicit periodicity assumptions

Combining FFT and GD

Koopman Forecast

Koopman Theory > Koopman showed in 1931: > any non-linear dynamical system can be lifted by non-linear but time-invariant function into space where time evolution is linear Koopman, Bernard O. "Hamiltonian systems and transformation in Hilbert space." Proceedings of the National Academy of Sciences of the United States of America 17.5 (1931): 315 > Analogous to Cover’s theorem (1965) > Theoretical underpinning of Kernel methods and Deep Learning Cover, T .M. (1965). "Geometrical and Statistical properties of systems of linear inequalities with applications in pattern recognition" (PDF). IEEE Transactions on Electronic Computers. EC-14 (3): 326–334

Koopman Theory f Koopman: Cover:

Objective: Koopman > Recap: Stable Linear Dynamical System sin( ω 1 t ) ⋮ sin( ω N t ) Ω ( ω t ) = cos( ω 1 t ) ⋮ cos( ω N t )

Objectives T 2 ∑ Koopman: ( x t − f Θ ( Ω ( ω t )) ) E ( Θ , ω ) = t =1 T 2 Fourier: ∑ ( x t − A Ω ( ω t ) ) E ( A , ω ) = t =1

Objectives T 2 ∑ Koopman: ( x t − f Θ ( Ω ( ω t )) ) E ( Θ , ω ) = t =1

Objective: Koopman T 2 ∑ Koopman: ( x t − f Θ ( Ω ( ω t )) ) E ( Θ , ω ) = t =1 Neural Network parameterized by Θ

Objective: Koopman T 2 ∑ Koopman: ( x t − f Θ ( Ω ( ω t )) ) E ( Θ , ω ) = t =1 Because of non-linearity, no analytical solution for ω i

Objective: Koopman T 2 ∑ Koopman: ( x t − f Θ ( Ω ( ω t )) ) E ( Θ , ω ) = t =1 However, in spite of non-linearity and non-convexity , computing global optima in direction of possible! ω i

Objective: Koopman T 2 ∑ Koopman: ( x t − f Θ ( Ω ( ω t )) ) E ( Θ , ω ) = t =1 T ∑ = L ( Θ , ω , t ) t =1 2 L ( Θ , ω , t ) = ( x t − f Θ ( Ω ( ω t )) )

Periodicity in loss t , t ) = ( x t − f Θ ( Ω (( ω + 2 π 2 t ) t )) ) L ( Θ , ω + 2 π 2 = ( x t − f Θ ( Ω ( ω t )) ) = L ( Θ , ω , t )

Periodicity in loss L ( Θ , ω , t ) = L ( Θ , ω + 2 π t , t ) sin(( ω + 2 π t ) t ) = sin ( ω t + 2 π ) = sin( ω t )

Periodicity in loss L ( Θ , ω , t ) = L ( Θ , ω + 2 π t , t )

Computing the loss 2 π For all , compute loss within t t

Computing the loss For all , repeat computed loss times t t

Computing the loss For all , resample loss t

Computing the loss Sum all ‘temporally local’ losses + +

Computing the loss + + =

Computing the loss Easy and efficient to implement in freq. domain! for t in range(T): E_ft[range(K)*t] += fft(L[t]) E = ifft(E_ft)

Results

Results: Theoretical > Fourier algorithm has universal approximation properties on finite datasets > Sines and cosine form an orthogonal basis > which is periodic in T > Analogous to Cover’s theorem, requires N dimensional space

Results: Theoretical > For infinite data, Koopman algorithm is more expressive than Fourier counterpart

Results: Theoretical > Close relationship to Bayesian Spectral analysis > Error grows linear in time and with noise variance > But shrinks superlinearly with amount of data x t ( ω *) | ∈ 𝒫 ( A i ) σ 2 t T 3 ∑ | ̂ x t ( ω ) − ̂ i Bretthorst, G. Larry. Bayesian spectrum analysis and parameter estimation. Vol. 48. Springer Science & Business Media, 2013. Jaynes, E. T . "Bayesian spectrum and chirp analysis." Maximum-Entropy and Bayesian Spectral Analysis and Estimation Problems. Springer, Dordrecht, 1987. 1-37.

Results: Practical x t = sin ( 17 24 t ) 2 π + ϵ t

Results: Practical

Summary > Fit linear and non-linear oscillators to data > non-convex and non-linear objective > Many real world phenomena are quasi-periodic > gait, (space) weather, fluid flows, epidemiological data, power systems, sales, room occupancy, …   > Code is available: > https://github.com/helange23/from_fourier_to_koopman

From Fourier to Koopman Spectral Methods for Long-term Time Series - PowerPoint PPT Presentation

From Fourier to Koopman Spectral Methods for Long-term Time Series Prediction arXiv:2004.00574 Henning Lange, Steven L. Brunton, J. Nathan Kutz Objective > Given data snapshots from x t t = 1 t = T to > Predict temporal snapshots x

Fourier Series and Transform Overview Why Fourier transform? Trigonometric functions Who is

Chapter 4 Chapter 4 The Fourier Series and The Fourier Series and Fourier Transform Fourier

Chapter 4 Chapter 4 The Fourier Series and The Fourier Series and Fourier Transform Fourier

Fourier Series Fourier Sine Series Fourier Cosine Series Fourier Series Convergence

Topic 5: Discrete-Time Fourier Transform (DTFT) o DT Fourier Transform o Overview of Fourier

Syntax and Morphology; a Single Computational Engine Hilda Koopman koopman@ucla.edu University

Topic 4: Continuous-Time Fourier Transform (CTFT) o Introduction to Fourier Transform o Fourier

Modelling the Business Cycle Siem Jan Koopman s.j.koopman@feweb.vu.nl Vrije Universiteit

Introduction to State Space Methods Siem Jan Koopman s.j.koopman@feweb.vu.nl Vrije Universiteit

Decomposition for Koopman Analysis of Time-Variant Systems Naoya Takeishi (RIKEN) Takehisa Yairi

Distributed Embedded System Architecture Philip Koopman koopman@cmu.edu July 12, 2002

Fourier Transform for Partial Differential Equations Introduction: Fourier Transform

Signals and Systems Chapter 4: The Continuous Time Fourier Transform Derivation of the CT Fourier

and its Applications Karl Rupp karlirupp@hotmail.com Fourier Transform p.1/22 Content

Parallel Fast Fourier Transforms Gavin J. Pringle Joahcim Hein Introduction The Fourier

Lecture 5: Fourier Series and Discrete Fourier Transform Mark Hasegawa-Johnson ECE 401: Signal

Lecture 23: Fourier Transform, Convolution Theorem, and Linear Dynamical Systems April 28, 2016.

5.6 Convolution and FFT Fast Fourier Transform: Applications Applications. Optics,

How to Write Fast Numerical Code Spring 2011 Lecture 21 Instructor: Markus Pschel TA: Georg

CS 294-73 Software Engineering for Scientific Computing Lecture 11: Fourier

Fast Fourier Transform Integer multiplication Multiplying two n-bit integers A and B: Grade

Complexity Theory of Polynomial-Time Problems Lecture 6: 3SUM Part I Karl Bringmann 3SUM given

Fast Fourier Transform 2 Announcements HW 3 posted tonight (after this) 3 Fast Fourier

Integer multiplication with generalized Fermat primes Svyatoslav Covanov CARAMEL Team, LORIA,