How to choose the covariance for Gaussian process regression independently of the basis Workshop Gaussian Processes in Practice Matthias O. Franz and Peter V. Gehler June 12th, 2006
Motivation: Nonlinear system identification using Volterra series Characterisation of a nonlinear system y ( t ) = T [ x ( t )] by a series expansion y ( t ) = � n H n [ x ( t )] (Volterra, 1887): � y ( t ) = h ( 0 ) + h ( 1 ) ( τ 1 ) x ( t − τ 1 ) d τ 1 R � R 2 h ( 2 ) ( τ 1 , τ 2 ) x ( t − τ 1 ) x ( t − τ 2 ) d τ 1 d τ 2 + � R 3 h ( 3 ) ( τ 1 , τ 2 , τ 3 ) x ( t − τ 1 ) x ( t − τ 2 ) x ( t − τ 3 ) d τ 1 d τ 2 d τ 2 + + · · · Discretised form for x = ( x 1 , . . . , x m ) ⊤ ∈ R m � m � m i n = 1 h ( n ) H n [ x ] = i 1 = 1 · · · i 1 ... i n x i 1 . . . x i n .
Polynomial regression and Volterra systems Volterra expansions can be efficiently estimated by a regression in polynomial kernel functions (Franz & Schölkopf, 2006) k ihp ( x , x ′ ) = ( 1 + x ⊤ x ′ ) p ⇒ GP framework is applicable for the estimation of Volterra systems. Problems: Polynomial covariance implies strong correlation of distant inputs. In real-world problems, the reverse situation is more common. Typically, polynomial regression shows inferior performance than localized covariance functions. ⇒ Independent choice of covariance and basis
Decoupling of basis and covariance Basic idea: approximate a desired covariance function k GP ( x i , x j ) on a finite set S = { x 1 , . . . , x p } of input points. Weight-space view of a GP: k ( x i , x j ) = φ ( x i ) ⊤ Σ w φ ( x j ) . ⇒ Choose basis φ ( x ) and prior Σ w such that k GP ( x i , x j ) = φ ( x i ) ⊤ Σ w φ ( x j ) ∀ x i , x j ∈ S . Basis: Kernel PCA map φ ( x ) = K − 1 2 ( k ( x , x 1 ) , . . . , k ( x , x n )) ⊤ , solve system of linear equations in Σ w . ⇒ Arbitrary covariances can be approximated. Performance of polynomial regression can be significantly improved.
Recommend
More recommend