Prototypes and Matrix Relevance Learning in Complex Fourier Space - PowerPoint PPT Presentation

Prototypes and Matrix Relevance Learning in Complex Fourier Space M. Straat, M. Kaden, M. Gay, T. Villmann, A. Lampe, U. Seiffert, M. Biehl, and F. Melchert June 26, 2017

Overview A study of classification of time series. In Fourier space: Vectors in C n . Generalized Matrix Learning Vector Quantization (GMLVQ) on complex-valued data. Evaluation and interpretation of the Fourier-space classifiers. Plane examples 3 2.5 2 1.5 Feature value 1 0.5 0 -0.5 -1 -1.5 -2 0 50 100 150 Feature index

Learning Vector Quantization (LVQ) Dataset of vectors ① m ∈ R N , each carrying class label σ m ∈ { 1 , 2 , ..., C } Training: For each class σ , identify prototype(s) ✇ i ∈ R N in feature space that are typical representatives for that class. Aim: Classify novel vectors ① µ , assigning them to the class of the nearest prototype.

Figure: LVQ with 5 prototypes per class. Initialized with K-means on each class. Black line : Piece-wise linear decision boundary.

d ( ① , ✇ ) = ( ① − ✇ ) T ( ① − ✇ ), sq. Euclidean distance. 1: procedure LVQ for each training epoch do 2: for each labeled vector { ① , σ } do 3: { ✇ ∗ , S ∗ } ← argmin i { d ( ① , ✇ i ) } 4: ✇ ∗ ← ✇ ∗ + η Ψ( S ∗ , σ )( ① − ✇ ∗ ) 5: � � +1 , if S = σ Ψ( S , σ ) = − 1 , otherwise Classification of novel data point ① µ : Closest prototype { ✇ ∗ , S ∗ } ← argmin { d ( ① µ , ✇ i ) } i Classify ① µ in class S ∗ : { ① µ , σ µ = S ∗ }

GMLVQ Learn feature relevance and adapt d accordingly. Adaptive quadratic distance measure: d Ω ( ① , ✇ ) = ( ① − ✇ ) T Ω T Ω ( ① − ✇ ). Update two prototypes upon presentation of { ① , σ } . ✇ + : Closest prototype of the same class as ① . ✇ − : Closest prototype of a different class than ① . Cost one example ① m e m = d Ω [ ✇ + ] − d Ω [ ✇ − ] d Ω [ ✇ + ] + d Ω [ ✇ − ] ∈ [ − 1 , 1] . Learning is minimization of the cost with gradient descent: ✇ ± ← ✇ ± − η w ∇ ✇ ± e m Ω ← Ω − η Ω ∇ Ω e m

Time series Example 2 1.5 1 f ( t ) → f ( i ∆ T ) , i = 0 , 1 , ..., N − 1 Magnitude 0.5 Vectors ① ∈ R N . 0 Temporal order of -0.5 dimensions. -1 -1.5 0 200 400 600 800 1000 Feature index (sample index)

Training in coefficient space Approximate f ( t ) = � n i =1 c i g i ( t ): Using Chebyshev basis. Using Fourier basis: ① ∈ R N → ① f ∈ C n . Prototypes ✇ i ∈ C n and relevances Λ Hermitian. Figure: 5 Chebyshev basis functions Figure: Fourier complex sinusoid F. Melchert, U. Seiffert, and M. Biehl, Polynomial Approximation of Spectral Data in LVQ and Relevance Learning, in Workshop on New Challenges in Neural Computation 2015

Fourier: Time ⇆ Frequency Matrix ❋ ∈ C n × N with rows e − j 2 π kn / N , k = 0 , 1 , 2 , ..., N − 1. Forward (DFT): ① f = ❋① ∈ C n Backward (iDFT): ① = 1 N ❋ H ① f ∈ R N Example Frequency magnitudes 150 2 1.5 1 100 Magnitude Magnitude 0.5 0 50 -0.5 -1 -1.5 0 0 200 400 600 800 1000 0 200 400 600 800 1000 Feature index (sample index) Frequency

GMLVQ complex-valued data Quadratic distance measure d Λ [ ① f , ✇ f ] = ( ① f − ✇ f ) H Ω H Ω ( ① f − ✇ f ) ∈ R ≥ 0 . Cost one example ① m f e m = d Λ [ ✇ + f ] − d Λ [ ✇ − f ] f ] ∈ [ − 1 , 1] . d Λ [ ✇ + f ] + d Λ [ ✇ − Compute gradients w.r.t. ✇ + f , ✇ − and Ω for learning: f f e µ = ∂ e µ ∂ d Λ ∇ ✇ + ∂ d + ∂ ✇ + Λ f

Wirtinger derivatives f ( z ) : C → R . � � � � ∂ z = 1 ∂ ∂ x − i ∂ ∂ ∂ z ∗ = 1 ∂ ∂ x + i ∂ ∂ Operators and 2 ∂ y 2 ∂ y ∂ z = z ∗ and f ( z ) = z · z ∗ , then ∂ f ∂ f ∂ z ∗ = z . Wirtinger gradients � T � T � � ∂ ∂ ∂ ∂ ∂ ∂ ∂ z = , ..., and ∂ z ∗ = , ..., ∂ z ∗ ∂ z ∗ ∂ z 1 ∂ z N 1 N Using the Wirtinger gradient: ∂ ∂ z ∗ ( z H ❆ z ) = ❆ z M. Gay, M. Kaden, M. Biehl, A. Lampe, and T. Villmann, ”Complex variants of GLVQ based on Wirtinger’s calculus”

Learning rules Complex-valued GMLVQ (Wirtinger) f d Λ [ ① f , ✇ f ] = − Ω H Ω ( ① f − ✇ f ) , ∇ ✇ ∗ ∇ Ω ∗ d Λ [ ① f , ✇ f ] = Ω ( ① f − ✇ f )( ① f − ✇ f ) H . Relevance matrix Λ = Ω H Ω is Hermitian. Real-valued GMLVQ ∇ ✇ d Λ [ ① , ✇ ] = − 2 Ω T Ω ( ① − ✇ ) , ∇ Ω d Λ [ ① , ✇ ] = Ω ( ① − ✇ )( ① − ✇ ) T . Relevance matrix Λ = Ω T Ω is symmetric (also Hermitian). After each epoch, normalize Λ such that tr( Λ ) = 1.

The testing scenarios 1 GMLVQ in original time domain on vectors ① ∈ R N . 2 GMLVQ (Wirtinger) in complex Fourier space on vectors ① f ∈ C n with n = [6 , 11 , ..., 51]. 3 GMLVQ in Fourier space on vectors ① f ∈ R 2 n , real and imaginary concatenated. 4 GMLVQ on smoothed time domain vectors ˆ ① ∈ R N . Before training... All dimensions z-score transformed. One prototype per class. Initialization prototype class i : ✇ i ≈ mean( { ( ① , y ) | y == i } ). Λ = cI .

Plane dataset 210 labeled vectors ( x , y ) ∈ R 144 × { 1 , 2 , ..., 7 } 105/105 train/val vectors. Plane examples 3 2.5 2 1.5 Feature value 1 0.5 0 -0.5 -1 -1.5 -2 0 50 100 150 Feature index

Plane - Classification performance Accuracies of the 4 testing scenarios on validation set

Interpreting the classifier Prototypes ✇ i f ∈ C n Matrix Λ f is Hermitian: Λ f = Λ H f 2 prototypes Plane 21-coeff Fourier 80 60 Magnitude 40 20 0 1 2 Prototype Map prototypes to time domain with iDFT: ✇ i = 1 N ❋ H ✇ i f . Relevance matrix to time domain: d [ ① f , ✇ f ] = ( ① − ✇ ) H ❋ H Λ f ❋ ( ① − ✇ ).

Plane - Prototypes and feature relevance Time domain training vs. 21 coefficient Fourier space Prototypes Plane Backtransformed prototypes Plane 3 3 2 2 1 1 Value Value 0 0 -1 -1 -2 -2 0 50 100 150 0 50 100 150 Feature Feature Relevances Plane Relevances Plane (backtransformed) 0.03 0.04 Relevance Relevance 0.02 0.02 0.01 0 0 0 50 100 150 0 50 100 150 Feature Feature

Symbols dataset 1020 feature vectors ( x , y ) ∈ R 398 × { 1 , 2 , ..., 6 } 25/995 train/validation vectors. Symbols examples 2.5 2 1.5 1 Feature value 0.5 0 -0.5 -1 -1.5 -2 -2.5 0 50 100 150 200 250 300 350 400 Feature index

Symbols - Classification performance Accuracies of the 4 testing scenarios on validation set

Mallat dataset 2400 feature vectors ( x , y ) ∈ R 1024 × { 1 , 2 , ..., 8 } 55/2345 train/validation vectors. Mallat examples 2.5 2 1.5 1 Feature value 0.5 0 -0.5 -1 -1.5 0 100 200 300 400 500 600 700 800 900 1000 Feature index

Mallat - Classification performance Accuracies of the 4 testing scenarios on validation set

Mallat - Classification error curves Error development on the training and validation set 0.25 0.25 GMLVQ original space GMLVQ original space GMLVQ Complex Fourier GMLVQ Complex Fourier 0.2 0.2 GMLVQ concatenated Fourier GMLVQ concatenated Fourier train error test error 0.15 0.15 0.1 0.1 0.05 0.05 0 0 50 100 150 200 250 50 100 150 200 250 epoch epoch

Discussion Learning in complex Fourier-coefficient space... can be an effective method for classification of periodic functional data. can provide an efficient low-dimensional representation. has the potential to improve classification accuracy. For future research: How to obtain close to optimal accuracy with the least number of adaptive parameters.

Prototypes and Matrix Relevance Learning in Complex Fourier Space - PowerPoint PPT Presentation

Prototypes and Matrix Relevance Learning in Complex Fourier Space M. Straat, M. Kaden, M. Gay, T. Villmann, A. Lampe, U. Seiffert, M. Biehl, and F. Melchert June 26, 2017 Overview A study of classification of time series. In Fourier space:

Evalutation of E- -newspaper newspaper Evalutation of E prototypes prototypes E- -newspaper

Topic of this talk Topic of this talk From E- -Relevance Relevance From E to W- -Relevance

Relevance Vector Machines Jukka Lankinen LUT February 21, 2011 Jukka Lankinen Relevance Vector

Relevance Feedback Relevance Feedback Relevance Feedback Prof. Paolo Ciaccia Prof. Paolo

[3] The Matrix What is a matrix? Traditional answer Neo: What is the Matrix? Trinity: The answer

Matrix Multiplication Matrix Multiplication via Matrix-Vector Mult Defn. If matrix A is m n

Intermembrane Space H + H + Cyt c Co Q Complex Complex III IV H + ATPase H + Complex

Complex Numbers Complex Numbers 1 / 19 Complex Numbers Complex numbers ( C ) are an extension of

Introductory Matrix Operations Matrix Entries Defn. For matrix A , notation a ij means the en-

Building an IoT Platform with Matrix matthew@matrix.org http://www.matrix.org What is Matrix?

Liberating Communication with Matrix matthew@matrix.org http://www.matrix.org What is Matrix?

Gov 2000: 10. Multiple Regression in Matrix Form Matthew Blackwell Fall 2016 1 / 64 1. Matrix

An introduction to complex numbers The complex numbers Are the real numbers not sufficient? A

Exploiting Matrix Reuse and Data Locality in Sparse Matrix-Vector and Matrix-Transpose-Vector

Matrix COSEC Right People in Right Place at Right Time Matrix COmplete SECurity Matrix COSEC

Complexity of matrix multiplication (For Hierarchical matrix) For Usual matrix The

-4 -2 0 in 2 r, E 10 Erro 15 x 0 In-sample 20 x 1 25 s W eights, w ( )

WRAP-UP Joint IS-ENES Workshop on Workflows and Metadata Generation Costa da Caparica,

Constructing Generalized Bent Functions from Trace Forms over Galois Rings Xiaoming Zhang Key

Training DNNs: Tricks Ju Sun Computer Science & Engineering University of Minnesota, Twin

DeePattern: Layout Pattern Generation with Transforming Convolutional Auto-Encoder Haoyu Yang 1 ,

FPGA-specific arithmetic pipeline design using FloPoCo Bogdan Pasca, Ar enaire CARAMEL,

Data Invariants, Abstraction and Refinement Liam OConnor University of Edinburgh LFCS (and

Cryo-SeaNice : CryoSat SciEnce-oriented data ANalysis over Sea-ICE areas Pierre Fabry, Nicolas

Prototypes and Matrix Relevance Learning in Complex Fourier Space - PowerPoint PPT Presentation

Prototypes and Matrix Relevance Learning in Complex Fourier Space M. Straat, M. Kaden, M. Gay, T. Villmann, A. Lampe, U. Seiffert, M. Biehl, and F. Melchert June 26, 2017 Overview A study of classification of time series. In Fourier space:

Evalutation of E- -newspaper newspaper Evalutation of E prototypes prototypes E- -newspaper

Topic of this talk Topic of this talk From E- -Relevance Relevance From E to W- -Relevance

Relevance Vector Machines Jukka Lankinen LUT February 21, 2011 Jukka Lankinen Relevance Vector

Relevance Feedback Relevance Feedback Relevance Feedback Prof. Paolo Ciaccia Prof. Paolo

[3] The Matrix What is a matrix? Traditional answer Neo: What is the Matrix? Trinity: The answer

Matrix Multiplication Matrix Multiplication via Matrix-Vector Mult Defn. If matrix A is m n

Intermembrane Space H + H + Cyt c Co Q Complex Complex III IV H + ATPase H + Complex

Complex Numbers Complex Numbers 1 / 19 Complex Numbers Complex numbers ( C ) are an extension of

Introductory Matrix Operations Matrix Entries Defn. For matrix A , notation a ij means the en-

Building an IoT Platform with Matrix matthew@matrix.org http://www.matrix.org What is Matrix?

Liberating Communication with Matrix matthew@matrix.org http://www.matrix.org What is Matrix?

Gov 2000: 10. Multiple Regression in Matrix Form Matthew Blackwell Fall 2016 1 / 64 1. Matrix

An introduction to complex numbers The complex numbers Are the real numbers not sufficient? A

Exploiting Matrix Reuse and Data Locality in Sparse Matrix-Vector and Matrix-Transpose-Vector

Matrix COSEC Right People in Right Place at Right Time Matrix COmplete SECurity Matrix COSEC

Complexity of matrix multiplication (For Hierarchical matrix) For Usual matrix The

-4 -2 0 in 2 r, E 10 Erro 15 x 0 In-sample 20 x 1 25 s W eights, w ( )

WRAP-UP Joint IS-ENES Workshop on Workflows and Metadata Generation Costa da Caparica,

Constructing Generalized Bent Functions from Trace Forms over Galois Rings Xiaoming Zhang Key

Training DNNs: Tricks Ju Sun Computer Science &amp; Engineering University of Minnesota, Twin

DeePattern: Layout Pattern Generation with Transforming Convolutional Auto-Encoder Haoyu Yang 1 ,

FPGA-specific arithmetic pipeline design using FloPoCo Bogdan Pasca, Ar enaire CARAMEL,

Data Invariants, Abstraction and Refinement Liam OConnor University of Edinburgh LFCS (and

Cryo-SeaNice : CryoSat SciEnce-oriented data ANalysis over Sea-ICE areas Pierre Fabry, Nicolas

Training DNNs: Tricks Ju Sun Computer Science & Engineering University of Minnesota, Twin