Scientific Computing Maastricht Science Program Week 5 Frans - PowerPoint PPT Presentation

Scientific Computing Maastricht Science Program Week 5 Frans Oliehoek <frans.oliehoek@maastrichtuniversity.nl>

Announcements  I will be more strict!  Requirements updated...  YOU are responsible that the submission satisfies the requirements!!!  I will not email you until the rest has their mark.

Recap Last Two Week  Supervised Learning  find f that maps {x 1 (j) ,...,x D (j) } → y (j)  Interpolation  f goes through the data points  linear regression  lossy fit, minimizes 'vertical' SSE  Unsupervised Learning x 1  We just have data points {x 1 (j) ,...,x D (j) }  PCA  minimizes orthogonal projection u =( u 1, u 2 ) x 2

Recap: Clustering  Clustering or Cluster Analysis has many applications  Understanding  Astronomy, Biology, etc.  Data (pre)processing  summarization of data set  compression  Are there questions about k-means clustering?

This Lecture  Last week: unlabeled data (also 'unsupervised learning')  data: just x  Clustering  Principle Components analysis (PCA) – what?  This week  Principle Components analysis (PCA) – how?  Numerical differentiation and integration.

Part 1: Principal Component Analysis ● Recap ● How to do it?

PCA – Intuition  How would you summarize this data using 1 dimension? (what variable contains the most information?) x 2 Very important idea The most information is contained by the variable with the largest spread. ● i.e., highest variance (Information Theory) x 1

PCA – Intuition  How would you summarize this data using 1 dimension? (what variable contains the most information?) so if we have to chose x 2 between x 1 and x 2 → remember x 2 Very important idea Transform of k -th point: ( k ) , x 2 ( k ) )→( z 1 ( k ) ) ( x 1 The most information is contained by the variable with where the largest spread. ( k ) = x 2 ( k ) ● i.e., highest variance z 1 (Information Theory) x 1

PCA – Intuition  How would you summarize this data using 1 dimension? Transform of k -th point: x 2 ( k ) , x 2 ( k ) )→( z 1 ( k ) ) ( x 1 where z 1 is the orthogonal scalar projection on (unit vector) u (1) : ( k ) = u 1 ( 1 ) x 1 ( k ) + u 2 ( 1 ) x 2 ( k ) =( u ( 1 ) , x ( k ) ) z 1 u x 1

More Principle Components  u (2) is the direction with most 'remaining' variance  orthogonal to u (1) ! x 2 In general ● If the data is D-dimensional ● We can find D directions ( 1 ) , ... ,u ( D ) u ● Each direction itself is a D-vector: ( i ) =( u 1, ( i ) ... ,u D ( i ) ) u x 1 ● Each direction is orthogonal to the others: ( i ) ,u ( j ) )= 0 ( u ● The first direction is has most variance ( D ) ● The least variance is in direction u

PCA – Goals  All directions of high variance might be useful in itself  Analysis of data  In the lab you will analyze the ECG signal of a patient with a heart disease.

PCA – Goals  All directions of high variance might be useful in itself  But not for dimension reduction...  Given X (N data points of D variables) → Convert to Z (N data points of d variables) ( 0 ) , x 2 ( 0 ) , ... , x D ( 0 ) )→( z 1 ( 0 ) , z 2 ( 0 ) , ... , z d ( 0 ) ) ( x 1 ( 1 ) , x 2 ( 1 ) , ... , x D ( 1 ) )→( z 1 ( 1 ) , z 2 ( 1 ) , ... , z d ( 1 ) ) ( x 1 ... ( n ) , x 2 ( n ) , ... , x D ( n ) )→( z 1 ( n ) , z 2 ( n ) , ... , z d ( n ) ) ( x 1 ( 0 ) , z i ( 1 ) , ... , z i ( n ) ) ( z i The vector is called the i -th principal component (of the data set)

PCA – Dimension Reduction  Approach  Step 1: ( 0 ) , x 2 ( 0 ) , ... , x D ( 0 ) )→( z 1 ( 0 ) , z 2 ( 0 ) , ... , z D ( 0 ) ) ( x 1 ( 1 ) , x 2 ( 1 ) , ... , x D ( 1 ) )→( z 1 ( 1 ) , z 2 ( 1 ) , ... , z D ( 1 ) ) ( x 1  find all directions (and principal components) ... ( n ) , x 2 ( n ) , ... , x D ( n ) )→( z 1 ( n ) , z 2 ( n ) , ... , z D ( n ) ) ( x 1  Step 2: …?

PCA – Dimension Reduction  Approach  Step 1: ( 0 ) , x 2 ( 0 ) , ... , x D ( 0 ) )→( z 1 ( 0 ) , z 2 ( 0 ) , ... , z D ( 0 ) ) ( x 1 ( 1 ) , x 2 ( 1 ) , ... , x D ( 1 ) )→( z 1 ( 1 ) , z 2 ( 1 ) , ... , z D ( 1 ) ) ( x 1  find all directions (and principal components) ... ( n ) , x 2 ( n ) , ... , x D ( n ) )→( z 1 ( n ) , z 2 ( n ) , ... , z D ( n ) ) ( x 1  Step 2: first d<D PCs contain  keep only the directions with most information! high variance. ( 0 ) , x 2 ( 0 ) , ... , x D ( 0 ) )→( z 1 ( 0 ) , z 2 ( 0 ) , ... , z d ( 0 ) ) ( x 1 → the principal components ( 1 ) , x 2 ( 1 ) , ... , x D ( 1 ) )→( z 1 ( 1 ) , z 2 ( 1 ) , ... , z d ( 1 ) ) with much information ( x 1 ... ( n ) , x 2 ( n ) , ... , x D ( n ) )→( z 1 ( n ) , z 2 ( n ) , ... , z d ( n ) ) ( x 1

PCA – More Concrete  PCA  finding all the directions, and  principle components  Data compression using PCA  computing compressed representation  computing reconstruction

PCA – More Concrete still to be shown  PCA (using eigen decomposition of cov. matrix)  finding all the directions, and  principle components Easy! for k -th point: ( k ) =( u ( j ) ,x ( k ) ) z j  Data compression using PCA  computing compressed representation  computing reconstruction Easy! For k -th point just keep ( k ) , ... , z d ( k ) ) ( z 1 still to be shown (we show that data is a linear combination of the PCs)

Computing the directions U Note: X is now D x N (before N x D) Algorithm  X is the DxN data matrix 1)Preprocessing: ● scale the features ● make X zero mean 2)Compute the data covariance matrix 3) Perform eigen decomposition  directions u i are the eigenvectors of C  variance of u i is the corresponding eigenvalue

Computing the directions U Algorithm  X is the DxN data matrix ( k ) 2 x i ( k ) = x i 1)Preprocessing: ( l ) − min m x i ( l ) max l x i ● scale the features ● make X zero mean 2)Compute the data covariance matrix 3) Perform eigen decomposition  directions u i are the eigenvectors of C  variance of u i is the corresponding eigenvalue

Computing the directions U Algorithm  X is the DxN data matrix μ N − 1 ● Compute μ i = 1 N ∑ ( k ) x i (the mean data point) 1)Preprocessing: k = 1 ● scale the features ● subtract the mean ( k ) = x ( k ) −μ from each point x ● make X zero mean 2)Compute the data covariance matrix 3) Perform eigen decomposition  directions u i are the eigenvectors of C  variance of u i is the corresponding eigenvalue

Computing the directions U Algorithm  X is the DxN data matrix 1)Preprocessing: ● scale the features ● Data covariance matrix ● make X zero mean C = 1 T N XX 2)Compute the data covariance matrix 3) Perform eigen decomposition  directions u i are the eigenvectors of C  variance of u i is the corresponding eigenvalue

Computing the directions U Algorithm  X is the DxN data matrix ● A square matrix has eigenvectors: 1)Preprocessing: map to a multiple of themselves ● scale the features C x =λ x ● make X zero mean 2)Compute the eigenvector data covariance matrix (scalar) eigenvalue 3) Perform eigen decomposition  directions u i are the eigenvectors of C  variance of u i is the corresponding eigenvalue

Computing the directions U Algorithm  X is the DxN data matrix ● A square matrix has eigenvectors: 1)Preprocessing: map to a multiple of themselves ● scale the features C x =λ x [eigenvectors, eigenvals] = eig(C) ● make X zero mean % 'eig' delivers eigenvectors with % the wrong order 2)Compute the % so we flip the matrix U = fliplr(eigenvectors) eigenvector data covariance matrix (scalar) eigenvalue % U(i, :) now is the i-th direction 3) Perform eigen decomposition  directions u i are the eigenvectors of C  variance of u i is the corresponding eigenvalue

PCA – More Concrete still to be shown  PCA (using eigen decomposition of cov. matrix)  finding all the directions, and  principle components Easy! for k -th point: ( k ) =( u ( j ) ,x ( k ) ) z j  Data compression using PCA  computing compressed representation  computing reconstruction Easy! For k -th point just keep ( k ) , ... , z d ( k ) ) ( z 1 still to be shown (we show that data is a linear combination of the PCs)

Scientific Computing Maastricht Science Program Week 5 Frans - PowerPoint PPT Presentation

Scientific Computing Maastricht Science Program Week 5 Frans Oliehoek <frans.oliehoek@maastrichtuniversity.nl> Announcements I will be more strict! Requirements updated... YOU are responsible that the submission satisfies the

Scientific Computing Albert-Jan Yzelman (May 10, 2010) Scientific Computing is... a two-years

Scientific report Mariusz ynel April 22, 2015 Scientific report 2 Contents 1 Scientific

The Scientific Method The Scientific Method The Scientific Method involves 6 steps: Problem

SCIENCE SCIENCE Scientific Question Hypothesis Prediction Experimental Test Scientific

Scientific Programming in mpags-python.github.io Steven Bamford An introduction to scientific

Scientific Computing & Visualization Scientific Computing & Visualization Dr. Robert

CS 3220: Introduction to Scientific Computing Steve Marschner Spring 2010 scientific computing :

Scientific Computing I Conclusion and Outlook Michael Bader Lehrstuhl Informatik V Winter

Scientific Computing I Module 3: Population Modelling Continuous Models (Part III) Michael

Scientific Computing What is scientific computing ? Design and analysis of algorithms for solving

Scientific Computing Chad Sockwell Florida State University kcs12j@my.fsu.edu October 27, 2015

Trustworthy Computing * Reverse engineers agree on that! Trustworthy Computing Trustworthy

How to Read and Present a Scientific Paper Tan Zhang 1 Part I: Reading a Scientific Paper

Scientific Computing I Module 8: Discretisation of PDEs Michael Bader Lehrstuhl Informatik V

Modernising Scientific Careers: Modernising Scientific Careers: The story so far and future

life-cycle of the product Spiros Vamvakas Head of Scientific Advice Product Development

Warm-up Question Do you understand the following sentence? The set of 2 2 symmetric matrices

Why Quantum (Wave Probability) Need for a Geometric . . . Models Are a Good Description of Many

Linearly independent functions Definition The set of functions { 1 , . . . , n } is called

A Quick Review of Linear Algebra (linear combination, linear independence, span, basis) +

Linear algebra and differential equations (Math 54): Lecture 11 Vivek Shende February 28, 2019

Linear Algebra Chapter 2. Dimension, Rank, and Linear Transformations Section 2.2. The Rank of a

Limits on Representing Functions by Linear Combinations of Simple Functions 0,1

r r Prof. Inder K. Rana Room 112 B Department of Mathematics

Scientific Computing Maastricht Science Program Week 5 Frans - PowerPoint PPT Presentation

Scientific Computing Maastricht Science Program Week 5 Frans Oliehoek <frans.oliehoek@maastrichtuniversity.nl> Announcements I will be more strict! Requirements updated... YOU are responsible that the submission satisfies the

Scientific Computing Albert-Jan Yzelman (May 10, 2010) Scientific Computing is... a two-years

Scientific report Mariusz ynel April 22, 2015 Scientific report 2 Contents 1 Scientific

The Scientific Method The Scientific Method The Scientific Method involves 6 steps: Problem

SCIENCE SCIENCE Scientific Question Hypothesis Prediction Experimental Test Scientific

Scientific Programming in mpags-python.github.io Steven Bamford An introduction to scientific

Scientific Computing &amp; Visualization Scientific Computing &amp; Visualization Dr. Robert

CS 3220: Introduction to Scientific Computing Steve Marschner Spring 2010 scientific computing :

Scientific Computing I Conclusion and Outlook Michael Bader Lehrstuhl Informatik V Winter

Scientific Computing I Module 3: Population Modelling Continuous Models (Part III) Michael

Scientific Computing What is scientific computing ? Design and analysis of algorithms for solving

Scientific Computing Chad Sockwell Florida State University kcs12j@my.fsu.edu October 27, 2015

Trustworthy Computing * Reverse engineers agree on that! Trustworthy Computing Trustworthy

How to Read and Present a Scientific Paper Tan Zhang 1 Part I: Reading a Scientific Paper

Scientific Computing I Module 8: Discretisation of PDEs Michael Bader Lehrstuhl Informatik V

Modernising Scientific Careers: Modernising Scientific Careers: The story so far and future

life-cycle of the product Spiros Vamvakas Head of Scientific Advice Product Development

Warm-up Question Do you understand the following sentence? The set of 2 2 symmetric matrices

Why Quantum (Wave Probability) Need for a Geometric . . . Models Are a Good Description of Many

Linearly independent functions Definition The set of functions { 1 , . . . , n } is called

A Quick Review of Linear Algebra (linear combination, linear independence, span, basis) +

Linear algebra and differential equations (Math 54): Lecture 11 Vivek Shende February 28, 2019

Linear Algebra Chapter 2. Dimension, Rank, and Linear Transformations Section 2.2. The Rank of a

Limits on Representing Functions by Linear Combinations of Simple Functions 0,1

r r Prof. Inder K. Rana Room 112 B Department of Mathematics

Scientific Computing & Visualization Scientific Computing & Visualization Dr. Robert