gatsby theoretical neuroscience lectures non gaussian
play

Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics - PowerPoint PPT Presentation

Definition of ICA Measures of nongaussianity Natural images, sparsity, ICA Independent subspaces and topography Image sequences Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts I-II Aapo Hyv


  1. Definition of ICA Measures of nongaussianity Natural images, sparsity, ICA Independent subspaces and topography Image sequences Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts I-II Aapo Hyv¨ arinen Gatsby Unit University College London 27 Feb 2017 Aapo Hyv¨ arinen Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statis

  2. Definition of ICA Measures of nongaussianity Natural images, sparsity, ICA Independent subspaces and topography Image sequences Outline ◮ Part I: Theory of ICA ◮ Definition and difference to PCA ◮ Importance of non-Gaussianity ◮ Part II: Natural images and ICA ◮ Application of ICA and sparse coding on natural images ◮ Extensions of ICA with dependent components ◮ Part III: Estimation of unnormalized models ◮ Motivation by extensions of ICA ◮ Score matching ◮ Noise-contrastive estimation ◮ Part IV: Recent extensions of ICA and natural image statistics ◮ A three-layer model, towards deep learning Aapo Hyv¨ arinen Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statis

  3. Definition of ICA Measures of nongaussianity Natural images, sparsity, ICA Independent subspaces and topography Image sequences Part I: Theory of ICA ◮ Definition of ICA as non-Gaussian generative model ◮ Importance of non-Gaussianity ◮ Fundamental difference to PCA ◮ Estimation by maximization of non-Gaussianity ◮ Measures of non-Gaussianity Aapo Hyv¨ arinen Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statis

  4. Definition of ICA Blind source separation Measures of nongaussianity Linear generative model Natural images, sparsity, ICA Comparison to PCA Independent subspaces and topography Identifiability by nongaussianity Image sequences Problem of blind source separation There is a number of “source signals”: Due to some external circumstances, only linear mixtures of the source signals are observed. Estimate (separate) original signals! Aapo Hyv¨ arinen Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statis

  5. Definition of ICA Blind source separation Measures of nongaussianity Linear generative model Natural images, sparsity, ICA Comparison to PCA Independent subspaces and topography Identifiability by nongaussianity Image sequences A solution is possible PCA does not recover original signals Aapo Hyv¨ arinen Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statis

  6. Definition of ICA Blind source separation Measures of nongaussianity Linear generative model Natural images, sparsity, ICA Comparison to PCA Independent subspaces and topography Identifiability by nongaussianity Image sequences A solution is possible PCA does not recover original signals Use information on statistical independence to recover: Aapo Hyv¨ arinen Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statis

  7. Definition of ICA Blind source separation Measures of nongaussianity Linear generative model Natural images, sparsity, ICA Comparison to PCA Independent subspaces and topography Identifiability by nongaussianity Image sequences Independent Component Analysis (H´ erault and Jutten, 1984-1991) ◮ Observed random variables x i are modelled as linear sums of hidden variables: m � x i = a ij s j , i = 1 ... n (1) j =1 ◮ Mathematical formulation of blind source separation problem ◮ Not unlike factor analysis ◮ Matrix of a ij is parameter matrix, called “mixing matrix”. ◮ The s i are hidden random variables called “independent components”, or “source signals” ◮ Problem: Estimate both a ij and s j , observing only x i . Aapo Hyv¨ arinen Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statis

  8. Definition of ICA Blind source separation Measures of nongaussianity Linear generative model Natural images, sparsity, ICA Comparison to PCA Independent subspaces and topography Identifiability by nongaussianity Image sequences When can the ICA model be estimated? ◮ Must assume: ◮ The s i are mutually statistically independent ◮ The s i are nongaussian (non-normal) ◮ (Optional:) Number of independent components is equal to number of observed variables ◮ Then: mixing matrix and components can be identified (Comon, 1994) A very surprising result! Aapo Hyv¨ arinen Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statis

  9. Definition of ICA Blind source separation Measures of nongaussianity Linear generative model Natural images, sparsity, ICA Comparison to PCA Independent subspaces and topography Identifiability by nongaussianity Image sequences Reminder: Principal component analysis ◮ Basic idea: find directions � i w i x i of maximum variance i w 2 ◮ We must constrain the norm of w : � i = 1, otherwise solution is that w i are infinite. ◮ For more than one component, find direction of max var orthogonal to components previously found. ◮ Classic factor analysis has essentially same idea as in PCA: explain maximal variance with limited number of components Aapo Hyv¨ arinen Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statis

  10. Definition of ICA Blind source separation Measures of nongaussianity Linear generative model Natural images, sparsity, ICA Comparison to PCA Independent subspaces and topography Identifiability by nongaussianity Image sequences Comparison of ICA, PCA, and factor analysis ◮ In contrast to PCA and factor analysis, components really give the original source signals or underlying hidden variables ◮ In PCA and factor analysis, only a subspace is properly determined (although an arbitrary basis is given as output) ◮ Catch: ICA only works when components are nongaussian ◮ Many psychological or social-science hidden variables (e.g. “intelligence”) may be (practically) gaussian because sum of many independent variables (central limit theorem). ◮ But signals measured by sensors are usually quite nongaussian Aapo Hyv¨ arinen Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statis

  11. Definition of ICA Blind source separation Measures of nongaussianity Linear generative model Natural images, sparsity, ICA Comparison to PCA Independent subspaces and topography Identifiability by nongaussianity Image sequences Some examples of nongaussianity 5 6 2 4 1.5 4 3 1 2 2 0.5 1 0 0 0 −0.5 −1 −2 −1 −2 −4 −1.5 −3 −2 −4 −6 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0.7 0.7 0.8 0.7 0.6 0.6 0.6 0.5 0.5 0.5 0.4 0.4 0.4 0.3 0.3 0.3 0.2 0.2 0.2 0.1 0.1 0.1 0 0 0 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 −4 −3 −2 −1 0 1 2 3 4 5 −6 −4 −2 0 2 4 6 Aapo Hyv¨ arinen Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statis

  12. Definition of ICA Blind source separation Measures of nongaussianity Linear generative model Natural images, sparsity, ICA Comparison to PCA Independent subspaces and topography Identifiability by nongaussianity Image sequences Why classic methods cannot find original components or sources ◮ In PCA and FA: find components y i which are uncorrelated cov( y i , y j ) = E { y i y j } − E { y i } E { y j } = 0 (2) and maximize explained variance (or variance of components) ◮ Such methods need only the covariances, cov( x i , x j ) ◮ However, there are many different component sets that are uncorrelated, because ◮ The number of covariances is ≈ n 2 / 2 due to symmetry ◮ So, we cannot solve the n 2 mixing coeffs, not enough information! (“More variables than equations”) Aapo Hyv¨ arinen Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statis

  13. Definition of ICA Blind source separation Measures of nongaussianity Linear generative model Natural images, sparsity, ICA Comparison to PCA Independent subspaces and topography Identifiability by nongaussianity Image sequences Nongaussianity, with independence, gives more information ◮ For independent variables we have E { h 1 ( y 1 ) h 2 ( y 2 ) } − E { h 1 ( y 1 ) } E { h 2 ( y 2 ) } = 0 . (3) ◮ For nongaussian variables, nonlinear covariances give more information than just covariances. ◮ This is not true for multivariate gaussian distribution ◮ Distribution is completely determined by covariances ◮ Uncorrelated gaussian variables are independent ⇒ ICA model cannot be estimated for gaussian data. Aapo Hyv¨ arinen Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statis

  14. Definition of ICA Blind source separation Measures of nongaussianity Linear generative model Natural images, sparsity, ICA Comparison to PCA Independent subspaces and topography Identifiability by nongaussianity Image sequences Whitening as preprocessing for ICA ◮ Whitening is usually done before ICA ◮ Whitening means decorrelation and standardization, E { xx T } = I . ◮ After whitening, A can be considered orthogonal. E { xx T } = I = A E { ss T } A T = AA T (4) ◮ Half of parameters estimated! (and other technical benefits) Aapo Hyv¨ arinen Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statis

  15. Definition of ICA Blind source separation Measures of nongaussianity Linear generative model Natural images, sparsity, ICA Comparison to PCA Independent subspaces and topography Identifiability by nongaussianity Image sequences Illustration Two components with uniform distributions: Original components, observed mixtures, PCA, ICA PCA does not find original coordinates, ICA does! Aapo Hyv¨ arinen Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statis

Recommend


More recommend