sparse pca lda
play

Sparse PCA / LDA T x A x max T General x B x symmetric matrices - PowerPoint PPT Presentation

Baback Moghaddam Machine Learning Group baback @ jpl.nasa.gov NASA Sounder Science Team Meeting November 3 rd - 5 th 2010 Sparse PCA / LDA T x A x max T General x B x symmetric matrices (given) Problem Formulation k card ( x ) x k subject


  1. Baback Moghaddam Machine Learning Group baback @ jpl.nasa.gov NASA Sounder Science Team Meeting November 3 rd - 5 th 2010

  2. Sparse PCA / LDA T x A x max T General x B x symmetric matrices (given) Problem Formulation k card ( x ≤ ) x k subject to : ie . ≤ 0

  3. DNA Microarray • n ~ 10 3 -10 4 genes • m ~ 10 2 samples • 2 classes: cancer vs. healthy subject to : card( x ) = k

  4. x 2 x 1

  5. 98% of spectral variance is in 500 frequencies (total of 2500), hence yielding a 5:1 compression AQUA/AIRS radiance data (June 4 th 2007)

  6. Hyperion Sulfur Detection Sparse Classifier (best 12 of 242 channels) 1024-by-256 imagery of Detection Performance (ROC) sulfur-rich Borup Fiord glacier also measured by 242-band Hyperion sensor during 2006-07 For FPR > 1% the 12-band detection rate is as good as using all 242 bands, yielding 20:1 compression ratio

  7. AIRS Dataset July, 2005 ~20,000 spectra Sec3on of Pacific with stratocumulus, cumulus and deep convec3ve clouds

  8. AIRS Spectrum

  9. Cloudy/Clear Classifier : 1-freq H 2 O

  10. Cloudy/Clear Classifier : 2-freqs CO 2 H 2 O

  11. Cloudy/Clear Classifier : 5-freqs CO 2 O 3 H 2 O

  12. Cloudy/Clear Classifier : 50-freqs CO 2 O 3 H 2 O

  13. Current Work • Algorithmic Enhancements • formulated Sparse-LDA as Sparse Regression problem • this speeds up optimization, reduces CPU time by factor of ~10 3 • Dataset Preparation • Selected suitable hyperspectral datasets from AIRS archive • IR spectra for a whole month (huge data matrix = 20000 x 1843) • Visual data in four frequency bands (from AIRS VIS instrument) • Demo of AIRS Cloudy/Clear sparse classifier • Separation of cloudy from clear data based on Level 1 data • Worked with Prof. Yung (Caltech) using their method of cloud separation • Tested 2 methods of cloud separation by AIRS Project Scientist G. Aumann

  14. Future Work • Methodology • Test current algorithm with a 3 rd cloud separation criterion (based on CO 2 retrievals) as suggested/used by Bill Irion (JPL) • Select more varied AIRS datasets (ocean and land regions separated) and perform cross-dataset validation • Missions • Meet with AIRS Science Team to discuss and propose new product (L1 D) based on our preliminary Sparse-LDA algorithm results

  15. Publications

Recommend


More recommend