w hat is it useful for
play

W HAT IS IT USEFUL FOR ? The learning paradigm is useful whenever - PowerPoint PPT Presentation

L EARNING AND APPLICATIONS R EGULARIZATION M ETHODS FOR H IGH D IMENSIONAL L EARNING Francesca Odone and Lorenzo Rosasco odone@disi.unige.it - lrosasco@mit.edu Regularization Methods for High Dimensional Learning Learning and applications P LAN


  1. L EARNING AND APPLICATIONS R EGULARIZATION M ETHODS FOR H IGH D IMENSIONAL L EARNING Francesca Odone and Lorenzo Rosasco odone@disi.unige.it - lrosasco@mit.edu Regularization Methods for High Dimensional Learning Learning and applications

  2. P LAN Learning and engineering applications: why? Examples of in house applications Face and object detection Medical image analysis Microarray data analysis Regularization Methods for High Dimensional Learning Learning and applications

  3. L ET ’ S GO BACK TO THE BEGINNING The goal is not to memorize but to generalize (or to predict) Given a set of data ( x 1 , y 1 ) , . . . , ( x n , y n ) find a function f which is a good predictor of y for a future input x f ( x ) = y Regularization Methods for High Dimensional Learning Learning and applications

  4. W HAT IS IT USEFUL FOR ? The learning paradigm is useful whenever the underlying process is partially unknown, too complex, or too noisy to be modeled as a sequence of instructions. Regularization Methods for High Dimensional Learning Learning and applications

  5. T HE APPLICATIONS WE DEAL WITH Computer vision Face detection and recognition Object detection Image annotation Dynamic events and actions analysis Medical Image Analysis Automatic MR annotation Dictionary learning Computational biology Gene selection Regularization Methods for High Dimensional Learning Learning and applications

  6. T HE APPLICATIONS WE DEAL WITH Computer vision Face detection and recognition Object detection Image annotation Dynamic events and actions analysis Medical Image Analysis Automatic MR annotation Dictionary learning Computational biology Gene selection Regularization Methods for High Dimensional Learning Learning and applications

  7. T HE APPLICATIONS WE DEAL WITH Computer vision Face detection and recognition Object detection Image annotation Dynamic events and actions analysis Medical Image Analysis Automatic MR annotation Dictionary learning Computational biology Gene selection Regularization Methods for High Dimensional Learning Learning and applications

  8. T HE APPLICATIONS WE DEAL WITH Computer vision Face detection and recognition Object detection Image annotation Dynamic events and actions analysis Medical Image Analysis Automatic MR annotation Dictionary learning Computational biology Gene selection Regularization Methods for High Dimensional Learning Learning and applications

  9. T HE APPLICATIONS WE DEAL WITH Computer vision Face detection and recognition Object detection Image annotation Dynamic events and actions analysis Medical Image Analysis Automatic MR annotation Dictionary learning Computational biology Gene selection Regularization Methods for High Dimensional Learning Learning and applications

  10. T HE APPLICATIONS WE DEAL WITH Computer vision Face detection and recognition Object detection Image annotation Dynamic events and actions analysis Medical Image Analysis Automatic MR annotation Dictionary learning Computational biology Gene selection Regularization Methods for High Dimensional Learning Learning and applications

  11. T HE APPLICATIONS WE DEAL WITH Computer vision Face detection and recognition Object detection Image annotation Dynamic events and actions analysis Medical Image Analysis Automatic MR annotation Dictionary learning Computational biology Gene selection Regularization Methods for High Dimensional Learning Learning and applications

  12. P LAN Learning and engineering applications: why? Examples of in house applications Face and object detection Medical image analysis Microarray data analysis Regularization Methods for High Dimensional Learning Learning and applications

  13. L EARNING FROM IMAGES Object detection, image categorization and, more in general, image understanding are difficult problems Learning from examples has been accepted as a viable way to deal with such problems, addressing noise and intra-class variability by collecting appropriate data and finding suitable descriptions Images are relatively easy to gather Regularization Methods for High Dimensional Learning Learning and applications

  14. I MAGE DESCRIPTIONS WITH OVERCOMPLETE FEATURE SETS Overcomplete general purpose sets of features are effective for modeling visual information Many object classes have peculiar intrinsic structures that can be better appreciated if one looks for symmetries or local geometries Examples of features: wavelets, ranklets, chirplets, rectangle features, ... Examples of problems: face detection [Heisele et al., Viola & Jones, Destrero et al.], pedestrian detection [Oren et al.], car detection [Papageorgiou & Poggio] The approach is inspired by biological systems See, for instance, B. A. Olshauser and D. J. Field “Sparse coding with an over-complete basis set: a strategy employed by V1?” 1997 Regularization Methods for High Dimensional Learning Learning and applications

  15. F ACE DETECTION D ESTRERO ET AL , 2009 T HE CLASSIFICATION PROBLEM It is a (binary) classification problem : → each image region can either be a face or not We start from a training set of faces and non-faces images: { ( x 1 , y 1 ) , . . . , ( x n , y n ) } x i is a raw vector encoding the gray levels of image I i , y i = {− 1 , 1 } according to whether the image is a face or not I MAGE REPRESENTATION We represent images as rectangle feature vectors: x i → ( φ 1 ( x i ) , . . . , φ p ( x i )) Regularization Methods for High Dimensional Learning Learning and applications

  16. F ACE DETECTION A SSUMPTION We assume Φ β = Y where Φ = { Φ ij } is the data matrix; β = ( β 1 , ..., β p ) T vector of unknown weights to be estimated; Y = ( y 1 , ..., y n ) T output labels Usually p is big; existence of the solution is ensured, uniqueness is not The overcomplete set contains many correlated features Thus, the problem is ill-posed. We resort to regularization. S ELECT FACE FEATURES L1 regularization allow us to select a sparse subset of meaningful features for the problem, with the aim of discarding correlated ones R p � Y − β Φ � 2 + λ � β � 1 . min β ∈ I Regularization Methods for High Dimensional Learning Learning and applications

  17. A SAMPLED VERSION OF THE ALGORITHM Applying the algorithm starting from the entire set of feature is not computationally feasible ( Φ : 4000x64000 ≃ 1GB) We create many subsets of Random extractions of S 0 features randomly sampled 10% features w. repetition with repetition Subset 1 Subset 2 Subset 200 We run the algorithm Thresholded Landweber separately on each subset Selected Selected Selected features 1 features 2 features 200 We keep only features Keep features selected in every run in which selected in every run in which they were present S 1 they were present Regularization Methods for High Dimensional Learning Learning and applications

  18. T HE FINAL SET OF FACE FEATURES Positive and negative samples from the training set Notice how vertical symmetries are not captured by selected features Regularization Methods for High Dimensional Learning Learning and applications

  19. T HE SOLUTION DEPENDS ON THE TRAINING DATA In MIT+CMU training set all images are registered and well cropped Vertical symmetries are captured by selected features Regularization Methods for High Dimensional Learning Learning and applications

  20. F ACE DETECTION FACE CLASSIFICATION Elastic net regularization embeds both feature selection and prediction functionalities As suggested in (Candes & Tao, 2007) in order to improve the classification performance one could use L2 regularization on the reduced data representation. Since a main requirement of our application is real-time performance we adopt a linear SVM for classification: L1 + SVM gives us sparsity both on the representation and on the dataset and thus fewer computations Regularization Methods for High Dimensional Learning Learning and applications

  21. F ACE CLASSIFICATION RESULTS 1 0.95 0.9 0.85 0.8 0.75 2 stages feature selection 2 stages feature selection + correlation Viola+Jones feature selection using our same data Viola+Jones cascade performance 0.7 0 0.005 0.01 0.015 0.02 Our strategy for feature selection outperforms the one by Viola and Jones using the same dataset Adaboost seems to need a big number of examples to be trained effectively (we used just 4000 examples) Regularization Methods for High Dimensional Learning Learning and applications

  22. F ROM FACE CLASSIFICATION TO FACE DETECTION W HY IS IT DIFFICULT ? It is very unlikely to find a face in a real image → high number of false positives Image dimensions: 384x222px ∼ 6 . 5 · 10 5 tests in a multi-scale search with a base window of 19x19px Only 11 faces! Regularization Methods for High Dimensional Learning Learning and applications

  23. F ACE DETECTION : A CASCADE OF CLASSIFIERS For each image we have many tests to do → few positive examples and many negative examples We build a coarse-to-fine classification architecture: → Simpler classifiers are used to reject the majority of sub-windows → More complex classifiers allow us to achieve low false positive rates Regularization Methods for High Dimensional Learning Learning and applications

Recommend


More recommend