11/18/2015 Support vector machines and kernels Thurs Nov 19 Kristen Grauman UT Austin Last time • Sliding window object detection pros and cons • Attentional cascade • Object proposals for detection • Nearest neighbor classification • Scene recognition example with global descriptors 1
11/18/2015 Today • HMM examples • Support vector machines (SVM) • Basic algorithm • Kernels • Structured input spaces: Pyramid match kernels • Multi-class • HOG + SVM for person detection • Visualizing a feature: Hoggles • Evaluating an object detector Window-based models: Three case studies Boosting + face SVM + person NN + scene Gist detection detection classification e.g., Hays & Efros e.g., Dalal & Triggs Viola & Jones Slide credit: Kristen Grauman 2
11/18/2015 Recall: Nearest Neighbor classification • Assign label of nearest training data point to each test data point Black = negative Novel test example Red = positive Closest to a positive example from the training set, so classify it as positive. from Duda et al. Voronoi partitioning of feature space for 2-category 2D data 6+ million geotagged photos by 109,788 photographers Annotated by Flickr users Slide credit: James Hays 3
11/18/2015 Im2gps: Scene Matches Slide credit: James Hays [Hays and Efros. im2gps : Estimating Geographic Information from a Single Image. CVPR 2008.] Slide credit: James Hays 4
11/18/2015 The Importance of Data [Hays and Efros. im2gps : Estimating Geographic Information from a Single Image. CVPR 2008.] Slide credit: James Hays HMM example: Photo Geo-location Where was this picture taken? Slide credit: Kristen Grauman 5
11/18/2015 Example: Photo Geo-location Where was this picture taken? Slide credit: Kristen Grauman Example: Photo Geo-location Where was this picture taken? Slide credit: Kristen Grauman 6
11/18/2015 Example: Photo Geo-location Where was each picture in this sequence taken? Slide credit: Kristen Grauman Idea: Exploit the beaten path • Learn dynamics model from “training” tourist photos • Exploit timestamps and sequences for novel “test” photos [Chen & Grauman CVPR 2011] Slide credit: Kristen Grauman 7
11/18/2015 Idea: Exploit the beaten path [Chen & Grauman CVPR 2011] Slide credit: Kristen Grauman Hidden Markov Model Observation P(Observation | State ) P(S 2 |S 2 ) P(State ) State 2 P(S 3 |S 2 ) Observation P(S 2 |S 1 ) P(S 2 |S 3 ) Observation P(S 1 |S 2 ) State 1 State 3 P(S 1 |S 1 ) P(S 3 |S 1 ) P(S 3 |S 3 ) P(S 1 |S 3 ) Slide credit: Kristen Grauman 8
11/18/2015 Discovering a city’s locations Define states with data-driven approach: New York mean shift clustering on the GPS coordinates of the training images Observation model P(Observation | State) = P( | Liberty Island) P(L 2 |L 2 ) Location 2 P(L 3 |L 2 ) P(L 2 |L 1 ) P(S 2 |S 3 ) P(L 1 |L 2 ) Location 1 Location 3 P(L 1 |L 1 ) P(L 3 |L 1 ) P(L 3 |L 3 ) P(L 1 |L 3 ) Slide credit: Kristen Grauman 9
11/18/2015 Observation model Slide credit: Kristen Grauman Location estimation accuracy Slide credit: Kristen Grauman 10
11/18/2015 Qualitative Result – New York Slide credit: Kristen Grauman Discovering travel g uides’ beaten paths Routes from travel guide book for New York vs. Random walks in learned HMM Slide credit: Kristen Grauman 11
11/18/2015 Video textures • Schodl, Szeliski, Salesin, Essa; Siggraph 2000. • http://www.cc.gatech.edu/cpl/projects/videotexture/ Today • HMM examples • Support vector machines (SVM) – Basic algorithm – Kernels • Structured input spaces: Pyramid match kernels – Multi-class – HOG + SVM for person detection • Visualizing a feature: Hoggles • Evaluating an object detector 12
11/18/2015 Window-based models: Three case studies Boosting + face SVM + person NN + scene Gist detection detection classification e.g., Hays & Efros e.g., Dalal & Triggs Viola & Jones Slide credit: Kristen Grauman Linear classifiers 13
11/18/2015 Linear classifiers • Find linear function to separate positive and negative examples positive : b 0 x x w i i negative : 0 b x x w i i Which line is best? Support Vector Machines (SVMs) • Discriminative classifier based on optimal separating line (for 2d case) • Maximize the margin between the positive and negative training examples 14
11/18/2015 Support vector machines • Want line that maximizes the margin. positive ( 1) : 1 y b x x w i i i negative ( y 1) : b 1 x x w i i i 1 b For support, vectors, x i w Support vectors Margin C. Burges, A Tutorial on Support V ector Machines for Pattern Recognition, Data Mining and Knowledge Discovery , 1998 Support vector machines • Want line that maximizes the margin. positive ( 1) : 1 y b x x w i i i negative ( 1) : 1 y b x x w i i i 1 b For support, vectors, x i w | b | Distance between point x w i and line: || || w For support vectors: Τ b 1 1 1 2 w x M Support vectors Margin M w w w w w C. Burges, A Tutorial on Support Vector Machines f or Pattern Recognition, Data Mining and Knowledge Discov ery, 15
11/18/2015 Support vector machines • Want line that maximizes the margin. positive ( 1) : 1 y b x x w i i i negative ( y 1) : b 1 x x w i i i 1 b For support, vectors, x i w | | b Distance between point x w i and line: || || w Therefore, the margin is 2 / || w || Support vectors Margin M C. Burges, A Tutorial on Support Vector Machines f or Pattern Recognition, Data Mining and Knowledge Discov ery, Finding the maximum margin line 1. Maximize margin 2/|| w || 2. Correctly classify all training data points: positive ( 1) : 1 y b x x w i i i negative ( 1) : 1 y b x x w i i i Quadratic optimization problem : 1 w T Minimize w 2 Subject to y i ( w · x i + b ) ≥ 1 C. Burges, A Tutorial on Support Vector Machines f or Pattern Recognition, Data Mining and Knowledge Discov ery, 16
11/18/2015 Finding the maximum margin line • Solution: i y x w i i i learned Support weight vector C. Burges, A Tutorial on Support Vector Machines f or Pattern Recognition, Data Mining and Knowledge Discov ery, Finding the maximum margin line • Solution: i y x w i i i b = y i – w · x i (for any support vector) b y b w x x x i i i i • Classification function: ( ) sign ( b) f x w x sign y b x x i i i i If f(x) < 0, classify as negative, if f(x) > 0, classify as positive C. Burges, A Tutorial on Support Vector Machines f or Pattern Recognition, Data Mining and Knowledge Discov ery, 19 17
11/18/2015 Questions • What if the data is not linearly separable? What if the data is not linearly separable? 1 2 min subject to ( ) 1 y b • w w x Separable: i i 2 , b w 1 n 2 min C • Non-separable: w i 2 , b w i 1 subject to ( ) 1 0 y b w x i i i • C : tradeoff constant, ξ i : slack variable (positive) • Whenever margin is ≥ 1, ξ i = 0 1 ( ) • y b Whenever margin is < 1, w x i i i Lana Lazebnik 18
11/18/2015 Today • HMM examples • Support vector machines (SVM) – Basic algorithm – Kernels • Structured input spaces: Pyramid match kernels – Multi-class – HOG + SVM for person detection • Visualizing a feature: Hoggles • Evaluating an object detector Non-linear SVMs Datasets that are linearly separable with some noise work out great: x 0 But what are we going to do if the dataset is just too hard? x 0 How about … mapping data to a higher-dimensional space: x 2 0 x 19
11/18/2015 Non-linear SVMs: feature spaces General idea: the original input space can be mapped to some higher-dimensional feature space where the training set is separable: Φ : x → φ ( x ) Slide f rom Andrew Moore’s tutorial: http://www .autonlab.org/tutorials/sv m.html Nonlinear SVMs • The kernel trick : instead of explicitly computing the lifting transformation φ ( x ), define a kernel function K such that j ) = φ ( x i ) · φ ( x j ) K ( x i , x j • This gives a nonlinear decision boundary in the original feature space: ( , ) y K b x x i i i i 20
Recommend
More recommend