Training Linear SVMs’ By - Thorsten Joachims Prasad Seemakurthi
Agenda • What is SVM • Kernel • Hard Margins • Soft Margins • Linear Algorithm • Few Examples • Conclusion
SVM – Curtain Rais iser • Linear Classification Algorithm • SVM have a clever way to prevent over-fitting • SVMs have a very clever way to use huge number of features nearly as much as computation as seems to be necessary
Lin inear Cla lassifiers (I (Intuition) Y (est)
Lin inear Cla lassifiers Y (est) denotes +1 Any of these denotes -1 would be fine … But which is best … ?
Linear Classifier
Maximum Margin 1. Maximizing margin is good according to intuition and PAC theory 2. Implies only support vectors denotes +1 are important denotes -1 3. Empirically works well Classifier with the maximum margin Support This kind of simplest kind of Vectors SVM is called Linear SVM
Maximizing th the margin .
Why maximize th the margin? Points near decision surface -----> Uncertain classification decision (50% either way) A classifier with a large margin make no low classification decision Gives classification safety margin w.r.t slight errors in measurement
Why maximize th the margin? • SVM Classifier : Large Margin around Decision boundary • Compare to decision hyperplane: Place a fat separator between classes • Fewer choices of where it can be put • Decreased memory capacity • Increased ability to correctly generalize the test data
Lin inear SVM math themati tically
Lin inear SVM math themati tically
Lin inear (h (hard – Margin ) ) SVM – fo formulation
Solv lving th the Opti timization Problem • Find w and b such that 1 • (w) = 2 . 𝑥 𝑢 . 𝑥 is minimized For all {(x i , y i )}: y i 𝑥 𝑈 + 𝑦𝑗 + 𝑐 ≥ 1 The solution involves construction a dual problem where a Lagrange multiplier I is associated with every constraint in the primary problem:
Data taset t wit ith noise Problem ?
Soft ft Margin Cla lassification • Slack variables can be added to allow misclassification of difficult or noisy data What should be our quadratic optimization criterion be ? Minimize 𝑆 1 2 ∗ 𝑥 𝑈 ∗ 𝑥 + 𝐷 𝜁 𝐿=1
Hard vs. . Soft ft Margin SVM • Hard- margin doesn’t require to guess the cost parameter (requires no parameters at all) • Soft-margin also always has a solution • Soft – margin is more robust to outliers Smoother surfaces (in non – liner cases)
Alg lgorith thm
SVM Applications • SVM has been used successfully in many real world applications • Text ( and hypertext ) categorization • Image classification • Bioinformatics (protein classification, cancer classification) • Hand-written char. classification
Recommend
More recommend