learning from data lecture 18 radial basis functions
play

Learning From Data Lecture 18 Radial Basis Functions - PowerPoint PPT Presentation

Learning From Data Lecture 18 Radial Basis Functions Non-Parametric RBF Parametric RBF k -RBF-Network M. Magdon-Ismail CSCI 4100/6100 recap: Data Condensation and Nearest Neighbor Search Training Set Consistent S 2 Branch and bound for


  1. Learning From Data Lecture 18 Radial Basis Functions Non-Parametric RBF Parametric RBF k -RBF-Network M. Magdon-Ismail CSCI 4100/6100

  2. recap: Data Condensation and Nearest Neighbor Search Training Set Consistent S 2 Branch and bound for finding nearest neighbors. x S 1 − − − → Lloyd’s algorithm for finding a good clustering. M Radial Basis Functions : 2 /31 � A c L Creator: Malik Magdon-Ismail RBF vs. k -NN − →

  3. Radial Basis Functions (RBF) k -Nearest Neighbor: Only considers k -nearest neighbors. each neighbor has equal weight What about using all data to compute g ( x )? RBF: Use all data. data further away from x have less weight. M Radial Basis Functions : 3 /31 � A c L Creator: Malik Magdon-Ismail Weighting data points − →

  4. Weighting the Data Points: α n Test point x . Most popular kernel: Gaussian φ ( z ) = e − 1 2 z 2 . α n : the weight of x n in g ( x ). � | | x − x n | | � α n ( x ) = φ r Window kernel, mimics k -NN, � 1 z ≤ 1 , φ ( z ) = decreasing function of | | x − x n | | 0 z > 1 , M Radial Basis Functions : 4 /31 � A c L Creator: Malik Magdon-Ismail Weighting depends on distance − →

  5. Weighting the Data Points: α n Test point x . Most popular kernel: Gaussian φ ( z ) = e − 1 2 z 2 . α n : the weight of x n in g ( x ). � | | x − x n | | � α n ( x ) = φ r Window kernel, mimics k -NN, � 1 z ≤ 1 , φ ( z ) = weighting depends on the distance | | x − x n | | 0 z > 1 , M Radial Basis Functions : 5 /31 � A c L Creator: Malik Magdon-Ismail Relative to scale r − →

  6. Weighting the Data Points: α n Test point x . Most popular kernel: Gaussian φ ( z ) = e − 1 2 z 2 . α n : the weight of x n in g ( x ). � | | x − x n | | � α n ( x ) = φ r Window kernel, mimics k -NN, � 1 z ≤ 1 , φ ( z ) = . . . relative to a scale parameter r 0 z > 1 , M Radial Basis Functions : 6 /31 � A c L Creator: Malik Magdon-Ismail Determined by φ − →

  7. Weighting the Data Points: α n Test point x . Most popular kernel: Gaussian φ ( z ) = e − 1 2 z 2 . α n : the weight of x n in g ( x ). � | | x − x n | | � α n ( x ) = φ r Window kernel, mimics k -NN, � 1 z ≤ 1 , φ ( z ) = kernel φ determines how the weighting decreases with distance 0 z > 1 , M Radial Basis Functions : 7 /31 � A c L Creator: Malik Magdon-Ismail Example Kernels φ − →

  8. Weighting the Data Points: α n Test point x . Most popular kernel: Gaussian φ ( z ) = e − 1 2 z 2 . α n : the weight of x n in g ( x ). � | | x − x n | | � α n ( x ) = φ r Window kernel, mimics k -NN, � 1 z ≤ 1 , φ ( z ) = kernel φ determines how the weighting decreases with distance 0 z > 1 , M Radial Basis Functions : 8 /31 � A c L Creator: Malik Magdon-Ismail Nonparametric RBF final hypothesis − →

  9. Nonparametric RBF – Regression � | | x − x n | | � y α n ( x ) = φ r α n x ( x n , y n ) N � � α n ( x ) � g ( x ) = · y n � N m =1 α m ( x ) n =1 ր Weighted average of target values M Radial Basis Functions : 9 /31 � A c L Creator: Malik Magdon-Ismail Nonparametric RBF – classsification − →

  10. Nonparametric RBF – Classification � | | x − x n | | � y α n ( x ) = φ r α n x ( x n , y n ) � N � � � α n ( x ) � g ( x ) = sign · y n � N m =1 α m ( x ) n =1 ր Weighted average of target values M Radial Basis Functions : 10 /31 � A c L Creator: Malik Magdon-Ismail Nonparametric RBF – logistic regression − →

  11. Nonparametric RBF – Logistic Regression � | | x − x n | | � y α n ( x ) = φ r α n x ( x n , y n ) N � � α n ( x ) � g ( x ) = · � y n = +1 � � N m =1 α m ( x ) n =1 ր Weighted average of target values M Radial Basis Functions : 11 /31 � A c L Creator: Malik Magdon-Ismail Choosing the scale r − →

  12. Choice of Scale r Nearest Neighbor Choosing k : k = 1 k = 3 k = 11 k = 3 √ k = N CV Nonparametric RBF r = 0 . 01 r = 0 . 05 r = 0 . 5 Choosing r : 1 √ r ∼ 2 d N CV overfitting underfitting M Radial Basis Functions : 12 /31 � A c L Creator: Malik Magdon-Ismail Highlights of Nonparametric RBF − →

  13. Highlights of Nonparametric RBF 6. Computationally demanding . } 1. Simple (‘smooth’ version of k -NN rule). 2. No training. A good! method 3. Near optimal E out . 4. Easy to justify classification to customer. 5. Can do classification, multi-class, regression, logistic regression. M Radial Basis Functions : 13 /31 � A c L Creator: Malik Magdon-Ismail Bumps on Data Points − →

  14. Scaled Bumps on Each Data Point N � � α n ( x ) � y g ( x ) = · y n � N m =1 α m ( x ) n =1 α n Weighted average of y n x ( x n , y n ) N � � y n � | | x − x n | | � � g ( x ) = · φ � N r m =1 α m ( x ) n =1 N � | | x − x n | | � � = w n ( x ) φ r n =1 Sum of bumps at x n scaled by w n ( x ) M Radial Basis Functions : 14 /31 � A c L Creator: Malik Magdon-Ismail Rewrite as weighted bumps − →

  15. Scaled Bumps on Each Data Point N � � α n ( x ) � g ( x ) = · y n y � N m =1 α m ( x ) n =1 α n Weighted average of y n x ( x n , y n ) N � � y n � | | x − x n | | � � g ( x ) = · φ � N r m =1 α m ( x ) n =1 N � | | x − x n | | � � = w n ( x ) φ r n =1 Sum of bumps at x n scaled by w n ( x ) M Radial Basis Functions : 15 /31 � A c L Creator: Malik Magdon-Ismail Weighted bumps, w n ( x ) − →

  16. Scaled Bumps on Each Data Point N � � α n ( x ) � g ( x ) = · y n y � N m =1 α m ( x ) n =1 α n Weighted average of y n x ( x n , y n ) N � � y n � | | x − x n | | � � g ( x ) = · φ � N r m =1 α m ( x ) n =1 w n y N � | | x − x n | | � � = w n ( x ) · φ r x ( x n , y n ) n =1 Sum of bumps at x n scaled by w n ( x ) M Radial Basis Functions : 16 /31 � A c L Creator: Malik Magdon-Ismail Nonparametric RBF: 3 point example − →

  17. Nonparametric RBF: w n ( x ) Nonparametric RBF N � | � | x − x n | | � g ( x ) = w n ( x ) · φ r n =1 y y r = 0 . 1 r = 0 . 3 Only need to specify r . x x Parametric RBF N � | | x − x n | | � � h ( x ) = w n · φ r n =1 Fix r ; need to determine the parameters w n . — fit the data. — overfit the data? M Radial Basis Functions : 17 /31 � A c L Creator: Malik Magdon-Ismail Parametric RBF − →

  18. Parametric RBF, w n – A Linear Model Nonparametric RBF N � | | x − x n | | � � g ( x ) = w n ( x ) · φ r n =1 y y r = 0 . 1 r = 0 . 3 Only need to specify r . x x Parametric RBF N � | | x − x n | | � � h ( x ) = w n · φ r n =1 Fix r ; need to determine the parameters w n . — fit the data. — overfit the data? M Radial Basis Functions : 18 /31 � A c L Creator: Malik Magdon-Ismail Parametric RBF 3 point example − →

  19. Parametric RBF – A Linear Model Nonparametric RBF N � | | x − x n | | � � g ( x ) = w n ( x ) · φ r n =1 y y r = 0 . 1 r = 0 . 3 Only need to specify r . x x Parametric RBF r = 0 . 1 r = 0 . 3 N � | � | x − x n | | � h ( x ) = w n · φ r n =1 y y Fix r ; need to determine the parameters w n . — fit the data. x x — overfit the data? M Radial Basis Functions : 19 /31 � A c L Creator: Malik Magdon-Ismail RBF-Nonlinear Transform − →

  20. RBF-Nonlinear Transform Depends on Data N � | | x − x n | | � � h ( x ) = w n · φ = w t z r n =1 — Φ ( x 1 ) t —       φ 1 ( x ) — z t 1 — — Φ ( x 2 ) t — φ 2 ( x ) — z t 2 —       z = Φ ( x ) =  , φ n ( x )= φ ( ) . Z =  = | | x − x n | | . . . . . .       . . . r     — Φ ( x N ) t — φ N ( x ) — z t N — Fit the data ( h ( x n ) = y n ): w = Z † y = (Z t Z) − 1 Z t y M Radial Basis Functions : 20 /31 � A c L Creator: Malik Magdon-Ismail Solving for w − →

  21. RBF-Nonlinear Transform Depends on Data N � | | x − x n | | � � h ( x ) = w n · φ = w t z r n =1 — Φ ( x 1 ) t —       φ 1 ( x ) — z t 1 — — Φ ( x 2 ) t — φ 2 ( x ) — z t 2 —       z = Φ ( x ) =  , φ n ( x )= φ ( ) . Z =  = | | x − x n | | . . . . . .       . . . r     — Φ ( x N ) t — φ N ( x ) — z t N — r = 0 . 1 r = 0 . 3 Fit the data ( h ( x n ) = y n ): y y w = Z † y = (Z t Z) − 1 Z t y x x M Radial Basis Functions : 21 /31 � A c L Creator: Malik Magdon-Ismail Reducing N → k (nonparametric) − →

  22. Reducing the Number of Bumps: Nonparametric N � | | x − x n | | � � g ( x ) = w n ( x ) · φ r n =1 − − → N � | | x − x n | | � � h ( x ) = w n · φ r n =1 − − → nonlinear in µ j ւ k � | | x − µ j | | � � h ( x ) = w 0 + w j · φ r j =1 = w t Φ ( x ) � � Φ ( x ) t = [1 , Φ 1 ( x ) , . . ., Φ k ( x )], where Φ j ( x ) = Φ | | x − µ j | | . r M Radial Basis Functions : 22 /31 � A c L Creator: Malik Magdon-Ismail Parametric, N centers − →

Recommend


More recommend