Safe Grid Search with Optimal Complexity E. Ndiaye Riken AIP Joint - PowerPoint PPT Presentation

Safe Grid Search with Optimal Complexity E. Ndiaye Riken AIP Joint work with: T. Le, O. Fercoq, J. Salmon, I. Takeuchi 1 / 7

Hyperparameter Tuning β ( λ ) ∈ arg min ˆ � Learning Task: f ( X train β )+ λ Ω( β ) β ∈ R p E v (ˆ β ( λ ) ) = L ( y test , X test ˆ β ( λ ) ) � Evaluation: 3 . 6 3 . 0 Validation curve at machine precision 3 . 4 2 . 9 � y test − X test β ( λ ) � 2 3 . 2 � y test − X test β ( λ ) � 2 3 . 0 2 . 8 2 . 8 2 . 7 2 . 6 2 . 6 2 . 4 Validation curve at machine precision 2 . 2 2 . 5 λ min λ max λ min λ max Regularization hyperparameter λ Regularization hyperparameter λ How to approximate the best hyperparameter? 2 / 7

Hyperparameter Tuning The optimal hyperparameter is given by E v (ˆ β ( λ ) ) = L ( y test , X test ˆ β ( λ ) ) arg min λ ∈ [ λ min ,λ max ] β ( λ ) ∈ arg min s.t. ˆ f ( X train β ) + λ Ω( β ) β ∈ R p Issues: � The objective λ �→ E v (ˆ β ( λ ) ) is non-smooth and non-convex � Often, It is unpractical to evaluate E v ( ˆ β ( λ ) ) 3 / 7

Tracking the curve of solutions β ( λ ) ∈ arg min ˆ f ( Xβ ) + λ Ω( β ) β ∈ R p Exact Path: For ( f, Ω) = (Piecewise Quadratic, Piecewise Linear) β ( λ ) is piecewise linear ( Lars 1 algorithm). → ˆ the function λ �− 1 (Efron et al. , 2004) 2 (Mairal and Yu, 2012) 3 (Bousquet and Bottou, 2008) 4 / 7

Tracking the curve of solutions β ( λ ) ∈ arg min ˆ f ( Xβ ) + λ Ω( β ) β ∈ R p Exact Path: For ( f, Ω) = (Piecewise Quadratic, Piecewise Linear) β ( λ ) is piecewise linear ( Lars 1 algorithm). → ˆ the function λ �− Drawbacks: � Exponential 2 complexity for Lasso O ((3 p + 1) / 2) � Numerical instabilities � Hard to generalize to others (loss, regularization) � Cannot benefited of early stopping rule 3 . 1 (Efron et al. , 2004) 2 (Mairal and Yu, 2012) 3 (Bousquet and Bottou, 2008) 4 / 7

Approximation of the solution path 4 β ( λ ) ∈ arg min ˆ f ( Xβ ) + λ Ω( β ) =: P λ ( β ) Training Task: β ∈ R p � 1 − λ � P λ ( β ( λ t ) ) − P λ (ˆ β ( λ ) ) ≤ Q t, V f ∗ Suboptimal gap: . λ t Upper Bound of the Duality Gap ǫ ǫ c λ min λ 5 λ 4 λ 3 λ 2 λ 1 λ max Q t, V f ∗ ( ρ ) := optimization error at λ t + approximation error ( λ, λ t ) , 4 (Giesen et al. 2012) 5 / 7

Bound the validation Gap � E v (ˆ β ( λ ) ) − E v ( β ( λ t ) ) � ≤ max L ( X ′ β, X ′ β ( λ t ) ) , � � β ∈B λ � � ∋ ˆ β ( λ t ) , Suboptimal gap on the training β ( λ ) B λ = Ball • − → Approximate the validation path ! 6 / 7

Bound the validation Gap � E v (ˆ β ( λ ) ) − E v ( β ( λ t ) ) � ≤ max L ( X ′ β, X ′ β ( λ t ) ) , � � β ∈B λ � � ∋ ˆ β ( λ t ) , Suboptimal gap on the training β ( λ ) B λ = Ball • − → Approximate the validation path ! High 3 . 2 Validation curve at precision machine precision δ v / 10 3 . 0 � y ′ − X ′ β ( λ ) � 2 ǫ v 2 . 8 2 . 6 Low precision 2 . 4 δ v × 10 λ min λ max 6 / 7

E v ( β ( λ t ) ) − λ ∈ [ λ min ,λ max ] E v (ˆ β ( λ ) ) ≤ ǫ v . min min λ t ∈ Λ val( ǫv ) Code: https://github.com/EugeneNdiaye/safe grid search Let’s talk during the poster session ;-) 7 / 7

Safe Grid Search with Optimal Complexity E. Ndiaye Riken AIP Joint - PowerPoint PPT Presentation

Safe Grid Search with Optimal Complexity E. Ndiaye Riken AIP Joint work with: T. Le, O. Fercoq, J. Salmon, I. Takeuchi 1 / 7 Hyperparameter Tuning ( ) arg min Learning Task: f ( X train )+ ( ) R p E v (

Safe Grid Search with Optimal Complexity Joseph Salmon http://josephsalmon.eu IMAG, Univ

Sun and Grid John Barr Grid Business Development 07808 328351 john.barr@sun.com Sun and Grid

Efficient Parallelization of Molecular Dynamics Simulations on Hybrid CPU/GPU Supercoputers

ON-GRID VS OFF-GRID SOLAR On-Grid Solar is solar generation that is connected to the utility grid

Migrating from Grid to Cloud: Migrating from Grid to Cloud: Migrating from Grid to Cloud:

SEE-GRID Deploying a Grid-enabled eInfrastructure in SE Europe www.see-grid.org Jorge Sanchez,

Modernizing T&D on the Electric Grid 11/29/2011 Mark Nealon System Meter & Smart Grid

Grid Grid to Grid Grid-to to Ports Clock Routing for to-Ports Clock Routing for Ports Clock

Grid/Clo d Comp ting Grid/Clo d Comp ting Grid/Cloud Computing Grid/Cloud Computing over

SEE-GRID-SCI SEE-GRID Infrastructure for Regional eScience www.see-grid-sci.eu International

Ameth Saloum Ndiaye UNU-WIDER 2017 Development Conference UNU-WIDER 2017 Development 1

Search Engines Issues Avi Rappoport Search Tools Consulting Search Issues Enterprise Search

RIKEN RI KEN Japans leading

Controlling chromosome segrega0on dynamics by the shapes Yuji Sakai (RIKEN iTHES) Masashi Tachikawa

From Quarks to Neutron Stars T. Hatsuda Director RIKEN Interdisciplinary Theoretical Science

Lattice gauge theory with quantum computers Akio Tomiya (RIKEN-BNL) akio.tomiya@riken.jp T.

Online Joint GlueX-EIC-PANDA Machine Learning Workshop Machine Learning for Beginners Thomas

Pattern Recognition 2 1 3 Perceptrons by M.L. Minsky and S.A. Papert (1969) 4 Books: Pattern

4000 King, 20 (15

South Campus Continuous Learning Teacher Information 5/4-5/8 Name: Bruce Callahan Mr.

SMHOA Annual Meeting March 5 th , 2018 Welcome! Well call the meeting to order at 6:30 PM 1

1 molecular evolution molecular phylogenetics evolution of molecules genomics bioinformatics

Explanations for Creativity S H A S H A N K SA H U D E PT. O F P H Y S I C S , I . I .T. K A

Introduction to Phenology, the Science of the Seasons Alisa Hove, Susan Mazer, and Brian Haggerty

Safe Grid Search with Optimal Complexity E. Ndiaye Riken AIP Joint - PowerPoint PPT Presentation

Safe Grid Search with Optimal Complexity E. Ndiaye Riken AIP Joint work with: T. Le, O. Fercoq, J. Salmon, I. Takeuchi 1 / 7 Hyperparameter Tuning ( ) arg min Learning Task: f ( X train )+ ( ) R p E v (

Safe Grid Search with Optimal Complexity Joseph Salmon http://josephsalmon.eu IMAG, Univ

Sun and Grid John Barr Grid Business Development 07808 328351 john.barr@sun.com Sun and Grid

Efficient Parallelization of Molecular Dynamics Simulations on Hybrid CPU/GPU Supercoputers

ON-GRID VS OFF-GRID SOLAR On-Grid Solar is solar generation that is connected to the utility grid

Migrating from Grid to Cloud: Migrating from Grid to Cloud: Migrating from Grid to Cloud:

SEE-GRID Deploying a Grid-enabled eInfrastructure in SE Europe www.see-grid.org Jorge Sanchez,

Modernizing T&amp;D on the Electric Grid 11/29/2011 Mark Nealon System Meter &amp; Smart Grid

Grid Grid to Grid Grid-to to Ports Clock Routing for to-Ports Clock Routing for Ports Clock

Grid/Clo d Comp ting Grid/Clo d Comp ting Grid/Cloud Computing Grid/Cloud Computing over

SEE-GRID-SCI SEE-GRID Infrastructure for Regional eScience www.see-grid-sci.eu International

Ameth Saloum Ndiaye UNU-WIDER 2017 Development Conference UNU-WIDER 2017 Development 1

Search Engines Issues Avi Rappoport Search Tools Consulting Search Issues Enterprise Search

RIKEN RI KEN Japans leading

Controlling chromosome segrega0on dynamics by the shapes Yuji Sakai (RIKEN iTHES) Masashi Tachikawa

From Quarks to Neutron Stars T. Hatsuda Director RIKEN Interdisciplinary Theoretical Science

Lattice gauge theory with quantum computers Akio Tomiya (RIKEN-BNL) akio.tomiya@riken.jp T.

Online Joint GlueX-EIC-PANDA Machine Learning Workshop Machine Learning for Beginners Thomas

Pattern Recognition 2 1 3 Perceptrons by M.L. Minsky and S.A. Papert (1969) 4 Books: Pattern

4000 King, 20 (15

South Campus Continuous Learning Teacher Information 5/4-5/8 Name: Bruce Callahan Mr.

SMHOA Annual Meeting March 5 th , 2018 Welcome! Well call the meeting to order at 6:30 PM 1

1 molecular evolution molecular phylogenetics evolution of molecules genomics bioinformatics

Explanations for Creativity S H A S H A N K SA H U D E PT. O F P H Y S I C S , I . I .T. K A

Introduction to Phenology, the Science of the Seasons Alisa Hove, Susan Mazer, and Brian Haggerty

Modernizing T&D on the Electric Grid 11/29/2011 Mark Nealon System Meter & Smart Grid