L M A D A Learning And Mining from DatA NANJING UNIVERSITY - PowerPoint PPT Presentation

L M A D A Learning And Mining from DatA NANJING UNIVERSITY Adaptive Regret of Convex and Smooth Functions Lijun Zhang 1 Tie-Yan Liu 2 Zhi-Hua Zhou 1 1 National Key Laboratory for Novel Software Technology, Nanjing University 2 Microsoft Research Asia The 36th International Conference on Machine Learning (ICML 2019) Zhang et al. Adaptive Regret

� � Online Learning Online Convex Optimization [Zinkevich, 2003] 1: for t = 1 , 2 , . . . , T do Learner picks a decision w t ∈ W 2: Adversary chooses a function f t ( · ) : W �→ R Learner suffers loss f t ( w t ) and updates w t 3: 4: end for + A classifier + � � � � � An example � � , � � � � � × ±1 A loss � � ( � ) = max 1 � � � � � � � , 0 Learner Adversary Cumulative Loss T � f t ( w t ) Cumulative Loss = L M A D A t = 1 Learning And Mining from DatA Zhang et al. Adaptive Regret

Performance Measure Regret T T � � f t ( w t ) f t ( w ) Regret = min − w ∈W t = 1 t = 1 � �� Cumulative Loss of Online Learner Minimal Loss of Offline Learner L M A D A Learning And Mining from DatA Zhang et al. Adaptive Regret

Performance Measure Regret T T � � f t ( w t ) f t ( w ) Regret = min − w ∈W t = 1 t = 1 � �� Cumulative Loss of Online Learner Minimal Loss of Offline Learner Convex Functions [Zinkevich, 2003] Online Gradient Descent (OGD) � √ � Regret = O T Convex and Smooth Functions [Srebro et al., 2010] OGD with prior knowledge � � � Regret = O 1 + F ∗ � T t = 1 f t ( w ) where F ∗ = min w ∈W Exp-concave Functions [Hazan et al., 2007] L M A D A Strongly Convex Functions [Hazan et al., 2007] Learning And Mining from DatA Zhang et al. Adaptive Regret

Learning in Changing Environments Regret → Static Regret T T � � f t ( w t ) − min f t ( w ) Regret = w ∈W t = 1 t = 1 T T � � f t ( w t ) − f t ( w ∗ ) = t = 1 t = 1 where w ∗ ∈ argmin w ∈W � T t = 1 f t ( w ) w ∗ is reasonably good during T rounds Changing Environments Different decisions will be good in different periods E.g., recommendation, stock market L M A D A Learning And Mining from DatA Zhang et al. Adaptive Regret

Adaptive Regret The Basic Idea Minimize the regret over every interval [ r , s ] s s � � � � f t ( w t ) − min f t ( w ) Regret [ r , s ] = w ∈W t = r t = r Weakly Adaptive Regret [Hazan and Seshadhri, 2007] � � WA-Regret( T ) = [ r , s ] ⊆ [ T ] Regret max [ r , s ] The maximal regret over all intervals Strongly Adaptive Regret [Daniely et al., 2015] � � SA-Regret( T , τ ) = [ s , s + τ − 1 ] ⊆ [ T ] Regret max [ s , s + τ − 1 ] L M A D A The maximal regret over all intervals of length τ Learning And Mining from DatA Zhang et al. Adaptive Regret

State-of-the-Art Convex Functions [Jun et al., 2017] �� Regret [ r , s ] = O ( s − r ) log s �� ⇒ SA-Regret( T , τ ) = O τ log T Exp-concave Functions [Hazan and Seshadhri, 2007] Strongly Convex Functions [Zhang et al., 2018] L M A D A Learning And Mining from DatA Zhang et al. Adaptive Regret

State-of-the-Art Convex Functions [Jun et al., 2017] �� Regret [ r , s ] = O ( s − r ) log s �� ⇒ SA-Regret( T , τ ) = O τ log T Exp-concave Functions [Hazan and Seshadhri, 2007] Strongly Convex Functions [Zhang et al., 2018] Question Can smoothness be exploited to boost the adaptive regret? L M A D A Learning And Mining from DatA Zhang et al. Adaptive Regret

Our Results Convex and Smooth Functions  �  � s � � � � � � f t ( w ) Regret [ r , s ] = O log s · log( s − r ) �   t = r Become tighter when � s t = r f t ( w ) is small Convex Functions [Jun et al., 2017] �� Regret [ r , s ] = O ( s − r ) log s L M A D A Learning And Mining from DatA Zhang et al. Adaptive Regret

Our Results Convex and Smooth Functions  �  � s � � � � � � f t ( w ) Regret [ r , s ] = O log s · log( s − r ) �   t = r Become tighter when � s t = r f t ( w ) is small Convex Functions [Jun et al., 2017] �� Regret [ r , s ] = O ( s − r ) log s Convex and Smooth Functions �   � s � � s s � � � � � � f t ( w ) f t ( w ) · log f t ( w ) Regret [ r , s ] = O log �   t = r t = 1 t = r Fully problem-dependent L M A D A Learning And Mining from DatA Zhang et al. Adaptive Regret

The Algorithm An Expert-algorithm Scale-free online gradient descent [Orabona and Pál, 2018] Can exploit smoothness automatically A Set of Intervals Compact geometric covering intervals [Daniely et al., 2015] t 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 · · · C 0 [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] · · · C 1 [ ] [ ] [ ] [ ] [ · · · C 2 [ ] [ ] · · · C 3 [ ] · · · C 4 [ · · · A Meta-algorithm AdaNormalHedge [Luo and Schapire, 2015] Attain a small-loss regret and support sleeping experts L M A D A Learning And Mining from DatA Zhang et al. Adaptive Regret

Reference I Thanks! Welcome to Our Poster @ Pacific Ballroom #161. Daniely, A., Gonen, A., and Shalev-Shwartz, S. (2015). Strongly adaptive online learning. In Proceedings of the 32nd International Conference on Machine Learning , pages 1405–1411. Hazan, E., Agarwal, A., and Kale, S. (2007). Logarithmic regret algorithms for online convex optimization. Machine Learning , 69(2-3):169–192. Hazan, E. and Seshadhri, C. (2007). Adaptive algorithms for online decision problems. Electronic Colloquium on Computational Complexity , 88. Jun, K.-S., Orabona, F ., Wright, S., and Willett, R. (2017). Improved strongly adaptive online learning using coin betting. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics , pages 943–951. Luo, H. and Schapire, R. E. (2015). Achieving all with no parameters: Adanormalhedge. In Proceedings of The 28th Conference on Learning Theory , pages 1286–1304. L M A D A Learning And Mining from DatA Zhang et al. Adaptive Regret

Reference II Orabona, F . and Pál, D. (2018). Scale-free online learning. Theoretical Computer Science , 716:50–69. Srebro, N., Sridharan, K., and Tewari, A. (2010). Smoothness, low-noise and fast rates. In Advances in Neural Information Processing Systems 23 , pages 2199–2207. Zhang, L., Yang, T., Jin, R., and Zhou, Z.-H. (2018). Dynamic regret of strongly adaptive methods. In Proceedings of the 35th International Conference on Machine Learning . Zinkevich, M. (2003). Online convex programming and generalized infinitesimal gradient ascent. In Proceedings of the 20th International Conference on Machine Learning , pages 928–936. L M A D A Learning And Mining from DatA Zhang et al. Adaptive Regret

L M A D A Learning And Mining from DatA NANJING UNIVERSITY - PowerPoint PPT Presentation

L M A D A Learning And Mining from DatA NANJING UNIVERSITY Adaptive Regret of Convex and Smooth Functions Lijun Zhang 1 Tie-Yan Liu 2 Zhi-Hua Zhou 1 1 National Key Laboratory for Novel Software Technology, Nanjing University 2 Microsoft

CSE 473: Artificial Intelligence Reinforcement Learning Dan Weld/ University of Washington [Many

Regression Many slides attributable to: Prof. Mike Hughes Erik Sudderth (UCI) Finale

EMPLOYMENT COUNSELLING SERVICES MR M.S BIYELA PRINCIPAL PSYCHOLOGIST KWAZULU-NATAL OVERVIEW

An updated version of the Participant Guidance is due to be published on the ESIF website in

Efficient tracking of a growing number of experts Jaouad Mourtada & Odalric-ambrym Maillard

Learning wit ith Pairw rwis ise Losses Problems, Algorithms and Analysis Purushottam Kar

Multi-Player Bandits Revisited Decentralized Multi-Player Multi-Arm Bandits Lilian Besson

Linear Bandits D avid P al Google, New York & Department of Computing Science

without Regret Barbara Jobstmann EPFL and Jasper DA CNRS, Verimag Joint Work with Christian von

Q4 05 STRATEGIC OVERVIEW Investor Community Conference Call TONY COMPER President and Chief

A Model for Detecting Transport Layer Data Reneging Nasif Ekiz, Paul D. Amer Nasif Ekiz, Paul D.

Managing Waiting Lines NKFUST The Economies of Waiting Features of Queuing Systems

Managing Waiting Lines NKFUST The Economies of Waiting Features of Queuing Systems

The BRAVO effect and other problems involving biological models D. J. D ALEY Department of

WORKPLACE LACTATION ACCOMMODATION: The Basics for Supporting Breastfeeding in the Workplace and

GEOTHERMAL ENERGY: A clean and stable renewable source for the global energy balance Michele

CITIES, HEALTH AND WELL-BEING NOVEMBER 2011 For People, We CARE; For Justice, We ACT! Housing

Electoral Accountability and Control in U.S. Cities Holger Sieg University of Pennsylvania and

Community Based Organization Wrap Around Support Training Community Engagement team Carina

Housing Americas Future: New Directions for National Policy Report of the Bipartisan Policy

Community Forum March 28, 2015 Breakout session 1 Supply Proposed Rental Objective Produce

Open problems in repeated games with finite automata Abraham Neyman Jerusalem, May 23, 2011

Experimental Studies in Matching Markets Joana Pais ISEG/Technical University of Lisbon + UECE

Will You Be Retirement Ready? Quarterly Webinar October 25, 2018 TODAYS SPEAKERS G. Leonard

L M A D A Learning And Mining from DatA NANJING UNIVERSITY - PowerPoint PPT Presentation

L M A D A Learning And Mining from DatA NANJING UNIVERSITY Adaptive Regret of Convex and Smooth Functions Lijun Zhang 1 Tie-Yan Liu 2 Zhi-Hua Zhou 1 1 National Key Laboratory for Novel Software Technology, Nanjing University 2 Microsoft

CSE 473: Artificial Intelligence Reinforcement Learning Dan Weld/ University of Washington [Many

Regression Many slides attributable to: Prof. Mike Hughes Erik Sudderth (UCI) Finale

EMPLOYMENT COUNSELLING SERVICES MR M.S BIYELA PRINCIPAL PSYCHOLOGIST KWAZULU-NATAL OVERVIEW

An updated version of the Participant Guidance is due to be published on the ESIF website in

Efficient tracking of a growing number of experts Jaouad Mourtada &amp; Odalric-ambrym Maillard

Learning wit ith Pairw rwis ise Losses Problems, Algorithms and Analysis Purushottam Kar

Multi-Player Bandits Revisited Decentralized Multi-Player Multi-Arm Bandits Lilian Besson

Linear Bandits D avid P al Google, New York &amp; Department of Computing Science

without Regret Barbara Jobstmann EPFL and Jasper DA CNRS, Verimag Joint Work with Christian von

Q4 05 STRATEGIC OVERVIEW Investor Community Conference Call TONY COMPER President and Chief

A Model for Detecting Transport Layer Data Reneging Nasif Ekiz, Paul D. Amer Nasif Ekiz, Paul D.

Managing Waiting Lines NKFUST The Economies of Waiting Features of Queuing Systems

Managing Waiting Lines NKFUST The Economies of Waiting Features of Queuing Systems

The BRAVO effect and other problems involving biological models D. J. D ALEY Department of

WORKPLACE LACTATION ACCOMMODATION: The Basics for Supporting Breastfeeding in the Workplace and

GEOTHERMAL ENERGY: A clean and stable renewable source for the global energy balance Michele

CITIES, HEALTH AND WELL-BEING NOVEMBER 2011 For People, We CARE; For Justice, We ACT! Housing

Electoral Accountability and Control in U.S. Cities Holger Sieg University of Pennsylvania and

Community Based Organization Wrap Around Support Training Community Engagement team Carina

Housing Americas Future: New Directions for National Policy Report of the Bipartisan Policy

Community Forum March 28, 2015 Breakout session 1 Supply Proposed Rental Objective Produce

Open problems in repeated games with finite automata Abraham Neyman Jerusalem, May 23, 2011

Experimental Studies in Matching Markets Joana Pais ISEG/Technical University of Lisbon + UECE

Will You Be Retirement Ready? Quarterly Webinar October 25, 2018 TODAYS SPEAKERS G. Leonard

Efficient tracking of a growing number of experts Jaouad Mourtada & Odalric-ambrym Maillard

Linear Bandits D avid P al Google, New York & Department of Computing Science