AI-Augmented Algorithms How I Learned to Stop Worrying and Love - PowerPoint PPT Presentation

AI-Augmented Algorithms How I Learned to Stop Worrying and Love Choice Lars Kotthofg University of Wyoming larsko@uwyo.edu Boulder, 16 January 2019

Outline 2 ▷ Big Picture ▷ Motivation ▷ Choosing Algorithms ▷ Tuning Algorithms ▷ (NCAR-relevant) Applications ▷ Outlook and Resources

Big Picture techniques intelligently – automatically https://xkcd.com/720/ 3 ▷ advance the state of the art through meta-algorithmic ▷ rather than inventing new things, use existing things more ▷ invent new things through combinations of existing things

Motivation – What Difgerence Does It Make? 4

Prominent Application Fréchette, Alexandre, Neil Newman, Kevin Leyton-Brown. “Solving the Station Packing Problem.” In Association for the Advancement of Artifjcial Intelligence (AAAI), 2016. 5

Performance Difgerences Hurley, Barry, Lars Kotthofg, Yuri Malitsky, and Barry O’Sullivan. “Proteus: A Hierarchical Portfolio of Solvers and Transformations.” In CPAIOR, 2014. 6 1000 100 Virtual Best SAT 10 1 0.1 0.1 1 10 100 1000 Virtual Best CSP

Leveraging the Difgerences Xu, Lin, Frank Hutter, Holger H. Hoos, and Kevin Leyton-Brown. “SATzilla: Portfolio-Based Algorithm Selection for SAT.” J. Artif. Intell. Res. (JAIR) 32 (2008): 565–606. 7

Performance Improvements Hutter, Frank, Domagoj Babic, Holger H. Hoos, and Alan J. Hu. 27–34. Washington, DC, USA: IEEE Computer Society, 2007. FMCAD ’07: Proceedings of the Formal Methods in Computer Aided Design, “Boosting Verifjcation by Automatic Tuning of Decision Procedures.” In 8 4 10 SPEAR, optimized for SWV (s) 3 10 2 10 1 10 0 10 −1 10 −2 10 −2 10 −1 10 0 10 1 10 2 10 3 10 4 10 SPEAR, original default (s)

Common Theme Performance models of black-box processes approximate model based on results of evaluations of the underlying process can be helpful) through interrogation of the model 9 ▷ also called surrogate models ▷ substitute expensive underlying process with cheap ▷ build approximate model using machine learning techniques ▷ no knowledge of what the underlying process is required (but ▷ may facilitate better understanding of the underlying process

Choosing Algorithms 10

Algorithm Selection Given a problem, choose the best algorithm to solve it. Rice, John R. “The Algorithm Selection Problem.” Advances in Computers 15 (1976): 65–118. 11

Algorithm Selection . Extraction Feature Feature Extraction . . . Instance 6: Algorithm 3 Instance 5: Algorithm 3 Instance 4: Algorithm 2 . . Instance 6 Portfolio Instance 5 Instance 4 Performance Model Algorithm Selection Instance 3 Instance 1 Instance 2 Training Instances Algorithm 3 Algorithm 1 Algorithm 2 12

Algorithm Portfolios algorithms across several securities performing poorly other algorithms known to have good performance Huberman, Bernardo A., Rajan M. Lukose, and Tad Hogg. “An Economics Approach to Hard Computational Problems.” Science 275, no. 5296 (1997): 51–54. doi:10.1126/science.275.5296.51. 13 ▷ instead of a single algorithm, use several complementary ▷ idea from Economics – minimise risk by spreading it out ▷ same for computational problems – minimise risk of algorithm ▷ in practice often constructed from competition winners or

Algorithms “algorithm” used in a very loose sense 14 ▷ algorithms ▷ heuristics ▷ machine learning models ▷ software systems ▷ machines ▷ …

Parallel Portfolios Why not simply run all algorithms in parallel? 15 ▷ not enough resources may be available/waste of resources ▷ algorithms may be parallelized themselves ▷ memory/cache contention

Building an Algorithm Selection System algorithms in portfolio on a number of instances 16 ▷ requires algorithms with complementary performance ▷ most approaches rely on machine learning ▷ train with representative data, i.e. performance of all ▷ evaluate performance on separate set of instances ▷ potentially large amount of prep work

Key Components of an Algorithm Selection System optional: extraction time) 17 ▷ feature extraction ▷ performance model ▷ prediction-based selector/scheduler ▷ presolver ▷ secondary/hierarchical models and predictors (e.g. for feature

Types of Performance Models Instance 1 A3: 2 votes Pairwise Regression Models A1 - A2 0 A1 - A3 0 … A1: -1.3 A2: 0.4 A3: 1.7 Instance 2 A1: 1 vote Instance 3 . . . Instance 1: Algorithm 2 Instance 2: Algorithm 1 Instance 3: Algorithm 3 . . . A2: 0 votes … Regression Models A2 A1 A2 A3 A1: 1.2 A2: 4.5 A3: 3.9 Classifjcation Model A1 A3 A1 A1 A3 Pairwise Classifjcation Models A1 vs. A2 A1 A2 A1 A1 A1 vs. A3 A1 A1 A3 18

Tuning Algorithms 19

Algorithm Confjguration Given a (set of) problem(s), fjnd the best parameter confjguration. 20

Parameters? resolution 21 ▷ anything you can change that makes sense to change ▷ e.g. search heuristic, optimization level, computational ▷ not random seed, whether to enable debugging, etc. ▷ some will afgect performance, others will have no efgect at all

Automated Algorithm Confjguration black-box process 22 ▷ no background knowledge on parameters or algorithm – ▷ as little manual intervention as possible ▷ failures are handled appropriately ▷ resources are not wasted ▷ can run unattended on large-scale compute infrastructure

Algorithm Confjguration Frank Hutter and Marius Lindauer, “Algorithm Confjguration: A Hands on Tutorial”, AAAI 2016 23

General Approach workings, build surrogate model based on this data 24 ▷ evaluate algorithm as black-box function ▷ observe efgect of parameters without knowing the inner ▷ decide where to evaluate next, based on surrogate model ▷ repeat

When are we done? parameter space solution (with fjnite time) 25 ▷ most approaches incomplete, i.e. do not exhaustively explore ▷ cannot prove optimality, not guaranteed to fjnd optimal ▷ performance highly dependent on confjguration space � How do we know when to stop?

Time Budget How much time/how many function evaluations? 26 ▷ too much � wasted resources ▷ too little � suboptimal result ▷ use statistical tests ▷ evaluate on parts of the instance set ▷ for runtime: adaptive capping ▷ in general: whatever resources you can reasonably invest

Grid and Random Search Bergstra, James, and Yoshua Bengio. “Random Search for Hyper-Parameter Optimization.” J. Mach. Learn. Res. 13, no. 1 (February 2012): 281–305. 27 ▷ evaluate certain points in parameter space

Model-Based Search results Hutter, Frank, Holger H. Hoos, and Kevin Leyton-Brown. “Sequential Model-Based Optimization for General Algorithm Confjguration.” In LION 5, 507–23, 2011. 28 ▷ evaluate small number of confjgurations ▷ build model of parameter-performance surface based on the ▷ use model to predict where to evaluate next ▷ repeat ▷ allows targeted exploration of new confjgurations ▷ can take instance features into account like algorithm selection

Model-Based Search Example 29 Iter = 1, Gap = 1.9909e−01 0.8 ● ● y 0.4 ● type ● init ● prop 0.0 ● type 0.025 y 0.020 yhat ei 0.015 ei 0.010 0.005 0.000 −1.0 −0.5 0.0 0.5 1.0 x

Model-Based Search Example 30 Iter = 2, Gap = 1.9909e−01 0.8 ● ● y type 0.4 ● ● init ● prop 0.0 ● seq type 0.03 y yhat 0.02 ei ei 0.01 0.00 −1.0 −0.5 0.0 0.5 1.0 x

Model-Based Search Example 31 Iter = 3, Gap = 1.9909e−01 0.8 ● ● y type 0.4 ● ● init ● prop 0.0 seq ● type 0.006 y yhat 0.004 ei ei 0.002 0.000 −1.0 −0.5 0.0 0.5 1.0 x

Model-Based Search Example 32 Iter = 4, Gap = 1.9992e−01 0.8 ● ● y type 0.4 ● ● init ● prop 0.0 seq ● type 8e−04 y 6e−04 yhat ei ei 4e−04 2e−04 0e+00 −1.0 −0.5 0.0 0.5 1.0 x

Model-Based Search Example 33 Iter = 5, Gap = 1.9992e−01 0.8 ● ● y type 0.4 ● ● init ● prop 0.0 seq ● type y 2e−04 yhat ei ei 1e−04 0e+00 −1.0 −0.5 0.0 0.5 1.0 x

Model-Based Search Example 34 Iter = 6, Gap = 1.9996e−01 0.8 ● ● y type 0.4 ● ● init ● prop 0.0 seq ● 0.00012 type y 0.00009 yhat ei 0.00006 ei 0.00003 0.00000 −1.0 −0.5 0.0 0.5 1.0 x

Model-Based Search Example 35 Iter = 7, Gap = 2.0000e−01 0.8 ● ● y type 0.4 ● ● init ● prop 0.0 seq ● 5e−05 type y 4e−05 yhat 3e−05 ei ei 2e−05 1e−05 0e+00 −1.0 −0.5 0.0 0.5 1.0 x

Model-Based Search Example 36 Iter = 8, Gap = 2.0000e−01 0.8 ● ● y type 0.4 ● ● init ● prop 0.0 seq ● type 2.0e−05 y 1.5e−05 yhat ei ei 1.0e−05 5.0e−06 0.0e+00 −1.0 −0.5 0.0 0.5 1.0 x

Model-Based Search Example 37 Iter = 9, Gap = 2.0000e−01 0.8 ● ● y type 0.4 ● ● init ● prop 0.0 seq ● 1.0e−05 type y 7.5e−06 yhat ei 5.0e−06 ei 2.5e−06 0.0e+00 −1.0 −0.5 0.0 0.5 1.0 x

AI-Augmented Algorithms How I Learned to Stop Worrying and Love - PowerPoint PPT Presentation

AI-Augmented Algorithms How I Learned to Stop Worrying and Love Choice Lars Kotthofg University of Wyoming larsko@uwyo.edu Boulder, 16 January 2019 Outline 2 Big Picture Motivation Choosing Algorithms Tuning Algorithms

Network performance requirements of Augmented Reality Systems Mike P. Wittie 1 Augmented

IMPACT OF AUGMENTED REALITY ON SOCIETY BY DEREK MANDL AND STEPHEN SLADEK WHAT IS AUGMENTED

1/08/2012 Augmented Reality How Does This Technology Fit in the Commercial World? Augmented

Portfolio of Work (9 pages) T H E N E X T R E V O L U T I O N I N R E T A I L AUGMENTED

ubiquitous computing and augmented realities virtual and augmented reality m aking the

AUGMENTED REALITY A complete overview of what augmented reality is and how it will revolutionize

Is Augmented Reality the Future? TJ VanToll (@tjvantoll) Augmented Reality TJ VanToll

Locus Innovative Augmented Reality Communication Overview Locus is an augmented reality app

Planar Augmented Reality Kameron Kincade What is Augmented Reality? Augment : to make

Amaze Your Users With Augmented Reality Colin Cornaby 360iDev 2013 Contents Evolution of

Experimental Evaluation of an Augmented Reality Visualization for Directing a Car Drivers

Invisible City A MOBILE AUGMENTED REALIT Y F OR LOC AL GOVERNMENT What is Augmented Reality?

Hacking Your Own Virtual and Augmented Hacking Your Own Virtual and Augmented Reality Apps for

VU Augmented Reality on Mobile Devices VU Augmented Reality on Mobile Devices Introduction

Augmented Cognition Augmented Cognition systems use real-time cognitive state data to adapt

Interaction Management for Ubiquitous Augmented Reality User Interfaces CAR - Car Augmented

Hyperparameter Optimization with SHERPA Lars Hertel, Julian Collado, Peter Sadowski, Pierre Baldi

A Convolutional Attention Network for Extreme Summarization of Source Code ATTENTION

Advanced Machine Learning CS 7140 - Spring 2019 Lecture 24: Bayesian Optimization Jan-Willem van

Bayesian optimisation Gilles Louppe April 11, 2016 Problem statement x = arg max f ( x ) x

Cause-Effect Pairs http://www.kaggle.com/c/cause-effect-pairs/ Goals: Introduction to the

LCD and LArIAT Datasets And CaloDNN and LArTPCDNN Amir Farbin (ATLAS/UTA) LCD Calo Dataset

Computer architecture for deep learning applications David Brooks School of Engineering and

3-3 Multiple Events 21 October 2010 While Im gone Groups of three Two players, one