Latent Class Models for Algorithm Portfolio Methods Bryan - PowerPoint PPT Presentation

Latent Class Models for Algorithm Portfolio Methods Bryan Silverthorn and Risto Miikkulainen Department of Computer Science The University of Texas at Austin 13 July 2010

Our setting: hard computational problems Many important computational questions, such as satisfiability ( SAT ), are intractable in the worst case. Definition (SAT) Does a truth assignment exist that makes a given Boolean expression true? But heuristics often work: enormous problem instances can be solved. SAT solvers are now in routine use for applications such as hardware verification... with up to a million variables. (Kautz and Selman, 2007) 2 / 17

Which solver do you choose? (2009 SAT competition) Instances where solver was best 100 120 20 40 60 80 0 march hi hybridGM3 gnovelty+2 MiniSat-2.1 clasp ag2wsat09++ SApperloT march KS SATzilla09 R precosat TNM glucose iPAWS adaptg2wsat+ MXC gnovelty+ gnovelty+-T SATzilla09 C SATzilla07 C SATzilla09 I kw Rsat minisat CircUs LySAT-i IUT BMB SAT LySAT-c MiniSAT-09z VARSAT-I ManySAT 1.1 minisat cumr rsat picosat SATzilla07 R 3 / 17

Algorithm portfolios: what and why Definition An algorithm portfolio is a pool of algorithms (“solvers”) and a method for scheduling their execution. Portfolios can reduce effort by choosing solvers automatically, and improve performance by allocating resources more effectively. Existing portfolios, such as SATzilla (Xu et al. 2008), often use classifiers trained on feature information to predict solver performance. 4 / 17

How should we predict solver performance? Research Questions What predictions can we make with minimal information? What assumptions are needed to make useful predictions? Do they hold sufficiently well in practice? To explore these questions, we will build unifying generative models of solver behavior and evaluate them in the SAT domain. 5 / 17

Assumptions that make modeling easier Outcomes of runs are discrete, few, and fixed. Utilities of outcomes are known. Durations of runs are discrete, few, and fixed. Learning is offline, but action is online. Tasks are drawn IID from some distribution. Information is obtained from outcomes alone. 6 / 17

Architecture of a model-based portfolio Training Tasks Training Outcomes Solvers 7 / 17

Architecture of a model-based portfolio Training Tasks Training Model Outcomes Solvers 7 / 17

Architecture of a model-based portfolio Training Tasks Training Predicted Model Outcomes Outcomes Solvers 7 / 17

Architecture of a model-based portfolio Training Tasks Training Predicted Action Model Outcomes Outcomes Selection Solvers 7 / 17

Architecture of a model-based portfolio Training Test Tasks Tasks Training Predicted Action Test Model Outcomes Outcomes Selection Outcomes Solvers 7 / 17

Basic structure in solver behavior Inter- algorithm correlations : solvers can be (dis)similar. Example “If solver X yielded outcome A on this task, solver Y likely will as well.” Inter- task correlations : tasks can be (dis)similar. Example “If solver X yielded outcome A on task 1, it likely will on task 2 as well.” Inter- duration correlations : runs can have (dis)similar outcomes. Example “If solver X did not quickly yield outcome A on this task, it never will.” 8 / 17

Conditional independence in solver behavior The outcome of a solver run is a function of only three inputs: the task on which it is executed, the duration of the run, and the seed of any internal pseudorandom sequence. This strong local independence suggests a possible model: take actions to be solver-duration pairs , and assume that tasks cluster into classes. Classes then capture the basic three aspects of solver behavior. 9 / 17

Multinomial latent class model of search Model ξ Key ξ class prior 10 / 17

Multinomial latent class model of search Model β ∼ Dir( ξ ) ξ β Key ξ class prior class distr. β 10 / 17

Multinomial latent class model of search Model β ∼ Dir( ξ ) ξ k t ∼ Mult( β ) t ∈ 1 . . . T β k T Key ξ class prior class distr. β k class tasks T 10 / 17

Multinomial latent class model of search Model β ∼ Dir( ξ ) ξ k t ∼ Mult( β ) t ∈ 1 . . . T β k T Key ξ class prior α outcome prior class distr. β k class α tasks actions T S S 10 / 17

Multinomial latent class model of search Model β ∼ Dir( ξ ) ξ k t ∼ Mult( β ) t ∈ 1 . . . T θ s , k ∼ Dir( α s ) s ∈ 1 . . . S β k ∈ 1 . . . K k T Key ξ class prior α outcome prior θ K class distr. outcome distr. β θ k class α tasks actions T S S K classes 10 / 17

Multinomial latent class model of search Model β ∼ Dir( ξ ) ξ k t ∼ Mult( β ) t ∈ 1 . . . T θ s , k ∼ Dir( α s ) s ∈ 1 . . . S β k ∈ 1 . . . K o t , s , r ∼ Mult( θ s , k t ) t ∈ 1 . . . T s ∈ 1 . . . S k r ∈ 1 . . . R s , t o R s , t T Key ξ class prior α outcome prior θ K class distr. outcome distr. β θ k class o outcome α tasks actions T S S K classes R runs 10 / 17

Burstiness: another important aspect of solver behavior Definition Burstiness is the tendency of some random events to recur. Solver outcomes recur —for some solvers more than others. Example “If solver X yields outcome A on this task, it will again; not so for Y.” Deterministic solvers are entirely bursty. Randomized solvers are less so. Burstiness also appears in text data . The Dirichlet compound multinomial (DCM) distribution has modeled it well in that domain. (Madsen et al., 2005) 11 / 17

Multinomial latent class model of search Model β ∼ Dir( ξ ) ξ k t ∼ Mult( β ) t ∈ 1 . . . T θ s , k ∼ Dir( α s ) s ∈ 1 . . . S β k ∈ 1 . . . K o t , s , r ∼ Mult( θ s , k t ) t ∈ 1 . . . T s ∈ 1 . . . S k r ∈ 1 . . . R s , t o R s , t T Key ξ class prior α outcome prior θ K class distr. outcome distr. β θ k class o outcome α tasks actions T S S K classes R runs 12 / 17

DCM (bursty) latent class model of search Model β ∼ Dir( ξ ) ξ k t ∼ Mult( β ) t ∈ 1 . . . T θ t , s ∼ Dir( α s , k t ) s ∈ 1 . . . S β t ∈ 1 . . . T o t , s , r ∼ Mult( θ t , s ) t ∈ 1 . . . T s ∈ 1 . . . S k r ∈ 1 . . . R s , t o R s , t Key ξ class prior α outcome root θ T class distr. outcome distr. β θ k class o outcome α tasks actions T S K S K classes R runs 13 / 17

Greedy, discounted selection One efficient approach is to choose the next action according to immediate expected utility without regard to later actions. This approach gives us a hard policy that chooses the expected-best action, and a soft policy that draws actions proportional to expected utility. Actions are solver- duration pairs: they have wildly different costs. An obvious response is to reduce an action’s expected utility by its cost, discounting by γ c for a c -second run and factor γ . 14 / 17

Experimental procedure In our experiments, we use every individual solver from the latest SAT competition, and every problem instance from its three benchmark collections; in repeated trials, we run the solvers on a randomly-drawn training set , fit a model to that training data , and then run a portfolio using that model on the remaining test set . Empirical Questions For each combination of model and action selection policy, how does its performance compare to its subsolvers ? how does its performance compare to that of other portfolios ? 15 / 17

Portfolio performance (on the random collection) 460 SAT instances solved 440 420 400 DCM Multinomial 380 Best Single 360 SATzilla 340 0 5 10 15 20 25 30 35 40 45 50 55 60 65 Number of latent classes ( K ) 16 / 17

Recapitulation These results suggest that models can capture useful patterns given little information, and these latent class models can be applied to a portfolio in practice. Research in progress aims to extend these models to capture dynamic information, and improve action planning to better exploit their predictions. 120 1200 100 1000 80 800 60 600 40 400 20 200 0 0 17 / 17

Thanks!—Questions? 18 / 17

References I Matteo Gagliolo and Jurgen Schmidhuber. Learning Restart Strategies. IJCAI 2007. Carla Gomes and Bart Selman. Algorithm Portfolio Design: Theory vs. Practice. UAI 1997. Eric Horvitz, Ruan Yongshao, Carla Gomes, Henry Kautz, Bart Selman, and David Chickering. A Bayesian Approach to Tackling Hard Computational Problems. UAI 2001. Bernardo Huberman, Rajan Lukose, and Tad Hogg. An Economics Approach to Hard Computational Problems. Science , 275(5296), 1997. 19 / 17

References II Frank Hutter, Domagoj Babi´ c, Holger H. Hoos, and Alan J. Hu. Boosting Verification by Automatic Tuning of Decision Procedures. FMCAD 2007. Frank Hutter, Holger H. Hoos, Kevin Leyton-Brown, and Thomas St¨ utzle. ParamILS: An Automatic Algorithm Configuration Framework. Journal of Artificial Intelligence Research , 2009. Henry Kautz and Bart Selman. The state of SAT. Discrete Applied Mathematics , 12, 2007. Ashiqur KhudaBukhsh, Lin Xu, Holger H. Hoos, Kevin Leyton-Brown. SATenstein: Automatically Building Local Search SAT Solvers From Components. IJCAI 2009. 20 / 17

Latent Class Models for Algorithm Portfolio Methods Bryan - PowerPoint PPT Presentation

Latent Class Models for Algorithm Portfolio Methods Bryan Silverthorn and Risto Miikkulainen Department of Computer Science The University of Texas at Austin 13 July 2010 Our setting: hard computational problems Many important computational

1 Latent variable models In the next section we will discuss latent variable models for

Part III: Latent Tree Models Le Song ICML 2012 Tutorial on Spectral Algorithms for Latent

Latent Class Models: The Latent Class Logit Model Accouting for unobserved heterogeneity:

Latent Variable Models CS3750 Xiaoting Li 1 Out utli line Latent Variable Models

Learning Overcomplete Latent Variable Models through Tensor Methods Anima Anandkumar UC Irvine

Latent Class Analysis (LCA) in Stata Kristin MacDonald Director of Statistical Services

Pengtao Xie Joint work with Yuntian Deng and Eric Xing Carnegie Mellon University 1 Latent

Learning Latent Variable Models through Tensor Methods Anima Anandkumar U.C. Irvine Challenges

lcda : Local Classification of Discrete Data by Latent Class Models Michael B ucker

Guaranteed Learning of Latent Variable Models through Tensor Methods Furong Huang University of

Optimization-Based Model Fitting for Latent Class and Latent Profile Analyses Guan-Hua Huang,

Latent class analysis and finite mixture models with Stata Isabel Canette Principal

C unobserved construct (e.g. Disordered v. Non- Disordered) Latent classes are mutually

Models for Replicated Discrimination Tests: A Synthesis of Latent Class Mixture Models and

Latent Variable Models Stefano Ermon, Aditya Grover Stanford University Lecture 6 Stefano

Odds Algorithm An Online Algorithm Group Fibonado 20. Dec 2016 Group Fibonado Odds Algorithm

Examples and Implementations [Bayesian approach to Latent Class Models: Definition, Simulation,

Food Price Heterogeneity and Income Inequality in Malawi: Is Inequality Underestimated? Richard

Releasing Search Queries and Clicks Privately Arne Bayer July 24, 2017 Arne Bayer Releasing

Laplace Sanitizer Claire McKay Bowen Postdoctoral Researcher, Los Alamos National Laboratory

Meta-Learning with Shared Amortized Variational Inference Ekaterina Iakovleva Jakob Verbeek

Invariant-equivariant representation learning for multi-class data Ilya Feige Faculty

(An example of) The Expectation-Maximization (EM) Algorithm Instructor: Sham Kakade 1 An

How to measure material deprivation? A Latent Markov Model based approach Francesco Dotto 1 Joint