Making Decisions via Simulation [Law, Ch. 10], [Handbook of Sim. - PowerPoint PPT Presentation

Making Decisions via Simulation [Law, Ch. 10], [Handbook of Sim. Opt.], [Haas, Sec. 6.3.6] Peter J. Haas CS 590M: Simulation Spring Semester 2020 1 / 39

Making Decisions via Simulation Overview Factor Screening Continuous Stochastic Optimization Robbins-Monro Algorithm Derivative Estimation Other Continuous Optimization Methods Ranking and Selection Selection of the Best Subset Selection Discrete Optimization Commercial Solvers 2 / 39

Overview Goal: Select best system design or parameter setting I Performance under each alternative estimated via simulation min θ 2 Θ f ( ✓ ) where Θ = feasible set I f is often of the form f ( ✓ ) = E θ [ c ( X , ✓ )] I X is estimated from the simulation I E θ indicates that dist’n of X depends on ✓ 3 / 39

Overview, Continued Three cases: 1. Θ is uncountably infinite (continuous optimization) I Robbins-Monro Algorithm I Metamodel-based optimization I Sample average approximation 2. Θ is small and finite (ranking and selection of best system) I E.g., Dudewicz and Dalal (HW #7) 3. Θ is a large discrete set (discrete optimization) Not covered here: Markov decision processes I Choose best policy: I.e., choose best function ⇡ , where ⇡ ( s ) = action to take when new state equals s [Chang et al., 2007] 4 / 39

Factor Screening Goal: Identify the most important drivers of model response I Needed for understanding I Needed to focus modeling resources (e.g., input distributions) I Needed to select decision variables for optimization 6 / 39

Factor Screening, Continued Based on a simulation metamodel, for example: Y ( x ) = � 0 + � 1 x 1 + · · · + � k x k + ✏ I Y = simulation model output I Parameters x = ( x 1 , . . . , x k ) I ✏ = noise term (often Gaussian) I Estimate the � i ’s using ”low” and ”high” x i values I Test if each | � i | is significantly di ff erent from 0 I Will talk more about metamodels later on... 7 / 39

Factor Screening, Continued � i coe ffi cients indicate parameter importance 1 0.5 0 -0.5 -1 6 4 2 Y(x) 0 -2 -4 x1 -6 1 0.5 0 -0.5 -1 x2 8 / 39

Factor Screening, Continued Challenge: Many Features I Example with k = 3: � 1 = Y ( h , l , l ) + Y ( h , l , h ) + Y ( h , h , l ) + Y ( h , h , h ) ˆ 4 � Y ( l , l , l ) + Y ( l , l , h ) + Y ( l , h , l ) + Y ( l , h , h ) 4 I In general, need 2 k simulations (”full factorial” design) I Can be smarter, e.g., ”fractional factorial” designs (will talk about this soon) I In general: interplay between metamodel complexity (e.g., � ij terms) and computational cost 9 / 39

Factor Screening, Continued Sequential bifurcation I For huge number of factors I Assumes Gaussian noise, nonnegative � ’s I Test groups (sums of � i ’s) 10 / 39

Continuous Stochastic Optimization Robbins-Monro Algorithm I Goal: min θ 2 [ θ , θ ] f ( ✓ ) I Estimate f 0 ( ✓ ) and use stochastic approximation (also called stochastic gradient descent) ⇣ a ⇣ ⌘ ⌘ ✓ n +1 = Π ✓ n � Z n n where I a > 0 (the gain) I E [ Z n ] = f 0 ( ✓ n ) 8 ✓ if ✓ < ✓ > < I Π ( ✓ ) = if ✓  ✓  ✓ ✓ > ✓ if ✓ > ✓ : (projection function) 12 / 39

no :X Continuous Stochastic Optimization, Continued . otnkyi.gg?nizorminimnn;*hYinaihum Convergence I Suppose that ✓ ⇤ is true minimizer and the only local minimizer I Under mild conditions, lim n !1 ✓ n = ✓ ⇤ a.s. I Q: If ✓ ⇤ is not the only local minimizer, what can go wrong? I For large n , ✓ n has approximately a normal dist’n Estimation Algorithm for 100(1 � δ )% Confidence Interval 1. Fix n � 1 and m 2 [5 , 10] 2. Run the Robbins-Monro iteration for n steps to obtain ✓ n 3. Repeat Step 2 a total of m times to obtain ✓ n , 1 , . . . , ✓ n , m 4. Compute point estimator ¯ ✓ m = (1 / m ) P m j =1 ✓ n , j 5. Compute 100(1 � � %) CI as [¯ ✓ m � s m t m � 1 , δ , ¯ ✓ m + s m t m � 1 , δ ] p m p m ✓ ) 2 and t m � 1 , δ = Student-t quantile 1 P m j =1 ( ✓ n , j � ¯ where s 2 m = m � 1 13 / 39

Continuous Stochastic Optimization, Continued Remarks I Variants available for multi-parameter problems I Drawbacks to basic algorithm are slow convergence and high sensitivity to the gain a ; current research focuses on more sophisticated methods I Simple improvement: return best value seen so far Kiefer-Wolfowitz algorithm I Replaces derivative f 0 ( ✓ n ) by finite di ff erence f ( θ n + ∆ ) � f ( θ n � ∆ ) 2 ∆ I Spalls’ simultaneous perturbation stochastic approximation (SPSA) method handles high dimensions I At the k th iteration of a d -dimensional problem, run simulation at ✓ k ± c ∆ k , where c > 0 and ∆ k is a vector of i.i.d. random variables I 1 , . . . , I d with P ( I j = 1) = P ( I j = � 1) = 0 . 5 14 / 39

Estimating the Derivative f 0 ( θ n ) Suppose that f ( ✓ ) = E θ [ c ( X , ✓ )] I Ex: M/M/1 queue with interarrival rate � and service rate ✓ I X = average waiting time for first 100 customers I c ( x , ✓ ) = a ✓ + bx (trades o ff operating costs and delay costs) Use likelihood ratios I We have f ( ✓ + h ) = E θ + h [ c ( X , ✓ + h )] = E θ [ c ( X , ✓ + h ) L ( h )] for appropriate likelihood L ( h ) f ( ✓ + h ) � f ( ✓ ) f 0 ( ✓ ) = lim h h ! 0 h c ( X , ✓ + h ) L ( h ) � c ( X , ✓ ) L (0) i = lim h ! 0 E θ h c ( X , ✓ + h ) L ( h ) � c ( X , ✓ ) L (0) h i = E θ lim under regularity cond. h th ) h ! 0 to e CIO th ) h d , Oth ) . dghccx � � i , � = E θ c ( X , ✓ + h ) L ( h ) � ( chain rule ) dh � h =0 c 0 = @ c / @✓ L 0 = @ L / @ h ⇥ c 0 ( X , ✓ ) + c ( X , ✓ ) L 0 (0) ⇤ = E θ 16 / 39

Derivative Estimation, Continued To estimate g ( ✓ ) ∆ ⇥ ⇤ = f 0 ( ✓ ) = E θ c 0 ( X , ✓ ) + c ( X , ✓ ) L 0 (0) I Simulate system to generate i.i.d. replicates X 1 , . . . , X m I At the same time, compute L 0 1 (0) , . . . , L 0 m (0) P m I Compute the estimate g m ( ✓ ) = 1 i =1 [ c 0 ( X i , ✓ ) + c ( X i , ✓ ) L 0 i (0)] m I Robbins and Monro showed that taking m = 1 is optimal (many approximate steps vs few precise steps) n th step of R-M algorithm 1. Generate a single sample X of the performance measure and compute L 0 (0) 2. Set Z n = g 1 ( ✓ n ) = c 0 ( X , ✓ n ) + c ( X , ✓ n ) L 0 (0) ⇣ a ⇣ ⌘ ⌘ 3. Set ✓ n +1 = Π ✓ n � Z n n 17 / 39

Derivative Estimation, Continued Ex: M/M/1 queue I Let V 1 , . . . , V 100 be the 100 generated service times I Let X = avg of the 100 waiting times (the perf. measure) 100 100 ( ✓ + h ) e � ( θ + h ) V i ✓ ✓ + h ◆ Y Y e � hV i L ( h ) = = ✓ e � θ V i ✓ i =1 i =1 100 ⇣ 1 ⌘ X ) L 0 (0) = ✓ � V i (can be computed incrementally) i =1 c 0 ( x , ✓ ) = a c ( x , ✓ ) = a ✓ + bx ) ⇣ 1 100 ⌘ X Z n = c 0 ( X n , ✓ n ) + c ( X n , ✓ n ) L 0 n (0) = a + ( a ✓ n + bX n ) � V n , i ✓ n i =1 18 / 39

Derivative Estimation, Continued A trick for computing L 0 (0) I Likelihood ratio often has form: L ( h ) = r 1 ( h ) r 2 ( h ) · · · r k ( h ) P θ + h ( S j +1 ; S j , e ⇤ j ) I E.g., for a GSMP, r i ( h ) = f θ + h ( X ; s 0 , e 0 , s , e ⇤ ) or f θ ( X ; s 0 , e 0 , s , e ⇤ ) P θ ( S j +1 ; S j , e ⇤ j ) I Using the product rule and the fact that r i (0) = 1 for all i d h =0 = d � �� dhL ( h ) r 1 ( h ) r 2 ( h ) · · · r k ( h ) � � � dh � h =0 r 1 ( h ) d ⇥ � �⇤ ⇥ ⇤ = r 2 ( h ) · · · r k ( h ) h =0 + r 0 1 ( h ) r 2 ( h ) · · · r k ( h ) h =0 dh = d ⇥ ⇤ h =0 + r 0 r 2 ( h ) · · · r k ( h ) 1 (0) dh I By induction: L 0 (0) = r 0 1 (0) + · · · + r 0 k (0) (compute incrementally) I For GSMP example (with f 0 θ = @ f θ / @✓ ): d � dh f θ + h ( X ; s 0 , e 0 , s , e ⇤ ) = f 0 θ ( X ; s 0 , e 0 , s , e ⇤ ) � h =0 r 0 i (0) = f θ ( X ; s 0 , e 0 , s , e ⇤ ) f θ ( X ; s 0 , e 0 , s , e ⇤ ) 19 / 39

Derivative Estimation, Continued Trick continued: M/M/1 queue 100 100 f θ + h ( V i ) Y Y L ( h ) = r i ( h ) = f θ ( V i ) i =1 i =1 f θ ( v ) = ✓ e � θ v θ ( v ) = (1 � ✓ v ) e � θ v f 0 and 100 100 100 f 0 (1 � ✓ V i ) e � θ V i θ ( V i ) ⇣ 1 ⌘ X X X L 0 (0) = f θ ( V i ) = = ✓ � V i ✓ e � θ V i i =1 i =1 i =1 20 / 39

Making Decisions via Simulation [Law, Ch. 10], [Handbook of Sim. - PowerPoint PPT Presentation

Making Decisions via Simulation [Law, Ch. 10], [Handbook of Sim. Opt.], [Haas, Sec. 6.3.6] Peter J. Haas CS 590M: Simulation Spring Semester 2020 1 / 39 Making Decisions via Simulation Overview Factor Screening Continuous Stochastic

Today Making Simple Decisions Making Decisions Making Sequential Decisions Planning

Making Decisions via Simulation Factor Screening [Law, Ch. 10], [Handbook of Sim. Opt.], [Haas,

Outline Narcisse Ngada DESY, MKK 1) What is simulation ? 14.05.2014 2) Why simulation ? 3)

Making Decisions 10 AI Slides (6e) c Lin Zuoquan@PKU 1998-2020 10 1 10 Making Decisions

Grid simulation (AliEn) Outline GRID simulation Simulation tool Ptolemy (Berkeley)

Making better decisions and improving Making better decisions and improving performance

Making maps pretty Andrea Aime Jim Groffen Making Maps Pretty Making Maps Pretty 1 1 Making

T7 Cloud Simulation On-demand access simulation December 2016 T7 Cloud Simulation December 2016

Simulation Simulation CHAPTER 1 INTRODUCTION TO SIMULATION 2 MODELING CHAPTER 1 INTRODUCTION

GCSE or Equivalent Options Decisions! Decisions! Decisions! An important time for our Year 10

Doing Your Taxes Decisions Decisions Decisions How do I get ready? Should I

Dysphagia: decisions, decisions, decisions Sean White Home Enteral Feed Dietitian Sheffield

$ Lesson One Making Decisions 04/09 the decision-making process The decision-making process

Automated Configuration of Co-simulation with Domain Specific Hints Co-simulation on the rise

MD3311 Simulation Results Joschua Dilly 28.01.2019 MD3311 Simulation Results 2 Introduction

Surgical Simulation: Surgical Simulation: We dont need simulation. We dont need

What is a factor? Introduction to R for Finance Stocks or bonds? Investment = 2 stock = 1

Ranking Factors of Team Success Nataliia Pobiedina, Julia Neidhardt, Maria del Carmen Calatrava

Deep Text Mining of Instagram Data Without Strong Supervision WI 2018 Santiago | International

Automatic Quality Estimation for Natural Language Generation: Ranting (Jointly Rating and

DRUPAL PERFORMANCE A Surgical Approach 1 1 @mandclu MARTIN ANDERSON-CLUTZ 2

Transfer to Rank for Heterogeneous One-Class Collaborative Filtering Weike Pan 1 , Qiang Yang 2

Oberseminar Convergence Mechanisms for a Smart Space App Store Bibek Shrestha

Google matrix of the world trade network Leonardo Ermann CNEA (Buenos Aires, Argentina) Colab.