Simulation Output Analysis Bruno Tuffin INRIA Rennes - Bretagne Atlantique PEV: Performance EValuation M2RI - Networks and Systems Track Rennes Bruno Tuffin (INRIA) Output Analysis PEV - 2010 1 / 47
Outline Introduction 1 Very basic tool: Central Limit Theorem 2 Accuracy and Efficiency of an estimator 3 Performance measures 4 Transient Simulations 5 Steady-state Simulations 6 Acceleration techniques 7 Some open problems 8 Bruno Tuffin (INRIA) Output Analysis PEV - 2010 2 / 47
References S. Asmussen and P.W. Glynn. Stochastic Simulation. Algorithms and Analysis . 1 Stochastic Modelling and Applied Probability Series, Springer Verlag, 2007. C. Alexopoulos and A.F. Seila. Advanced Methods for Simulation Output Analysis. 2 In the Proceedings of the 1998 Winter Simulation Conference , D.J. Medeiros, E.F. Watson, J.S. Carson and M.S. Manivannan, eds. 1998. M. Nakayama. Simulation Output Analysis. In the Proceedings of the 2002 Winter 3 Simulation Conference , E. Yucesan, C.H. Chen, J.L. Snowdown and J.M. Charnes, eds. 2002. B. Schmeiser. Simulation Output Analysis: a Tutorial Based on one research 4 Thread. In the Proceedings of the 2004 Winter Simulation Conference , R.G. Ingalls, M.D. Rossetti, J.S. Smith and B.A. Peters, eds. 2004. 5 G. Rubino and B. Tuffin. Simulations et m´ ethodes de Monte Carlo. In : Techniques de l’Ing´ enieur, 2007. 6 B. Tuffin. La simulation de Monte Carlo. Editions Herm` es, 2010. Bruno Tuffin (INRIA) Output Analysis PEV - 2010 3 / 47
Outline Introduction 1 Very basic tool: Central Limit Theorem 2 Accuracy and Efficiency of an estimator 3 Performance measures 4 Transient Simulations 5 Steady-state Simulations 6 Acceleration techniques 7 Some open problems 8 Bruno Tuffin (INRIA) Output Analysis PEV - 2010 4 / 47
Introduction: what should we get from simulation results Analytical results provide an exact result for a performance measure; Numerical analysis techniques provide an appraoch result, and hopefully an idea of the error; Simulation make use of random numbers: ◮ 2 different simulations give 2 different results; ◮ We need, and there exist efficient tools providing an idea of the error. A simulation just giving a number as result is disappointing: no output analysis . ◮ Example: opinion polls in media. Bruno Tuffin (INRIA) Output Analysis PEV - 2010 5 / 47
What we should get: a confidence interval Assume that we want to estimate a performance measure µ . While standard numerical analysis provide a deterministic approximation for µ the quantity and (potentially) a strict error bound... ... simulation does provide a Confidence Interval ( A , B ) with confidence level 1 − α : ◮ It means that we can only say that µ is in ( A , B ) with probability 1 − α ; ◮ No insurance that it is true (100 α % chances to be out of it). ◮ We can increase the confidence level 1 − α , but at the expense of the interval width B − A . ◮ Usually, the more we simulate, the smaller the interval width. The goal of this course is to explain how to build such intervals. Bruno Tuffin (INRIA) Output Analysis PEV - 2010 6 / 47
Outline Introduction 1 Very basic tool: Central Limit Theorem 2 Accuracy and Efficiency of an estimator 3 Performance measures 4 Transient Simulations 5 Steady-state Simulations 6 Acceleration techniques 7 Some open problems 8 Bruno Tuffin (INRIA) Output Analysis PEV - 2010 7 / 47
Basic tool in statistics: Central Limit Theorem (CLT) Most (if not all) performance measures µ can be expressed as µ = E[ X ] for some random variable X (ex: mean throughput, mean delay or mean number of packets at a router...) Central Limit Theorem: Let X 1 , · · · , X n n i.i.d copies of X with (finite) expectation µ and variance σ 2 . The CLT gives the behaviour of arithmetical average n X n = 1 ¯ � X i n i =1 when n is large. More precisely, normalized r.v. ¯ √ n X n − µ σ (so that mean is 0 and variance 1) has a distribution N (0 , 1) (Gaussian law with mean 0 and variance 1) as n → ∞ . Bruno Tuffin (INRIA) Output Analysis PEV - 2010 8 / 47
Confidence interval for µ From tables provided in textbooks (or on the web), we can get values z 1 − α/ 2 such that P [ N (0 , 1) ≤ z 1 − α/ 2 ] = 1 − α/ 2 . I Then 1 − α P [ − z 1 − α/ 2 ≤ N (0 , 1) ≤ z 1 − α/ 2 ] = I ¯ − z 1 − α/ 2 ≤ √ n � X n − µ � ≈ ≤ z 1 − α/ 2 P I σ � σ σ � ¯ √ n ≤ µ ≤ ¯ = X n − z 1 − α/ 2 X n + z 1 − α/ 2 √ n P I This gives for µ a confidence interval at confidence level 1 − α � σ σ � ¯ √ n , ¯ √ n X n − z 1 − α/ 2 X n + z 1 − α/ 2 . Standard values for ◮ 1 − α = 90% gives z 1 − α/ 2 = 1 . 64, ◮ 1 − α = 95% gives z 1 − α/ 2 = 1 . 96, ◮ 1 − α = 99% gives z 1 − α/ 2 = 2 . 58. Bruno Tuffin (INRIA) Output Analysis PEV - 2010 9 / 47
Properties and accuracy σ √ n decreasing at rate O (( n − 1 / 2 ). Confidence interval width 2 z 1 − α/ 2 This is much better than numerical analysis in high dimension for which it is often n − k / d (for some k ) for d -dimensional problems. Increasing the sample size n improves the accuracy, but at this rate O ( n − 1 / 2 ). Bruno Tuffin (INRIA) Output Analysis PEV - 2010 10 / 47
But σ 2 not known in general: how to proceed? Major issue: variance σ 2 generally unknown. σ 2 estimated by the (unbiased) sample variance n 1 � ( X i − ¯ S 2 ( n ) X n ) 2 = n − 1 i =1 � n � 1 i − n (¯ � X 2 X n ) 2 = . n − 1 i =1 Easy to implement, for the simulation, we only need two counters, one for � n i =1 X i , and the other for � n i =1 X 2 i . This gives for µ a confidence interval at confidence level 1 − α � S ( n ) S ( n ) � ¯ √ n , ¯ √ n X n − z 1 − α/ 2 X n + z 1 − α/ 2 . If n is moderate, z 1 − α/ 2 can be found from Student distribution with n − 1 degrees fo freedom (which converges to a Gaussian law quite quickly). Textbooks provide such values also. Bruno Tuffin (INRIA) Output Analysis PEV - 2010 11 / 47
Example: simulating the mean number of customer at fixed time T for an M/M/1 queue Consider the simulation of an M/M/1 queue up to time T , starting from an empty queue at t = 0 Simulation is realized by using discrete-event simulation (see previous course), by schedulling arrivals and departures and keeping track of the state given by the number of customers in the queue. A run is a single simulation up to T , and gives X : the number of customers in the queue at time T . We realize n i.i.d. copies of X , X 1 , . . . , , X n , and use the above framework. Bruno Tuffin (INRIA) Output Analysis PEV - 2010 12 / 47
Outline Introduction 1 Very basic tool: Central Limit Theorem 2 Accuracy and Efficiency of an estimator 3 Performance measures 4 Transient Simulations 5 Steady-state Simulations 6 Acceleration techniques 7 Some open problems 8 Bruno Tuffin (INRIA) Output Analysis PEV - 2010 13 / 47
Total error Statistical quality of an estimator ¯ X n of µ is measured by the mean squared error MSE(¯ X n , µ ) = E[(¯ X n − µ ) 2 ] = Bias 2 ( ¯ X n , µ ) + σ 2 ( ¯ X n ) . Indeed, it may happen that E[¯ X n ] � = µ , because some good estimators are inherently biased, 1 it is much ”cheaper” to sample from a close but not exact distribution 2 modelling errors... 3 In many cases though, Bias(¯ X n , µ ) = 0. Bruno Tuffin (INRIA) Output Analysis PEV - 2010 14 / 47
Efficiency It may appear that variance is THE measure of efficiency of an estimator: the best estimator is the one with smallest variance (for same sample size). However, we also need to consider CPU times. X ′ the estimator with The problem is rather: what is , between ¯ X and ¯ the smallest variance in a computational budget c ? If T and T ′ are the times required to generate one independent replication of X and X ′ when computing ¯ X and ¯ X ′ , the number of replications will be respectively n = c / T and n ′ = c / T ′ . Thus the best estimator is ¯ X if σ 2 ( X ) T < σ 2 ( X ′ ) T ′ . Note that σ 2 ( X ) T can be interpreted as the variance for a unit of time. Bruno Tuffin (INRIA) Output Analysis PEV - 2010 15 / 47
Sequential procedure for absolute error criterion Assume that we want to have an estimator ¯ X n of µ such that | ¯ X n − µ | < ε with probability 1 − α . From the CLT: σ ε ≈ z 1 − α/ 2 √ n so that we need � ( z 1 − α/ 2 ) 2 σ 2 � n > . ε 2 Since σ 2 unknown and estimated, we need to proceed as follows ◮ Use a sample of size n 0 (typically n 0 ≥ 50), estimate σ 2 , then compute � ( z 1 − α/ 2 ) 2 S ( n 0 ) 2 � n = . ε 2 ◮ Then generate a sample of size n , for which absolute error should approximately be ε . Can also be performed in a sequential way. Bruno Tuffin (INRIA) Output Analysis PEV - 2010 16 / 47
Recommend
More recommend