Proceedings of the 2004 Winter Simulation Conference R. G. Ingalls, M. D. Rossetti, J. S. Smith, and B. A. Peters, eds. CONFIDENCE INTERVAL ESTIMATION IN HEAVY-TAILED QUEUES USING CONTROL VARIATES AND BOOTSTRAP Pablo Jes´ us Argibay-Losada Andr´ es Su´ arez-Gonz´ alez C´ andido L´ opez-Garc´ ıa Ra´ ul Fernando Rodr´ ıguez-Rubio Jos´ e Carlos L´ opez-Ardao Departmento de Enxener´ ıa Telem´ atica ETSE de Telecomunicaci´ on Universidade de Vigo 36200 Vigo, SPAIN ABSTRACT Moreover, network engineering is not the only im- portant field where heavy-tailed distributions have a The heavy-tailed condition of a random variable can considerable practical relevance: many financial tasks cause difficulties in the estimation of parameters and also use them in models regarding financial and insur- their confidence intervals from simulations, specially if ance risks [9]. the variance of the random variable we are studying So it should be quite clear how important is to is infinite. If we use a standard method to obtain consider that kind of random variables in simulation, as confidence intervals under such circumstances we shall simulation is one of the most powerful tools at time to typically get inaccurate results. To face up this problem, make performance studies within such engineering and and trying to contribute to find accurate confidence economic areas. But the use of random variables with interval estimation methods for such cases, in this paper those characteristics leads to important problems when we propose the use of a control variate method combined trying to analyze the results of simulations. with a bootstrap based confidence interval computation. In an M/P/1 queue system, Gross et al. [10] describe The control variate approach is doubly interesting to problems regarding the estimation of the mean queue address the problem of infinite variance. We tested this waiting time. Fischer et al. [11], Chen [12] and Sees approach in a M/P/1 queue system with infinite variance and Shortle [13] study the estimation of quantiles in the in the queue waiting time and got quite accurate results. presence of the heavy-tail condition. Argibay et al. [14] study the use of a control variate 1 INTRODUCTION (CV) to help in the estimation of the mean queue waiting time of the M/P/1, improving both the estimated mean Heavy-tailed distributions and distributions with infi- and its confidence intervals (CIs) when the coefficient nite variance play an important role in the modeling of of the CV method is calculated beforehand from the several variables in communication networks. In the lit- classical queueing theory. erature we can find good references relating these special Our objective is to find an accurate method to esti- characteristics [2] to several magnitudes like the size of mate the confidence intervals for the mean queue waiting the files downloaded from HTTP or FTP servers [3] [4], time when affected by the heavy-tailed behavior of the the duration of sessions [5], or even to certain charac- service time but thinking in its usefulness in a more teristics exhibited by human-computer interactions [6] generic scenario (G/P/1). In this paper we extend the [7]. In fact, in [8] Paxson shows that the presence of work in [14] but now calculating the coefficient of the heavy-tailed distributions is an invariant in the internet. CV method from the simulation data itself combined with some bootstrap-based confidence interval estima-
Argibay-Losada, Su´ arez-Gonz´ alez, L´ opez-Garc´ ıa, Rodr´ ıguez-Rubio, L´ opez-Ardao tion techniques. Nevertheless we use the M/P/1 queue We will also fix ρ to 0.5 to minimize the effects of the system in order to validate our results and show that transient state in the simulations. our method achieves accurate confidence intervals for In not heavy-tailed distributions —like the expo- the mean queue waiting time. nential or the normal ones— the probability that the In Section 2, we describe mathematically the system random variable takes a great value is so negligible that queue under study, the M/P/1. In Section 3 we describe if we do not consider those values to calculate some the problems related to the accuracy of estimators of moments of the distribution, we will still get a pretty CIs for the mean queue waiting time in that queue. In good estimation of them. This can be the case of the Section 4 we present the approach we propose to try mean, the second moment, and as a consequence, the to obtain better results by means of a control variate variance. that help us with the problem of infinite variance of The Pareto distribution is a particular example of the estimator. In Section 5 we describe the method heavy-tailed distributions. Nevertheless, in the case of we are going to use for the construction of confidence heavy-tailed distributions, the probability of such large intervals, based on bootstrap percentiles, and in Section values, although still being small, is enough to make them 6 we describe the results of applying that method to have a great influence in some important parameters of the M/P/1 queue, achieving confidence intervals with the distribution. If those parametersare being estimated good coverage. Finally, in Section 7 we describe some through simulation, the problem arises because such not conclusions and further work. negligible probability is paralelly not so significant to be likely for such large values to appear even in a long 2 THE M/P/1 QUEUE simulation; and so the lack of those unlikely samples could finally affect drastically the results. The M/P/1 is the queue system we are going to work This effect, and its implications in the simulation of with. Customers arrive according to a Poisson process, queueing systems, will be discussed in the next section. and demand independent and identically distributed (iid) service times which follow a Pareto distribution. 3 PROBLEMS WHEN SIMULATING THE The queue discipline is “first come first served” and M/P/1 the queue capacity is infinite. We will work with the stochastic process of the consecutive customers’ waiting We want to estimate a confidence interval for the mean times, W = { W j ; j = 1 , 2 , . . . } . queue waiting time of the M/P/1. Since the M/P/1 is a special case of the M/G/1, The classical theory of construction of CIs assumes we can use the Pollaczek-Khinchin formula, that gives independent and identically distributed samples from a us the mean queue waiting time: distribution with finite mean and variance. But if in the M/P/1 the shape parameter of the Pareto, a , is smaller than 3, the variance of W will be λ · S 2 infinite. This will imply that we cannot use the central W = (1) 2 · (1 − ρ ) limit theorem to give a CI for W . Instead, the infinite variance will imply that the sample mean, appropriately where S is the demanded service time random variable, λ normalized, will tend to a stable distribution. To show is the mean arrival rate of the Poissonian arrival process, it, we note that the two conditions that W must achieve and ρ the utilization factor of the system [18]. to be in the domain of attraction of a stable law are We are interested in those systems where the service [23]: time is a Pareto RV. The cumulative distribution function (cdf) of the 1. Pareto is given by: 1 − F W ( x ) 1 − F W ( x )+F W ( − x ) → p � a � m F W ( − x ) 1 − F W ( x )+F W ( − x ) → q F ( x ) = 1 − ∀ x ≥ m > 0 x where F W ( x ) denotes the cdf of W . In our case, In [10] a Pareto distribution —with m = 1 and W is nonnegative, so we met this condition with shifted to 0— is used in an M/P/1 to illustrate the p = 1 , q = 0. problems of simulating such system when a is near 2. 2. Specifically, when a is in (2 , 3) the variance of W is infinite. In this paper we also fix m to 1 to show the 1 − F W ( x ) + F W ( − x ) ∼ 2 − α x − α L( x ) benefits of our proposed method in a similar scenario. α
Recommend
More recommend