Input Distributions Reading: Chapter 6 in Law Input Distributions - PowerPoint PPT Presentation

Input Distributions Reading: Chapter 6 in Law Input Distributions Overview Probability Theory for Choosing Distributions Peter J. Haas Data-Driven Approaches Other Approaches to Input Distributions CS 590M: Simulation Spring Semester 2020 1 / 22 2 / 22 Overview To specify a simulation model, we need to define clock-setting “input” sequences Examples: Input Distributions ◮ Interarrival sequences Overview ◮ Processing time sequences for a production system Probability Theory for Choosing Distributions ◮ Asset-value sequence for a financial model Data-Driven Approaches Other Approaches to Input Distributions Even if we assume i.i.d. sequences for simplicity... ◮ What type of distribution should we use (gamma, Weibull, normal, exponential, . . . )? ◮ Given a type of probability distribution (i.e., a “distribution family”) what parameter values should we use? Two approaches: probability theory and historical data 3 / 22 4 / 22

Theoretical Justification for Normal Random Variables Theoretical Justification for Lognormal Random Variables Suppose that X can be expressed as a sum of random Suppose that X can be expressed as a product of random variables: X = Y 1 + Y 2 + · · · + Y n variables: X = Z 1 Z 2 · · · Z n ◮ Example: Value of a financial asset In great generality, versions of CLT imply that X D ∼ N ( µ, σ 2 ) (approx.) for large n , where µ = E [ X ] and σ 2 = Var[ X ] Then log( X ) = Y 1 + Y 2 + · · · + Y n , where Y i = log( Z i ) ◮ Y i ’s need not be i.i.d., just not “too dependent” and not “too By prior discussion, log( X ) D non-identical” ∼ N ( µ, σ 2 ) (approx.) for large n , i.e. X D � � ∼ exp N (0 , 1) , so that X is approximately lognormally Q: Examples where CLT breaks down? distributed Moral: Moral: If X is the product of a large number of other random quantities, If X is the sum of a large number of other random quantities, then then X can be approx. modeled as a lognormal random variable. X can be approx. modeled as a normal random variable. 5 / 22 6 / 22 Theoretical Justification for Poisson Arrival Process Theoretical Justification for Weibull Distribution Suppose that the arrival process is a superposition of arrivals from a variety of statistically independent sources Suppose that X can be expressed as a minimum of nonnegative random variables: X = min 1 ≤ i ≤ n Y i source 1 ◮ Example: Lifetime of a complex system source 2 source 3 queue : Extreme-value theory (Gnedenko’s Theorem) says that if Y i ’s : : are i.i.d. then X has approximately a Weibull distribution source n when n is large: P ( X ≤ x ) = 1 − e − ( λ x ) α The Palm-Khintchine Theorem says that superposition of n ◮ α is the shape parameter and λ is the scale parameter i.i.d. sources looks ≈ Poisson as n becomes large ◮ Can relax i.i.d. assumption Moral: If X is the the lifetime of a complicated component, then X can be Moral: approx. modeled as a Weibull random variable. Poisson process often a reasonable model of arrival processes. (But beware of long-range dependence) 7 / 22 8 / 22

Controlling the Variability Squared coefficient of variation: ρ 2 ( X ) = Var( X ) ( E [ X ]) 2 D Case 1: X ∼ exp( λ ) Input Distributions ρ 2 ( X ) = 1 Overview Probability Theory for Choosing Distributions D f X ( x ) = λ e − λ x ( λ x ) α − 1 / Γ( α ) Case 2: X ∼ gamma( λ, α ) Data-Driven Approaches Other Approaches to Input Distributions ρ 2 ( X ) = α/λ 2 ( α/λ ) 2 = 1 α Three scenarios to play with: ◮ α > 1: less variable than exponential ◮ α = 1: exponential ◮ α < 1: more variable than exponential 9 / 22 10 / 22 Goodness-of-Fit Software Feature Matching to Data GoF software applies a set of goodness-of-fit tests to select Pragmatic approach: match key features in the data distribution family with “best fit” to data ◮ chi-square, Kolmogorov-Smirnov, Epps-Singleton, Ex: Use gamma dist’n, match first two empirical moments Anderson-Darling, . . . ◮ Hence match the empirical coefficient of variation (see below) ◮ Note: nothing about “process physics” implies gamma, we use GoF software must be used with caution it for it’s convenence and flexibility in modeling a range of ◮ Low power: Poor discrimination between different distributions variability ◮ Sequential testing: Test properties mathematically ill-defined ◮ Discourages sensitivity analysis: Unwary users stop with “best” Can use even more flexible distribution families ◮ Can obscure non-i.i.d. features: e.g., trends, autocorrelation ◮ Ex: Johnson translation system (4 parameters) ◮ Fails on big data: All test fail on real-world datasets ◮ Ex: Generalized lambda distribution (4 parameters) ◮ Over-reliance on summary statistics: Should also plot data ◮ Useful when extreme values not important to simulation ◮ Ex: Q-Q plots [Law, p. 339-344] better indicate departures [Nelson (2013), pp. 113–116] from candidate distribution (lower/upper, heavy/light tails) 11 / 22 12 / 22

Estimating Parameters: Maximum Likelihood Method Maximum Likelihood, Continued Ex: Poisson arrival process to a queue Ex: Geometric distribution ◮ Exponential interarrivals: 3.0, 1.0, 4.0, 3.0, 8.0 ◮ Number of failures before first success in Bernoulli trials, ◮ Goal: estimate λ success probability = p ◮ For a continuous dist’n, likelihood of an observation = pdf ◮ P { X = k } = (1 − p ) k p for k ≥ 0 [Law, Problem 6.26] ◮ How to estimate p given four observations of X ? X = 3 , 5 , 2 , 8 ◮ Given a value of λ , likelihoods for observations: ◮ Given a value of p , likelihood for ◮ Joint likelihood: L ( λ ) = obs1: obs2: obs3: obs4: ◮ Joint log-likelihood: ˜ L ( λ ) = ◮ Joint likelihood L ( p ) for all four observations: ◮ Estimate ˆ λ = ◮ Choose estimator ˆ p to maximize L ( p ) (“best explains data”) ◮ Equivalently, maximize ˜ � � L ( p ) = log L ( p ) : Q: Why is this a reasonable estimator? 13 / 22 14 / 22 Maximum Likelihood, Continued Maximum Likelihood, Continued Maximizing the likelihood function General setup (continuous i.i.d. case) ◮ Simple case: solve ◮ Given X 1 , . . . , X n i.i.d. samples from pdf f ( · ; α 1 , . . . , α k ) ◮ MLE’s ˆ α 1 , . . . , ˆ α k maximize the likelihood function ∂ ˜ L n (ˆ α 1 , . . . , ˆ α k ) = 0 , for i = 1 , 2 , . . . , k ∂α i n � L n ( α 1 , . . . , α k ) = f ( X i ; α 1 , . . . , α k ) ◮ Harder cases: maximum occurs on boundary, constraints i =1 (Kuhn-Tucker conditions) or, equivalently, the log-likelihood function ◮ Hardest cases (typical in practice): solve numerically n Why bother? ˜ � � � L n ( α 1 , . . . , α k ) = log f ( X i ; α 1 , . . . , α k ) ◮ MLEs maximize asymptotic statistical efficiency i =1 ◮ For large n , MLEs “squeeze maximal info from sample” ◮ For discrete case use pmf instead of pdf (i.e., smallest variance ⇒ narrowest confidence intervals) 15 / 22 16 / 22

Estimating Parameters: Method of Moments Bayesian Parameter Estimation Idea: equate k sample moments to k true moments & solve View unknown parameter θ as a random variable with prior distribution f ( θ ) Example: Exponential distribution ◮ Encapsulates prior knowledge about θ before seeing data ◮ k = 1 and E [ X ] = 1 /λ ◮ Ex: If we know that θ ∈ [5 , 10] set f = Uniform(5,10) ◮ So equate first moments: After seeing data Y , use Bayes’ Rule to compute posterior X n = 1 / ˆ ¯ λ = 1 / ¯ ˆ λ ⇒ X n f ( θ | Y ) = f ( Y | θ ) f ( θ ) / c Example: Gamma distribution where f ( Y | θ ) is the likelihood of Y under θ and c is normalizing ◮ k = 2, E [ X ] = α/λ , and Var[ X ] = α/λ 2 constant ◮ equate moments: Estimate θ by mean (or mode) of posterior ◮ f ( θ | Y ) usually very complex ¯ α/ ˆ λ and s 2 α/ ˆ λ 2 α = ¯ X 2 n / s 2 n and ˆ λ = ¯ X n / s 2 X n = ˆ n = ˆ ⇒ ˆ n ◮ Use Markov Chain Monte Carlo (MCMC) to estimate mean— see HW 2 Can match other stats, e.g., quantiles [Nelson 2013, p. 115] 17 / 22 18 / 22 Bayesian Parameter Estimation, Continued Simple example (see refresher handout) ◮ Goal: Estimate success probability for Bernoulli distribution ◮ Assume a Beta( α, β ) prior on θ Input Distributions ◮ Observe n Bernoulli trials with Y successes Overview Probability Theory for Choosing Distributions ◮ Y | θ has Binomial( n , θ ) likelihood distribution Data-Driven Approaches ◮ Posterior, given Y = y , is Beta( α + y , β + n − y ) Other Approaches to Input Distributions (Beta is “conjugate prior” to binomial) α + y ◮ ˆ θ = mean of posterior = α + β + n Note: Maximum likelihood is a special case of Bayesian estimation (with a uniform prior and MAP estimation) 19 / 22 20 / 22

Input Distributions Reading: Chapter 6 in Law Input Distributions - PowerPoint PPT Presentation

Input Distributions Reading: Chapter 6 in Law Input Distributions Overview Probability Theory for Choosing Distributions Peter J. Haas Data-Driven Approaches Other Approaches to Input Distributions CS 590M: Simulation Spring Semester 2020

Formal Modeling in Cognitive Science 1 Distributions Lecture 20: Joint, Marginal, and Conditional

File Input and Output File Input and Output 1 / 9 File input/output input function reads values

Input Input devices Text entry Positional input Input Devices 1 iPod Wheel Input Devices 2

? ? ? ? Basic Charts Outline - Distributions & Histograms - Mean, Mode, Average - Chart

Input Distributions Reading: Chapter 6 in Law Peter J. Haas CS 590M: Simulation Spring Semester

Tra ffi c Management as a Service | Ghent, Belgium INPUT PROCESS OUTPUT INPUT PROCESS OUTPUT

Input Input devices Text entry Positional input Input Devices 1 MacBook Wheel (The Onion) -

Stat 5102 Lecture Slides: Deck 1 Empirical Distributions, Exact Sampling Distributions,

Create Distributions Empirically using Excel V0E 10/11/2014 0E 2014 Schield Creating

Lecture 5: Probability Distributions Random Variables Probability Distributions

Outline Power Law Size Distributions Distributions Power Law Size Distributions Overview

Input Devices Managing text and positional input 1 CS 349 - Input Devices iPod Wheel 2 CS 349

Expanding the Reach of Fuzzing Caroline Lemieux September 8 th , 2020 Fuzzcon Europe

Wayland Input Methods Michael Hasselmann Openismus GmbH Wayland Input Methods Input methods?

problems of direct input and solutions Input devices vs. Finger-based input Indirect vs. Direct

Exercise 1: Basic Input Exercise 1: Basic Input FLUKA Beginners Course Exercise 1: Basic Input

Variation of Parameters Bernd Schr oder logo1 Bernd Schr oder Louisiana Tech University,

Two Binomial Coefficient Analogues Bruce Sagan Department of Mathematics Michigan State

Estimating variance David L Miller Now we can make predictions Now we are dangerous.

Numerical summary of data Measures of location: mode , median, mean, Measures of spread:

Episode Definitions: What you need to know for the Bundled Payments for Care Improvement

The COIN Project Niels van Dijk, technical Product Manager SURFnet SURFfederatie Online

Making Change Coin Changing Goal. Given currency in integer denominations: {100, 25, 10, 5, 1}

Set Functions for FLP Sergio Antoy Portland State University PPDP09 Coimbra, Portugal,

Input Distributions Reading: Chapter 6 in Law Input Distributions - PowerPoint PPT Presentation

Input Distributions Reading: Chapter 6 in Law Input Distributions Overview Probability Theory for Choosing Distributions Peter J. Haas Data-Driven Approaches Other Approaches to Input Distributions CS 590M: Simulation Spring Semester 2020

Formal Modeling in Cognitive Science 1 Distributions Lecture 20: Joint, Marginal, and Conditional

File Input and Output File Input and Output 1 / 9 File input/output input function reads values

Input Input devices Text entry Positional input Input Devices 1 iPod Wheel Input Devices 2

? ? ? ? Basic Charts Outline - Distributions &amp; Histograms - Mean, Mode, Average - Chart

Input Distributions Reading: Chapter 6 in Law Peter J. Haas CS 590M: Simulation Spring Semester

Tra ffi c Management as a Service | Ghent, Belgium INPUT PROCESS OUTPUT INPUT PROCESS OUTPUT

Input Input devices Text entry Positional input Input Devices 1 MacBook Wheel (The Onion) -

Stat 5102 Lecture Slides: Deck 1 Empirical Distributions, Exact Sampling Distributions,

Create Distributions Empirically using Excel V0E 10/11/2014 0E 2014 Schield Creating

Lecture 5: Probability Distributions Random Variables Probability Distributions

Outline Power Law Size Distributions Distributions Power Law Size Distributions Overview

Input Devices Managing text and positional input 1 CS 349 - Input Devices iPod Wheel 2 CS 349

Expanding the Reach of Fuzzing Caroline Lemieux September 8 th , 2020 Fuzzcon Europe

Wayland Input Methods Michael Hasselmann Openismus GmbH Wayland Input Methods Input methods?

problems of direct input and solutions Input devices vs. Finger-based input Indirect vs. Direct

Exercise 1: Basic Input Exercise 1: Basic Input FLUKA Beginners Course Exercise 1: Basic Input

Variation of Parameters Bernd Schr oder logo1 Bernd Schr oder Louisiana Tech University,

Two Binomial Coefficient Analogues Bruce Sagan Department of Mathematics Michigan State

Estimating variance David L Miller Now we can make predictions Now we are dangerous.

Numerical summary of data Measures of location: mode , median, mean, Measures of spread:

Episode Definitions: What you need to know for the Bundled Payments for Care Improvement

The COIN Project Niels van Dijk, technical Product Manager SURFnet SURFfederatie Online

Making Change Coin Changing Goal. Given currency in integer denominations: {100, 25, 10, 5, 1}

Set Functions for FLP Sergio Antoy Portland State University PPDP09 Coimbra, Portugal,

? ? ? ? Basic Charts Outline - Distributions & Histograms - Mean, Mode, Average - Chart