CS626 Data Analysis and Simulation Instructor: Peter Kemper R 104A, phone 221-3462, email:kemper@cs.wm.edu Office hours: Monday, Wednesday 2-4 pm Today: Stochastic Input Modeling Reference: Law/Kelton, Simulation Modeling and Analysis, Ch 6. NIST/SEMATECH e-Handbook of Statistical Methods, http://www.itl.nist.gov/div898/handbook/ 1
What is input modeling? Input modeling Deriving a representation of the uncertainty or randomness in a stochastic simulation. Common representations Measurement data Distributions derived from measurement data <-- focus of “Input modeling” usually requires that samples are i.i.d and corresponding random variables in the simulation model are i.i.d i.i.d. = independent and identically distributed theoretical distributions empirical distribution Time-dependent stochastic process Other stochastic processes Examples include time to failure for a machining process; demand per unit time for inventory of a product; number of defective items in a shipment of goods; times between arrivals of calls to a call center. 2
Overview of fitting with data Check if key assumptions hold (i.i.d) Select one or more candidate distributions based on physical characteristics of the process and graphical examination of the data. Fit the distribution to the data determine values for its unknown parameters. Check the fit to the data via statistical tests and via graphical analysis. If the distribution does not fit, select another candidate and repeat the process, or use an empirical distribution. from WSC 2010 Tutorial by Biller and Gunes, CMU, slides used with permission 3
Check the fit to the data Graphical analysis Plot fitted distribution and data in a way that differences can be recognized beyond obvious cases, there is a grey area of subjective acceptance/rejection Challenges How much difference is significant enough to trash a fitted distribution? Which graphical representation is easy to judge? Options: Histogram-based plots Probability plots: P-P plot, Q-Q plot Statistical tests define a measure X for the difference between fitted distribution & data X is an RV, so if we find an argument what distribution X has, we get a statistical test to see if in a concrete case a value of X is significant Goodness-of-fit tests: Chi-square test( χ 2), Kolmogorov-Smirnov test(K-S), Anderson Darling test(AD) 4
Check the fit to the data: Statistical tests define a measure X for the difference between fitted distribution & data Test statistic X is an RV say small X means small difference, high X means huge difference if we find an argument what distribution X has, we get a statistical test to see if in a concrete case a value of X is significant or not Say P(X ≤ x) = (1- α ), and e.g. this holds for x=10 and α =.05, then we know that if data is sampled from a given distribution and this is done n times (n-> ∞ ), this measure X will be below 10 in 95% of those cases. If in our case, the sample data yields x=10.7, we can argue that it is too unlikely that the sample data is from the fitted distribution. Concepts, Terminology Hypothesis H 0 , Alternative H 1 Power of a test: (1-beta), probability to correctly reject H 0 Alpha / Type I error: reject a true hypothesis Beta / Type II error: not rejecting a false hypothesis P-value: probability of observing result at least as extreme as test statistic assuming H 0 is true 5
Sample test characteristic for Chi-Square test (all parameters known) One-sided Right side: - critical region - region of rejection Left side: - region of acceptance where we fail to reject hypothesis P-value of x: 1-F(x) 6
Tests and p-values In the typical test... H 0 : the chosen distribution fits H 1 : the chosen distribution does not fit P-value of a test is: the probability of observing a result at least as extreme as test statistic assuming H 0 is true (hence 1-F(x) on previous slide) is the Type I error level (significance) at which we would just reject H 0 for the given data. Implications If the α level (common values: 0.01, 0.05, 0.1) < p-value, then we do not reject H 0 otherwise, we reject H 0 . If the p-value is large (> 0.10) then more extreme values than our current one are still reasonably likely so we fail to reject H 0 in this sense it supports H 0 that the distribution fits (but not more than that!) 7
Chi-Square Test Histogram-based test O6',+R,-. L+,(),JMQ Sums the squared differences � " ! % $ # $ � � � " " " # # � " ! " STU,M0,-.L+,(),JMQ ! " #$%&' " V$,+,. ' % %'.0$,. 0$,3+,0%M*P.U+36<.37. >; 0$,. " 0$ %J0,+R*P< 8
Chi-Square Test 4#'5,6"%7(,8(9(,8(!""#$%&'(-0"#16(:;:<=>%7 Arrange n observations into k cells, KB JC test statistics: ."*/0*123 ! " � % $ # $ � JB " � � " " C # # " B � " ! B J K L M C O @ N ! JB JJ which approximately follows the chi-square 9(,8(!""#$%&' distribution with k-s-1 degrees of freedom, where s = # of parameters of the hypothesized distribution estimated by the sample statistics. !""#$%&'()*"( +*"#,- ."*/0*123 Valid only for large sample size B JK J JB Each cell has at least 5 observations for both K J! O i and E i L J@ M JB Result of the test depends on grouping of C N O @ the data @ C N C Example: #vehicles arriving at an ! L JB L intersection between 7-7.05 am for 100 JJ J random workdays 9
Chi-Square Test Example continued: Sample mean 3.64 1 ! " #$ C*0*.*+,.DE%''E>.-%'0+%F)0,-.G%0$.7,*>.?@AB ! % #$$ C*0*.*+,.>E0.DE%''E>.-%'0+%F)0,-.G%0$.7,*>.?@AB � % #$ # ! " � 4" 5 )6)1 5 7 8 91 5 ! % "#$%&'%()*&%+,%-./0)" % 1!2%.3%()*&%+,%-./0)1 % & " L! !@A � � � ! " � � M@OM � L L" ;@A � # � ! L; LM@B "@LN ! ! ? LM !L@L "@O B L; L;@! B@BL N A LB@" !@NM A M O@N "@!A M N B@B O N !@" ; ? "@O LL@A! #E7F%>,-.F,3*)',. L" ? "@? EH.7%>. - . P.LL L "@L !M@AO L"" L""@" C,:+,,.EH.H+,,-E7.%'. &'('%$)$*'%'%$)$+ *>-.'E.0$,. ,' 5*4),.%'. "@""""B@.I$*0.%'.JE)+.3E>34)'%E>K "@""""B@.I$*0. 10
Chi-Square Test What if m parameters estimated by MLEs? Chi-Square distributions looses m degrees of freedom (df) 11
Goodness-of-fit tests Chi-square test K-S and A-D tests Features: Features: • A formal comparison of a histogram or • Comparison of an empirical distribution function line graph with the fitted density or mass with the distribution function of the hypothesized function distribution. • Sensitive to how we group the data. • Does not depend on the grouping of data. • A-D detects discrepancies in the tails and has higher power than K-S test from WSC 2010 Tutorial by Biller and Gunes, CMU, slides used with permission 12
Kolmogorov-Smirnov Test #*+.2012.,1.C10OC%. 3.45.676666 3.45.967666 6768 :;7<8 " PQ0-.1I&R%0.1,S0.,1. 67: 1&I%% 67; 67? /012.12I2,12,H 67> @AB+#.+2CD0-2.E0(1,$- @AB+#.+2CD0-2.E0(1,$- @AB+#.+2CD0-2.E0(1,$- @AB+#.+2CD0-2.E0(1,$- @AB+#.+2CD0-2.E0(1,$- @AB+#.+2CD0-2.E0(1,$- @AB+#.+2CD0-2.E0(1,$- @AB+#.+2CD0-2.E0(1,$- @AB+#.+2CD0-2.E0(1,$- @AB+#.+2CD0-2.E0(1,$- 67< ! !"#"$%&'"()&*"+ , - )&*' F$(.GHID0&,H.J10.K-%L F$(.GHID0&,H.J10.K-%L F$(.GHID0&,H.J10.K-%L F$(.GHID0&,H.J10.K-%L F$(.GHID0&,H.J10.K-%L F$(.GHID0&,H.J10.K-%L F$(.GHID0&,H.J10.K-%L F$(.GHID0&,H.J10.K-%L F$(.GHID0&,H.J10.K-%L F$(.GHID0&,H.J10.K-%L 679 #*+.2012.%$$M1.I2. 67= &IN,&C&.D,OO0(0-H0 67! 67" TUF.$O.2Q0. 6 6 < "6 "< !6 !< =6 =< 96 0&R,(,HI%. TUF.$O.2Q0. D,12(,VC2,$-. KS-Test detects QLR$2Q01,S0D. H$-12(CH20D. the max difference D,12(,VC2,$- O($&.2Q0.DI2I !" : ;<-=/->6(/-,-#80/'(61+#,0-? @ A? ! ABA? , A-1>/, * , C?D-E-C,9%8/'-#<-? @ A? ! ABA? , 1>61-6'/- � ?D-F-, 13
K-S Test Sometimes a bit tricky: geometric meaning of test statistic but not for details, see Law/Kelton, Chap. 6 14
Anderson-Darling test (AD test) Test statistic is a weighted average of the squared differences with weights such that weights are largest for F(x) close to 0 and 1. Modified critical values for adjusted A-D test statistics, reject H 0 if A n2 exceeds critical value. 15
Goodness-of-fit tests Chi-square test K-S and A-D tests Features: Features: • A formal comparison of a histogram or • Comparison of an empirical distribution function line graph with the fitted density or mass with the distribution function of the hypothesized function distribution. • Sensitive to how we group the data. • Does not depend on the grouping of data. • A-D detects discrepancies in the tails and has higher power than K-S test • Beware of goodness-of-fit tests because they are unlikely to reject any distribution when you have little data, and are likely to reject every distribution when you have lots of data. from WSC 2010 Tutorial by Biller and Gunes, CMU, slides used with permission 16
Recommend
More recommend