CS626 Data Analysis and Simulation Instructor: Peter Kemper R 104A, phone 221-3462, email:kemper@cs.wm.edu Today: Recap before midterm 1
Big Picture: Model-based Analysis of Systems portion/facet real world perception transfer solution to real world problem real world problem description decision formal model transformation presentation probability model, solution, rewards, stochastic process qualitative and formal / computer aided quantitative properties analysis 2
Reminder This is no pipe! ... and this is no serpentine accumulator in a production line! 3
System - Model - Study Model vs System largely simplified formal/mathematical/stochastic model implemented in software in a fully controlled environment set of physical devices interacting in space-time in an largely uncontrolled, not fully understood environment Model includes some of the rules how the system operates, excludes others includes some aspects of the real world as random variables, ignores others or assumes them as constant is parameterized with respect to certain design variables Study has an objective, a clear question delivers values that are probabilities like R(0,t) Interpretation? evaluates effects of different design choices 4
CS 626 Topics From Data to Stochastic Input Models Input Modeling Probability, Distributions Exploratory Data Analysis, Statistical tests Stochastic processes, Markov Processes DTMC, CTMC Phase type distributions, MAPs, MAP Fitting Tools for data analysis: R for MAP fitting: KPC toolbox Simulation Modeling Simulation Output Data Analysis Verification, Validation, Trace driven simulation Debugging of simulation models Tools for simulation: Mobius, (+Traviando) Applications Reliability analysis, Dependability modeling of a LEO satellite Modeling traffic in computer networks 5 Emulation: Testing, Debugging, Training in Automated Material Handling Systems
From Data to Stochastic Input Models Probability Axiomatic Definition Frequentist Definition 6
Frequency Definition of Probability If our experiment is repeated over and over again then the proportion of time that event E occurs will just be P(E). Frequency Definition of Probability: P(E) = lim m(E) / m m → ∞ where m(E) is the number of times event E occurs, m is the number of trials Note: Random experiment can be repeated under identical conditions if repeated indefinitely, relative frequency of occurrence of an event converges to a constant Law of large numbers states that limit does exist. For small m, m(E) can show strong fluctuations. 7
Axiomatic Definition of Probability Definition For each event E of the sample S, we assume that a number P(E) is defined that satisfies Kolmogorov’s axioms: 8
Outline on Problem Solving (Goodman & Hedetniemi 77) Identify sample space S All elements must be mutually exclusive, collectively exhaustive. All possible outcomes of experiment should be listed separately. (Root of “tricky” problems: often ambiguity, inexact formulation of the model of a physical situation) Assign probabilities To all elements of S, consistent with Kolmogorov’s axioms. (In practice: estimates based on experience, analysis or common assumptions) Identify events of interest Recast statements as subsets of S. Use laws (algebra of events) for simplifications Use visualizations for clarification Compute desired probabilities Use axioms, laws, often helpful: express event of interest as union of mutually exclusive events and sum up probabilities 9
More relations What is the probability of a UNION of events ? What is the probability of a union of a set of events? Is there a better way to calculate this? Sum of disjoint products (SDP) formula 10
Conditional Probabilities E given F happens EF EF F F Definition The conditional probability of E given F is if P(F) > 0 and it is undefined otherwise. Interpretation: Given F has happened, only events in EF are still possible for E, so original probability P(EF) is scaled by 1/P(F). Multiplication rule: 11
Independent events Definition Two events E and F are independent if: This also means: In English, E and F are independent if knowledge that F has occurred does not affect the probability that E occurs. Notes: if E, F independent then also E,F c and E c ,F and E c ,F c Generalizes from 2 to n events e.g. n=3 every subset independent Mutually exclusive vs independent 12
About independent events Venn diagrams For independent events: consider A, B being not empty S and not S, 1) if A ⊂ B, then A and B cannot be independent A B 2) if A ∩ B = ∅ , then A and B cannot be independent Tree diagrams of sequential sample spaces Throw coin twice Joint sample space from cross product of individual sample spaces. H T First, second throw are independent. H T T H (H,H) (H,T) (T,H) (T,T) 13
Joint and pairwise independence A ball is drawn from an urn containing four balls numbered 1, 2, 3, 4. Then we have: They are pairwise independent, but not jointly independent A sequence of experiments results in either a success or a failure where E i , i >= 1 denotes a success. If for all i 1 , i 2 , …, i n : we say the sequence of experiments consists of independent trials 14
Independence is a very important property Independence simplifies calculations significantly => very popular assumption for theoretical results input modeling, workload modeling statistical tests output analysis of simulation models: confidence intervals for estimate of mean ... independence need not be present in real data data traffic in networks: often correlated output data of a (simulated) system, i.e. response of a system to some workload ways to investigate independence graphics: correlation plot tests: chi-square test for vectors, rank von Neumann test, runs test see Law/Kelton Chap 6.3 and Chap 7.4.1 15
Bayes’ Formula Let F 1 , F 2 , …, F n be events of S, all mutually exclusive and collectively exhaustive. Theorem of total probability (also Rule of Elimination) Bayes’ Formula helps us to determine which F j happened given we observed E 16
Random Variable RV Definition A random variable X on a probability space (S,F,P) is a function X : S -> R that assigns a real number X(s) to each sample point s ∈ S, such that for every real number x, the set of sample points {s|X(s) ≤ x} is an event, that is a member of F. RVs can be discrete or continuous More concepts cumulative distribution function density moments E[X i ], centralized moments, Variance, Skewness, Kurtosis Particular examples Normal distribution Poisson distribution Exponential distribution Pareto distribution 17
Parameterization of distributions Parameters of 3 basic types Location specifies an x-axis location point of a distribution’s range of values usually the midpoint (e.g. mean for normal distribution) or lower end point for the distribution’s range sometimes called shift parameter since changing its value shifts the distribution to the left or right, e.g., for Y = X + γ Scale determines the scale (unit) of measurement of the values in the range of the distribution (e.g. std deviation σ for normal distribution) changing its value compresses/expands distribution but does not alter its basic form, e.g., for Y = β X Shape determines basic form/shape of a distribution changing its values alters a distribution’s properties, e.g. skewness more fundamentally than a change in location or scale 18
Properties of Mean, Variance and Covariance X x X stochastic variable ! F ( x ) P ( X x ) f ( y ) dy distributi on function = ! = X X f ( x ) density function X # " # " " = ! x E(cX) cE(X) = E(X) yf ( y ) dy expected value X E(X Y) E(X) E(Y) + = + # " E(X Y) E(X) E(Y) E(cX) cE(X) independen t : P(X x, Y independen t : P(X x, Y y) P(X x) P(Y y), E(XY) E(X)E(Y) = = = = = = 2 var( aX b ) a var( X ) 2 2 + = ! = var( X ) = E (( X " E ( X )) ) X var( X Y ) var( X ) var( Y ) 2 cov( X , Y ) + = + + var( X Y ) var( X ) var 2 var( aX b ) a var( X ) covariance : cov( X , Y ) E (( X E ( X ))( Y E ( covariance : cov( X , Y ) E (( X E ( X ))( Y E ( Y ))) = " " X Y cov( X , Y ) cov( X , Y ) independen t : cov( X , Y ) 0 = correlatio n : 2 2 ! X ! Y independen t : cov( X , Y ) For any random variables X, Y, Z and constant c,
Proposition 2.4 X 1 , …, X n are independently and identically distributed with expected value µ and variance σ 2 . Then, Confidence intervals for estimate of mean Then, the (1 - ! ) confidence interval about x can be expressed as: ( ) ( ) t 1 s t 1 s ! ! " " N 1 N 1 2 2 ˆ " ˆ " µ " # µ # µ + N N Where – ! ( ) ( ) t N 1 is the 100 1 th percentile of the student' s t distributi on with " " ! ! 1 2 2 ! N 1 degrees of freedom (values of this distributi on can be found in tables) . ! 2 – ! s = s is the sample standard deviation. – ! N is the number of observations. 20
Recommend
More recommend