review
play

Review Probability Basic definitions: Randomization experiment - PowerPoint PPT Presentation

Review Probability Basic definitions: Randomization experiment Sample spaces Elementary outcomes Event Basic operationsconditional probability Bayes Theorem Objectives Random Variable Discrete random


  1. Review • Probability • Basic definitions:  Randomization experiment  Sample spaces  Elementary outcomes  Event • Basic operations—conditional probability • Bayes Theorem

  2. Objectives • Random Variable  Discrete random variable  Continuous random variable • Two probability distributions  Binomial distribution  Normal distribution

  3. Random variables 4 • A random variable is a function that assigns numeric values to different events in a sample space. Usually we denote a random variable using a capital letter X, Y or Z… NOTE: (1) Randomness; (2) Numeric values • Example 1: Randomly select a student from a class. • X=student’s number of siblings. X could be 0, 1, 2 … Example 2: Randomly select a student from a class. • X=student’s height. X could be any value bigger than 0

  4. Two types of random variables 5 Discrete random variable: their outcomes are set of 1. discrete (isolated) values. Eg. X=number of siblings Continuous random variable: its possible values 2. cannot be enumerated; infinite number of values, all outcomes have probability zero. p(x)=0 for every x. Eg. X=the student’ height

  5. EG1. Tossing two coins 6 let X=number of heads Outcome TT HT TH HH x 0 1 2 Notation: X: variable x: observed values

  6. Probability distribution function 7 • A probability distribution function (pdf) is a mathematical relationship, or rule, that assigns to any possible value x of a discrete random variable X the probability Pr(X=x).

  7. Probability Distribution of the Random Variable 8 X=number of heads. Outcome TT WT TW WW x 0 1 2 P(X=x) 1/4 1/2 1/4 Probability histogram

  8. EG2. Tossing two dice 9 Y: the sum of the dots on the two Dice. What’s the possible values of Y?

  9. Probability Distribution of the Random Variable 10 Y: the sum of the dots on the two Dice.

  10. Relative frequency In practice, the probability can be estimated by the relative frequency of an event “in a long run”. frequency of occurrences Probability = frequency of all possible occurrences 0 ≤ Probability ≤ 1 Relative frequency histogram should look very much like the probability histogram, if the experiment is repeated many times.

  11. Data set vs. Probability distributions  Sample properties—based on data set 12 ∑ n = x x / n Sample mean: = i i 1 1 ∑ Sample variance: n = − 2 2 s ( x x ) − = i i 1 n 1  Model or population properties—based on probability distribution. R ∑ µ = = Pr( ) x X x Population mean: i i = i 1 Population variance: R ∑ σ = − µ = 2 2 ( x ) Pr( X x ) i i = i 1

  12. Mean of Random Variable  Mean or expected value of X, denoted as E(X) 13 or µ, is defined as R ∑ = µ = = ( ) Pr( ) E X x X x i i = i 1  It is the sum of the possible values, each weighted by its probability  Expectation represents “average” value of the random variable

  13. Mean of X 14 X=number of heads. Outcome TT WT TW WW x 0 1 2 P(X=x) 1/4 1/2 1/4 xP(x) 0 1/2 1/2 3 ∑ = µ = = = E X ( ) x Pr( X x ) 1 i i = i 1

  14. Variance of Random Variable 15  The variance of X is the expected squared distance from the population mean. R ∑ = σ = − µ = 2 2 Var ( X ) ( x ) Pr( X x ) i i = i 1  The standard deviation σ is the square root of variance = σ = sd ( X ) Var ( X )

  15. Variance of X 16 X=number of heads. (X-µ) 2 P(x) x P(x) (0-1) 2 *0.25=0.25 0 0.25 (1-1) 2 *0.25=0 1 0.5 (2-1) 2 *0.25=0.25 2 0.25 Total 0.50 σ = 2 Thus, 0.5 Summary, µ and σ are computed from probability distribution. They are population properties.

  16. Two types of random variables 17 Discrete random variable: their outcomes are set of 1. discrete (isolated) values. Continuous random variable: its possible values 2. cannot be enumerated; infinite number of values, all outcomes have probability zero. p(x)=0 for every x.

  17. Continuous random variables 18  A balanced spinning pointer. Can stop anywhere in the circle  X—the proportion of the total circumference it lands on.  X can be any value between 0 and 1. Infinite values. p(0.25≤x ≤0.75)=0.5   p(x=0.5)=0, for x can take on an infinite number of values.

  18. Probability density function(pdf) of X = y f x ( ) 19 • The curve is the probability density function (pdf) of the random = variable X y f x ( ) • Pr( a≤X ≤b)= is the area under the curve between the x value a and b. = ∫ b ≤ ≤ P a ( X b ) f x dx ( ) a • The total area under the density function curve over the entire range of possible values for the random variable is 1 ∞ ∫ −∞ ≤ ≤ ∞ = = P ( X ) f x dx ( ) 1 −∞

  19. Probability density function(pdf) of X 20 • The pdf has large values in regions of high probability and = y f x ( ) small values in regions of low probability • Pr(X=x)=0 for any specific value x • Generally, a distinction is not made between probabilities such as Pr(X<x) and Pr( X≤x ), Pr( a≤X≤b ) and Pr(a<X<b) when X is a continuous

  20. Expectation and variance of a continuous random variable 21 = ∫ ∞ µ = µ • Mean : E (X) xf x dx ( ) −∞ Center of the probability density ∞ ∫ = σ = − µ σ 2 2 2 Var (X) ( x ) f x dx ( ) • Variance : −∞ Spread of the probability density • The standard deviation , or σ , is the square root of the variance, that is, σ = Var ( X )

  21. Two distributions 22  Binomial --discrete  Normal -- continuous

  22. Bernoulli trial 23 Examples:  A heads-or-tails Coin toss  A win-or-lose football game  A pass-or-fail automotive smog inspection Properties:  Two outcomes: success or failure  Success probability(p) is the same in each trial  Trials are independent.

  23. Binomial random variable 24 ---X is the number of success in n repeated Bernoulli trial with probability p of success.  Success probability(p) is the same in each trial  Trials are independent.

  24. Binomial random variable 25 Probability Distribution: the probability of obtaining k successes in n trial, with success probability p:   n − = = − k n k   P X ( k ) p (1 p )   k : counts all possible ways of getting k   = n n !   −   success and n-k failures k k n !( k )! = × − × × where n ! n ( n 1) ... 1 : probability for getting k success and − − k n k p (1 p ) n-k failures

  25. Mean and Variance of the Binomial Distribution 26 µ = np σ = − 2 np (1 p )

  26. Exercise 27 Newborns were screened for HIV in a Massachusetts hospital. The positive rate for inner-city baby is p=0.01. If 500 newborns are screened, 1. what is the exact binomial probability of 5 HIV positive test results?

  27. Exercise 28 Newborns were screened for HIV in a Massachusetts hospital. The positive rate for inner-city baby is p=0.01. If 500 newborns are screened, 1. what is the exact binomial probability of 5 HIV positive test results?   500 = = − 5 495   P X ( 5) 0.01 (1 0.01) Answer:   5 = 0.176 EXCEL: BINOMDIST(5,500,0.01,FALSE)

  28. Exercise 29 Newborns were screened for HIV in a Massachusetts hospital. The positive rate for inner-city baby is p=0.01. If 500 newborns are screened, 2. What is the exact binomial probability of at least 5 HIV positive test results?

  29. Exercise 30 Newborns were screened for HIV in a Massachusetts hospital. The positive rate for inner-city baby is p=0.01. If 500 newborns are screened, 2. What is the exact binomial probability of at least 5 HIV positive test results? ≥ = − ≤ P X ( 5) 1 P X ( 4) Answer: = − 1 ( 4) F = − 1 0.44 = 0 .5 6 EXCEL: F(4)= BINOMDIST(4,500,0.01,TRUE)

  30. Normal distribution 31 • Normal distribution is also called Gaussian distribution, after the well-known mathematician Karl Gauss (1777-1855, “the Prince of Mathematicians“)

  31. Normal distribution 32 • Normal distribution is very useful • Many things closely follow a normal distribution • Heights of people • Errors in measurement • Blood pressure • Scores on a test • Many other distributions can be made approximately normal by transformation—Binomial et al. • Most statistical methods considered in this text are based on normal distribution

  32. The pdf of normal distribution 33 • The normal distribution is defined by its pdf, which is given as for some parameters µ and σ   − µ 2 ( x ) −  1 σ  2  =   2 f x ( ) e πσ 2

  33. Other properties of Normal pdf 34 • Mean=median=mode • Symmetry about the center • 50% of values less than the mean

  34. Location is measured by µ • In the graph, µ 2 > µ 1 35

  35. Spread is measured by σ 2 • In the graph, σ 2 > σ 1 36

Recommend


More recommend