ACMS 20340 Statistics for Life Sciences Chapter 9: Introducing Probability
Why Consider Probability? We’re doing statistics here. Why should we bother with probability? As we will see, probability plays an important role in statistics.
An Example I In a (very) recent Gallup study on the role of religion on one’s views on violence, we find the following statement: Results are based on face-to-face interviews with approximately 1,000 adults in each country, aged 15 and older, from 2008 through 2010. For results based on the total sample of adults, one can say with 95% confidence that the maximum margin of sampling error ranges from ± 1 . 66 to ± 5 . 8 percentage points. Source: gallup.com
An Example II What is meant by the claim that “one can say with 95% confidence that the maximum margin of sampling error ranges from ± 1 . 66 to ± 5 . 8 percentage points”? This means that the probability that the estimate from the samples comes within the given margin of error is 0.95.
Another Example Recall: A simple random sample of size 10 taken from this class means that every possible group of size 10 has an equal chance of being selected. What do we mean when we say a group “has an equal chance of being selected”? A class of size 50 has 10,272,278,170 possible samples of size 10.
What is Probability? This is a difficult philosophical question. Following the textbook, we will define probability in terms of the long run behavior of random phenomena. Why “the long run behavior of random phenomena”? “Chance behavior is unpredictable in the short run but has a regular and predictable pattern in the long run.”
Short Run vs. Long Run Suppose I toss a fair (unbiased) coin. What will the outcome be? or ? or ? One can’t know for certain: the outcome is unpredictable in the short run.
Short Run vs. Long Run However, if I toss the coin a sufficiently large number of times, the outcomes start to settle down. Two trials of 5000 tosses each:
Further Confirmation Buffon Kerrich Pearson Total Tosses 4,040 10,000 24,000 Heads 2,048 5,067 12,012 Proportion 0.5069 0.5067 0.5005 (These guys had too much time on their hands.)
Randomness and Probability A phenomenon is random if individual outcomes are uncertain but there is nonetheless a regular distribution of outcomes in a large number of repetitions. Always keep the example of the tosses of a coin in mind! The probability of any outcome of a random phenomenon is the proportion of times the outcome would occur in a very long series of repetitions. As we toss the fair coin more and more, the proportion of the occurrence of heads gets closer and closer to 1/2.
Examples of randomness? ◮ The outcome of a coin toss. ◮ The time between emissions of particles by a radioactive source. ◮ The sexes of the next litter of lab rats. ◮ The outcome of a random sample of randomized experiment.
Probability Models 1 Let us study a certain random phenomenon, the birth of a child.
Probability Models 2 What will the outcome be? That is, will the child be male or female? We can’t know (too far) in advance. Here’s what we do know: 1. The outcome will be either male or female. 2. The probability of each outcome is (roughly) 1/2.
Probability Models 3 Thus we’ve described: 1. A list of possible outcomes 2. A probability for each outcome. These correspond to the two components of a probability model. Before defining a probability model, we need a bit more terminology.
Probability Models 4 The sample space S of a random phenomenon is the set of all possible outcomes. An event is an outcome or set of outcomes of a random phenomenon. Thus, an event is a subset of the sample space. For example, if S = { 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 } is a set of outcomes, then E = { 2 , 4 , 6 , 8 } is an event. Careful: An event need not be an individual outcome!
Finally. . . A probability model is the description of a random phenomenon consisting of 1. a sample space S , and 2. a way of assigning probabilities to events in S .
Examples of Sample Spaces ◮ S = { M,F } ◮ S = { Republican, Democrat, Independent } ◮ S = { weights of 1,000 individuals in a sample }
A Baby-friendly Example Suppose a couple plans to have three children. Let S be the number of girls they can possibly have. That is, S = { 0 , 1 , 2 , 3 } . What is the probability of each outcome in S (assuming that the probability of a girl is 1/2)? Incorrect answer: Each outcome is equally likely, so each has probability 1/4.
The Possible Outcomes Possible outcomes with one child: { B , G } Possible outcomes with two children: { BB , BG , GB , GG } Possible outcomes with three children: { BBB , BBG , BGB , BGG , GBB , GBG , GGB , GGG } Now, these outcomes are equally likely.
Calculating the Probabilities Probability of no girls? 1/8 BBB , ✘✘ BBG , ✘✘ BGB , ✘✘ BGG , ✘✘ GBB , ✘✘ GBG , ✘✘ GGB , ✘✘ GGG ✘ ✘ ✘ ✘ ✘ ✘ ✘ Probability of exactly one girl? 3/8 BBB , BBG , BGB , ✘✘ BGG , GBB , ✘✘ GBG , ✘✘ GGB , ✘✘ GGG ✘ ✘ ✘ ✘ ✘ ✘✘ Probability of exactly two girls? 3/8 BBB , ✘✘ BBG , ✘✘ BGB , BGG , ✘✘ GBB , GBG , GGB , ✘✘ GGG ✘ ✘ ✘ ✘ ✘ ✘✘ Probability of three girls? 1/8 BBB , ✘✘ BBG , ✘✘ BGB , ✘✘ BGG , ✘✘ GBB , ✘✘ GBG , ✘✘ GGB , GGG ✘ ✘ ✘ ✘ ✘ ✘ ✘ ✘✘
Another Example: Blood Types Let S = { O + , O − , A + , A − , B + , B − , AB + , AB −} . If we choose an American at random, what is the probability that this person has, say, blood type O +? Where do we get these probabilities? These are the frequencies of occurrence of each blood type—we get them by taking lots and lots of samples.
A Donation Problem Suppose that we need blood for someone with blood type AB − . What is the probability that a randomly selected American has the right blood to donate? Individuals with blood type O − , A − , B − and AB − can donate to this person. So what we’re looking for is the probability of the event E = { O − , A − , B − , AB −} . How do we find this probability?
General Rules of Probability: Rules 1 and 2 Rule 1: Every probability is a number between 0 and 1. That is, if A is any event in S , then 0 ≤ P ( A ) ≤ 1 . What if P ( A ) = 0? ◮ When S is finite, then this means that A is impossible. What if P ( A ) = 1? ◮ When S is finite, then this means that A must occur. Rule 2: The event consisting of all outcomes in the sample space has probability 1.
General Rules of Probability: Rule 3 Rule 3: If two events have no outcomes in common, then the probability that one or the other occurs is the sum of their individual probabilities. When two events have no outcomes in common, we say that they are disjoint .
General Rules of Probability: Rule 3 (continued) Rule 3: If A and B are disjoint, then P ( A or B ) = P ( A ) + P ( B ) . (This is sometimes call the “addition rule”.) In general, if A and B are any two events in S , then P ( A or B ) = P ( A ) + P ( B ) − P ( A and B ) .
General Rules of Probability: Rule 4 Rule 4: The probability that an event does not occur is 1 minus the probability that the event does occur. P ( A does not occur) = 1 − P ( A )
Back to the Donation Problem The addition rule holds for more than just two disjoint events: P ( O − or A − or B − or AB − ) = P ( O − ) + P ( A − ) + P ( B − ) + P ( AB − ) = 0 . 07 + 0 . 06 + 0 . 02 + 0 . 01 = 0 . 16 We also used the addition rule in the example with the couple having three children. Can you see where?
Discrete Probability Models So far, the probability models we’ve considered are discrete probability models. A probability model is discrete if the sample space is made up of a list of individual outcomes (the first outcome, the second outcome, the third outcome, . . . ). To assign probabilities in a discrete model, we merely list the probabilities of all the individual outcomes.
Continuous Probability Models What kind of probability model should we use for continuous quantitative variables? These can take any number in a range of possible values. First try: Histograms! Heights (inches) of women age 40–49 in the U.S. (Ignore the curve...just look at the histogram)
Calculating Probabilities 1 On this graph the bins are intervals of 1 inch. What if we want to know the probability someone is within half an inch of 60 inches? P (59 . 5 ≤ X ≤ 60 . 5) =?
Calculating Probabilities 2 We could keep asking for probabilities of smaller and smaller intervals. But then there are an infinite number of possible events! This is a problem. Is there an easier way?
Continuous Distributions Solution: Use a curve to indicate the different outcomes, and let the probability of any given interval of values be the area under the curve. These curves are called density curves.
Density Curves All density curves have the following properties. ◮ The curve is always on or above the x -axis. ◮ The total area under the curve is equal to 1.
Continuous Probability Models: The Official Definition A continuous probability model gives a density curve and assigns the probability of every interval as the area under the curve for that interval.
Recommend
More recommend