1
play

1 Introduction to Statistics and Data Analysis 2 1.1 Overview: - PDF document

Probability, Statistics, and Statistical Methods 1 1 Introduction to Statistics and Data Analysis 2 1.1 Overview: Statistical Inference, Samples, Populations, and the Role of Probability 3 4 1.2 Sampling Procedures; Collection of


  1. Probability, Statistics, and Statistical Methods 1 1 Introduction to Statistics and Data Analysis 2

  2. 1.1 Overview: Statistical Inference, Samples, Populations, and the Role of Probability 3 4

  3. 1.2 Sampling Procedures; Collection of Data 5 6

  4. 7 1.3 Random Sampling 8

  5. 9 1.4 Measures of Location: Sample Mean, Sample Mode, and Sample Median 10

  6. The sample mode denoted by x mode is the observation with the highest frequency Sample median is a better measure of the central tendency of a sample since it would not be effected by extreme values in the sample 11 1.5 Measures of Variability: Sample Range, Sample Variance, and Sample Standard Deviation 12

  7. The sample range, denoted by r, is given by r = X max – X min 13 Data collected on a pH meter from a sample of 10 observations are: 7.07, 7.00, 7.10, 6.97, 7.00, 7.03, 7.01, 7.01, 6.98, 7.08  The sample mean x = (7.07 + 7.00 + …+7.08)/10 = 7.025  The sample mode x mode = 7.00 and 7.01  The sample median x = (7.01 + 7.01)/2 = 7.01  The sample range r = 7.10 – 6.97 = 0.14  The sample variance s 2 = [(7.07-7.025)2 + (7.00-7.025)2 + …+(7.08-7.025)2)] /9 = 0.00194 14

  8. 15 1.6 Discrete and Continuous Data Discrete Data – countable, could be finite or infinite, no additional data point between two consecutive data points. Example, number of defects in an automobile, number of trees in a forest, … Continuous Data – measurable, infinite, additional data points could be found between any two data points. Example, time, weight, density, … 16

  9. 1.7 Statistical Modeling, Scientific, Inspection, and Graphical Diagnostics 17 18

  10. 19 20

  11. Before editing After editing Histogram of Battery Life Histogram of Battery Life 1 6 1 2 1 4 1 0 1 2 8 1 0 Frequency Frequency 8 6 6 4 4 2 2 0 0 1 .6 2.4 3.2 4.0 4.8 1 .5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 Battery Life Battery Life 21 22

  12. 23 Boxplot of Battery Life 5.0 4.5 4.0 Battery Life 3.5 3.0 2.5 2.0 1.5 24

  13. 2 Probability 25 Consider the experiment of tossing a die. If we are interested in the number facing up, the sample space would be S = {1, 2, 3, 4, 5, 6} 26

  14. 27 28

  15. 29 A  B = A  B  C = C’ = (A  B )  (B  C)  (A  C) = 30

  16. 2.3 Counting Sample Points 31 32

  17. In how many different ways can a buyer order one of these homes? n = 4 ꞏ 3 = 12 33 Sam is going to assemble a computer. He has two choices of chips, four choices of a hard drive, three choices for memory, and five choice of the case. How many different ways could Sam assemble the computer? n = 2 ꞏ 4 ꞏ 3 ꞏ 5 = 120 34

  18. 35 36

  19. In how many different ways could three individuals, A, B, and C, be arranged in a row from left to right? A B C ACB BAC BCA CAB CBA 3! = 6 37 Example: If three medals (gold, silver, bronze) could be given to three students in a class of 25 and each student can receive at most one medal, how many possible selections could be made? 25 P 3 = 25!/(25-3)! = 13,800 38

  20. Example: If three identical medals could be given to three students in a class of 25 and each student can receive at most one medal, how many possible ways could the three medals be given? 25 25! 3 = 3!(25−3)! = 2,300 39 40

  21. 2.4 Probability of an Event 41 42

  22. 43 2.5 Additive Rules 44

  23. 45 46

  24. 47 48

  25. 49 50

  26. 2.6 Conditional Probability, Independence, and the Product Rule 51 52

  27. Example: If an adult is chosen, what is the probability that a male person is chosen given that this male person is employed? P(M|E) = P(E  M)/P(E) = 460/600 = 23/45 P(E  M) = n (E  M) / n(S) = 460/900 P(E) = n(E)/ n(S) = 600/900 53 54

  28. 55 56

  29. 3 Random Variables and Probability Distributions 57 58

  30. 3.2 Discrete Probability Distribution 59 60

  31. 61 62

  32. 63 64

  33. Suppose that the number of crashes observed in an intersection on the Memorial weekend has the following probability distribution: x 0 1 2 3 4 f(x) 0.2 0.1 0.3 0.3 0.1 Find the probability of having 3 crashes. Find the probability of having 3 or more crashes. P(x = 3) = 0.3 P(x ≥ 3) = 0.3 + 0.1 = 0.4 65 Section 3.3 Continuous Probability Distributions 66

  34. 67 68

  35. * P(x = a) = 0, P (a < x < b) = P(x < b) – P(x < a) 69 70

  36. 71 The weekly demand for Pepsi, in 1,000 liters, from a local store, is a continuous random variable with the probability density function f(x) = 2 (x – 1) for 1 < x < 2 2 = 0 elsewhere Find the probability that x = 1.5. 1 Find the probability that x ≤ 1.5. f(x) 0 0 1 2 3 P(x = 1.5) = 0 �.� � � 𝑦 | 1.5 � � 2 𝑦 � 1 𝑒𝑦 P(x ≤ 1.5) = � = 2 𝑦 � 1 = 0.25 ��� 72

  37. 4 Mathematical Expectation 73 74

  38. 75 76

  39. 77 78

  40. Suppose that the number of crashes observed in an intersection on the Memorial weekend has the following probability distribution: x 0 1 2 3 4 f(x) 0.2 0.1 0.3 0.3 0.1 Find the mean  and the variance  2 of X.  = E(X) = 0ꞏ0.2 + 1ꞏ0.1 + 2ꞏ0.3 + 3ꞏ0.3 + 4ꞏ0.1 = 2.0  2 = E(X-  ) 2 = (0-2) 2 ꞏ0.2 + (1-2) 2 ꞏ0.1 + (2-2) 2 ꞏ0.3 + (3-2) 2 ꞏ0.3 + (4-2) 2 ꞏ0.1 = 1.6 79 The weekly demand for Pepsi, in 1,000 liters, from a local store, is a continuous random variable with the probability density function f(x) = 2 (x – 1) for 1 < x < 2 = 0 elsewhere Find the mean  and the variance  2 of X. 80

  41. 5 Some Discrete Probability Distributions 81 The binomial distribution is a discrete probability distribution of the number of successes in a sequence of n independent success/failure experiments, each of which yields success with probability p. Such a success/failure experiment is also called a Bernoulli experiment or Bernoulli trial. In general, if the random variable X follows the binomial distribution with parameters n and p, we write b(x; n, p) = the probability of getting exactly x successes in n trials is given by the probability mass function: 82

  42. The probability that a certain kind of component will pass a shock test is 0.75. Find the probability that exactly 2 of the next 4 components tested passed. Find the probability that 2 or more of the next 4 components tested passed. 83 84

  43. The hypergeometric distribution is a discrete probability distribution that describes the probability of x successes in n draws from a finite population of size N containing k successes without replacement. A random variable X follows the hypergeometric distribution if its probability mass function is given by: 85 Lot of 40 components each are called unacceptable if they contain as many as 3 defective or more. The procedure for sampling the lot is to select 5 components at random and to reject the lot if a defective is found. What is the probability that exactly 1 defective is found in the sample if there are 3 defectives in the entire lot? n = 5, N = 40, k = 3, and x = 1 86

  44. The Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time and/or space if these events occur with a known average rate and independently of the time since the last event. A discrete random variable X is said to have a Poisson distribution with parameter λ>0, t>0, if for x = 0, 1, 2, ... the probability mass function of X is given by: 87 88

  45. During a laboratory experiment the average number of radioactive particles passing through a counter in 1 millisecond is 4. What is the probability that 6 particles enter the counter in a given millisecond? x = 6,  = 4, and t = 1 89 6 Some Continuous Probability Distributions 90

  46. The continuous uniform distribution is a probability distributions such that for each member of the family, all intervals of the same length on the distribution's support are equally probable. The support is defined by the two parameters, A and B, which are its minimum and maximum values. The probability density function of the continuous uniform random variable X is: 91 92

  47. In probability theory, the normal (or Gaussian) distribution is a continuous probability distribution that has a bell-shaped probability density function, known as the Gaussian function or informally the bell curve: The parameter μ is the mean or expectation (location of the peak) and σ 2 is the variance. σ is known as the standard deviation. The distribution with μ = 0 and σ 2 = 1 is called the standard normal distribution. 93 94

  48. 95 96

  49. 97 6.3 Areas under the Normal Curve 98

  50. 99 100

  51. An arbitrary normal random variable X could be transformed into a standard normal variable Z by means of the transportation Z = (X –  ) /  101 (a) Pr(z>1.84) = 1 – Pr(z < 1.84) (b) Pr(-1.97 < z < 0.86) = 1 – 0.9671 = 0.0329 = Pr(z < 0.86) – Pr(z < -1.97) = 0.8051 – 0.0244 = 0.7807 102

Recommend


More recommend