Samples and Statistics The objective of statistical inference is to - PowerPoint PPT Presentation

ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Samples and Statistics “The objective of statistical inference is to draw conclusions or make decisions about a population, based on a sample selected from the population.” Inference is simplest when the sample is a random sample from the population: the sample values X 1 , X 2 , . . . , X n are statistically independent and all have the same distribution. That is not possible when sampling without replacement from a finite population; in that case, a random sample is one that is drawn in � N � such a way that all possible samples have the same probability of n being chosen. 1 / 41 Inferences About Process Quality Statistics and Sampling Distributions

ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control It is not always possible or desirable to use a random sample. For example, the successive values plotted in a control chart are rarely independent, because they are influenced by slow-changing properties of the system. When we know, or suspect, that the sample was not a random sample, we should use appropriate methods. 2 / 41 Inferences About Process Quality Statistics and Sampling Distributions

ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Statistic A statistic is a quantity that can be calculated from only the values in a sample. Examples of statistics: Sample mean: n x = 1 � ¯ x i ; n i =1 Sample standard deviation: � n � 1 � � s = ( x i − ¯ x ) 2 ; � n − 1 i =1 A quantity like ¯ x − µ is not a statistic, because to calculate it we must know the value of the population parameter µ . 3 / 41 Inferences About Process Quality Statistics and Sampling Distributions

ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Sampling distribution A statistic computed from a random sample it itself a random variable, and has its own probability distribution. The distribution of a statistic of a random sample is called its sampling distribution , to emphasize that we are dealing with a statistic and not a single observation. 4 / 41 Inferences About Process Quality Statistics and Sampling Distributions

ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Sampling from a normal distribution Suppose that X 1 , X 2 , . . . , X n is a random sample from a normal population with mean µ and variance σ 2 . That is, X 1 , X 2 , . . . , X n are independent, and each is distributed as N ( µ, σ 2 ). Then the sampling distribution of the sample mean ¯ X is N ( µ, σ 2 / n ), or equivalently ¯ X − µ Z = σ/ √ n ∼ N (0 , 1) . 5 / 41 Inferences About Process Quality Statistics and Sampling Distributions

ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control The sampling distribution of the sample variance is a scaled chi-square distribution: χ 2 = ( n − 1) S 2 ∼ χ 2 n − 1 . σ 2 The χ 2 distribution with ν degrees of freedom, here n − 1, is the Gamma distribution with shape parameter r = ν/ 2 and rate parameter λ = 1 / 2. 6 / 41 Inferences About Process Quality Statistics and Sampling Distributions

ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control These sampling distributions are used to derive confidence intervals for µ and σ 2 , respectively. However, the confidence interval for µ requires that we know the value of σ ; this is rarely the case. When σ is unknown, we use a third sampling result: the sampling distribution of ¯ X − µ T = S / √ n is Student’s t -distribution with n − 1 degrees of freedom. 7 / 41 Inferences About Process Quality Statistics and Sampling Distributions

ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Sampling from a Bernoulli distribution Recall the notion of a sequence of independent trials, each resulting in success or failure, used to introduce the binomial distribution. Let X i be the indicator of success at the i th trial: � if the i th trial is a success; 1 X i = if the i th trial is a failure. 0 Each X i follows the Bernoulli distribution with parameter p = P ( X i = 1). 8 / 41 Inferences About Process Quality Statistics and Sampling Distributions

ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control The number of successes in n trials is X = X 1 + X 2 + · · · + X n , which follows the binomial distribution with parameters n and p . The sample mean ¯ X = X / n = ˆ p also has a discrete distribution, most easily described in terms of the distribution of X ; in particular E( ¯ X ) = p and Var( ¯ X ) = p (1 − p ) / n . By the Central Limit Theorem, ¯ X is approximately normal, N ( p , p (1 − p ) / n ). 9 / 41 Inferences About Process Quality Statistics and Sampling Distributions

ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Sampling from a Poisson distribution If X 1 , X 2 , . . . , X n are independent and each has the Poisson distribution with parameter λ , then X = X 1 + X 2 + · · · + X n follows the Poisson distribution with parameter n λ . The sample mean ¯ X = X / n = ˆ p also has a discrete distribution, most easily described in terms of the distribution of X ; in particular E( ¯ X ) = λ and Var( ¯ X ) = λ/ n . By the Central Limit Theorem, ¯ X is approximately normal, N ( λ, λ/ n ). 10 / 41 Inferences About Process Quality Statistics and Sampling Distributions

ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control More generally, if X 1 , X 2 , . . . , X n are independent and X i has the Poisson distribution with parameter λ i , then X = X 1 + X 2 + · · · + X n follows the Poisson distribution with parameter � n i =1 λ i . 11 / 41 Inferences About Process Quality Statistics and Sampling Distributions

ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Point Estimation In any of these sampling contexts, we need to make inferences about the parameter(s) of the corresponding model. A point estimator of a parameter is a sample statistic that approximates the parameter. As a statistic, it has a sampling distribution, with a mean and a variance. The standard deviation of its sampling distribution is called its standard error . 12 / 41 Inferences About Process Quality Point Estimation of Process Parameters

ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control If an estimator ˆ θ of some parameter θ satisfies E(ˆ θ ) = θ , it is called unbiased . In some situations, but not all, unbiased estimators are best. The mean squared error of an estimator ˆ θ of some parameter θ is θ ) 2 + Var(ˆ E[(ˆ θ − θ ) 2 ] = bias(ˆ θ ) which for an unbiased ˆ θ is just Var(ˆ θ ). X and variance s 2 are always In a random sample, the sample mean ¯ unbiased estimators of the population mean µ and variance σ 2 , respectively, but s is biased for σ . 13 / 41 Inferences About Process Quality Point Estimation of Process Parameters

ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control In some situations, the sample range , x ( n ) − x (1) , has been used to construct an estimator of the population standard deviation σ because it requires little computation. This construction is critically dependent on the assumption that the data are normally distributed; for any other distribution, the relationship between the range and the standard deviation is different. 14 / 41 Inferences About Process Quality Point Estimation of Process Parameters

ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Inference for a Single Sample Inferences about some parameter may be made using: a point estimator; an interval estimator; a hypothesis test. 15 / 41 Inferences About Process Quality Statistical Inference for a Single Sample

ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Mean of a normal population Point estimator The usual point estimator of µ is the unbiased ¯ X . The sampling distribution of ¯ X is N ( µ, σ 2 / n ), so its standard error is σ/ √ n . When σ is unknown, we replace it by s to get the estimated standard error s / √ n . 16 / 41 Inferences About Process Quality Statistical Inference for a Single Sample

ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Interval estimator The usual interval estimator is a confidence interval , derived from the distribution of Z (when σ is known) or T (when σ is unknown). Known σ : X ± z α/ 2 × σ ¯ √ n Unknown σ : X ± t α/ 2 , n − 1 × s ¯ √ n In each case, the interval contains µ with probability 1 − α , and is called a 100(1 − α )% confidence interval. The confidence level 100(1 − α )% is often 95%, but sometimes 99% is preferred. 17 / 41 Inferences About Process Quality Statistical Inference for a Single Sample

Samples and Statistics The objective of statistical inference is to - PowerPoint PPT Presentation

ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Samples and Statistics The objective of statistical inference is to draw conclusions or make decisions about a population, based on a sample

Business Statistics CONTENTS Comparing two samples Comparing two unrelated samples Comparing

Samples Advertising of samples and handing out samples Advertising Education and Assurance

-Samples [AB98] Hyp: domain S is a smooth curve or surface. S 1 -Samples [AB98] Hyp:

Biased and Unbiased Samples James J. Heckman Econ 312, Spring 2019 May 14, 2019 1 / 125

Biased and Unbiased Samples James J. Heckman Econ 312, Spring 2019 May 13, 2019 1 / 125

STAT 113 Independent vs. Paired Samples Colin Reimer Dawson Oberlin College November 16, 2017

Official Statistics Matt Dray, Assistant Statistician Official Statistics 2 Official

MutaPon Analysis in Frozen and FFPE Tumor Samples Gad Getz, PhD KrisPn Ardlie, PhD Broad

Lecture 6: samples and populations Todays lecture Look at fundamental concepts of samples and

Combining Point and Line Samples for Direct Illumination Points only Points + Lines Katherine

Labeling Blood Samples There are documented occurrences and near misses of mislabeling of blood

MVA method in channel @CEPC FANGYI GUO 1 2019/6/17 MC samples and

Counting Words: Type probabilities Population models Type-rich populations, samples, ZM &

Sampling a Signal an analog signal together with some samples of the signal. The samples

User Interface Design Prof. Dr. Jan M. Pawlowski Autumn 2013 Contents Introduction Definitions

This graph shows the evidence from the samples giving an indication of the predominance of the

Unit 3: Foundations for inference 3. Hypothesis tests GOVT 3990 - Spring 2020 Cornell University

z and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial

Monitoring Built-up areas using DMSP-OLS nighttime lights data: A study from Indo Gangetic Plain

CS 240A: Shared Memory & Multicore Programming with Cilk++ Multicore and NUMA

STAT2201 Analysis of Engineering & Scientific Data Unit 7 Slava Vaisman The University of

Topic III: Significance Testing Discrete Topics in Data Mining Universitt des Saarlandes,

Chapter 5.5: Hypothesis Tests 1. What is a hypothesis test? 2. The elements of a test: null and

relates to statistics Quantitative Thinking in the Life Sciences Today Probability! More

Samples and Statistics The objective of statistical inference is to - PowerPoint PPT Presentation

ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Samples and Statistics The objective of statistical inference is to draw conclusions or make decisions about a population, based on a sample

Business Statistics CONTENTS Comparing two samples Comparing two unrelated samples Comparing

Samples Advertising of samples and handing out samples Advertising Education and Assurance

-Samples [AB98] Hyp: domain S is a smooth curve or surface. S 1 -Samples [AB98] Hyp:

Biased and Unbiased Samples James J. Heckman Econ 312, Spring 2019 May 14, 2019 1 / 125

Biased and Unbiased Samples James J. Heckman Econ 312, Spring 2019 May 13, 2019 1 / 125

STAT 113 Independent vs. Paired Samples Colin Reimer Dawson Oberlin College November 16, 2017

Official Statistics Matt Dray, Assistant Statistician Official Statistics 2 Official

MutaPon Analysis in Frozen and FFPE Tumor Samples Gad Getz, PhD KrisPn Ardlie, PhD Broad

Lecture 6: samples and populations Todays lecture Look at fundamental concepts of samples and

Combining Point and Line Samples for Direct Illumination Points only Points + Lines Katherine

Labeling Blood Samples There are documented occurrences and near misses of mislabeling of blood

MVA method in channel @CEPC FANGYI GUO 1 2019/6/17 MC samples and

Counting Words: Type probabilities Population models Type-rich populations, samples, ZM &amp;

Sampling a Signal an analog signal together with some samples of the signal. The samples

User Interface Design Prof. Dr. Jan M. Pawlowski Autumn 2013 Contents Introduction Definitions

This graph shows the evidence from the samples giving an indication of the predominance of the

Unit 3: Foundations for inference 3. Hypothesis tests GOVT 3990 - Spring 2020 Cornell University

z and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial

Monitoring Built-up areas using DMSP-OLS nighttime lights data: A study from Indo Gangetic Plain

CS 240A: Shared Memory &amp; Multicore Programming with Cilk++ Multicore and NUMA

STAT2201 Analysis of Engineering &amp; Scientific Data Unit 7 Slava Vaisman The University of

Topic III: Significance Testing Discrete Topics in Data Mining Universitt des Saarlandes,

Chapter 5.5: Hypothesis Tests 1. What is a hypothesis test? 2. The elements of a test: null and

relates to statistics Quantitative Thinking in the Life Sciences Today Probability! More

Counting Words: Type probabilities Population models Type-rich populations, samples, ZM &

CS 240A: Shared Memory & Multicore Programming with Cilk++ Multicore and NUMA

STAT2201 Analysis of Engineering & Scientific Data Unit 7 Slava Vaisman The University of