chapter 11
play

Chapter 11 Categorical Data Analysis Categorical Data and the - PowerPoint PPT Presentation

Chapter 11 Categorical Data Analysis Categorical Data and the Multinomial Distribution Properties of the Multinomial Experiment 1. Experiment has n identical trials 2. There are k possible outcomes to each trial, called classes, categories or


  1. Chapter 11 Categorical Data Analysis

  2. Categorical Data and the Multinomial Distribution Properties of the Multinomial Experiment 1. Experiment has n identical trials 2. There are k possible outcomes to each trial, called classes, categories or cells 3. Probabilities of the k outcomes remain constant from trial to trial 4. Trials are independent Variables of interest are the cell counts, n 1 , n 2 …n k , the 5. number of observations that fall into each of the k classes

  3. Testing Category Probabilities: One-Way Table In a multinomial experiment with categorical data from a single qualitative variable, we summarize data in a one-way table. Schema for one-way table for an experiment with k outcomes … k 1 k 2 k … n 1 n 2 n k

  4. Testing Category Probabilities: One-Way Table Hypothesis Testing for a One-Way Table • Based on the  2 statistic, which allows comparison between the observed distribution of counts and an expected distribution of counts across the k classes • Expected distribution = E(n k )=np k , where n is the total number of trials, and p k is the hypothesized probability of being in class k according to H 0   2  k n E ( n )   • The test statistic,  2 , is calculated as 2 i i    E n  and the rejection region is determined i 1 i by the  2 distribution using k-1 df and the desired 

  5. Testing Category Probabilities: One-Way Table Hypothesis Testing for a One-Way Table • The null hypothesis is often formulated as a no difference, where H 0 : p 1 =p 2 =p 3 =…=p k =1/k , but can be formulated with non-equivalent probabilities • Alternate hypothesis states that H a : at least one of the multinomial probabilities does not equal its hypothesized value

  6. Testing Category Probabilities: One-Way Table Hypothesis Testing for a One-Way Table • The null hypothesis is often formulated as a no difference, where H 0 : p 1 =p 2 =p 3 =…=p k =1/k , but can be formulated with non-equivalent probabilities • Alternate hypothesis states that H a : at least one of the multinomial probabilities does not equal its hypothesized value

  7. Testing Category Probabilities: One-Way Table One-Way Tables: an example H 0 : p none =.10, p Standard =.65, p Merit =.25 H a : At least 2 proportions differ from proposed plan Rejection region with  =.01, df = k-1 = 2 is 9.21034 Since the test statistic falls in the rejection =Total x p region, we reject H 0

  8. Testing Category Probabilities: One-Way Table Conditions Required for a valid  2 Test • Multinomial experiment has been conducted • Sample size is large, with E(n i ) at least 5 for every cell

  9. Testing Category Probabilities: Two-Way (Contingency) Table Used when classifying with two qualitative variables General r x c Contingency Table Column … 1 2 c Row Totals … 1 n 11 n 12 n 1c R 1 2 n 21 n 22 n 2c R 2 … … … … … … Row … r n r1 n r2 n rc R r … Column Totals C 1 C 2 C c n H 0 : The two classifications are independent H a : The two classifications are dependent 2    n E R C Test Statistic:   ij ij  i j  2   w h e re E ij E n ij Rejection region:  2 >  2  , where  2  has (r-1)(c-1) df

  10. Testing Category Probabilities: Two-Way (Contingency) Table Conditions Required for a valid  2 Test • N observed counts are a random sample from the population of interest • Sample size is large, with E(n i ) at least 5 for every cell

  11. Testing Category Probabilities: Two-Way (Contingency) Table Sample Statistical package output

  12. A Word of Caution about Chi-Square Tests • When an expected cell count is less than 5,  2 probability distribution should not be used • If H 0 is not rejected, do not accept H 0 that the classifications are independent, due to the implications of a Type II error. • Do not infer causality when H 0 is rejected. Contingency table analysis determines statistical dependence only.

Recommend


More recommend