Screening the Data for Detecting Methodological induced Variation Jörg Blasius University of Bonn, Germany Victor Thiessen Dalhousie University, Halifax, Canada 6 th CARME Conference Rennes, France, February 9-11, 2011
Substantive and Non-Substantive Variation Non-substantive variation , produced by • Response styles, such as acquiescence, disacquiescence, extreme response styles, midpoint responding, wide range responding, … • Hidden non-responses (using the midpoint, random responses, …) • Misunderstanding of questions • Translations and coding errors (in cross-national surveys) • Different field work standards (in cross-national surveys) • Missing data (item non-response) • Social Desirability • Primacy and recency effects • Fatigue effects • Biased samples (unit non-response) • Faked interviews … which is often summarized as “Measurement Error”
Substantive variation , produced by individual attributes – and depending on cognitive competencies (which have an effect on the dimensionality of the solution; Thiessen/Blasius, 2008) Quality of Data: The higher share of substantive variation, or the lower the share of non-substantive variation, the higher is the quality of the data. But: How to assess the quality of data?
Canadian Nationwide Election Study 1984: “Political Trust and Efficacy Data” (N=3,377) Item SA AS NN DS SD NO a) Generally, those elected to Parliament 26.6 44.5 3.5 16.1 4.8 4.5 soon lose touch with the people. b) I don't think the (Federal) Government 26.9 32.9 3.8 24.2 9.0 3.2 cares much about what people like me think. c) Sometimes, (Federal) Politics and Govern- ment seem so complicated that a person like 30.8 33.1 2.5 19.1 12.6 1.9 me can't really understand what's going on. d) People like me don't have any say about 33.4 28.3 2.2 20.0 14.0 2.1 what the Government in (Ottawa) does. e) So many other people vote in (Federal) elections that it does not matter very much 7.8 9.9 1.8 16.0 62.8 1.7 whether I vote or not. f) Many people in the (Federal) Government 10.5 25.1 10.1 24.6 18.2 11.5 are dishonest. g) People in the (Federal) Government waste 46.3 33.2 3.9 9.0 3.6 4.1 a lot of the money we pay in taxes.
Item SA AS NN DS SD NO h) Most of the time we can trust people in the 10.4 46.0 6.2 23.5 9.7 4.2 (Federal) Government to do what is right. i) Most of the people running the (Federal) Government are smart people who usually 15.9 45.5 5.9 21.0 8.2 3.6 know what they are doing.
Subset Multiple Correspondence Analysis (SMCA) SMCA concentrates on just some of the response categories, while exclud- ing others from the solution (Greenacre and Pardo 2006, Greenacre 2007). For example, with SMCA the structure of the subset of NOs can be analyz- ed separately, or these responses can be excluded from the solution while concentrating only on the substantive responses. Suppose we have five variables with four categories, ranging from SA to SD. Since the row sums of the indicator matrix are 5, SMCA maintain the equal weighting of all respondents, the row profile values are 0.2 and zero. If we concentrate on SA, respondents with five answers on SA will have five profile values of 0.2 (and a row sum of 1.0), respondents with four answers on SA will have four profile values of 0.2 (and a row sum of 0.8), respondents with two answers on SA will have two profile values of 0.2 (and a row sum of 0.4); in case of omitting the categories they would have four profile values of 0.25 (or two values of 0.5) and a row sum of one.
SMCA, Burt-Table a1 a2 a4 a5 b1 b2 ... i1 i2 i4 i5 a3 a9 b3 b9 c3 c9 ... i3 i9 a1 a2 a4 a5 Interaction, Set 1 × Set 2 b1 Subset MCA, Set 1 b2 ... i4 i5 a3 a9 b3 Interaction, Set 2 × Set 1 b9 Subset MCA, Set 2 ... i3 i9
Constructing a two-dimensional Map by Means of (Subset) Multiple Correspondence Analysis • Best method to see different kinds of methodologically-induced variation, for example, response sets; as well as to distinguish between methodologically-induced and substantive variation • In MCA and SMCA, similarities between variable categories (or bet- ween respondents) are reflected by short (Euclidian) distances, dissi- milarities by large distances • If the quality of data is high, in MCA/SMCA the first dimension should capture mainly substantive variation due to political efficacy and trust, with the second dimension reflecting the horseshoe. • The items associated with the first dimension should retain their ordinality in this dimension.
• If people did not pay attention to the direction of the questions, the responses to the negatively-formulated items will not conform to an ordinal scale. • The horseshoe might also appear on the first dimension (large amount of non-substantive variation) or between dimensions 1 and 2 (two- dimensional solution, data might be on high quality). • If there is a high intercorrelation within the non-substantive responses, in MCA, the first or second dimension will just reflect the difference between substantive and non-substantive responses, in SMCA the non- substantive responses can be excluded without missing any information as it is true in the case in listwise deletion.
Fedgov, N = 3,377; all respondents h5 i5 f1 d5 g1 e5 a1 i1 c5 h1 f2 h4 d4 a2 b5 b2 f4 b1 e1 c4 a4 a5 c2 d1 i4 b4 f5 d2 h2 0.0 c1 g2 g4 e2 f3 e4 i2 g5 h3 a3 e3 i3 g3 b3 −0.5 c3 d3 f9 −1.0 a9 h9 −1.5 g9 i9 b9 −2.0 d9 c9 e9 −2.5 −1.0 −0.5 0.0 0.5 1.0
SMCA (1,2,3,4,5), Fedgov, N=3,377 a5 g5 0.8 b5 h1 0.6 c5 d5 i1 f5 0.4 h5 f1 e1 i5 g4 0.2 b1 e5 a1 a4 d1 g1 c1 b4 c4 0.0 d4 h4 a3 h2 i2 i4 g2 a2 −0.2 f4 f2 c2 e2 e3 g3 b2 f3 d2 i3 b3 −0.4 e4 h3 c3 d3 −0.5 0.0 0.5
SMCA (3,9), Fedgov, N=3,377 1.5 d3 c3 e3 1.0 g3 b3 h3 a3 i3 f3 0.5 c9 e9 f9 a9 g9 0.0 b9 d9 h9 i9 −0.5 −2.5 −2.0 −1.5 −1.0 −0.5 0.0
Fedgov, SMCA (1), N=3,377 e1 c1 0.2 d1 b1 a1 g1 0.0 f1 −0.2 −0.4 h5 −0.6 i5 −0.8 −0.6 −0.4 −0.2 0.0
Fedgov, SMCA (2), N=3,377 0.2 e2 d2 b2 g2 c2 0.1 a2 0.0 f2 −0.1 −0.2 −0.3 h4 −0.4 i4 −0.4 −0.3 −0.2 −0.1
Fedgov, SMCA (3), N=3,377 i3 0.5 h3 g3 a3 f3 0.0 b3 −0.5 c3 d3 −1.0 −1.5 e3 −1.5 −1.0 −0.5 0.0
Fedgov, SMCA (4), N=3,377 0.2 a4 c4 d4 b4 0.0 h2 i2 f4 −0.2 g4 −0.4 e4 0.0 0.2 0.4 0.6
Fedgov, SMCA (5), N=3,377 0.4 c5 d5 b5 0.2 a5 e5 0.0 f5 −0.2 g5 −0.4 i1 −0.6 h1 0.0 0.2 0.4 0.6 0.8 1.0
Fedgov, SMCA (9), N=3,377 1.0 e9 d9 c9 0.5 b9 a9 0.0 h9 g9 −0.5 f9 i9 −2.5 −2.0 −1.5 −1.0
Decomposition of inertia, SMCA, Federal Government Dimension 1 Dimension 2 Total Model K Abs. In % Abs. In % Abs. In % All categories 45 0.1118 15.8 0.1036 14.6 0.7083 100.0 Subset(1,2,3,4,5) 45 0.1107 20.3 0.0625 11.5 0.5441 76.8 Subset(9) 9 0.0929 62.7 0.0117 7.9 0.1481 20.9 Interaction 9 0.0046 57.8 0.0010 12.7 0.0080 1.1 Subset(1,2,4,5) 36 0.1095 25.9 0.0599 14.2 0.4225 59.6 Subset(3,9) 18 0.0934 36.2 0.0327 12.6 0.2583 36.5 Interaction 18 0.0066 47.9 0.0017 12.2 0.0137 1.9 Subset(1) 9 0.0543 56.7 0.0128 13.3 0.0959 13.5 Subset(2) 9 0.0150 24.1 0.0114 18.3 0.0622 8.8 Subset(3) 9 0.0326 30.0 0.0140 12.9 0.1086 15.3 Subset(4) 9 0.0229 32.4 0.0093 13.2 0.0705 10.0 Subset(5) 9 0.0442 45.5 0.0138 14.2 0.0972 13.7 Subset(9) 9 0.0929 62.7 0.0117 7.9 0.1481 20.9 Subset(1): First category items “a” to “g”, last category items “h” and “I”, and so on . Example: 76.8 + 20.9 + 2 × 1.1 = 100.0
Understanding of questions, subdivision by political interest: First row, low PI, N = 1,935; second row: High PI, N = 1,441 χ 2 SA AS NN DS SD NO Item a) Generally, those elected to Parlia- 28.1 44.4 3.9 13.5 3.5 6.6 85.2 ment soon lose touch with the people. 24.6 44.8 3.0 19.5 6.5 1.7 b) I don't think the (Federal) Govern- 30.0 33.0 4.1 22.2 6.5 4.2 ment cares much about what people 69.5 22.7 32.8 3.4 26.9 12.3 1.9 like me think. c) Sometimes, (Federal) Politics and Government seem so complicated that 38.4 34.5 2.8 15.1 6.8 2.3 249.7 a person like me can't really understand 20.7 31.4 1.9 24.4 20.3 1.3 what's going on. d) People like me don't have any say 38.0 28.8 2.8 16.8 10.7 2.9 about what the Government in (Ottawa) 111.1 27.1 27.8 1.5 24.3 18.5 0.9 does. e) So many other people vote in (Fe- 10.4 12.5 2.2 19.2 53.4 2.4 deral) elections that it does not matter 179.2 4.3 6.5 1.3 11.8 75.4 0.7 very much whether I vote or not. f) Many people in the (Federal) Govern- 11.4 26.0 11.0 23.3 13.4 15.0 118.2 ment are dishonest. 9.2 23.9 9.0 26.4 24.7 6.8
χ 2 Item SA AS NN DS SD NO g) People in the (Federal) Government 46.4 33.3 4.5 7.7 2.5 5.5 waste a lot of the money we pay in 54.2 46.1 33.0 3.0 10.7 5.1 2.0 taxes. h) Most of the time we can trust people 8.8 47.1 7.4 22.4 8.9 5.4 in the (Federal) Government to do what 43.9 12.6 44.6 4.6 24.8 10.8 2.6 is right. i) Most of the people running the 14.5 46.8 7.1 19.1 7.6 4.9 (Federal) Government are smart people 51.5 17.8 43.7 4.2 23.5 9.0 1.7 who usually know what they are doing. One missing case because one respondents did not answer the political interest items.
Recommend
More recommend