scale construction
play

Scale construction Michelle Mazurek (some material from Bilge - PowerPoint PPT Presentation

Scale construction Michelle Mazurek (some material from Bilge Mutlu) 1 About scales Bridging from qual. to quant Using (typically) ordinal questions Sometimes nominal categorical Using them in a repeatable way That is


  1. Scale construction Michelle Mazurek (some material from Bilge Mutlu) 1

  2. About scales • Bridging from qual. to quant • Using (typically) ordinal questions – Sometimes nominal categorical • Using them in a repeatable way • That is validated! • For construct validity 3

  3. Thinking about construct validity • How to measure something complicated / hard to define – Risk taking – Privacy concern – Sociability – Etc. ted way! • In a va vali lida date 4

  4. What do we want to validate? • Items -> latent factors • Reliability: internal consistency, test-retest • Reflects something in the real world 5

  5. Overall procedure • Generate items and review for wording, match to intended construct, etc. – Expert review; cognitive interview • Refine items – Check for range effects – Do exploratory factor analysis – Get rid of ones that don’t work – Set up subscales – Repeat 6

  6. Overall procedure (2) • Validate – Against other scales, real-world behavior – That subscales still intra-correlate and load – Test-retest – Different populations? Modes (internet)? 7

  7. EXPLOR ORATOR ORY FACTOR OR AN ANAL ALYSIS 8

  8. Often, multiple components • Risk perception: different kinds of risk • Privacy -- ideas about collection vs. unauthorized sharing, etc. • … • Subscales! 9

  9. Observable vs. latent • Observable: answers to items, test scores, other measurements factor that correlates with • Latent: underlying fa (governs?) multiple measurable components • Factor analysis: re reduce large number of observables to smaller number of latent factors – Resultant factors hopefully (mostly) independent 10

  10. Factor analysis model • X 1 -X n : measured variables • F 1 -F m : latent factors • b 11 -b nm : factor loa loadin ings • X 1 = b 11 F 1 + b 12 F 2 + …. b 1m F m + e 1 X 2 = b 21 F 1 + b 22 F 2 + …. b 2m F m + e 2 X n = b n1 F 1 + b n2 F 2 + …. b nm F m + e n 11

  11. Factor analysis model • Loadings: -1 to 1, where 0 = no loading • Like to end up w/ mostly 1s and 0s • All based on correlation / covariance matrices among the measure variables 12

  12. Assumptions: • Measurement error constant variance, avg=0 • No assoc. btwn errors • No assoc. btwn factor + measurement error • Local/conditional independence: – Meas. Vars are independent (given the factor) • In practice: everything in standardized – Subtract the mean (center at 0) and div by StD (var=1) – Total variance = # of meas. variables 13

  13. Requires large samples • Rule of thumb: 10 observations per variable in the list (so if 30 item scale, n=300) 14

  14. Running example • Teaching reviews (from “Real Statistics with Excel” website) • 120 obs. of 9 questions – All on 1-10 Likerts – E.g. is entertaining, communicates well, has expertise in the subject, passion for teaching, etc. 15

  15. Overall procedure • Co Collec ect + ex explore e data • Extract initial factors; choose how many to retain • Choose and use estimation method • Rotate • Interpret, adjust, repeat 16

  16. Explore data • Check for range effects • Check for applicability of factor analysis – KMO sampling adequacy (> 0.6) – Bartlett’s sphericity • Null: correlation matrix is identity matrix (everything is uncorrelated). You want to reject it (p < 0.05). But, it’s always rejected basically. 17

  17. Overall procedure • Collect + explore data • Extra Extract ct initial facto ctors rs; ch choose how many y to to reta tain • Choose and use estimation method • Rotate • Interpret, adjust, repeat 18

  18. How many factors? • Theoretical / predicted answer • Guess and check • Use PCA to find out – Start with factors = # of variables – Decide how many to retain based on results • Too many: some may have zero loadings; not parsimonious • Too few: may have incorrect loadings (worse!) 19

  19. Using PCA to retain factors • Each factor has an associated eigenvalue; retain based on eigenvalues All with eigenvalue > 1 (Kaiser) • – Factor contributes more than a single measure variable to the total variance (each meas has var=1) – This is obviously arbitrary; can retain too many • Scree plot (Catell): Plot, keep left of inflection – Subjective • Min factors where sum > 70% (80%) of total variance • Others 20

  20. Overall procedure • Collect + explore data • Extract initial factors; choose how many to retain • Cho Choose e and nd us use e es esti tima mati tion n metho method • Rotate • Interpret, adjust, repeat • Confirm: collect new data and fit to model – Evaluate adequacy; compare to other models 21

  21. Main estimation method • Maximum likelihood – Max. likelihood of seeing this corr. matrix (more CFA) • Principle Axis – Put as many vars as possible on first factor, etc. • Principle components (ish) – Account for max. variance with first factor, etc. 22

  22. Overall procedure • Collect + explore data • Extract initial factors; choose how many to retain • Choose and use estimation method • Ro Rota tate te • Interpret, adjust items, repeat 23

  23. Rotation factor loadings • There are infinite equally good solutions to the factor loadings (matrix math) • Think of these as rotations – Factors are axes/vectors, variables “load” onto close by axes, can ”rotate” them infinitely • Goal: loadings that are close to either 1 or 0 – Distribute items among factors – Clearly distinguish “on” or “off” – Does not improve fit! 24

  24. Rotation methods • Orthogonal: factors independent – Varimax: max sq. loading variance ac across ss va vars rs • Most common – Quartimax: max. it ac across fac ss factors • Oblique: not independent – Oblimin, promax 25

  25. Choosing rotation • Maybe not super important • Orthogonal: simple to interpret – Is independence reasonable for your construct? • Oblique: maybe simpler structure, but interactions are confusing – Loading not interpretable as correlation var + factor 26

  26. Overall procedure • Collect + explore data • Extract initial factors; choose how many to retain • Choose and use estimation method • Rotate • In Interpret, a , adjust st i items, r s, repeat 27

  27. Detour: FA vs. clustering • Clustering: Group ob observation ions – Find and profile subgroups • FA: Group va vari riabl bles – Data reduction – Latent factors 28

  28. Detour: FA vs. PCA • Meta-analysis study • CFA: underlying construct – Best for correlations of variables, structure of data • PCA: increased factor loadings – Best for summarizing, reducing variables • (Kim 2008) 29

  29. Detour: Communality vs. uniqueness • Communality: Variance in the measure variable explained by the factors • Uniqueness: variance explained by the e term 30

  30. Choosing items • Drop anything with uniqueness > 0.5 – Not well mapped to factors • Keep things that load > 0.3 (or 0.5) • Avoid cross-loading items – Anything that doesn’t load as least 2x on “main” factor (“Saucier”) 31

  31. Interpreting a subscale • Is there a coherent explanation for why these particular questions fit together? • Do the subscale items have high reliability? – Cronbach alpha > 0.6 for each, 0.7 for majority of the subscales (McKinley) – Item-total correlation (pearson btwn item and subscale average) > 0.2 (Everitt) 32

  32. Validating the scale • Get a new sample, check validity • Does PCA produce same # of factors? • Do items load as predicted? • Test-retest: same participants, over time • Validate against real-world data: – SEBIS vs. measured security behavior – DOSPERT vs. risk behaviors 33

Recommend


More recommend