Applying Signal Detection Theory to Multi-Level Modeling: When “Accuracy” Isn't Always Accurate December 5 th , 2011 Scott Fraundorf & Jason Finley
Outline • Why do we need SDT? • Sensitivity vs. Response Bias • Example SDT Analyses • Terminology & Theory • Logit & probit • Extensions
Categorical Decisions • Lots of paradigms involve asking participants to make a categorical decision – Could be explicit or implicit – Here, we're focusing on cases where there are just two categories … but can generalize to >2
Some Categorical Decisions “The coach knew Did Anna dress (that) you missed the baby? practice.” (D) Yes (K) No Assigning a meaning to a novel Choosing to include Comprehension word like “bouba” optional words like questions “that” The cop saw the spy with the binoculars. Interpreting an Baby looking at 1 Choosing referent in ambiguous sentence screen or another perspective-taking task
Some Categorical Decisions Source Recognition memory – item VIKING was this memory – VIKING (1) Male talker word said did you see (1) Seen (4) New by a male or (4) Female talker this word or female face in the talker? study list? Did Detecting something whether or change not a faint between the signal is two present displays?
What is Signal Detection Theory? For experiments with categorical judgments – Part method for analyzing judgments – Part theory about how people make judgments Originally developed for psychophysics – Operators trying to detect radar signals amidst noise Purpose: – Better metric properties than ANOVA on proportions ( logistic regression has already taken care of this for us ) – Distinguish sensitivity from response bias
Outline • Why do we need SDT? • Sensitivity vs. Response Bias • Example SDT Analyses • Terminology & Theory • Logit & probit • Extensions
Early recognition memory experiments: Study a list of words, then see the same list again. Circle the ones you remember. Study: Test: POTATO SLEEP SLEEP POTATO RACCOON BINDER WITCH WITCH NAPKIN RACCOON BINDER NAPKIN Problem: People might realize that all of these are words they studied! Could circle them all even if they don't really remember them.
Later experiments: Add foils or lures that aren't words you studied. Study: Test: POTATO POTATO SLEEP HEDGE RACCOON WITCH WITCH BINDER NAPKIN SHELL BINDER SLEEP Here we see someone circled half of MONKEY the studied words … but they circled half of the lures, too. No real ability OATH to tell apart new & old items. They're just circling 50% of everything.
What we want is a measure that examines correct endorsements relative to incorrect endorsements … and is not influenced by an overall bias to circle things Study: Test: POTATO POTATO SLEEP HEDGE RACCOON WITCH WITCH BINDER NAPKIN SHELL BINDER SLEEP MONKEY OATH
Sensitivity vs. Response Bias “C is the most common answer in multiple choice Knowing exams” which answers are Response C and bias which aren't Sensitivity (or discrimination )
Sensitivity vs. Response Bias Imagine asking L2 learners of English to judge grammaticality... SAID Group A ACCURACY “GRAMMATICAL” 70% 70% Grammatical condition 70% 30% Ungrammatical cond. It appears that our participants are more accurate at accepting grammatical items than rejecting ungrammatical ones...
Sensitivity vs. Response Bias SAID Group A ACCURACY “GRAMMATICAL” 70% Grammatical condition 70% 70% 30% Ungrammatical cond. But, really, they are just judging all sentences as “grammatical” 70% of the time – a response bias No evidence they're showing sensitivity to grammaticality here “Accuracy” can confound these 2 influences
Sensitivity vs. Response Bias Now imagine we have speakers of two different first languages... SAID Group A ACCURACY “GRAMMATICAL” 70% 70% Grammatical condition 70% 30% Ungrammatical cond. Group B 60% Grammatical condition 40% Ungrammatical cond.
Sensitivity vs. Response Bias It looks like Group B is better at rejecting ungrammatical sentences... But groups just have different biases SAID Group A ACCURACY “GRAMMATICAL” 70% 70% Grammatical condition 70% 30% Ungrammatical cond. Group B 60% 60% Grammatical condition 60% 40% Ungrammatical cond.
Sensitivity vs. Response Bias This would be particularly misleading if we only looked at ungrammatical items No way to distinguish response bias vs. sensitivity in that case! Group A SAID ACCURACY “GRAMMATICAL” Ungrammatical cond. 70% 30% Group B 40% 60% Ungrammatical cond.
Sensitivity vs. Response Bias We see participants can give the “right” answer without really knowing it Comparisons to “chance” attempt to deal with this But “chance” = 50% assumes both responses equally likely – Probably not true for e.g., attachment ambiguities – People have overall bias to answer questions with “yes”
Sensitivity vs. Response Bias Common to balance frequency of intended responses – e.g. 50% true statements, 50% false But bias may still exist for other reasons – Prior frequency • e.g. low attachments are more common in English than high attachments … might create a bias even if they're equally common in your experiment – Motivational factors (e.g., one error “less bad” than another) • Better to suggest a healthy patient undergo additional screening for a disease than miss someone w/ disease
Outline • Why do we need SDT? • Sensitivity vs. Response Bias • Example SDT Analyses • Terminology & Theory • Logit & probit • Extensions
Fraundorf, Watson, & Benjamin (2010) Hear recorded discourse: Both the British and the French biologists had been searching Malaysia and Indonesia for the endangered monkeys. Finally, the British spotted one of the monkeys in Malaysia and planted a radio tag on it. Then, later, get Presentational or true/false memory test contrastive pitch accent?
The British scientists spotted the endangered monkey and tagged it. D K TRUE FALSE
N.B. Actual experiment had multiple types of false probes … an important part of the actual experiment, but not needed for this demonstration The French scientists spotted the endangered monkey and tagged it. D K TRUE FALSE
SDT & Multi-Level Models Traditional logistic regression model: Accuracy = Probe Type x Pitch Accent of Response CORRECT MEMORY or INCORRECT MEMORY Accuracy confounds sensitivity and response bias – Manipulation might just make you say true to everything more
SDT & Multi-Level Models Traditional logistic regression model: Accuracy = Probe Type x Pitch Accent of Response CORRECT MEMORY or SDT model involves changing INCORRECT MEMORY the way your DV is SDT model: parameterized. Response = Probe Type x Pitch Accent Made JUDGED TRUE vs JUDGED FALSE JUDGED GRAMMATICAL or JUDGED UNGRAMMATICAL
Respond True correctly statement or or Respond False incorrectly? statement? They are deciding This better reflects whether to say “this is the actual true” or “this is false” judgment we are … not whether to asking participants respond accurately or to make. respond inaccurately
SDT & Multi-Level Models SDT model: Said “TRUE” w/ centered predictors... = Response Baseline rate of responding TRUE. Intercept bias + Actually is Does item being true make you Sensitivity more likely to say TRUE? TRUE At this point, we haven't looked at any differences between conditions (e.g. contrastive vs presentational accent or L1 vs L2). We are just analyzing overall performance.
SDT & Multi-Level Models SDT model: Said “TRUE” w/ centered predictors... = Response Baseline rate of responding TRUE. Intercept bias + Actually is Does item being true make you Sensitivity more likely to say TRUE? TRUE Contrastive Does contrastive accent change Effect on + Accent overall rate of saying TRUE? bias Accent x Does accent especially increase Effect on + TRUE TRUE responses to true items? sensitivity
SDT & Multi-Level Models SDT model: Contrastive accent improves actual sensitivity. No effect on Said response bias. “TRUE” = Response Baseline rate of responding TRUE. Intercept bias + Actually is Does item being true make you Sensitivity more likely to say TRUE? TRUE * Contrastive Does contrastive accent change Effect on + Accent overall rate of saying TRUE? bias Accent x Does accent especially increase Effect on + TRUE * TRUE responses to true items? sensitivity
SDT & Multi-Level Models SDT model: Said “TRUE” = Response General heuristic: Intercept bias Effects that don't interact with + Actually is Sensitivity item type = effects on bias TRUE * Contrastive Effect on + Effects that do involve item Accent bias type = effects on sensitvity Accent x Effect on + TRUE * sensitivity
Recommend
More recommend