Outline The difR package • DIF (in 5 minutes) A toolbox for the identification of dichotomous differential item functioning David Magis University of Liège and K.U. Leuven, Belgium david.magis@ulg.ac.be Outline Outline • DIF (in 5 minutes) • DIF (in 5 minutes) • DIF methods (in 2 minutes) • DIF methods (in 2 minutes) • The difR package (in 5 minutes)
Outline Outline • DIF (in 5 minutes) • DIF • DIF methods (in 2 minutes) • DIF methods • The difR package (in 5 minutes) • The difR package • Application (until Florian’s signal or lunch • Application time ) Outline DIF • DIF • Framework: – One test with dichotomous items • DIF methods • The difR package – Two (or more) groups – One reference group, one (or more) focal • Application group(s) – Question of interest: are the items functioning similarly in all groups?
DIF (2) DIF (3) • Item is said to have differential functioning • Four main aspects: (to be DIF) if examinees from different – IRT vs non-IRT groups, but with the same ability level, – Uniform DIF vs nonuniform DIF have different probabilities of answering – Two vs more than two groups the item correctly – Item purification • Goals of DIF research: – To develop methods to detect DIF – To identify and remove DIF items DIF (4) DIF (5) • IRT vs non-IRT: • Uniform vs nonuniform: – Early first methods rely on statistical aspects – DIF effect is uniform if the item-group (Mantel-Haenszel, logistic regression, interaction is independent of the ability level, SIBTEST…) and don’t require fitting IRT and nonuniform otherwise models – Non-IRT methods: conditional association – Other methods fit IRT models and compare between item response and group model fits (LRT) or item parameters (Lord, membership is independent of matching Raju) variable (i.e. sum score) – IRT methods:
DIF (6) DIF (7) • Two vs more than two groups: – Most methods deal with two groups (reference and focal) – Some are specifically designed for simultaneous comparison of more than two groups DIF (8) DIF (9) • Item purification: – Proposed solution: item purification – Iterative process that successively removes – DIF items can affect the validity of the items flagged as DIF from measures of DIF • the computation of sum scores (non-IRT) – Some known effects: • the rescaling of item parameters (IRT) • Type I error inflation: non-DIF items are incorrectly – Process stops when flagged as DIF • Masking effect: Items with large DIF effect can • no DIF item is detected mask the presence of other DIF items but with • two successive steps of the process yield the smaller DIF effects same classification of items
DIF (10) Outline –Item purification • DIF • controls for Type I error inflation • DIF methods • The difR package • usually yields increased power –but • Application • can be time consuming • no guarantee that the iterative process stops DIF methods DIF methods (2) Groups Groups Method DIF effect Two More than two Method DIF effect Two More than two NON-IRT Uniform TID, MH, GMH, NON-IRT Uniform TID, MH, GMH, Std, logReg, genLogReg, Std, logReg, genLogReg, SIBTEST genTID SIBTEST genTID NON-IRT Nonuniform MH*, BD, genLogReg NON-IRT Nonuniform MH*, BD, genLogReg logReg, logReg, SIBTEST* SIBTEST* IRT Uniform Lord, Raju, genLord IRT Uniform Lord, Raju, genLord LRT LRT IRT Nonuniform Lord, Raju, genLord IRT Nonuniform Lord, Raju, genLord LRT LRT
The difR package Outline • DIF • Jointly developed by • DIF methods Sébastien Béland (UQAM, Canada), • The difR package Francis Tuerlinckx (K.U. Leuven, Belgium) • Application Paul De Boeck (University of Amsterdam, The Netherlands and K. U. Leuven, Belgium) The difR package (2) The difR package (3) • Three levels of R functions: – Low level: Working functions, do the computational job – Middle level: DIF functions, of the form “dif…” to call a specific method (e.g. difMH for Mantel-Haenszel) – High level: dichoDif function, calls several middle level functions and merge their output
The difR package (4) The difR package (5) • Generic input parameters: • Specific input parameters: – Data : the data matrix – Depend on the method – group : the vector of group membership – Can specify: – focal.name(s) : the name(s) of focal group(s) • the DIF statistic (e.g. Mantel-Haenszel) – purify : should item purification be performed? • the type of logistic model (e.g. logistic regression) (default is FALSE) or IRT model (e.g. Lord, Raju) – save.output : should the output be saved into a text • The DIF classification thresholds (e.g. file? (default is FALSE) standardization) – output : specifies the name and the place to save the • The matrix of item parameters (e.g. Lord, Raju) output • Etc. The difR package (6) The difR package (7) • dichoDif function: • Output: – List with all useful information (input and – Calls one or several DIF methods output) – Either for two groups, or for more than two – Displayed in a visually attractive way through groups print(.) – All specific options can be passed to – Can be saved into a text file dichoDif – Can be plotted for visual representation of DIF – Returns a summary of all requested methods statistics, through plot(.) – For direct comparison of method output
Outline Application • DIF • Data set: verbal aggression example • DIF methods • 316 students (243 females, 73 males), first • The difR package year psychology (K.U. Leuven) • 24 items built by mixing • Application – 4 frustrating situations – 3 possible aggressive responses – 2 possible actions related to aggressive responses Application (2) Application (3) • Frustrating situations: • Possible actions: – S1: “A bus fails to stop for me” – I want to… – S2: “I miss a train because a clerk gave me – I do… faulty information” – S3: “The grocery store closes just as I am • Possible aggressive responses: about to enter” – To shout – S4: “The operator disconnects me when I had – To curse used up my last 10 cents for a call” – To scold
Application (4) Application (5) • Examples: • “Correct response” if student responds in an aggressive way, that is, if he/she – S1DoShout: “A bus fails to stop for me. I answers “yes”. shout”. – S3WantCurse: “The grocery store closes just • Research question: do the items “function” as I am about to enter. I want to curse.” similarly for males and females? – Etc. • Data collected by Vansteelandt (2000) • Available in difR Application (6) Application (7) • Reference group: female students • Three DIF analyzes: – Using Mantel-Haenszel • Focal group: male students – Using Lord’s test (and 1PL model) – Using dichoDif function and several DIF • Columns 1-24: items methods • Column 25: Anger (not used here) • Column 26: Gender (group membership)
Application (8) Application (9) • And now… • Reading and preparing the data: require(difR) data(verbal) verbal <-verbal[colnames(verbal)!="Anger"] Application (10) Application (11) • Mantel-Haenszel analysis: • Output: – Focal group: 1 (males) – MH chi-square statistic (default) – Significance level: 5% (default) – No item purification (default) difMH(verbal, group="Gender", focal.name=1)
Application (12) Application (13) Application (14) Application (15) • Plotting the output: plot( difMH(verbal, group="Gender", focal.name=1) )
Application (16) Application (17) • Other possible options: – Significance level: alpha = … – No continuity correction: correct = FALSE – Log OR DIF statistic: MHstat = “logOR” – Item purification: purify = TRUE – Number of iterations: nrIter=… … Application (18) Application (19) • Structure of the output (using str(r) ): • Lord’s test: – Focal group: 1 (males) – 1PL model to be estimated from ‘ltm’ package – Significance level: 5% (default) – No item purification (default) r <- difLord(verbal, group="Gender", focal.name=1, model="1PL", engine="ltm")
Application (20) Application (21) • Visualizing the results: plot(r) Application (22) Application (23) • Visualizing one item in particular: plot(r, plot=“itemCurve”, item=6) or plot(r, plot=“itemCurve”, item=“S2WantShout”)
Application (24) Application (25) • dichoDif use: • Other possible options: – Focal group: 1 (males) – Significance level: alpha = … – Methods: Mantel-Haenszel, Standardization, logistic – Item purification: purify = TRUE regression, Lord’s test (1PL), Raju’s method (1PL) – Significance level: 5% (default) – Number of iterations: nrIter=… – No item purification (default) – Provide the item parameters by yourself: irtParam = … dichoDif(verbal,group="Gender", … focal.name=1,method=c( "MH","Std","Logistic","Lord","Raju"), model="1PL) Application (26) Application (27)
Recommend
More recommend