IRT and beyond What to do when you want to customise a model but a - PowerPoint PPT Presentation

Image by: Lauren Rowling IRT and beyond What to do when you want to customise a model but a package doesn’t let you do that? Krzysztof Jędrzejewski Advanced Computing & Data Science Lab eRum, 15 May 2018 1 Presentation Title Arial Bold 7 pt

Code examples and slides github.com/kjedrzejewski/eRum2018 https://github.com/kjedrzejewski/eRum2018 2 eRum 2018

IRT Item Response Theory ● Used in psychometrics to estimate the difficulty of a test question (and a learner’s skill level) ● Can also be used in other areas, e.g. to assess ad clickability ● https://github.com/kjedrzejewski/eRum2018 3 eRum 2018

1PL IRT model 1-parameter logistic (1PL) is the most basic IRT model ● Assumption: the probability of answering a test question correctly depends only ● on the difference between a student’s skill and that question’s difficulty Observed data: ● which question was answered? ○ by which student? ○ was the answer correct or incorrect? ○ https://github.com/kjedrzejewski/eRum2018 4 eRum 2018

Many ways to estimate parameter values with R Using a dedicated IRT package, e.g. TAM ● As random effects in logistic regression model, e.g. with lme4 ● Using gradient descent, e.g. with TensorFlow ● Using probabilistic programming, e.g. with stan or greta ● TensorFlow stan greta https://github.com/kjedrzejewski/eRum2018 5 eRum 2018

Using a dedicated IRT package + We just need to convert the data to the expected format and call a function + Usually the fastest way to estimate model parameters - Such packages almost always support only most popular models - Doesn’t let us to estimate a custom model parameters Example packages: TAM , eRm , mirt Example code: github.com/kjedrzejewski/eRum2018/blob/master/1pl_irt.R http://www.edmeasurementsurveys.com/TAM/Tutorials/ 6 eRum 2018

Using logistic regression with random effects Question difficulties and skill levels are random effects related to questions and students + Allows us to add additional variables and parameters to the model - The model needs to remain a linear combination of observed variables Example packages: lme4 Example code: github.com/kjedrzejewski/eRum2018/blob/master/1pl_me.R https://github.com/lme4/lme4 7 eRum 2018

Using gradient descent (e.g. with TensorFlow ) Maximum Likelihood Estimation of model parameters using cross entropy and gradient descent based optimisers + Allows us to have non-linear components in the model + Can use GPU to speed up computations - We need to write a lot of code to describe dependencies between data and model parameters, and to establish the optimisation process - We need to create our own stop condition Example packages: tensorflow Example code: github.com/kjedrzejewski/eRum2018/blob/master/1pl_tf.R https://tensorflow.rstudio.com 8 eRum 2018

Using probabilistic programming (with stan ) + Provides credible intervals of estimated model parameters, which gives us information about the precision of our estimates - Model needs to be expressed in the stan language - Sampling is time-consuming , esp. for big datasets Example packages: rstan Example code: github.com/kjedrzejewski/eRum2018/blob/master/1pl_stan.R http://mc-stan.org/users/interfaces/rstan 9 eRum 2018

Using probabilistic programming (with greta ) + Also gives us information on the precision of our estimates (like stan ) + We define the model using native R syntax (unlike stan ) + It’s built on top of TensorFlow, so it can leverage GPU for computation - Sampling is still time-consuming Example packages: greta Example code: https://github.com/kjedrzejewski/eRum2018/blob/master/1pl_greta.R https://greta-dev.github.io/greta/ 10 eRum 2018

Benchmark, 1PL, small sample 100 questions, 1000 people => 100 000 observations Macbook Pro (CPU-only calculations) TAM 0.9 s lme4 24.3 s tensorflow 4.5 min. greta 18.9 min. stan 32.2 min. https://github.com/kjedrzejewski/eRum2018/blob/master/tests_1pl.R 11 eRum 2018

Benchmark, 1PL, small sample 100 questions, 1000 people => 100 000 observations Macbook Pro AWS p3.2xlarge GPU speed-up (CPU-only calculations) nVidia Tesla V100 TAM 0.9 s lme4 24.3 s tensorflow 4.5 min. 1.7 min. ~2.65x greta 18.9 min. 11.9 min. ~1.59x stan 32.2 min. https://github.com/kjedrzejewski/eRum2018/blob/master/tests_1pl.R 12 eRum 2018

Benchmark, 1PL, large sample 500 questions, 5000 people => 2 500 000 observations Macbook Pro AWS p3.2xlarge GPU speed-up (CPU-only calculations) nVidia Tesla V100 TAM 47.3 s lme4 30.2 min. tensorflow 42.4 min. 3.8 min. ~11.16x greta 5.8 h 39.4 min. ~8.83x stan too long :( https://github.com/kjedrzejewski/eRum2018/blob/master/tests_1pl.R 13 eRum 2018

Takeaways TensorFlow may be used for other tasks ● than deep learning GPU may be used to speed up ● parameter estimation of a large group of models For large samples, it may be faster to estimate ● parameters of a linear model using TensorFlow with GPU, than using specialized regression libraries Speed-up offered by GPU increases with data size ● https://github.com/kjedrzejewski/eRum2018 14 eRum 2018

ioki.pl/category/data-science/

IRT and beyond What to do when you want to customise a model but a - PowerPoint PPT Presentation

Image by: Lauren Rowling IRT and beyond What to do when you want to customise a model but a package doesnt let you do that? Krzysztof Jdrzejewski Advanced Computing & Data Science Lab eRum, 15 May 2018 1 Presentation Title Arial

Unidimensional and Multidimensional IRT Modeling with the mirt Package Phil Chalmers York

Rethinking public transport (Part 1: IRT) Everyone in Wales can access effective IRT public

IRT Recommendations on RPMs in the new gTLDs: A Summary How the IRT hopes ICANN will protect the

ABB ABB FlexTrack IRT 501 LC BIW - 1 - ABB BiW product FlexTrack IRT 501 Material handling

IRT 5000/7000 Infrared Microscopes JASCO Pr oduc t Se minar e b 9 th 2018, Kuala L F umpur

Shaping the Future of Railway Monash University IRT 2017 Jim Hunter GM Network Engineering |

Fast Item Response Theory (IRT) Analysis by using GPUs Lei Chen lei.chen@liulishuo.com Liulishuo

Screening Common Items for IRT Equating Yi Du, Ph.D. Data Recognition Corporation Presentation

Logistic mixed models for DIF IRT models can be regarded as logistic mixed models (e.g., Adams,

MEDIA DISRUPTION SEEING BEYOND SEEING BEYOND SEEING BEYOND SEEING BEYOND LED BY THE BLIND

Looking Beyond the Knob Looking Beyond the Knob Looking Beyond the Knob Looking Beyond the Knob

Human Development Report 2019 Beyond income, beyond averages, beyond today: Inequalities in human

IRT models and mixed models: Theory and lmer practice Paul De Boeck Sun-Joo Cho U. Amsterdam

SBIR IRT Screening, Brief Intervention & Referral to Treatment Welcome & In

Overview MAXENT-Modeling: A framework for Discrete MAXENT-Models and RMs IRT-Modeling?

The IRT-Object in the RIPE Database: short status update (and some background) . vienna

Building the resilience of the workforce our approach Dominique Kent Chief Operating Officer

Reports for supporting full-cycle assessment Rebecca Freund and Amy Arneson University of

Psychometrics dott. Andrea Greco andrea.greco@unibg.it Office hours (for students): Wednesday,

Pucktada Treeratpituk (Puck) PhD Student College of Information Sciences & Technology Penn

BOOLEAN MATRIX AND TENSOR DECOMPOSITIONS Pauli Miettinen TML 2013 27 September 2013 BOOLEAN

Week 1, video 1 Intro to EDM Why EDM now? Which tools to use in class Big Data in Education

Research Committee Purpose : Advance the conduct, dissemination, and use of family nursing

Algebraic Methods for Tensor Data Neriman Tokcan (Broad Institute, MIT/Harvard) Harm Derksen

IRT and beyond What to do when you want to customise a model but a - PowerPoint PPT Presentation

Image by: Lauren Rowling IRT and beyond What to do when you want to customise a model but a package doesnt let you do that? Krzysztof Jdrzejewski Advanced Computing & Data Science Lab eRum, 15 May 2018 1 Presentation Title Arial

Unidimensional and Multidimensional IRT Modeling with the mirt Package Phil Chalmers York

Rethinking public transport (Part 1: IRT) Everyone in Wales can access effective IRT public

IRT Recommendations on RPMs in the new gTLDs: A Summary How the IRT hopes ICANN will protect the

ABB ABB FlexTrack IRT 501 LC BIW - 1 - ABB BiW product FlexTrack IRT 501 Material handling

IRT 5000/7000 Infrared Microscopes JASCO Pr oduc t Se minar e b 9 th 2018, Kuala L F umpur

Shaping the Future of Railway Monash University IRT 2017 Jim Hunter GM Network Engineering |

Fast Item Response Theory (IRT) Analysis by using GPUs Lei Chen lei.chen@liulishuo.com Liulishuo

Screening Common Items for IRT Equating Yi Du, Ph.D. Data Recognition Corporation Presentation

Logistic mixed models for DIF IRT models can be regarded as logistic mixed models (e.g., Adams,

MEDIA DISRUPTION SEEING BEYOND SEEING BEYOND SEEING BEYOND SEEING BEYOND LED BY THE BLIND

Looking Beyond the Knob Looking Beyond the Knob Looking Beyond the Knob Looking Beyond the Knob

Human Development Report 2019 Beyond income, beyond averages, beyond today: Inequalities in human

IRT models and mixed models: Theory and lmer practice Paul De Boeck Sun-Joo Cho U. Amsterdam

SBIR IRT Screening, Brief Intervention &amp; Referral to Treatment Welcome &amp; In

Overview MAXENT-Modeling: A framework for Discrete MAXENT-Models and RMs IRT-Modeling?

The IRT-Object in the RIPE Database: short status update (and some background) . vienna

Building the resilience of the workforce our approach Dominique Kent Chief Operating Officer

Reports for supporting full-cycle assessment Rebecca Freund and Amy Arneson University of

Psychometrics dott. Andrea Greco andrea.greco@unibg.it Office hours (for students): Wednesday,

Pucktada Treeratpituk (Puck) PhD Student College of Information Sciences &amp; Technology Penn

BOOLEAN MATRIX AND TENSOR DECOMPOSITIONS Pauli Miettinen TML 2013 27 September 2013 BOOLEAN

Week 1, video 1 Intro to EDM Why EDM now? Which tools to use in class Big Data in Education

Research Committee Purpose : Advance the conduct, dissemination, and use of family nursing

Algebraic Methods for Tensor Data Neriman Tokcan (Broad Institute, MIT/Harvard) Harm Derksen

SBIR IRT Screening, Brief Intervention & Referral to Treatment Welcome & In

Pucktada Treeratpituk (Puck) PhD Student College of Information Sciences & Technology Penn