Fast Item Response Theory (IRT) Analysis by using GPUs Lei Chen - PowerPoint PPT Presentation

Fast Item Response Theory (IRT) Analysis by using GPUs Lei Chen lei.chen@liulishuo.com Liulishuo Silicon Valley AI Lab 1

Outline • A brief introduction of Item Response Theory (IRT) • Edward, a new probabilistic programming (PP) toolkit • An experiment of using Edward to do IRT model estimation on both CPU and GPU computing platforms • Summary 2

A concise introduction of adaptive learning • What's up with adaptive learning 3

Adaptive learning is hot in the eduTech market • Increasing demands • Districts’ spending on adaptive learning products has grown threefold between 2013 and 2016 , according to a new analysis. EdWeek market brief 7/14/2017 • Increasing suppliers 4

Precisely knowing students ability levels is important • Adaptive learning needs correct inputs about students’ ability levels, which are latent • Assessment are developed for inferring latent abilities • For a Yes/No question, the probability a student provides a correct answer p(X=1) depends on • his/her latent ability (theta) • Also other related factors, e.g., item’s di ffi culty, making a lucky guess, carelessness … 5

Item Response Theory (IRT) • IRT provides a principled statistical method to quantify these factors and has been widely used to build up modern assessment industry • A widely used 2 parameter logistic model (2-PL) 6

IRT with fewer or more parameters • 1-PL • Only having b, assume all items share same a • 3-PL • c for random guessing • 4-PL • d for inattention 7

IRT’s wide usages 8

IRT’s wide usages • More precise description of item performance 8

IRT’s wide usages • More precise description of item performance • More precise scoring 8

IRT’s wide usages • More precise description of item performance • More precise scoring • More powerful test assembly 8

IRT’s wide usages • More precise description of item performance • More precise scoring • More powerful test assembly • Supporting advanced linking & equating to make standard tests be possible 8

IRT’s wide usages • More precise description of item performance • More precise scoring • More powerful test assembly • Supporting advanced linking & equating to make standard tests be possible • Supporting adaptive testing by placing examinees and items on the same scale 8

Concrete examples • “ Item response theory and computerized adaptive testing ” presentation made for a hands-on workshop by Rust, Cek, Sun, and Kosinski from University of Cambridge The Psychometrics Center • Very nice animations to explain IRT, how to use IRT to score, and CAT. 9

Item Response Function Binary items Probability of getting item right 1 Parameters: Models: Measured concept (theta) 10

Item Response Function Binary items Probability of getting item right 1 Parameters: Difficulty • Models: Difficulty 1 Parameter • Measured concept (theta) 10

Item Response Function Binary items Probability of getting ) e item right 1 p o l s ( n Parameters: o i t Difficulty • a n Discrimination • i m i r c s i D Models: Difficulty 1 Parameter • 2 Parameter • Measured concept (theta) 10

Item Response Function Binary items Probability of getting ) e item right 1 p o l s ( n Parameters: o i t Difficulty • a n Discrimination • i m Guessing • i r c s i D Models: Difficulty 1 Parameter • 2 Parameter • 3 Parameter • Guessing Measured concept (theta) 10

Item Response Function Binary items Probability of getting ) e item right 1 p o Inattention l s ( n Parameters: o i t Difficulty • a n Discrimination • i m Guessing • i r c Inattention • s i D Models: Difficulty 1 Parameter • 2 Parameter • 3 Parameter • Guessing 4 Parameter • Measured concept (theta) 10

Item Response Function Binary items Probability of getting ) e item right 1 p o Inattention l s ( n Parameters: o i t Difficulty • a n Discrimination • i m Guessing • i r c Inattention • s i D Models: Difficulty 1 Parameter • 2 Parameter • 3 Parameter • Guessing 4 Parameter • unfolding • Measured concept (theta) 10

Scoring Test: 1.0 0.8 Probability 0.6 0.4 0.2 0.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 Theta 11

Scoring Test: 1. Normal distribution 1.0 0.8 Probability 0.6 0.4 0.2 0.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 Theta 11

Scoring Test: 1. Normal distribution 1.0 2. q1 – Correct 0.8 Probability 0.6 0.4 0.2 0.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 Theta 11

Scoring Test: 1. Normal distribution 1.0 2. q1 – Correct 0.8 Probability 0.6 0.4 0.2 0.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 Theta Most likely score 11

Scoring Test: 1. Normal distribution 1.0 2. q1 – Correct 3. q2 – Correct 0.8 Probability 0.6 0.4 0.2 0.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 Theta 11

Scoring Test: 1. Normal distribution 1.0 2. q1 – Correct 3. q2 – Correct 0.8 Probability 0.6 0.4 0.2 0.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 Theta Most likely score 11

Scoring Test: 1. Normal distribution 1.0 2. q1 – Correct 3. q2 – Correct 0.8 4. q3 - Incorrect Probability 0.6 0.4 0.2 0.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 Theta 11

Scoring Test: 1. Normal distribution 1.0 2. q1 – Correct 3. q2 – Correct 0.8 4. q3 - Incorrect Probability 0.6 0.4 0.2 0.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 Theta Most likely score 11

Computer Adaptive Testing • Standard tests • Containing fixed number of questions • Some are too simple and some are too di ffi cult for a specific test-taker • CAT • Items can be tailored • Save time/money • Measure test-taker’s ability more accurately 12

Example of CAT Start the test: 1.0 0.8 Probability 0.6 0.4 0.2 0.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 Theta 13

Example of CAT Start the test: Incorrect response Correct response 1. Ask first question, e.g. of 1.0 medium difficulty 0.8 Probability 0.6 0.4 0.2 0.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 Theta 13

Example of CAT Start the test: 1. Ask first question, e.g. of 1.0 medium difficulty 2. Correct! 0.8 Probability 0.6 0.4 0.2 0.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 Theta 13

Example of CAT Start the test: 1. Ask first question, e.g. of 1.0 medium difficulty 2. Correct! 0.8 3. Score it Probability Normal distribution 0.6 0.4 0.2 0.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 Theta Most likely score 13

Example of CAT Start the test: 1. Ask first question, e.g. of 1.0 medium difficulty 2. Correct! 0.8 3. Score it Probability 0.6 0.4 0.2 0.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 Theta Most likely score 13

Example of CAT Start the test: 1. Ask first question, e.g. of 1.0 medium difficulty 2. Correct! 0.8 3. Score it Probability 4. Select next item with a 0.6 difficulty around the most Difficulty likely score (or with the max 0.4 information) 0.2 0.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 Theta Most likely score 13

Example of CAT Start the test: 1. Ask first question, e.g. of 1.0 medium difficulty 2. Correct! 0.8 3. Score it Probability 4. Select next item with a 0.6 difficulty around the most Difficulty likely score (or with the max 0.4 information) 5. And so on…. Until the 0.2 stopping rule is reached 0.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 Theta Most likely score 13

IRT model estimation • Mostly used Marginal Maximum Likelihood (MMLE) • Finding the marginal distribution of the item parameters by integrating over theta • Estimate item parameters by MLE • Obtain theta by MLE based on estimated item parameters • For a more e ffi cient estimation, use EM • Other ways • Joint Maximum Likelihood (JML) 14

Bayesian solution • Issues with MLE • Depends on distribution of data • Estimation is not accurate when samples are small- sized • Hard to handle ability distribution is not normal • Bayesian solutions consider theta priors 15

MCMC • Markov chain Monte Carlo (MCMC) used for Bayesian estimation • Ultimate goal is approximate p(parameters|data) by sampling many data points from the posterior probability • Hamiltonian MC is good at dealing with high-dimensional parameter spaces. HMC utilizes the geometry of the important regions of the posterior for making better proposals. 16

Variational Inference • To approximate intractable distribution by using a family of distributions and finding the member of this family that can minimizes divergence to the true posterior • By approximating the posterior with a simpler function, leading to faster estimation • Kullback–Leibler (K-L) divergence was frequently used to measure two distributions’ closeness 17

Fast Item Response Theory (IRT) Analysis by using GPUs Lei Chen - PowerPoint PPT Presentation

Fast Item Response Theory (IRT) Analysis by using GPUs Lei Chen lei.chen@liulishuo.com Liulishuo Silicon Valley AI Lab 1 Outline A brief introduction of Item Response Theory (IRT) Edward, a new probabilistic programming (PP) toolkit

Week 4 Video 4 Knowledge Inference: Item Response Theory Item Response Theory A classic

What is Item Response Theory? Nick Shryane Social Statistics Discipline Area University of

eRm Extended Rasch Modeling Item response theory Rasch measurement scale The

INVESTIGATING DIFFICULT TOPICS IN A DATA STRUCTURES COURSE USING ITEM RESPONSE THEORY AND LOGGED

Multilevel Item Response Theory Models: An Introduction Caio L. N. Azevedo, Department of

Flexible Latent Trait Metrics An Application of the Filtered Monotonic Polynomial Item Response

Combing Item Response Theory and Diagnostic Classification Models: A Psychometric Model for

FAST OMC April 2, 2020 Item #1 CITIZENS PARTICIPATION 1 Item #2 APPROVAL OF THE MINUTES:

Item Response Theory Using the ltm Package Dimitris Rizopoulos Biostatistical Centre, Catholic

Joint Modeling of Longitudinal Item Response Data and Survival Jean-Paul Fox University of

FAST OMC August 6, 2020 1 Item #1 CONDUCT A COMMENT PERIOD FOR CITIZENS PARTICIPATION Item #2

Random Vibration Theory Methodology for Probabilistic Site Response Analysis Hieu Van Nguyen a ,

Search on Modern CPUs and GPUs N. Satish, C. Kim, J. Chhugani, A. Nguyen, V. Lee, D. Kim, P. Dubey

Joint Modeling of Feedback-Use and Time Data Advances in Bayesian Item Response Modeling

Previous work Estimating Hierarchical Structure GLMM Item Response Models in R De Boeck P ,

Fast Binding Site Mapping using GPUs and CUDA Bharat Sukhwani Martin C. Herbordt Computer

Fast patient-specific blood flow modelling on GPUs Dr. Gbor Zvodszky Budapest University of

GPU-Acceleration of In-Memory Data Analytics Evangelia Sitaridi AWS Redshift GPUs for Telcos

Can GPUs Cure Cancer? Multi-scale Integrative Analysis Predict treatment outcome, select,

GPUs and Python: A Recipe for Lightning-Fast Data Pipelines Craig Warner Christopher Packham

A High-Quality and Fast Maximal Independent Set Algorithm for GPUs Martin Burtscher and Sindhu

Advisory Committee Fund and Budget Proposal January 18, 2018 Agenda Item 8 1 Item 8

Why use GPUs for graph processing? FOSDEM 2020 2 GPUs and Graphs Graphs GPUs Found

Performance Analysis of CNN Frameworks for GPUs Heehoon Kim, Hyoungwook Nam, Wookeun Jung,