Preferences in college applications A non-parametric Bayesian - PowerPoint PPT Presentation

Preferences in college applications A non-parametric Bayesian analysis of top-10 rankings Alnur Ali 1 Thomas Brendan Murphy 2 a 3 Marina Meil˘ Harr Chen 4 1 Microsoft 2 University College Dublin 3 University of Washington 4 Massachusetts Institute of Technology

Introduction Model Findings Conclusions Questions . . . . . . . . . . . Outline Introduction College Applications Goals Dataset Model Data Coding Generalized Mallow’s models Dirichlet process mixture models Gibbs sampler Findings General properties Overall trends Conclusions

Introduction Model Findings Conclusions Questions . . . . . . . . . . . College Applications • Irish college applicants apply through a central system administered by the College Applications Office (CAO). • Applicants list up to ten degree courses in order of preference. • Applicants are awarded points on the basis of their Leaving Certificate results; these determine course entry.

Introduction Model Findings Conclusions Questions . . . . . . . . . . . Goals • It has been postulated that a number of factors influence course choices: • Institution & Location • Degree subject • Degree type (Specific vs. General) • Points Requirement • Gender 500 450 points 400 Do points requirements influence ranks? 350 300 1 2 3 4 5 6 7 8 9 10 rank

Introduction Model Findings Conclusions Questions . . . . . . . . . . . Dataset • We study the cohort of applicants to degree courses from the year 2000. • The applications data has the following properties: • There were 55737 applicants; • They selected from a list of 533 courses; • Applicants selected up to 10 courses.

Introduction Model Findings Conclusions Questions . . . . . . . . . . . Data Coding • The data coding ( s 1 , s 2 , . . . , s t ) of π | σ is defined by s j + 1 = rank of π − 1 ( j ) in σ after removing π − 1 (1 : j − 1) . Example, if σ = [ a b c d ] and π = [ c a b d ] σ π − 1 (1) = c s 1 = 2 a b d c π − 1 (2) = a s 2 = 0 a b · d π − 1 (3) = b s 3 = 0 · · d b π − 1 (4) = d s 4 = 0 · · · d • Kendall’s distance is d Kendall ( π, σ ) = ∑ t − 1 j =1 s j .

Introduction Model Findings Conclusions Questions . . . . . . . . . . . Generalized Mallow’s models • Mallow’s model assumes that   t − 1 1 ∑  . P ( π | σ, θ ) = ψ ( θ ) exp  − θ s j ( π | σ ) j =1 • Can extend Mallow’s model to allow for varying precision in ranking   t − 1 1 P ( π | σ, ⃗ ∑ θ ) = exp  − θ j s j ( π | σ )  . ψ ( ⃗ θ ) j =1 • Location parameter σ , scale parameters ( θ 1 , . . . , θ max t − 1 ). • ψ ( ⃗ θ ) is a tractable normalization constant.

Introduction Model Findings Conclusions Questions . . . . . . . . . . . Dirichlet process mixture models α � p G 0 • ⃗ p ∼ Dirichlet ( α/ K , . . . , α/ K ) • c i ∼ Multinomial ( p 1 , . . . , p K ) c i σ c , � θ c • σ c , ⃗ θ c ∼ G 0 ∝ P 0 ( σ, ⃗ θ ; ν,⃗ r ) K • π i ∼ GM ( π i | σ c , ⃗ θ c ) π i N • Prior: conjugate to GM , informative w.r.t. ⃗ θ . • DPMM benefits: no need to specify K upfront, identifies both large and small clusters.

Introduction Model Findings Conclusions Questions . . . . . . . . . . . Gibbs sampler 1. Resample cluster assignments: N + α − 1 GM ( π | σ c , ⃗ N c − 1 1.1 Draw existing cluster w.p. ∝ θ c ) or Beta function approximation. ( n − t )! 1.2 Draw new cluster w.p. ∝ α . N + α − 1 n ! 2. Resample cluster parameters: 2.1 Draw ⃗ θ c by slice sampling or a Beta distribution approx. 2.2 Draw σ c “stage-wise” or by a Beta function approx. Beta approx. based sampler (Beta-Gibbs) faster than slice based sampler (Slice-Gibbs) (per iteration & overall time to convergence).

Introduction Model Findings Conclusions Questions . . . . . . . . . . . General properties of the clusterings • The DPMM found 164 clusters. • Thirty three of these clusters had nine or more members. 3 10 clust size 10 2 1 10 0 5 10 15 20 25 30 cluster • The clusters were characterized by a number of features. Cluster Size Description Male (%) Points Average (SD) 1 4536 CS & Engineering 77.2 369 (41) 2 4340 Applied Business 48.5 366 (40) 3 4077 Arts & Social Science 13.1 384 (42) 4 3898 Engineering (Ex-Dublin) 85.2 374 (39) 5 3814 Business (Ex-Dublin) 41.8 394 (32) 6 3106 Cork Based 48.9 397 (33) . . . . . . . . . . . . . . . 33 9 Teaching (Home Economics) 0.0 417 (4)

Introduction Model Findings Conclusions Questions . . . . . . . . . . . Precision • The precision parameters ( θ j ) were very high for top rankings. 1 4 2 3.5 3 3 4 2.5 5 rank j 6 2 7 1.5 8 1 9 0.5 10 0 5 10 15 20 25 30 cluster • The θ j values tended to decrease with j . • In many cases, the θ j values dropped suddenly after a particular point. • The central ranking σ for each cluster is of length 533; the θ j values suggested a point to truncate the ranking.

Introduction Model Findings Conclusions Questions . . . . . . . . . . . Overall trends • Subject • Subject matter is a key determinant of course choice. • The courses chosen are similar in subject area. • Some opt for general degrees (eg. Science) and others opt for specific (eg. Chemical Engineering). • Gender • There is quite a difference in the percentage male/female applicants in some clusters. • Males tend to dominate CS/Engineering clusters. • Females tend to dominate social science/education clusters. • Geography • There is evidence of the college location influencing choice. • The sixth largest cluster is dominated by courses from colleges in Cork (CIT and UCC). • There is evidence of a mix of subject matter and geography having a joint effect; the fourth largest cluster is dominated by engineering courses outside Dublin.

Introduction Model Findings Conclusions Questions . . . . . . . . . . . Points • The points requirements for the courses in the truncated central rankings were not monotonically decreasing in any cluster. points 2 4 413 6 rank j 8 10 200 12 5 10 15 20 25 30 cluster • This suggests that points requirements are not important when students are ranking courses.

Introduction Model Findings Conclusions Questions . . . . . . . . . . . Conclusions & Lessons Learned • The CAO system appears to be working more effectively than many suggest. • The clusters revealed in this analysis tend to be cohesive in subject matter. • The focus of possible improvements to the CAO system might be directed at how points are scored. • The Generalized Mallows DPMM facilitated discovering small clusters that were missed in previous analyses. • The model also allowed for the study of precision in rankings within clusters.

Introduction Model Findings Conclusions Questions . . . . . . . . . . . Questions? Thanks!

Preferences in college applications A non-parametric Bayesian - PowerPoint PPT Presentation

Preferences in college applications A non-parametric Bayesian analysis of top-10 rankings Alnur Ali 1 Thomas Brendan Murphy 2 a 3 Marina Meil Harr Chen 4 1 Microsoft 2 University College Dublin 3 University of Washington 4 Massachusetts

Moral Preferences F R A N C E S C A R O S S I Decision making Based on our preferences

Rational preferences Idea: preferences of a rational agent must obey constraints. Rational

Rational preferences Idea: preferences of a rational agent must obey constraints. Rational

Sustainable intergenerational preferences preferences Geir B. Combining sensitivity for the

14.54 International Trade Lecture 3: Preferences and Demand 14.54 Week 2 Fall 2016 14.54 (Week

Axiomatic Foundations of Multiplier Preferences Tomasz Strzalecki Multiplier preferences

Incentives and Behavior Prof. Dr. Heiner Schumacher KU Leuven 6. Time Preferences I Prof. Dr.

Platform for Privacy Platform for Privacy Preferences (P3P) Project Preferences (P3P) Project

Incentives and Behavior Prof. Dr. Heiner Schumacher KU Leuven 3. Risk Preferences I Prof. Dr.

Cardinal and Ordinal Preferences A preference structure represents an agents preferences over a

Introduction to Artificial Intelligence Planning under Uncertainty Janyl Jumadinova November 2,

Social Choice: Single-Peaked Preferences Game Theory Course: Jackson, Leyton-Brown & Shoham

Ordinal and Cardinal Preferences A preference structure represents an agents preferences over a

ECE700.07: Game Theory with Engineering Applications Le Lecture 2: Preferences and Utilities

ABA + : Assumption-Based Argumentation with Preferences Cardiff Argumentation Forum Kristijonas

Evaluating methods to capture stakeholder preferences MACBETH: A Non-numerical Method for

06.03.2015 20:33:15 These could be pictures of another planet or the set of a science fiction

EECS 4314 Advanced Software Engineering Topic 05: Design Pattern Review Zhen Ming (Jack) Jiang

Transportation in the Future November 23, 2012 UDLS, November 23, 2012 Future Transportation

Local Authority (HS Enforcement) Terry Mallard Health & Safety Inspector Birmingham City

MELODI M achin E L earning, O ptimization, & D ata I nterpretation @ UW Iyer & Bilmes,

Strong Consistency of the AIC, BIC, C p and KOO Methods in High-Dimensional-Response Regression

Day 5: Model Selection I Lucas Leemann Essex Summer School Introduction to Statistical Learning

The Problem of Overfitting The Problem of Overfitting BR data: neural network with 20%

Sambuz

Useful Links

Newsletter

Mail Us

Preferences in college applications A non-parametric Bayesian - PowerPoint PPT Presentation

Preferences in college applications A non-parametric Bayesian analysis of top-10 rankings Alnur Ali 1 Thomas Brendan Murphy 2 a 3 Marina Meil Harr Chen 4 1 Microsoft 2 University College Dublin 3 University of Washington 4 Massachusetts

Moral Preferences F R A N C E S C A R O S S I Decision making Based on our preferences

Rational preferences Idea: preferences of a rational agent must obey constraints. Rational

Rational preferences Idea: preferences of a rational agent must obey constraints. Rational

Sustainable intergenerational preferences preferences Geir B. Combining sensitivity for the

14.54 International Trade Lecture 3: Preferences and Demand 14.54 Week 2 Fall 2016 14.54 (Week

Axiomatic Foundations of Multiplier Preferences Tomasz Strzalecki Multiplier preferences

Incentives and Behavior Prof. Dr. Heiner Schumacher KU Leuven 6. Time Preferences I Prof. Dr.

Platform for Privacy Platform for Privacy Preferences (P3P) Project Preferences (P3P) Project

Incentives and Behavior Prof. Dr. Heiner Schumacher KU Leuven 3. Risk Preferences I Prof. Dr.

Cardinal and Ordinal Preferences A preference structure represents an agents preferences over a

Introduction to Artificial Intelligence Planning under Uncertainty Janyl Jumadinova November 2,

Social Choice: Single-Peaked Preferences Game Theory Course: Jackson, Leyton-Brown &amp; Shoham

Ordinal and Cardinal Preferences A preference structure represents an agents preferences over a

ECE700.07: Game Theory with Engineering Applications Le Lecture 2: Preferences and Utilities

ABA + : Assumption-Based Argumentation with Preferences Cardiff Argumentation Forum Kristijonas

Evaluating methods to capture stakeholder preferences MACBETH: A Non-numerical Method for

06.03.2015 20:33:15 These could be pictures of another planet or the set of a science fiction

EECS 4314 Advanced Software Engineering Topic 05: Design Pattern Review Zhen Ming (Jack) Jiang

Transportation in the Future November 23, 2012 UDLS, November 23, 2012 Future Transportation

Local Authority (HS Enforcement) Terry Mallard Health &amp; Safety Inspector Birmingham City

MELODI M achin E L earning, O ptimization, &amp; D ata I nterpretation @ UW Iyer &amp; Bilmes,

Strong Consistency of the AIC, BIC, C p and KOO Methods in High-Dimensional-Response Regression

Day 5: Model Selection I Lucas Leemann Essex Summer School Introduction to Statistical Learning

The Problem of Overfitting The Problem of Overfitting BR data: neural network with 20%

Sambuz

Useful Links

Newsletter

Mail Us

Social Choice: Single-Peaked Preferences Game Theory Course: Jackson, Leyton-Brown & Shoham

Local Authority (HS Enforcement) Terry Mallard Health & Safety Inspector Birmingham City

MELODI M achin E L earning, O ptimization, & D ata I nterpretation @ UW Iyer & Bilmes,