ioannis caragiannis university of patras
play

Ioannis Caragiannis University of Patras Joint work with George - PowerPoint PPT Presentation

Ioannis Caragiannis University of Patras Joint work with George Krimpas and Alexandros Voudouris massive : available to a large number of people (16-18 million students) online : through the internet/web open : no cost for the


  1. Ioannis Caragiannis University of Patras Joint work with George Krimpas and Alexandros Voudouris

  2.  massive : available to a large number of people (16-18 million students)  online : through the internet/web  open : no cost for the students  courses : series of lectures on a subject

  3.  www.edx.org  www.coursera.org  www.udacity.com  > 100 employees each  business model: verified certificates , head- hunting ( connecting students to industry ), specializations, corporate collaborations

  4.  400+ universities  2400+ courses  22 out of the top-25 US universities  3000+ instructors  TAs, video assistants  13 languages (80% english, 8.5% spanish, french, chinese)  subjects: humanities, computer science, business & management

  5.  Daphne Koller, Andrew Ng (Coursera founders):  “… courses in the humanities and social sciences - in which the material is more open to interpretation - have proven more complicated to translate into an online format, especially when it came to the assessment and grading of the students.”

  6.  What? Should result in quantitative information  successfully completed her class, achieved a 9/10 (A+), ranked in the top 1% of her class of 100,000, etc  Why? Information in the verified certificate, important for employers (new revenue source)  Who? Experts (graders, TAs) are costly  A common solution: automatic grading (multiple choice questions)

  7.  Highly unsatisfactory when evaluating the students’ ability of  proving a mathematical statement  expressing their critical thinking over an issue  demonstrating their creative writing skills  In these cases, assessment and grading is a human computation task  Alternative solution: peer grading  outsource the grading task to the students

  8.  How does it work?  each student grades some of the other students’ assignments (as part of her own assignment)  Allowing the students to grade using cardinal scores is risky:  not experienced in assessing their peers’ performance in absolute terms  have strong incentives to assign low scores  Solution: ordinal peer grading

  9.  Cardinal peer grading  Piech, Huang, Chen, Do, Ng, & Koller (2013)  Kulkarni, Wei, Le, Chia, Papadopoulos, Cheng, Koller, & Klemmer (2013)  Walsh (2014)  de Alfaro & Shavlovsky (2014)  www.crowdgrader.org  Ordinal peer grading  Raman & Joachims (2014)  Shah, Bradley, Parekh, Wainwright, & Ramachandran (2014)

  10.  n students (exam papers)  Distributing the exam papers : each student gets k << n exam papers to grade so that each exam paper is given to k students  Grading : each student ranks the exam papers assigned to her  Rank aggregation : compute a global ranking from the partial ranks  Goal : to come up with a global ranking that is “as correct as possible”

  11.  Similarities:  on input a profile of rankings, compute a final full ranking  Differences:  each student is simultaneously an alternative and a voter  voters do not have to rank all alternatives  the alternatives to be ranked are decided externally

  12.  ( n , k )-bundle graph : k -regular bipartite graph G=(U,V,E) with |U|=|V|= n  U: exam papers (randomly assigned to nodes)  V: graders  Edge ( u , v ) with u in U and v in V indicates that exam paper u will be given to student v  Warning! Nodes corresponding to a grader and her exam paper should not be connected

  13.  The students participate in the exam and submit their papers  Scenario I :  the instructor announces indicative solutions and grading instructions  the students use this info when grading  Scenario II :  no info by the instructor  students’ grading performance is similar to their performance in the exam

  14.  Basic assumption: there is a ground truth ranking of the exam papers  Perfect grading : each grader ranks the k exam papers she gets consistently to the ground truth

  15.  Quality measure : number of pairs of exam papers which compare in the global ranking as in the ground truth  .. or total number of pairs minus the kendall-tau distance  (bad) example: a random permutation recovers correctly 50% of pairwise relations on average

  16.  Find the minimum -degree ( n , k )-bundle graph that guarantees that the whole ground truth is always recovered if perfect grading is used 1 2 3 4 5 6 7 graders k = Θ( n 1/2 ) exam papers 1 2 3 4 5 6 7

  17.  Find the minimum -degree ( n , k )-bundle graph that guarantees that the whole ground truth is always recovered if perfect grading is used 1 2 3 4 5 6 7 graders Find a minimum-degree diameter-3 bipartite graph k = Θ( n 1/2 )  Miller and Siran (2013) exam papers 1 2 3 4 5 6 7

  18.  Use much simpler bundle graphs  E.g., any k -regular bip. graph for small values of k  even by putting together K k , k ’s  or a k -regular bip. graph not containing a 4-cycle  Aggregation rules  plurality, approval  Borda  Random serial dictatorship  Markov-chain-based aggregation rules

  19.  Each grader gives k+i-1 points to the exam paper she ranks i-th  Global ranking is obtained by sorting the exam papers in terms of non-increasing number of total points ( Borda score )  Ties are broken randomly

  20.  Theorem: When Borda is applied on partial rankings that are consistent to the ground truth , the expected fraction of correctly recovered pairwise relations is at least 1-O(1/k) when the bundle graph is 4-cycle-free and at least 1-O(1/k 1/2 ) in general

  21.  Students have qualities in [1/2,1]  ability to compare correctly two exam papers (probability to find the correct outcome)  Qualities define the ground truth ranking σ *  Grading according to a Mallows noise model for generating random rankings  each grader of quality p ranks each pair among the k exam papers she gets as in σ * with prob. p and incorrectly with prob. 1- p  if no ranking is defined, she repeats  C., Procaccia, & Shah (2013)

  22.  Comparison of Borda and RSD in 500 executions ( n = 1000, k = 8)

  23.  Theory:  Is a 1-O(1/k 2 ) fraction (or better) possible? Upper bounds?  Analysis for noisy grading?  Impact of incentives?  Practice:  Which is the most realistic noise model for grading?  How do the methods considered perform in practice (with real students)?

  24. 0 2 4 6 8 10 0 2 4 6 8 10 12 14

Recommend


More recommend