Automating Programming Assessments Things I Learned Porting 15-150 - PowerPoint PPT Presentation

Automating Programming Assessments Things I Learned Porting 15-150 to Autolab Iliano Cervesato

Thanks! Jorge Sacchini Bill Maynes Ian Voysey Generations of 15-150, 15-210 and 15-212 teaching assistants 1

Outline  Autolab  The challenges of 15-150  Automating Autolab  Test generation  Lessons learned and other thoughts 2

 Tool to automate assessing programming assignments  Student submits solution  Autolab runs it against reference solution  Student gets immediate feedback » Learns from mistakes while on task  Used in 80+ editions of 30+ courses  Customizable 3

The promises of Autolab  Enhance learning  By pointing out errors while students are on task  Not when the assignment is returned » Students are busy with other things » They don’t have time to care  Streamline the work of course staff … maybe  Solid solution must be in place from day 1  Enables automated grading » Controversial 4

How Autolab works, typically Virtual machine Student Submission Compiler solution = Outcome Test cases Reference solution Autograding script 5

The Challenges of 15-150 6

15-150 Use the mathematical structure of a problem to program its solution  Core CS course  Programming and theory assignments  Qatar  Pittsburgh (x 2)  20-30 students  150-200 students  0-2 TAs  18-30 TAs 7

Autolab in 15-150q  Used as  Submission site  Immediate feedback for coding components  Cheating monitor via MOSS integration  Each student has 5 to 10 submissions  Used 50.1% in Fall 2014  Grade is not determined by Autolab  All code is read and commented on by staff 8

The Challenges of 15-150  15-150 relies on Standard ML (common to 15-210, 15-312, 15-317, …)  Used as an interpreted language » no I/O  Strongly typed » No “eval”  Strict module system » Abstract types  11, very diverse, programming assignments  Grader for hw- (x+1) very different from hw- x 9

Autograding SML code  Traditional model does not work well  Requires students to write unnatural code  Needs complex parsing and other infrastructure » But SML interpreter already comes with a parser for SML  Instead, make everything happen within SML  running test cases  establishing outcome  dealing with errors Student and reference code become modules 10

Running Autolab with SML Virtual machine SML interpreter Student Submission solution = Outcome Test cases Autograder Reference solution 11

Making it work is non-trivial  Done for 15-210  But 15-150 has much more assignment diversity  No documentation  Initiation rite of TAs by older TAs » Cannot work on the Qatar campus!  Demanding on the course staff  TA-run  Divergent code bases Too important to be left to rotating TAs 12

What’s in a typical autograder?  A working autograder took grader.cm 3 days to write handin.cm  Tedious, ungrateful job handin.sml  Proceed by trial and error autosol.cm  Lots of repetitive parts autosol.sml  Cognitively complex HomeworkTester.sml  Each assignment brings new xyz-test.sml challenges aux/  Time taken away from allowed.sml helping students xyz.sig  Discourages developing sources.cm new assignments support.cm 13 ( simplified )

structure HomeworkTester = fun test_traverseC () = OurTester.testFromRef (Our.treeC_toString) (list_toString Char.toString) struct exception FatalError of string ( op =) (Stu.traverseC) (Our.traverseC) structure Stu = StuHw04Code (studTests_traverseC) structure Our = Hw04Tests (Hw04 (Stu)) HomeworkTester.sml – Fall 2013 fun test_convertCan () = OurTester.testFromRef fun bool_toString true = "true" (Our.treeS_toString) (Our.treeC_toString) | bool_toString false = "false" ( op =) (Stu.convertCan) (Our.convertCan) fun pair_toString fst_ts snd_ts (x,y) = (studTests_convertCan) "(" ^ (fst_ts x) ^ ", " ^ (snd_ts y) ^ ")" fun test_convertCan_safe () = OurTester.testFromRef fun triple_toString ts snd_ts trd_ts (x,y,z) = (Our.treeS_toString) (Our.treeC_toString) "(" ^ (fst_ts x) ^ ", " ^ (snd_ts y) ^ ", " ^ (trd_ts z) ^ ")" ( op =) (Stu.convertCan_safe) (Our.convertCan_safe) fun list_toString toString l = (studTests_convertCan_safe) let fun lts [] = "“ | lts [x] = toString x fun test_convertSloppy () = OurTester.testFromRef | lts (x::l) = toString x ^ ",\n " ^ lts l (Our.treeS_toString) (Our.treeC_toString) in "[" ^ lts l ^ "]“ end ( op =) (Stu.convertSloppy) (Our.convertSloppy) fun compareReal (x: real, y: real): bool = Real.abs (x-y) < 0.0001 (studTests_convertSloppy) val studTests_traverseS = Our.treeSList1 fun test_convert () = OurTester.testFromRef (Our.treeC_toString) (Our.tree_toString) val studTests_canonical = Our.treeSList1 fun test_traverseS () = OurTester.testFromRef val studTests_simplify = Our.treeSList1 (Our.tree_eq) val studTests_simplify_safe = studTests_simplify (Stu.convert) (Our.convert) (studTests_convert) (Our.treeS_toString) val studTests_traverseC = Our.treeCList1 fun test_convert_safe () = OurTester.testFromRef val studTests_convertCan = Our.treeSList3 val studTests_convertCan_safe = studTests_convertCan (Our.treeC_toString) (Our.tree_toString) val studTests_convertSloppy = Our.treeSList1 (Our.tree_eq) (list_toString Char.toString) (Stu.convert_safe) (Our.convert_safe) val studTests_convert = Our.treeCList1 (studTests_convert_safe) val studTests_convert_safe = studTests_convert ( op =) val studTests_splitN = Our.treeIntList1 fun test_splitN () = OurTester.testFromRef val studTests_leftmost = Our.treeList3 (pair_toString Our.tree_toString Int.toString) val studTests_halves = Our.treeList3 (pair_toString Our.tree_toString Our.tree_toString) val studTests_rebalance = Our.treeList1 ( op =) (Stu.traverseS) (Our.traverseS) (Stu.splitN) (Our.splitN) fun test_traverseS () = OurTester.testFromRef (studTests_splitN) (Our.treeS_toString) (list_toString Char.toString) (studTests_traverseS) ( op =) fun test_leftmost () = OurTester.testFromRef (Stu.traverseS) (Our.traverseS) (Our.tree_toString) (studTests_traverseS) (pair_toString Char.toString Our.tree_toString) ( op =) fun test_canonical () = OurTester.testFromRef (Stu.leftmost) (Our.leftmost) (Our.treeS_toString) (bool_toString) (studTests_leftmost) ( op =) (Stu.canonical) (Our.canonical) fun test_halves () = OurTester.testFromRef (studTests_canonical) (Our.tree_toString) (triple_toString Our.tree_toString Char.toString Our.tree_toString) fun test_simplify () = OurTester.testFromRef ( op =) (Our.treeS_toString) (Our.treeS_toString) (Stu.halves) (Our.halves) ( op =) (studTests_halves) (Stu.simplify) (Our.simplify) (studTests_simplify) fun test_rebalance () = OurTester.testFromRef (Our.tree_toString) (Our.tree_toString) fun test_simplify_safe () = OurTester.testFromRef ( op =) (Our.treeS_toString) (Our.treeS_toString) (Stu.rebalance) (Our.rebalance) 14 ( op =) (studTests_rebalance) (Stu.simplify_safe) (Our.simplify_safe) end (studTests simplify safe)

Autograder development cycle Exhaustion Gratification Frustration Dread 15 Work of course staff hardly streamlined

Automating Autolab for 15-150 16

However …  Most files can be grader.cm generated automatically handin.cm from function types handin.sml autosol.cm autosol.sml  Some files stay the same HomeworkTester.sml xyz-test.sml aux/  Others are trivial allowed.sml  given a working solution xyz.sig sources.cm support.cm 17 ( simplified )

Significant opportunity for automation  Summer 2013:  Hired a TA to deconstruct 15-210 infrastructure  Fall 2013:  Ran 15-150 with Autolab  Early automation  Fall 2014:  Full automation of large fragment  Documentation  Summer 2015:  Further automation  Automated test generation  Fall 2015 was loaded on Autolab by first day of class 18

Automating Programming Assessments Things I Learned Porting 15-150 - PowerPoint PPT Presentation

Automating Programming Assessments Things I Learned Porting 15-150 to Autolab Iliano Cervesato Thanks! Jorge Sacchini Bill Maynes Ian Voysey Generations of 15-150, 15-210 and 15-212 teaching assistants 1 Outline Autolab The

Automating batch fecundity measurements Automating batch fecundity measurements using digital

REDHAT KICKSTART REDHAT KICKSTART Automating Linux Installation Automating Linux Installation

Automating the Automating the configuration of flow configuration of flow monitoring probes

Automating MySQL Deployments on Kubernetes Calin Don & Flavius Mecea Presslabs Automating

Automating Authority Work Automating authority work, or, Be your own authority control vendor

Automating Production of Cross Media Automating Production of Cross Media Content for

RANDOMIZING AND RANDOMIZING AND AUTOMATING ASSESSMENT AUTOMATING ASSESSMENT WITH R WITH R exams

Vessel Assessments 01 MAY 2019 OPR: N7 Vessel Assessments Vessel self-assessments were

Automating Programming Assessments What I Learned Porting 15-150 to Autolab Iliano Cervesato

Supervisor of Assessments FY2020 Budget Presentation Presented by Mark D. Armstrong, CIAO

International Comparative Assessments 1 05/06/2015 1 International Comparative Assessments Key

Division of Property Assessments August 21, 2018 Division of Property Assessments Jaclyn

Why Companies Use Assessments II. What information do What information do II. Assessments

PERFORMANCE ASSESSMENTS: FEEDBACK FOR GROWTH USING PERFORMANCE ASSESSMENTS TO EMPOWER LEADERS

Automating Registrar Onboarding What is AROS? A utomated R egistrar O nboarding S ystem

Automating and Simplifying your External Reporting by Integrating XBRL Ken Pavell & Steve

Scintillation Tile Hodoscope for the PANDA Barrel Time-Of-Flight Detector William Nalti, Ken

Iterative Algorithms for Polynomial Eigenvalue Decomposition and Applications Stephan Weiss UDRC

The bursty cosmic dawn Outline 1 Introduction Umberto Maio Motivations Leibniz Institute for

Highly Granular Calorimeters: Technologies and Results Yong Liu Johannes Gutenberg-Universitt

COMP 302: Lecture 1 Course Overview and Introduction to SML Joshua Dunfield 4 January 2010

My New/Old Agenda Dave MacQueen WG 2.8, Aussois, 2013 1 Sunday, October 20, 13 From Ryerson

spyn: weaving memories into handcrafted artifacts daniela rosner | advisor: kimiko ryokai

SML basics Aslan Askarov aslan@cs.au.dk Revised from slides by E. Ernst ML Functional

Automating Programming Assessments Things I Learned Porting 15-150 - PowerPoint PPT Presentation

Automating Programming Assessments Things I Learned Porting 15-150 to Autolab Iliano Cervesato Thanks! Jorge Sacchini Bill Maynes Ian Voysey Generations of 15-150, 15-210 and 15-212 teaching assistants 1 Outline Autolab The

Automating batch fecundity measurements Automating batch fecundity measurements using digital

REDHAT KICKSTART REDHAT KICKSTART Automating Linux Installation Automating Linux Installation

Automating the Automating the configuration of flow configuration of flow monitoring probes

Automating MySQL Deployments on Kubernetes Calin Don &amp; Flavius Mecea Presslabs Automating

Automating Authority Work Automating authority work, or, Be your own authority control vendor

Automating Production of Cross Media Automating Production of Cross Media Content for

RANDOMIZING AND RANDOMIZING AND AUTOMATING ASSESSMENT AUTOMATING ASSESSMENT WITH R WITH R exams

Vessel Assessments 01 MAY 2019 OPR: N7 Vessel Assessments Vessel self-assessments were

Automating Programming Assessments What I Learned Porting 15-150 to Autolab Iliano Cervesato

Supervisor of Assessments FY2020 Budget Presentation Presented by Mark D. Armstrong, CIAO

International Comparative Assessments 1 05/06/2015 1 International Comparative Assessments Key

Division of Property Assessments August 21, 2018 Division of Property Assessments Jaclyn

Why Companies Use Assessments II. What information do What information do II. Assessments

PERFORMANCE ASSESSMENTS: FEEDBACK FOR GROWTH USING PERFORMANCE ASSESSMENTS TO EMPOWER LEADERS

Automating Registrar Onboarding What is AROS? A utomated R egistrar O nboarding S ystem

Automating and Simplifying your External Reporting by Integrating XBRL Ken Pavell &amp; Steve

Scintillation Tile Hodoscope for the PANDA Barrel Time-Of-Flight Detector William Nalti, Ken

Iterative Algorithms for Polynomial Eigenvalue Decomposition and Applications Stephan Weiss UDRC

The bursty cosmic dawn Outline 1 Introduction Umberto Maio Motivations Leibniz Institute for

Highly Granular Calorimeters: Technologies and Results Yong Liu Johannes Gutenberg-Universitt

COMP 302: Lecture 1 Course Overview and Introduction to SML Joshua Dunfield 4 January 2010

My New/Old Agenda Dave MacQueen WG 2.8, Aussois, 2013 1 Sunday, October 20, 13 From Ryerson

spyn: weaving memories into handcrafted artifacts daniela rosner | advisor: kimiko ryokai

SML basics Aslan Askarov aslan@cs.au.dk Revised from slides by E. Ernst ML Functional

Automating MySQL Deployments on Kubernetes Calin Don & Flavius Mecea Presslabs Automating

Automating and Simplifying your External Reporting by Integrating XBRL Ken Pavell & Steve