jumbo ml
play

Jumbo ML Smooth Sailing to Module Mastery Norman Ramsey, Tufts - PDF document

Jumbo ML Smooth Sailing to Module Mastery Norman Ramsey, Tufts University On July 31, 2014, I talked about Jumbo ML with a very vigor- tion anywhere but with removal only of the first element, you ous audience of researchers and teachers from


  1. Jumbo ML Smooth Sailing to Module Mastery Norman Ramsey, Tufts University On July 31, 2014, I talked about Jumbo ML with a very vigor- tion anywhere but with removal only of the first element, you ous audience of researchers and teachers from Harvard and also want a heap! My slogan is Northeastern. Many interesting things are said, and my notes Abstraction + Cost Model = Data Structure are both distributed and collected at the end, in italics. Problem, Part I: Teaching programming With this organization in mind, here are my goals for CS2: • Students will build programs from modules, will under- A computer scientist should be able to prove theorems and stand how modules are connected, and will have an idea write programs. Most introductory instruction focuses on pro- how to solve large problems by connecting modules. gramming. A great strength of this instruction is that students actually build programs. But building requires materials: a • Students will be able to estimate how much time and technology for teaching programming. And too often, we space a program needs for its execution. Moreover, stu- ask students to use the same technology that industrial en- dents will be able to manage time and space costs by gineers use. Unfortunately, when beginning students use an shifting them to the most appropriate part of a system. industrial-strength programming language, the difficulties of • Students will become comfortable with some data struc- mastering the language divert students from intended learn- tures that are widely used in many modules. These data ing outcomes. Beginning students should be provided with structures are a common currency of late 20th-century a “teaching language” tailored to their needs. Using a suit- industrial computing culture, and they are very popu- able teaching language, the essential principles that are taught lar with phone screeners and job interviewers, as well in an introductory course should be clearly and easily made as programmers. manifest. The role of Jumbo ML is to support these learning goals while At present there is a thriving ecosystem of languages designed remaining as simple and as easy to learn as possible. for teaching absolute beginners, usually in middle school or high school. One well-known example is Scratch. At the uni- Language needs for CS2 versity level, I am aware only of How to Design Programs and its three teaching languages Beginning Student Language, What kind of language we want depends on what we want Intermediate Student Language, and Advanced Student Lan- students to learn. The key learning goals that affect my own guage. These languages ship with tools, a textbook, and a de- language choices are programming with abstraction, reason- sign method, and the result is very effective. They are a great ing about costs, and understanding the decomposition of pro- way to get students started with deep ideas about program- grams into modules. ming. But they can only carry you so far: they are missing Students will be able to build substantial programs only if they much of what we’d like to teach in the second course. can use abstraction. The most fundamental abstraction is pro- cedural abstraction: students need to be able to call a proce- Problem, Part II: The second course dure (function, subroutine) knowing only its specification, not its implementation. I call this specification a contract ; other To talk about the first and second courses in computing, ACM writers use purpose statement or precondition and postcondi- curricula use the words “CS1” and “CS2.” Instructors gener- tion . If contracts are important, then we should lean toward ally agree that CS2 means some sort of course in basic data pure functional languages: contracts for pure code are much structures, but they may differ on the details. I, too, view CS2 simpler than contracts for impure code. (Try writing the spec- as a data-structure course, but I don’t view data structures as ification for a mutable stack. Then try an immutable stack.) foundational. I believe that data structures follow from two more fundamental concerns: abstractions and cost models. If we want students to be able to reason about costs, then And the most critical abstraction is the module abstraction. the programming language needs a perspicuous cost model. The fundamental ideas are laid out nicely in Butler Lampson’s If this were the only criterion, I would teach the second course Hints for Computer System Design : programs are composed in C, which enjoys a cost model of unparalleled perspicuity. of interfaces and implementations , interfaces define abstrac- Among popular functional languages, Scheme probably has tions , and client code uses the abstraction. A good abstraction the simplest cost model. Haskell’s cost model makes grown provides not just a clean interface but also a perspicuous and persons weep. helpful cost model . Finally, if students need to understand how programs are de- What do all these ideas have to do with data structures? composed into modules, they should be able to look at inter- A data structure follows from a choice of abstraction and a faces. And I want interfaces to be separately compiled. A per- cost model. For example, if you want a bag abstraction with son could limp along with C’s .h files, but I want to rule out fast access to the smallest element, you want a heap. Simi- the lines taken by C++, Clu, Haskell, Oberon, Racket, and larly, if you want an ordered-list abstraction with fast inser- a bunch of others, where all the compiler understands is an 1

  2. Jumbo ML Smooth Sailing to Module Mastery Norman Ramsey, Tufts University Irreducible complexity: types and modules implementation, and certain parts of the implementation are called public , exported , or provided . If we want students to If I want separately compiled modules with static type check- learn information hiding, they need to look at code that hides ing, I’m going to need a lot more syntactic forms. information. My colleagues and I are not aware of any available language What are we working with? that meets all our criteria. Among the languages that are avail- able, the best choice seems to be the dead research language Values Modules Standard ML: What are we describing? • It militates toward pure code but also supports impure Data Types Module types code and mutable abstractions. type and module type • I wouldn’t call the cost model perspicuous, but at least datatype definitions it’s discoverable. definitions • It provides first-class, separately compiled interfaces. Computation Values Modules • While significantly simpler than its living relatives module expressions Haskell and OCaml, it is nowhere near as simple as real def and redef definitions teaching languages. Fortunately, having taught using definitions Standard ML, I know some of the pitfalls. I’ll need these syntactic forms: What simplicity looks like: *SL • Terms (expressions), to compute values • Types , to classify terms The teaching languages developed by the Racket team for • Definitions , to associate names with terms, types, mod- How to Design Programs are called Beginning Student Lan- ules, and module types guage, Intermediate Student Language, and Advanced Stu- dent Language. Individually they can be referred to as BSL, • Declarations , to summarize definitions ISL, and ASL; collectively they are referred to as *SL. The • Modules , to collect definitions quality of the language design bowls me over. Here is a sum- • Module types , to collect declarations (and classify mod- mary: ules) • Data is either atomic , or it is defined by parts (product • Compilation units , to be the unit of compilation type) or defined by choices (sum type). ISL adds arrow types. In addition, to manage the construction of systems, I’ll need these concepts: • There are just two syntactic categories: expression (term) and definition. • Components , to group compilation units (think CM or • An expression is a variable, a literal, a function applica- MLB) tion, or McCarthy’s cond . (There are also short-circuit • Programs , to be run and and or forms.) There is no let-binding, and func- I’ll talk loosely about a number of languages: tions have no local variables. • The type language • A definition introduces a function, variable, or a struc- ture. A structure definition introduce a type predicate, a • The term language constructor function, and one selector function per field. • The module language • A final “definition” form is check-expect , which is • The component language actually a unit test. There are also variations check- • An interactive language within and check-error . Finally, I too plan on “language levels.” In BSL, functions are second-class—they are not values. • Basic Jumbo ML is the simplest possible language, for ISL makes functions first class, and it adds lambda and local beginners. forms. The local form, like the top-level definition forms, • Full Jumbo ML is all the shiny objects. seems to enjoy a combination of let-star and letrec semantics. ASL adds mutation and sequencing. I haven’t studied it. We’ll see if I can avoid intermediate layers. 2

Recommend


More recommend