1
play

1 Parsing Theoretical Foundations Given expression find tree Many - PDF document

CS 242 Syntax and Semantics of Programs Fundamentals Syntax The symbols used to write a program Semantics The actions that occur when a program is executed John Mitchell Programming language implementation Syntax


  1. CS 242 Syntax and Semantics of Programs Fundamentals � Syntax • The symbols used to write a program � Semantics • The actions that occur when a program is executed John Mitchell � Programming language implementation • Syntax → Semantics • Transform program syntax into machine instructions that can be executed to cause the correct sequence of actions to occur Reading: Chapter 4 Interpreter vs Compiler Typical Compiler Source Lexical Analyzer Program Source Program Syntax Analyzer Input Interpreter Output Semantic Analyzer Intermediate Code Generator Source Program Code Optimizer Compiler Code Generator Target Program Input Target Program Output See summary in course text, compiler books Brief look at syntax Parse tree � Grammar � Derivation represented by tree e → e − e → e − e+e → n − n+n → nd − d+d → dd − d+d e ::= n | e+e | e − e n ::= d | nd → … → 27 − 4 + 3 d ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 e � Expressions in language e − e e → e − e → e − e+e → n − n+n → nd − d+d → dd − d+d → … → 27 − 4 + 3 e e 27 + Grammar defines a language 4 3 Expressions in language derived by sequence of productions Tree shows parenthesization of expression Many of you are familiar with this to some degree 1

  2. Parsing Theoretical Foundations � Given expression find tree � Many foundational systems � Ambiguity • Computability Theory • Program Logics • Expression 27 − 4 + 3 can be parsed two ways • Lambda Calculus • Problem: 27 − (4 + 3) ≠ (27 − 4) + 3 • Denotational Semantics � Ways to resolve ambiguity • Operational Semantics • Precedence • Type Theory – Group * before + � Consider two of these methods – Parse 3*4 + 2 as (3*4) + 2 • Associativity • Lambda calculus (syntax, operational semantics) – Parenthesize operators of equal precedence to left (or right) • Denotational semantics – Parse 3 − 4 + 5 as (3 − 4) + 5 See book for more info Plan for next 1.5 lectures Lambda Calculus � Formal system with three parts � Lambda calculus • Notation for function expressions � Denotational semantics • Proof system for equations � Functional vs imperative programming • Calculation rules called reduction � Additional topics in lambda calculus • Mathematical semantics (=model theory) • Type systems We will look at syntax, equations and reduction For type theory, take CS258 in winter There is more detail in the book than we will cover in class History Why study this now? � Basic syntactic notions � Original intention • Free and bound variables • Formal theory of substitution (for FOL, etc.) • Functions � More successful for computable functions • Declarations • Substitution --> symbolic computation � Calculation rule • Church/Turing thesis • Symbolic evaluation useful for discussing programs � Influenced design of Lisp, ML, other languages • Used in optimization (in-lining), macro expansion • See Boost Lambda Library for C++ function objects – Correct macro processing requires variable renaming • Illustrates some ideas about scope of binding � Important part of CS history and foundations – Lisp originally departed from standard lambda calculus, returned to the fold through Scheme, Common Lisp 2

  3. Expressions and Functions Higher-Order Functions � Expressions � Given function f, return function f ° f x + y x + 2*y + z λ f. λ x. f (f x) � Functions � How does this work? λ x. (x+y) λ z. (x + 2*y + z) ( λ f. λ x. f (f x)) ( λ y. y+1) � Application ( λ x. (x+y)) 3 = 3 + y = λ x. ( λ y. y+1) (( λ y. y+1) x) ( λ z. (x + 2*y + z)) 5 = x + 2*y + 5 = λ x. ( λ y. y+1) (x+1) Parsing: λ x. f (f x) = λ x.( f (f (x)) ) = λ x. (x+1)+1 Same result if step 2 is altered. Same procedure, Lisp syntax Declarations as “Syntactic Sugar” � Given function f, return function f ° f function f(x) return x+2 (lambda (f) (lambda (x) (f (f x)))) end; � How does this work? f(5); ((lambda (f) (lambda (x) (f (f x)))) (lambda (y) (+ y 1)) ( λ f. f(5)) ( λ x. x+2) = (lambda (x) ((lambda (y) (+ y 1)) ((lambda (y) (+ y 1)) x)))) block body declared function = (lambda (x) ((lambda (y) (+ y 1)) (+ x 1)))) let x = e 1 in e 2 = ( λ x. e 2 ) e 1 = (lambda (x) (+ (+ x 1) 1)) Extra reading: Tennent, Language Design Methods Based on Semantics Principles. Acta Informatica, 8:97-112, 197 Free and Bound Variables Reduction � Bound variable is “placeholder” � Basic computation rule is β -reduction • Variable x is bound in λ x. (x+y) ( λ x. e 1 ) e 2 → [e 2 /x]e 1 • Function λ x. (x+y) is same function as λ z. (z+y) where substitution involves renaming as needed � Compare (next slide) ∫ x+y dx = ∫ z+y dz ∀ x P(x) = ∀ z P(z) � Reduction: � Name of free (=unbound) variable does matter • Apply basic computation rule to any subexpression • Variable y is free in λ x. (x+y) • Repeat • Function λ x. (x+y) is not same as λ x. (x+z) � Confluence: � Occurrences • Final result (if there is one) is uniquely determined • y is free and bound in λ x. (( λ y. y+2) x) + y 3

  4. Rename Bound Variables 1066 and all that � Function application � 1066 And All That , Sellar & Yeatman, 1930 ( λ f. λ x. f (f x)) ( λ y. y+x) 1066 is a lovely parody of English history books, "Comprising all the parts you can remember including apply twice add x to argument one hundred and three good things, five bad kings and two genuine dates.” � Substitute “blindly” λ x. [ ( λ y. y+x) (( λ y. y+x) x) ] = λ x. x+x+x � Battle of Hastings Oct. 14, 1066 � Rename bound variables • Battle that ended in the defeat of Harold II of ( λ f. λ z. f (f z)) ( λ y. y+x) England by William, duke of Normandy, and established the Normans as the rulers of England = λ z. [ ( λ y. y+x) (( λ y. y+x) z)) ] = λ z. z+x+x Easy rule: always rename variables to be distinct Main Points about Lambda Calculus Denotational Semantics � λ captures “essence” of variable binding � Describe meaning of programs by specifying the mathematical • Function parameters • Declarations • Function • Bound variables can be renamed • Function on functions � Succinct function expressions • Value, such as natural numbers or strings defined by each construct � Simple symbolic evaluator via substitution � Can be extended with • Types • Various functions • Stores and side-effects ( But we didn’t cover these ) Original Motivation for Topic Why study this in CS 242 ? � Precision � Look at programs in a different way • Use mathematics instead of English � Program analysis � Avoid details of specific machines • Initialize before use, … • Aim to capture “pure meaning” apart from � Introduce historical debate: functional versus implementation details imperative programming � Basis for program analysis • Program expressiveness: what does this mean? • Justify program proof methods • Theory versus practice: we don’t have a good – Soundness of type system, control flow analysis theoretical understanding of programming language “usefulness” • Proof of compiler correctness • Language comparisons 4

  5. Basic Principle of Denotational Sem. Trivial Example: Binary Numbers � Compositionality � Syntax • The meaning of a compound program must be b ::= 0 | 1 defined from the meanings of its parts ( not the n ::= b | nb syntax of its parts). e ::= n | e+e � Examples � Semantics value function E : exp -> numbers • P; Q E [[ 0 ]] = 0 E [[ 1 ]] = 1 composition of two functions, state → state E [[ nb ]] = 2*E [[ n ]] + E [[ b ]] • letrec f(x) = e 1 in e 2 E [[ e 1 +e 2 ]] = E [[ e 1 ]] + E [[ e 2 ]] meaning of e 2 where f denotes function ... Obvious, but different from compiler evaluation using registers, etc. This is a simple machine-independent characterization ... Second Example: Expressions w/vars Semantics of Imperative Programs � Syntax � Syntax d ::= 0 | 1 | 2 | … | 9 P ::= x:=e | if B then P else P | P;P | while B do P n ::= d | nd � Semantics e ::= x | n | e + e • C : Programs → (State → State) � Semantics value E : exp x state -> numbers • State = Variables → Values state s : vars -> numbers would be locations → values if we wanted to model aliasing E [[ x ]] s = s(x) E [[ 0 ]] s = 0 E [[ 1 ]] s = 1 … Every imperative program can be translated into a functional E [[ nd ]] s = 10*E [[ n ]] s + E [[ d ]] s program in a relatively simple, syntax-directed way. E [[ e 1 + e 2 ]] s = E [[ e 1 ]] s + E [[ e 2 ]] s Semantics of Assignment Semantics of Conditional C[[ x:= e ]] C[[ if B then P else Q ]] is a function states → states is a function states → states C[[ x:= e ]] s = s’ C[[ if B then P else Q ]] s = where s’ : variables → values is identical to s except C[[ P ]] s if E [[ B ]] s is true s’(x) = E [[ e ]] s gives the value of e in state s C[[ Q ]] s if E [[ B ]] s is false Simplification: assume B cannot diverge or have side effects 5

Recommend


More recommend