Outline Semantic Analysis • The role of semantic analysis in a compiler – A laundry list of tasks • Scope – Static vs. Dynamic scoping – Implementation: symbol tables • Types – Statically vs. Dynamically typed languages 2 Compile Design I (2011) Where we are The Compiler so far Lexical analysis : program is lexically well-formed – Tokens are legal (e.g. identifiers have valid names, no stray characters, etc.) – Detects inputs with illegal tokens Parsing : program is syntactically well-formed – Declarations have correct structure, expressions are syntactically valid, etc. – Detects inputs with ill-formed syntax Semantic analysis : – Last “front end” compilation phase – Catches all remaining errors 3 4 Compile Design I (2011) Compile Design I (2011)
Why have a Separate Semantic Analysis? What Does Semantic Analysis Do? Parsing cannot catch some errors Performs checks of many kinds ... Examples: Some language constructs are not context-free 1. All used identifiers are declared – Example: Identifier declaration and use 2. Identifiers declared only once – An abstract version of the problem is: 3. Types { wcw | w ∈ (a + b) * } 4. Procedures and functions defined only once 5. Procedures and functions used with the right – The 1st w represents the identifier’s declaration; number and type of arguments the 2nd w represents a use of the identifier And others . . . The requirements depend on the language 5 6 Compile Design I (2011) Compile Design I (2011) What’s Wrong? Semantic Processing : Syntax-Directed Translation Basic idea : Associate information with language constructs by attaching attributes to the Example 1 grammar symbols that represent these constructs let string y ← "abc" in y + 42 – Values for attributes are computed using semantic rules associated with grammar productions – An attribute can represent anything (reasonable) Example 2 that we choose; e.g. a string, number, type, etc. let integer y in x + 42 – A parse tree showing the values of attributes at each node is called an annotated parse tree 7 8 Compile Design I (2011) Compile Design I (2011)
Attributes of an Identifier Scope name : character string (obtained from scanner) • The scope of an identifier (a binding of a name to the entity it names) is the textual part of scope : the program in which the binding is active type : - integer - array: • Scope matches identifier declarations with uses • number of dimensions – Important static analysis step in most languages • upper and lower bounds for each dimension • type of elements – function: • number and type of parameters (in order) • type of returned value • size of stack frame 9 10 Compile Design I (2011) Compile Design I (2011) Scope (Cont.) Static vs. Dynamic Scope • The scope of an identifier is the portion of a • Most languages have static (lexical) scope program in which that identifier is accessible – Scope depends only on the physical structure of program text, not its run-time behavior – The determination of scope is made by the compiler • The same identifier may refer to different – C, Java, ML have static scope; so do most languages things in different parts of the program • A few languages are dynamically scoped – Different scopes for same name don’t overlap – Lisp, SNOBOL – Lisp has changed to mostly static scoping • An identifier may have restricted scope – Scope depends on execution of the program 11 12 Compile Design I (2011) Compile Design I (2011)
Static Scoping Example Dynamic Scope • A dynamically-scoped variable refers to the let integer x ← 0 in closest enclosing binding in the execution of { the program x; let integer x ← 1 in Example x; g(y) = let integer a ← 42 in f(3); x; f(x) = a; } – When invoking g(54) the result will be 42 Uses of x refer to closest enclosing definition 13 14 Compile Design I (2011) Compile Design I (2011) Static vs. Dynamic Scope Dynamic Scope (Cont.) • With dynamic scope, bindings cannot always be Program scopes (input, output); resolved by examining the program because var a: integer; they are dependent on calling sequences With static scope procedure first; rules, it prints 1 • Dynamic scope rules are usually encountered in begin a := 1; end; interpreted languages procedure second; With dynamic scope var a: integer; • Also, usually these languages do not normally rules, it prints 2 begin first; end; have static type checking as type begin determination is not always possible when a := 2; second; write(a); dynamic rules are in effect end. 15 16 Compile Design I (2011) Compile Design I (2011)
Scope of Identifiers Scope of Identifiers (Cont.) • In most programming languages identifier • Not all kinds of identifiers follow the most- bindings are introduced by closely nested scope rule – Function declarations (introduce function names) – Procedure definitions (introduce procedure names) • For example, function declarations – Identifier declarations (introduce identifiers) – often cannot be nested – Formal parameters (introduce identifiers) – are globally visible throughout the program • In other words, a function name can be used before it is defined 17 18 Compile Design I (2011) Compile Design I (2011) Example: Use Before Definition Other kinds of Scope • In O-O languages, method and attribute foo (integer x) names have more sophisticated (static) scope { rules integer y y ← bar(x) • A method need not be defined in the class in ... which it is used, but in some parent class } bar (integer i): integer • Methods may also be redefined (overridden) { ... } 19 20 Compile Design I (2011) Compile Design I (2011)
Implementing the Most-Closely Nested Rule Implementing Most-Closely Nesting (Cont.) • Much of semantic analysis can be expressed as • Example: the scope of variable declarations is a recursive descent of an AST one subtree – Process an AST node n let integer x ← 42 in E – Process the children of n – Finish processing the AST node n can be used in subtree E x • • When performing semantic analysis on a portion of the AST, we need to know which identifiers are defined 21 22 Compile Design I (2011) Compile Design I (2011) Symbol Tables Symbol Tables Purpose : To hold information about identifiers • Consider again: that is computed at some point and looked up let integer x ← 42 in E at later times during compilation • Idea: Examples: – Before processing E , add definition of x to – type of a variable current definitions, overriding any other – entry point for a function definition of x – After processing E , remove definition of x Operations : insert , lookup , delete and restore old definition of x Common implementations : linked lists, hash tables • A symbol table is a data structure that tracks the current bindings of identifiers 23 24 Compile Design I (2011) Compile Design I (2011)
A Simple Symbol Table Implementation Limitations • Structure is a stack • The simple symbol table works for variable declarations – Symbols added one at a time • Operations – Declarations are perfectly nested add_symbol(x) push x and associated info, such as x’s type, on the stack • Doesn’t work for find_symbol(x) search stack, starting from top, for x. Return first x found or NULL if none found foo(x: integer, x: float); remove_symbol() pop the stack • Other problems? • Why does this work? 25 26 Compile Design I (2011) Compile Design I (2011) A Fancier Symbol Table Function/Procedure Definitions • Function names can be used prior to their start/push a new nested scope • enter_scope() definition finds current x (or null) • find_symbol(x) • We can’t check that for function names add a symbol x to the table • add_symbol(x) – using a symbol table • check_scope(x) true if x defined in current – or even in one pass scope • Solution exits/pops the current scope • exit_scope() – Pass 1: Gather all function/procedure names – Pass 2: Do the checking • Semantic analysis requires multiple passes – Probably more than two 27 28 Compile Design I (2011) Compile Design I (2011)
Types Why Do We Need Type Systems? • What is a type? Consider the assembly language fragment – This is a subject of some debate – The notion varies from language to language addi $r1, $r2, $r3 • Consensus – A type is a set of values and What are the types of $r1, $r2, $r3 ? – A set of operations on those values • Type errors arise when operations are performed on values that do not support that operation 29 30 Compile Design I (2011) Compile Design I (2011) Types and Operations Type Systems • Certain operations are legal for values of each • A language’s type system specifies which type operations are valid for which types • The goal of type checking is to ensure that – It doesn’t make sense to add a function pointer and operations are used with the correct types an integer in C – Enforces intended interpretation of values, because nothing else will! – It does make sense to add two integers • Type systems provide a concise formalization – But both have the same assembly language of the semantic checking rules implementation! 31 32 Compile Design I (2011) Compile Design I (2011)
Recommend
More recommend