cs502 compiler design semantic analysis cont manas thakur
play

CS502: Compiler Design Semantic Analysis (Cont.) Manas Thakur Fall - PowerPoint PPT Presentation

CS502: Compiler Design Semantic Analysis (Cont.) Manas Thakur Fall 2020 Recap Syntax analysis can only find, well, syntax errors. We are interested in being able to find various other kinds of errors: bar(int a, char* s) {...} int


  1. CS502: Compiler Design Semantic Analysis (Cont.) Manas Thakur Fall 2020

  2. Recap ● Syntax analysis can only find, well, syntax errors. ● We are interested in being able to find various other kinds of errors: bar(int a, char* s) {...} int foo() { int f[3]; int i, j, k; char q, *p; float k; bar(f[6], 10, x); break; i->val = 5; q = k + p; printf(“%s, %s.\n”, p, k); goto label2; } Manas Thakur CS502: Compiler Design 2

  3. Program checking ● When are checks performed? ● Static checking – At compile-time – Detect and report errors by analyzing the program offline ● Dynamic checking – At run-time – Detect and report/handle errors as they occur ● Pros and cons? – Efficiency? – Completeness? – Developer and user experience? – Language flexibility? Manas Thakur CS502: Compiler Design 3

  4. What all can be checked statically? ● Uniqueness checks – Certain names must be unique – Many languages require variable declarations ● Control-flow checks – Match control-flow operators with structures – Example: break applies to innermost loop/switch ● Type checks – Check compatibility of operators and operands – Example: Does 3.5 + “foobar” make sense? ● What kind of check is “array bounds”? Manas Thakur CS502: Compiler Design 4

  5. Uniqueness checks ● What does a name in a program denote? – Variable – Function – Class – Label ● Information maintained in bindings – A binding from the name to the corresponding entity – Bindings have scope: ● the region of the program in which they are valid ● Uniqueness checks – Analyze the bindings – Make sure they obey the rules Manas Thakur CS502: Compiler Design 5

  6. Namespace abstractions ● What is a function/procedure/method? What is a class? – Do they exist at the machine-code level? – Not really! ● Functions/procedures/methods and classes essentially define namespaces. ● Helpful in – Identifying scopes – Defining bindings Manas Thakur CS502: Compiler Design 6

  7. Procedures as namespaces ● Each procedure creates its own namespace – Names can be declared locally – Local names hide identical non-local (global) names (shadowing) – Local names cannot be seen outside the procedure ● Such a set of rules is called lexical (or static) scoping. – There must then exist a dynamic scoping! ● Ask those who have taken CS302! ● e.g., C has global, static, local, and block scopes – Blocks can be nested, procedures cannot. Manas Thakur CS502: Compiler Design 7

  8. Lexical scoping Difgerent because of ● Why is it good? lexical scoping – Flexibility for programmer (reuse of variable names) { for (int i = 0; i < 100; ++i) { ... } for (Iterator i = list.iterator(); i.hasNext();) { ... } } – Easy to “see” a binding! ● Compiler’s headache to differentiate same-name variables at different points – Implementation: Lexically scoped symbol tables Manas Thakur CS502: Compiler Design 8

  9. Symbol Table Character stream Machine-Independent Machine-Independent Lexical Analyzer Lexical Analyzer Code Optimizer Code Optimizer Intermediate representation Token stream Syntax Analyzer Code Generator Syntax Analyzer Code Generator Target machine code Syntax tree Machine-Dependent Machine-Dependent Semantic Analyzer Semantic Analyzer Code Optimizer Code Optimizer Syntax tree Target machine code Intermediate Symbol Intermediate Code Generator Table Code Generator Intermediate representation Manas Thakur CS502: Compiler Design 9

  10. Lexically scoped symbol tables ● Tasks at hand – Keep track of names – At the use of a name, find its information (e.g., which one?) ● The problem – Compiler needs a distinct entry for each declaration – Nested lexical scopes allow duplicate entries ● Let’s see an example. Manas Thakur CS502: Compiler Design 10

  11. Scopes class p { S p :{ int a, b, c; int a, b, c; method q { S q : { int v, b, x, w; int v, b, x, w; for (r = 0; ...) { S r : { int x, y, z; int x, y, z; … ... } } while (s) { S s : { int x, a, v; … int x, a, v; } ... … r … s } } } … q … } } Manas Thakur CS502: Compiler Design 11

  12. Chained implementation ● Create a new table for each scope ● Chain tables together for lookup ● enter() creates a new table ... ● insert() adds at current level p a q ● lookup() walks chain of tables r r and returns fjrst occurrence b x of name • v ... ● exit() throws away the table for the current level c y b x w ● How would one implement the z individual tables? Manas Thakur CS502: Compiler Design 12

  13. Tomorrow ● Extensions to symbol tables for OO languages – Classes – Objects – Object fields – Inheritance ● Implementation: – Your compiler is taking shape now. ● Poll on Teams for doubt session. Manas Thakur CS502: Compiler Design 13

  14. CS502: Compiler Design Semantic Analysis (Cont.) Manas Thakur Fall 2020

  15. Virtual White Board ● Designing a symbol table ● Extending for new scopes ● Classes and inheritance ● Assignment 2: Not overweight, but under-tall – Try feeding lasagne to Garfield – Deadline: Oct 18 th Manas Thakur CS502: Compiler Design 15

  16. CS502: Compiler Design Semantic Analysis (Cont.) Manas Thakur Fall 2020

  17. Uniqueness checks: More complications ● Forward references – need multiple passes ● includes, packages, modules, interfaces – need to import/export ● Various coding conveniences – int a = sizeof(a); ● Declare “ a ” in the namespace before parsing the initializer – int b, c[sizeof(b)]; ● Declare “ b ” with a type before parsing “ c ” ● Multiple inheritance? ● Summary: Language features complicate the life of compiler designers even for a seemingly simple check! Manas Thakur CS502: Compiler Design 17

  18. Type checking ● Big topic – Type expressions – Type equivalence – Type systems – Type inference ● What is a type? – A collection of values and the set of operations on those values. – Remember why did you say a door can’t kick or a ship can’t die? ● Types define capabilities. Manas Thakur CS502: Compiler Design 18

  19. Purpose of types ● Identify and prevent errors – Avoid meaningless or harmful computations – Meaningless: (x < 6) + 1 - “bathtub” – Harmful? ● Program organization and documentation – Separate types for separate concepts P o P P – Types indicate programmers’ intent o P ● Support implementation – Allocate right amount of space for variables – Select right machine operands – Optimization: e.g., use fewer bits when possible ● Key idea: types can be checked Manas Thakur CS502: Compiler Design 19

  20. Type errors ● Problem: – Underlying memory has no concept of type – Everything is just a string of bits: 0100 0000 0101 1000 0000 0000 0000 0000 – The floating point number: 3.375 – The 32-bit integer: 1,079,508,992 – Two 16-bit integers: 16472 and 0 – Four ASCII characters: @, X, NULL and NULL ● Without type checking: – Machine will let you store 3.375 and later load 1,079,508,992 – Violates the intended semantics of the program Manas Thakur CS502: Compiler Design 20

  21. Type system ● Idea: – Provide clear interpretation for bits in memory – Impose constraints on the use of variables and data – Expressed as a set of rules – Automatically check the rules – Report errors to programmers ● Key questions: – What types are built into the language? – Can the programmer build new types? – What are the typing rules? – When does type checking occur? – How strictly are the rules enforced? Manas Thakur CS502: Compiler Design 21

  22. When are checks performed? ● Statically typed languages – Types of all the variables are determined ahead of time – Examples? ● C, C++, Java ● Dynamically typed languages – Type of a variable can vary at run-time – Examples? ● Python, JavaScript, bash, Scheme ● Our focus: – Static typing – corresponds to standard static compilation Manas Thakur CS502: Compiler Design 22

  23. Expressiveness ● Consider this Scheme function: P o P P o (define myfunc (lambda (x) P (if (list? x) (myfunc(car x)) (+ x 1)) ● What is the type of x ? – Sometimes a list, sometimes an atom – Downside? ● What would happen in static typing? – Cannot assign a type to x at compile-time – Cannot write this function – Static typing is conservative Manas Thakur CS502: Compiler Design 23

  24. Types and Compilers ● Suppose the task is to generate code for: a = b + c * d; arr[i] = *p + 2; – What does the compiler need to know? ● Duties of a compiler: – Enforce type rules of the language – Choose operations to be performed ● Can a certain computation be done in one machine instruction? – Provide concrete representation (bits) ● What if a check can’t be performed at compile-time? Manas Thakur CS502: Compiler Design 24

  25. Strong vs weak typing ● A strongly typed language does not allow variables to be used in a way inconsistent with their types (no loopholes) . – Example: Java. ● A weakly typed language allows many ways to bypass/violate the type system. – Classic example: C. How? ● Pointer arithmetic. ● C’s motto: just trust the programmer! Manas Thakur CS502: Compiler Design 25

Recommend


More recommend