description
play

Description Given as grammatical rules States what strings are - PowerPoint PPT Presentation

Syntax Describes the structure of a program Description Given as grammatical rules States what strings are legitimate programs of the given of a language Syntax check (parsing) language Generating a parse tree Does


  1. ● Syntax – Describes the structure of a program Description – Given as grammatical rules – States what strings are legitimate programs of the given of a language ● Syntax check (parsing) language – Generating a parse tree – Does the input string correspond to the grammar of the language? ● Semantics – What is the meaning a given legal program – What kind of computation does a legal program produce – Static semantics (compile time), "contextual" checking (types, scopes) – Run-time semantics (program operation) 78

  2. ● Compilation usually divided to separate phases: easier, simpler, Phases of clearer compilation ● Output of previous phase is input of next ● Grammar of each phase defines the programming language ● Symbol table collects info on user- defined constructs (variables, functions, types, ...) 79

  3. ● If applied to natural language: – lexical analysis: ”Dog cha5es the c? Phases of a” code – syntax analysis: ”Man blue drive car.” analysis – semantic analysis: ”Car hides moon” ● Phases of compilation: Analysis Process lexical scanning ● syntactic parsing contextual type compatability etc. semantic code generation 80

  4. Source code Compilation characters Lexical analysis tokens (maybe) Syntactic analysis Parse tree ● Semantic analysis Symbol table ● Intermediate codegen (maybe same) Optimisation Intermediate Executable Code generation code target language(assembly, machine) 81

  5. Interpretation Source code Symbol table Inputs Interpreter Results 82

  6. Source Hybrid process Lexical analysis tokens Syntax analysis Parse tree Intermediate code gen. Byte code interpreter Input 83 output

  7. Precompiler Example on macros #define MAX_LOOP 100 #define INCR (a) ( a )++ #define FOR_LOOP ( var, from, to ) \\ for ( var = from; var <= to; INCR ( a ) ) { #define END_FOR } C-code: #define NULL FOR_LOOP ( n, 1, MAX_LOOP ) NULL; END_FOR; 84

  8. characters (source code) Lexical analysis Lexer ● grammar: regular ● format: regular expressions ● (implementation: finite state machine ) list of tokens 85

  9. ● Grouping of input strings – lexeme Lexical • Some item in the input text analysis – token • Classification of lexemes, output of lexical analysis index = 2 * count; • A "name/type" given to a lexeme • Tokens have a type (keyword IF, number literal, Lexeme Token operator, etc.) and value (IF, 123, +, i.e. interpreted index identifier: index lexeme) ● Token = assignment 2 integer: 2 – A terminal symbol in the language syntax, further syntactic structures are built on tokens * mult. operator count identifier: count ; semicolon 86

  10. ● Keywords : Reserved words ● Identifiers : Names chosen by the Tokens programmer ● Literals : Values for constants ● Operators : Arithmetic and similar operations ● Separators : Symbols and strings that separate language constructs ● Other things to consider in lexical analysis: comments, white space, indentation ● Grammar for tokens: regular expressions – [0-9][0-9]* – [A-Za-z][A-Za-z0-9]* – ".*" 87

  11. ● "normal" characters: keyword ● . = any (1) char: k..word Regular ● [abc] = set of chars: k[aeu]yword expressions ● [a-z] = range of chars: k[a-e]yword ("regexes") ● Above can be combined : k[a-e0-9]y ● * = any number of previous (incl. zero): ke*yw[o-u]*rd.* ● () = group chars together: key(word)* ● ? = 0 or 1 of previous: keyword[0-9]? ● + = at least 1 of prev: key(word)+ ● | = alternative regexes: key(word|phrase) 88

  12. Pascal-code: program gcd ( input, output ); Lexical var i, j: integer; begin analysis Identifying read ( i, j ); the lexemes while i <> j do if i > j then i := i – j else j := j – i; writeln ( i ) end . kw-program, ident( gcd ), lparen, ident( input ), comma, ident( output ), rparen, semicolon, kw-var, ident( i ), comma, ident( j ), colon, ident( integer ), semicolon, kw-begin, ident( read ), lparen, ident( i ), comma, ident( j ), rparen, semicolon, kw-while, ident( i ), noteq, ident( j ), kw-do, kw-if, ident( i ), greater, ident( j ), kw-then, ident( i ), assign, ident( i ), minus, ident( j ), kw-else, ident( j ), assign, ident( j ), minus, ident( i ), semicolon, 89 ident( writeln ), lparen, ident( i ), rparen, kw-end, fullstop

Recommend


More recommend