Description of a programming language • Syntax – describes the structure of a language – given as grammatical rules – which streams of symbols (characters) form a legal program • Syntax checking (syntax analysis, parsing) – construction of a parse tree – does the input program follow the grammatical rules – checking requires program transformation from a character string into a stream of tokes (lexical analysis, scanning) • Semantics – what is the meaning of a given legal program – which kind of computation does a legal program produce Principles of programming languages TUT Pervasive Computing 1 Maarit Harsu / Matti Rintala / Henri Hansen
Phases of compilation • Compilation is usually divided to separate phases: easier, simpler, clearer • Output of a previous phase is the input of the next one • Symbol table collects information on user- defined constructs (variables, functions, types, …) Principles of programming languages TUT Pervasive Computing Maarit Harsu / Matti Rintala / Henri Hansen
Analyses (check-ups) • Natural language levels: – lexical: ”It iß seven o’clock.” – syntactic: ”It seven o’clock.” – semantic: ”It is thirty o’clock.” • Programming language levels: Analysis Process lexical scanning syntactic parsing contextual e.g. type checking semantic code generation Principles of programming languages TUT Pervasive Computing 3 Maarit Harsu / Matti Rintala / Henri Hansen
Source program Compilation characters process Lexical analyzer tokens Syntax analyzer Parse tree Semantic analyzer maybe Intermediate code Symbol the same generator table Intermediate Optimization code Executable Code generator target language (assembly) Principles of programming languages TUT Pervasive Computing 4 Maarit Harsu / Matti Rintala / Henri Hansen
Interpretation process Source program Input data Interpreter Results Principles of programming languages TUT Pervasive Computing 5 Maarit Harsu / Matti Rintala / Henri Hansen
Hybrid Source program process Lexical analyzer tokens Syntax analyzer parse tree Intermediate code generator intermediate code Interpreter Input data Principles of programming languages TUT Pervasive Computing 6 Results Maarit Harsu / Matti Rintala / Henri Hansen
Preprocessor An example of macros: C-code: #define MAX_LOOP 100 #define INCR ( a ) ( a ) ++ #define FOR_LOOP ( var, from, to ) \\ for ( var = from; var <= to; INCR ( var ) ) { #define END_FOR } #define NULL FOR_LOOP ( n, 1, MAX_LOOP ) NULL; END_FOR; Principles of programming languages TUT Pervasive Computing 7 Maarit Harsu / Matti Rintala / Henri Hansen
Lexical analysis characters (source code) Lexer • grammar: regular • format: regular expressions • implementation: finite state machine list of tokens Principles of programming languages TUT Pervasive Computing 8 Maarit Harsu / Matti Rintala / Henri Hansen
Examples of regular expressions Digits and letters <digit> → 0 | 1 | ... | 9 <letter> → a | ... | z | A | … | Z Numbers <unsigned int> → <digit>* Identifiers <id> → <letter> | <id> <letter> | <id> <digit> Principles of programming languages TUT Pervasive Computing 9 Maarit Harsu / Matti Rintala / Henri Hansen
Lexical analysis • Grouping of input characters index = 2 * count; – lexeme Lexeme Token • a unit that can be detected from a program text index identifier – token = equal_sign • classification of lexemes 2 int_literal • a name given to a lexeme * mult_op • Lexeme count identifier – terminal symbol ; semicolon Principles of programming languages TUT Pervasive Computing 10 Maarit Harsu / Matti Rintala / Henri Hansen
Lexemes/tokens • Keywords • Other things to be considered in lexical – reserved words analysis: • Identifiers – comments – names chosen by the – white spaces programmer – indentations • Literals – constant values • Operators – acronyms for (e.g. aritmetic) functions • Separators – characters and strings be- tween language constructs Principles of programming languages TUT Pervasive Computing 11 Maarit Harsu / Matti Rintala / Henri Hansen
Pascal code: program gcd ( input, output ); Lexical var i, j: integer; analysis begin read ( i, j ); while i <> j do Identifying if i > j then i := i – j the lexemes else j := j – i; (pattern writeln ( i ) matching) end . program gcd ( input , output ) ; var i , j : integer ; begin read ( i , j ) ; while i <> j do if i > j then i := i – j else j := j – i ; writeln ( i ) end . Principles of programming languages TUT Pervasive Computing 12 Maarit Harsu / Matti Rintala / Henri Hansen
Syntactic analysis list of tokens Parser • grammar: context free • format: BNF • implementation: push-down (stack) automaton symbol table parse tree Principles of programming languages TUT Pervasive Computing 13 Maarit Harsu / Matti Rintala / Henri Hansen
Describing syntax Context-free grammar: G = ( N, T, P, S) N: set of nonterminals T: set of terminals P: set of productions (rules) S: start symbol, S ∈ N The rules do not depend on the context in which they appear. Principles of programming languages TUT Pervasive Computing 14 Maarit Harsu / Matti Rintala / Henri Hansen
Examples of grammar rules (BNF) Variable definition <variable def> ::= <identifier> <identifier> = <expr> (type) (name) while-loop <iteration stmt> ::= while ( <expr> ) <stmt> Statements <stmt> ::= <iteration stmt> <stmt> ::= <compound stmt> <compound stmt> ::= { <statement seq> } <statement seq> ::= <stmt> <statement seq> ::= <stmt> <statement seq> Principles of programming languages TUT Pervasive Computing 15 Maarit Harsu / Matti Rintala / Henri Hansen
Example of a grammar <expr> ::= <expr> + <term> Grammar <expr> ::= <expr> - <term> rules in BNF <expr> ::= <term> <term> ::= <term> * <factor> <term> ::= <term> / <factor> <term> ::= <factor> <factor> ::= <integer> <factor> ::= ( <expr> ) Nonterminals expr, term, factor, integer (, ), +, -, *, / (and integer instances) Terminals expr Start symbol Principles of programming languages TUT Pervasive Computing 16 Maarit Harsu / Matti Rintala / Henri Hansen
<expr> ::= <expr> + <term> Derivation of <expr> ::= <expr> - <term> <expr> ::= <term> 2 * ( 12 + 3 ) <term> ::= <term> * <factor> <term> ::= <term> / <factor> <expr> <term> ::= <factor> <factor>::= <integer> = <term> <factor>::= ( <expr> ) = <term> * <factor> = <factor> * <factor> = <integer> * <factor> = 2 * <factor> = 2 * ( <expr> ) = 2 * ( <expr> + <term> ) = 2 * ( <term> + <term> ) = 2 * ( <factor> + <factor> ) = 2 * ( <integer> + <integer> ) = 2 * ( 12 + 3 ) Principles of programming languages TUT Pervasive Computing 17 Maarit Harsu / Matti Rintala / Henri Hansen
E Derivation as a parse tree T T F F E E T I E = expr T F T = term F = factor I = integer F I I 2 * ( 12 + 3 ) Principles of programming languages TUT Pervasive Computing 18 Maarit Harsu / Matti Rintala / Henri Hansen
Ambiguous grammar • There exist several parse trees for one input – input can be derived in several ways <expr> ::= <expr> + <expr> <expr> ::= <expr> - <expr> <expr> ::= <expr> * <expr> <expr> ::= <expr> / <expr> <expr> ::= ( <expr> ) <expr> ::= <integer> Principles of programming languages TUT Pervasive Computing 19 Maarit Harsu / Matti Rintala / Henri Hansen
A real example of ambiguity dangling else: Pascal- if E1 then if E1 then code: if E2 then if E2 then S1 S1 else else S2 S2 Corresponding grammar: <stmt> ::= <if_stmt> | ... <if_stmt> ::= if <expr> then <stmt> <if_stmt> ::= if <expr> then <stmt> else <stmt> Principles of programming languages TUT Pervasive Computing 20 Maarit Harsu / Matti Rintala / Henri Hansen
”Dangling else” in parse trees <if-stmt> <if-stmt> if <expr> then <stmt> else <stmt> if <expr> then <stmt> <if-stmt> <if-stmt> if <expr> then <stmt> if <expr> then <stmt> else <stmt> <stmt> ::= <if_stmt> | ... <if_stmt> ::= if <expr> then <stmt> <if_stmt> ::= if <expr> then <stmt> else <stmt> Principles of programming languages TUT Pervasive Computing 21 Maarit Harsu / Matti Rintala / Henri Hansen
Solutions for dangling else problem • (The same problems exists in C) • Semantic rule: – else branch belongs to the latest condition that not yet has an else branch • Programmer can use compound statements Principles of programming languages TUT Pervasive Computing 22 Maarit Harsu / Matti Rintala / Henri Hansen
Recommend
More recommend