A Simple Syntax-Directed Translator Compilers Design Sukree Sinthupinyo Department of Computer Engineering, Chulalongkorn University A Simple Syntax-Directed Translator – p. 1/34
Learning Objectives • Know basic concept of grammar • Can write simple grammars • Know parsing methods • Know basic concept of lexical analysis A Simple Syntax-Directed Translator – p. 2/34
Introduction • To specify syntax, we use context-free grammars • For example if (expression) statement else statement • This rule can be expressed as stmt → if ( expr ) stmt else stmt • → means "can have the form". • This rule is called a production • if and the parentheses are called terminals . • Variables like expr and stmt represent sequences of terminals and are called nonterminals . A Simple Syntax-Directed Translator – p. 3/34
Definition of Grammars • A context-free grammar has four components which are sets of: • Terminal symbols: elementary symbols of the language defined by the grammar. • Nonterminals : represents a set of strings of terminals • Productions : consists of a nonterminal called the head or left side of the production, an arrow, and a sequence of terminals and/or nonterminals, called the body or right side . • Designation of one of the nonterminals as the start symbol. A Simple Syntax-Directed Translator – p. 4/34
Syntax Example • An example of expressions, such as 9 − 5 + 2 , 3 − 1 , or 7 • We can refer to this kind of expression as "lists of digits separated by plus or minus signs." • The production are list → list + digit list → list − digit list → digit digit → 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 A Simple Syntax-Directed Translator – p. 5/34
Syntax Example • The bodies of the three productions with nonterminal list as head equivalently can be grouped: list → list + digit | list − digit | digit • The terminals of the grammar are the symbols + − 012345689 • The nonterminals are list and digit A Simple Syntax-Directed Translator – p. 6/34
Syntax Example • Grammar of a list of parameters in a function call. • For example: max(x,y)d or System.out.println() call → id ( optparams ) optparams → params | ǫ params → params, param | param A Simple Syntax-Directed Translator – p. 7/34
Parse Tree Example A Simple Syntax-Directed Translator – p. 8/34
Ambiguity • The ambiguous grammar have more than one parse tree. For example, with the following grammar, there are two parse trees. string → string + string | string − string | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 A Simple Syntax-Directed Translator – p. 9/34
Associativity of Operators • For some expression, such as 9 + 5 + 2 , 5 has two operators on its left and right side. But 5 belongs to the operator to its left. • We say that the operator + associates to the left. • Addition, subtraction, multiplication, and division are left-associative. A Simple Syntax-Directed Translator – p. 10/34
Associativity of Operators • For a = b = c , b associates to the = on its right. • We say that the operator = is right-associative. • This can be expressed by: right → letter = right | letter letter → a | b | . . . | z A Simple Syntax-Directed Translator – p. 11/34
Precedence of Operators • Normally, ∗ has higher precedence that + and − . • We use two nonterminals expr and term for the two levels of precedence, and factor for generating basic units in expressions. expr → expr + term | expr − term | term term → term ∗ factor | term/factor | factor factor → digit | ( expr ) A Simple Syntax-Directed Translator – p. 12/34
Java Statements • Java statements can be expressed by id = expression ; stmt → if ( expression ) stmt | if ( expression ) stmt else stmt | while ( expression ) stmt | do stmt while ( expression ); | | { stmts } stmts → stmts stmt | ǫ A Simple Syntax-Directed Translator – p. 13/34
Syntax-Directed Translation expr → expr 1 + term • We can translate expr by • translate expr 1 • translate term • handle + A Simple Syntax-Directed Translator – p. 14/34
Postfix Notation • The postfix notation for an expression E can be defined inductively as follows: • If E is a variable or constant, then the postfix notation for E is E itself. • If E is an expression of the form E 1 op E 2 , where op is any binary operator, then the 2 op , where E ′ postfix notation for E is E ′ 1 and 1 E ′ 2 are the postfix notations for E 1 and E 2 , E ′ respectively. • If E is a parenthesized expression of the form ( E 1 ), then the postfix notation for E is the same as the postfix notation for E 1 . A Simple Syntax-Directed Translator – p. 15/34
Synthesized Attributes • We can add quantities to programming constructs in terms of grammars. • We attach rules to the grammar; these rules describe how the attributes are computed at those nodes of the parse tree • A syntax-directed definition associates 1. With each grammar symbol, a set of attributes, and 2. With each production, a set of semantic rules for computing the values of the attributes associated with the symbols appearing in the production. A Simple Syntax-Directed Translator – p. 16/34
Synthesized Attributes • An attribute is said to be synthesized if its value at a parse-tree node N is determined from attribute values at the children of N and at N itself. A Simple Syntax-Directed Translator – p. 17/34
Synthesized Attributes A Simple Syntax-Directed Translator – p. 18/34
Translation Schemes • We have another approach which can produce the same translation. • We can builds up a translation by attaching strings as attributes to the nodes in the parse tree. A Simple Syntax-Directed Translator – p. 19/34
Translation Schemes expr → expr 1 + term { print ( ′ + ′ ) } expr 1 + term { print ( ′ − ′ ) } expr → expr → term { print ( ′ 0 ′ ) } expr → 0 expr → 1 { print ( ′ 1 ′ ) } . . . { print ( ′ 9 ′ ) } expr → 9 A Simple Syntax-Directed Translator – p. 20/34
Parsing • Most parsing methods fall into one of the two classes, top-down and bottom-up methods. • In top-down parsers, construction starts at the root and proceeds towards the leaves. • In bottom-up parsers, construction starts at the leaves and proceeds towards the root. A Simple Syntax-Directed Translator – p. 21/34
Top-Down Parsing • We can parse using the Top-Down fashion by starting with the root, labeled with the starting nonterminal stmt , and repeatedly performing the following two steps. 1. At node N , labeled with nonterminal A , select one of the productions for A and construct children at N for the symbols in the production body. 2. Find the next node at which a subtree is to be constructed, typically the leftmost unexpanded nonterminal of the tree. A Simple Syntax-Directed Translator – p. 22/34
Top-Down Parsing expr ; stmt → if(expr) stmt | for( optexpr ; optexpr ; optexpr ) stmt | other | optexpr → ǫ expr | A Simple Syntax-Directed Translator – p. 23/34
Top-Down Parsing A Simple Syntax-Directed Translator – p. 24/34
Predictive Parsing • Another top-down method • We will study Recursive-descent • Use a set of recursive procedure • For example for ( optexpr ; optexpr ; optexpr ) stmt stmt → • can be handled by match ( ′ ; ′ ); match ( for ); match ( ′ ( ′ ); optexpr (); match ( ′ ; ′ ); optexpr (); optexpr (); match ( ′ ) ′ ); stmt (); A Simple Syntax-Directed Translator – p. 25/34
Predictive Parsing A Simple Syntax-Directed Translator – p. 26/34
Predictive Parsing A Simple Syntax-Directed Translator – p. 27/34
Prefix to Postfix (1) A Simple Syntax-Directed Translator – p. 28/34
Prefix to Postfix (2) A Simple Syntax-Directed Translator – p. 29/34
Lexical Analysis • A lexical analyzer reads characters from the input and groups them into "token objects." • A sequence of input characters that comprises a single token is called a lexeme . • Comprise • Skip white space and comments • Handle numbers • Handle reserved words and identifiers • Return token A Simple Syntax-Directed Translator – p. 30/34
Remove White Space and Comments A Simple Syntax-Directed Translator – p. 31/34
Handle Constant • When a sequence of digits appears in the input stream, the lexical analyzer passes to the parser a token consisting of the terminal num along with an integer-valued attribute computed from the digits. • For example, from 31 + 28 + 59 can be transformed into the sequence < num , 31 > < + > < num , 28 > < + > < num , 59 > A Simple Syntax-Directed Translator – p. 32/34
Handle Constant A Simple Syntax-Directed Translator – p. 33/34
Recognizing Keywords and Identifier • For example, from count = count + increment; can be transformed into the sequence < id , ” count ” > < = > < id , ” count ” > < + > < id , ” increment ” > < ; > A Simple Syntax-Directed Translator – p. 34/34
Recognizing Keywords and Identifier A Simple Syntax-Directed Translator – p. 35/34
Recommend
More recommend