Syntactical analysis Syntactical analysis • Context-free grammars A context-free grammar is a 4-tuple G = ( N , Σ , P , S ) g ( ) • Derivations 1. N is a set of non terminals • Parse Trees 2. Σ is a set of terminals (disjoint from N ) • Left-recursive grammars 3. P is a subset of ( N Σ )* N • Top-down parsing • non-recursive predictive parsers An element ( α , A ) P is called a production • construction of parse tables A ::= α or α A A A • Bottom-up parsing 4. S N is the start symbol • shift/reduce parsers • LR parsers • GLR parsers The sets N , Σ , P are finite • SGLR parsers / Faculteit Wiskunde en Informatica / Faculteit Wiskunde en Informatica 15-9-2010 PAGE 0 15-9-2010 PAGE 1 Syntactical analysis Syntactical analysis The language L ( G ) generated by the context-free A context-free grammar can be consider as a simple rewrite system: grammar G = ( N , Σ , P , S ) is: A if A P ( , , ( N Σ )*, A N ) L ( G ) = { w Σ * | S + w } A sentence w L ( G ) contains only terminals A sentence w L ( G ) contains only terminals Example N = { E }, Σ = { + , * , ( , ) , - , a }, S = E , P = { E + E E A sentential form is a string of terminals and non- E E E E * E E t terminals which can be derived from S : i l hi h b d i d f S ( E ) E S * with ( N Σ ) * - E E A sentence in L ( G ) is a sentential form in which no A sentence in L ( G ) is a sentential form in which no a E } } non-terminals occur Derivation: E -E -(E) -(E+E) -(a+E) -(a+a) / Faculteit Wiskunde en Informatica 15-9-2010 PAGE 2 / Faculteit Wiskunde en Informatica 15-9-2010 PAGE 3
Syntactical analysis Syntactical analysis A parse tree for a context-free grammar is a G = ( N , Σ , P , S ) tree: Left/right derivations 1 1. The root is labeled with S (the start non-terminal) The root is labeled ith S (the start non terminal) Th There are choices to be made for each derivation step: h i t b d f h d i ti t • 2. Each leaf is labeled with a terminal ( Σ )or ε which non-terminal must be replaced? • 3. All other nodes are labeled with a non-terminal which alternative of the selected non-terminal must be applied? • Always selecting the leftmost non-terminal in the sentential form gives Always selecting the leftmost non-terminal in the sentential form gives • • a leftmost derivation : lm If A is the label of a node and X 1 ,…, X n are the labels of the There exists also a rightmost derivation: rm children (from left to right) then • X 1 ,…, X n A 1 n Consider the context-free grammar for expressions: must be a production rule in G (with X i is either a terminal or a • non-terminal) Leftmost derivation for -(a+a) • E -E -(E) -(E+E) -(a+E) -(a+a) Special case: ε A with label A which has exactly one child with Special case: ε A with label A which has exactly one child with Ri ht Rightmost derivation for -(a+a) t d i ti f • label ε E -E -(E) -(E+E) -(E+a) -(a+a) / Faculteit Wiskunde en Informatica / Faculteit Wiskunde en Informatica 15-9-2010 PAGE 4 15-9-2010 PAGE 5 Syntactical analysis Syntactical analysis Example: Acceptor and parser E -E -(E) -(E+E) -(a+E) -(a+a) For each grammar G there exists a decision procedure (acceptor) E E E E E E E AG for L ( G ): AG : STRING { true , false } - E - E - E - E - E - E such that AG ( w ) = true w L ( G ) AG ( w ) = true w L ( G ) ( E ) ( E ) ( E ) ( E ) ( E ) A parser is an acceptor which constructs a parse tree as well. E + E E + E E + E • A top-down parser constructs the tree starting from the root f • A bottom-up parser constructs the tree starting from the leafs a a a The parse tree abstracts from the derivation order p / Faculteit Wiskunde en Informatica 15-9-2010 PAGE 6 / Faculteit Wiskunde en Informatica 15-9-2010 PAGE 7
Syntactical analysis Syntactical analysis During parsing the following problems may occur: • A grammar is immediate left recursive if the g grammar contains a rule of the form A • The grammar is ambiguous • The grammar is left recursive • A grammar is left recursive if there exists a non- terminal A and a string ( N Σ )* such that A A terminal A and a string ( N Σ )* such that A A • The grammar contains cycles • The grammar contains cycles • This means that after one or more steps in a A grammar G is ambiguous if one word w L ( G ) has at g g ( ) derivation an occurrence of A reduces again to an g least two parse trees occurrence of A without recognizing any terminal in • Expression grammar without associativities and the input sentence. priorities priorities • Dangling else problem / Faculteit Wiskunde en Informatica / Faculteit Wiskunde en Informatica 15-9-2010 PAGE 8 15-9-2010 PAGE 9 Syntactical analysis Syntactical analysis Examples of indirect left recursion Elimination of left recursion B A A A A B A or worse produce the sentential forms: n B A D A B D A B A set of equivalent (non left recursive) rules are A t f i l t ( l ft i ) l ε D A’ A G D A A A’ A’ ε A’ It is easy to remove left recursion from a context-free grammar / Faculteit Wiskunde en Informatica 15-9-2010 PAGE 10 / Faculteit Wiskunde en Informatica 15-9-2010 PAGE 11
Syntactical analysis Syntactical analysis Example: Example: (1) E + T E (immediate left rec.) (1’) T E’ E (2) T E (2’) + T E’ E’ (3) T * F T (immediate left rec.) ( ) ( ) E’ (2’’) ε (4) F T (5) ( E ) F (6) a F F (6) a the same for: th f (3’) F T’ T Applying the left recursion elimination transformation: (4’) * F T’ T’ F T T (4 ) (1) E E (with = + T ) T’ (4’’) ε (2) E (with = T ) / Faculteit Wiskunde en Informatica / Faculteit Wiskunde en Informatica 15-9-2010 PAGE 12 15-9-2010 PAGE 13 Syntactical analysis Syntactical analysis Indirect left recursion elimination This process is repeated until either • Suppose we have a rule of the form • t A ; the process stops, or B A • A A ; the immediately left recursion elimination 1 B rule can be applied l b li d 2 B … n B • The rule B A is now transformed into: 1 A 2 A 2 … n A / Faculteit Wiskunde en Informatica 15-9-2010 PAGE 14 / Faculteit Wiskunde en Informatica 15-9-2010 PAGE 15
Syntactical analysis Syntactical analysis Left factorization Example • In general it is efficient to move the difference between the In general it is efficient to mo e the difference bet een the if b then S else S S alternatives of a non-terminal as far as possible to the left S • Productions of the form if b then S 1 A 1 A Only at the occurrence of else it can be decided which Only at the occurrence of else it can be decided which 2 A alternative should have been selected … n A n • Are equivalent with An equivalent grammar is A’ A if b then S S’ S 1 A’ S S’ else S else S … S’ n A’ ε / Faculteit Wiskunde en Informatica / Faculteit Wiskunde en Informatica 15-9-2010 PAGE 16 15-9-2010 PAGE 17 Syntactical analysis Syntactical analysis Top-down parsing Recursive descent parser with backtracking Grammar • • A top-down parser “guesses” the next alternative to be c A d S recognized, and verifies whether this alternative can be a A a b A recognized in the input. If not, another alternative will be tried • Constructs the parse tree starting at the root • Constructs the parse tree starting at the root Parser Parser • • bool proc S() { • Finds the leftmost derivation of the sentence if input = ‘c’ then inptr +:= 1; if A() • Alternative types of top-down parsers: Alternative types of top down parsers: then if input = ‘d’ then inptr +:= 1; check for EOF;return(true) • recursive descent parser with backtracking fi • recursive descent parser without backtracking (“predictive fi parser ) parser”) fi; fi; return(false) • non-recursive predictive parser (uses push-down automaton) } / Faculteit Wiskunde en Informatica 15-9-2010 PAGE 18 / Faculteit Wiskunde en Informatica 15-9-2010 PAGE 19
Recommend
More recommend