Syntactical analysis Syntactical analysis • Context-free grammars A context-free grammar is a 4-tuple G = ( N , Σ , P , S ) g ( ) • Derivations 1. N is a set of non terminals • Parse Trees 2. Σ is a set of terminals (disjoint from N ) • Left-recursive grammars P is a subset of ( N Σ ) N P is a subset of ( N Σ )* N 3 3. • Top-down parsing An element ( α , A ) P is called a production non-recursive predictive parsers • A ::= α or α A construction of parse tables • • Bottom-up parsing B i S N is the start symbol 4. shift/reduce parsers • LR parsers • The sets N Σ P are finite The sets N , Σ , P are finite • Generalized parsing G li d i GLL parsers • GLR parsers • SGLR parsers SGLR parsers • • / Faculteit Wiskunde en Informatica 22-9-2011 PAGE 0 / Faculteit Wiskunde en Informatica 22-9-2011 PAGE 1 Syntactical analysis Syntactical analysis The language L ( G ) generated by the context-free grammar A context-free grammar can be consider as a simple rewrite G = ( N , Σ , P , S ) is: system: A if A P ( , , ( N Σ )*, A N ) L ( G ) = { w Σ * | S + w } Example p N = { E }, Σ = { + , * , ( , ) , - , a }, S = E , { }, { , , ( , ) , , }, , A sentence w L ( G ) contains only terminals P = { E + E E E * E E A sentential form is a string of terminals and non-terminals ( E ) E ( E ) E which can be derived from S : - E E S * with ( N Σ ) * a E } Derivation: E -E -(E) -(E+E) -(a+E) -(a+a) A sentence in L ( G ) is a sentential form in which no non-terminals occur / Faculteit Wiskunde en Informatica / Faculteit Wiskunde en Informatica 22-9-2011 PAGE 2 22-9-2011 PAGE 3
Syntactical analysis Syntactical analysis A parse tree for a context-free grammar is a G = ( N , Σ , P , S ) tree: Left/right derivations 1 1. The root is labeled with S (the start non terminal) The root is labeled with S (the start non-terminal) There are choices to be made for each derivation step: f • Each leaf is labeled with a terminal ( Σ )or ε 2. which non-terminal must be replaced? • 3. All other nodes are labeled with a non-terminal which alternative of the selected non-terminal must be applied? • If A is the label of a node and X 1 ,…, X n are the labels of the Always selecting the leftmost non-terminal in the sentential form gives • a leftmost derivation : lm children (from left to right) then There exists also a rightmost derivation: rm • X 1 ,…, X n A must be a production rule in G (with X i is either a terminal or a Consider the context-free grammar for expressions: • non-terminal) Leftmost derivation for -(a+a) • E -E -(E) -(E+E) -(a+E) -(a+a) ( ) ( ) ( ) ( ) Special case: ε A with label A which has exactly one child with S i l A ith l b l A hi h h tl hild ith Rightmost derivation for -(a+a) • label ε E -E -(E) -(E+E) -(E+a) -(a+a) Also called a nullable non-terminal • / Faculteit Wiskunde en Informatica 22-9-2011 PAGE 4 / Faculteit Wiskunde en Informatica 22-9-2011 PAGE 5 Syntactical analysis Syntactical analysis Example: Acceptor and parser E -E -(E) -(E+E) -(a+E) -(a+a) For each grammar G there exists a decision procedure (acceptor) E E E E E E E AG for L ( G ): AG for L ( G ): AG : STRING { true , false } - E - E - E - E - E - E such that AG ( w ) = true w L ( G ) AG ( ) t L ( G ) ( E ) ( E ) ( E ) ( E ) ( E ) A parser is an acceptor which constructs a parse tree as well. E + E E + E E + E A top-down parser constructs the tree starting from the root • A bottom-up parser constructs the tree starting from the leafs a a a • The parse tree abstracts from the derivation order p / Faculteit Wiskunde en Informatica / Faculteit Wiskunde en Informatica 22-9-2011 PAGE 6 22-9-2011 PAGE 7
Syntactical analysis Syntactical analysis During parsing the following problems may occur: • A grammar is immediate left recursive if the g grammar contains a rule of the form A A • The grammar is ambiguous • The grammar is left recursive • A grammar is left recursive if there exists a non- • The grammar contains cycles terminal A and a string ( N Σ )* such that A A terminal A and a string ( N Σ )* such that A A • This means that after one or more steps in a A grammar G is ambiguous if one word w L ( G ) has at derivation an occurrence of A reduces again to an g least two parse trees least two parse trees occurrence of A without recognizing any terminal in • Expression grammar without associativities and priorities the input sentence. • Dangling else problem / Faculteit Wiskunde en Informatica 22-9-2011 PAGE 8 / Faculteit Wiskunde en Informatica 22-9-2011 PAGE 9 Syntactical analysis Syntactical analysis Examples of indirect left recursion Elimination of left recursion B A A A A B A where =/> ε where =/> ε or worse produce the sentential forms: n B A D A B A set of equivalent (non left recursive) rules are A set of equivalent (non left recursive) rules are ε D A’ A G D A’ A’ ε A’ A’ It is relatively easy to remove left recursion from a grammar / Faculteit Wiskunde en Informatica / Faculteit Wiskunde en Informatica 22-9-2011 PAGE 10 22-9-2011 PAGE 11
Syntactical analysis Syntactical analysis Example: Example: (1) E + T E (immediate left rec.) (1’) T E’ E (2) T E (2’) + T E’ E’ (3) T F T (immediate left rec.) (3) T * F T (immediate left rec ) E E’ (2 ) ε (2’’) ε (4) F T (5) ( E ) F the same for: (6) a F (6) F T’ T (3’) * F T’ T’ (4’) Applying the left recursion elimination transformation: Applying the left recursion elimination transformation: T T’ (4’’) ε (4 ) ε (1) E E (with = + T ) (2) E (with = T ) / Faculteit Wiskunde en Informatica 22-9-2011 PAGE 12 / Faculteit Wiskunde en Informatica 22-9-2011 PAGE 13 Syntactical analysis Syntactical analysis Indirect left recursion elimination This process is repeated until either • Suppose we have a rule of the form • t A ; the process stops, or B A • A A ; the immediately left recursion elimination rule can be 1 B applied applied 2 B … n B • The rule B A is now transformed into: 1 A A 2 A … n A / Faculteit Wiskunde en Informatica / Faculteit Wiskunde en Informatica 22-9-2011 PAGE 14 22-9-2011 PAGE 15
Syntactical analysis Syntactical analysis Left factorization Example • In general it is efficient to move the difference between the I l i i ffi i h diff b h if b then S else S S alternatives of a non-terminal as far as possible to the left S if b then S • Productions of the form Only at the occurrence of else it can be decided which y 1 A 1 A 2 A alternative should have been selected … n A An equivalent grammar is • Are equivalent with if b then S S’ S A’ A S’ else S 1 A A’ S’ ε … n A’ / Faculteit Wiskunde en Informatica 22-9-2011 PAGE 16 / Faculteit Wiskunde en Informatica 22-9-2011 PAGE 17 Syntactical analysis Syntactical analysis Top-down parsing • Left recursion elimination and left factorization: • A top-down parser “guesses” the next alternative to be • introduce new (extra) non-terminals recognized, and verifies whether this alternative can be • change the structure of the derivation tree recognized in the input. If not, another alternative will be tried • may influence semantic actions connected to grammar rules • may influence semantic actions connected to grammar rules • Constructs the parse tree starting at the root C t t th t t ti t th t • Finds the leftmost derivation of the sentence • Alternative types of top-down parsers: Al i f d recursive descent parser with backtracking • recursive descent parser without backtracking (“predictive parser”) • non recursive predictive parser (uses push down automaton) non-recursive predictive parser (uses push-down automaton) • generalized parser • / Faculteit Wiskunde en Informatica / Faculteit Wiskunde en Informatica 22-9-2011 PAGE 18 22-9-2011 PAGE 19
Recommend
More recommend