CSCI 3136 Principles of Programming Languages Syntactic Analysis and Context-Free Grammars - 2 Summer 2013 Faculty of Computer Science Dalhousie University 1 / 13
Language defined by a CFG L ( G ) = { w | S ⇒ ∗ w } If G is a grammar, the language of the grammar, denoted as L ( G ), is the set of terminal strings that have derivations from the start symbol. • Languages defined by CFGs are context-free languages • Example: Our ‘ Jim-ate-cheese ’ grammar generates the following language: L ( G ) = { “ Jim ate cheese ” , “ Jim ate Jim ” , “ cheese ate cheese ” , “ cheese ate Jim ” , “ big Jim ate cheese ” , “ big Jim ate Jim ” , “ big cheese ate cheese ” , · · · } 2 / 13
CF Languages: Example (1) G = ( { S } , { 0 , 1 } , { S → 0 S 1 | ǫ } , S ) • Is ǫ in L(G)? • Is 01 in L(G)? • Is 0011 in L(G)? • Is 0 n 1 n in L(G)? What language is defined by the following CFG? S → ǫ S → 0 S 1 3 / 13
CF Languages: Example (2) What language is defined by the following CFG? S → ǫ S → 0 S 0 S → 1 S 1 Neither of these two languages is regular. There are more context-free languages than regular ones. 4 / 13
Classes of Chomsky Hierarchy 5 / 13
Regular grammar (1) A right regular grammar (or right linear grammar ) is a formal grammar ( N , Σ , P , S ) such that all the production rules in P are of one of the following forms: 1. B → a , where B ∈ N and a ∈ Σ 2. B → aC , where B , C ∈ N and a ∈ Σ 3. B → ǫ , where B ∈ N and ǫ denotes the empty string In a left regular grammar (or left linear grammar ), all rules obey the forms: 1. B → a , where B ∈ N and a ∈ Σ 2. B → Ca , where B , C ∈ N and a ∈ Σ 3. B → ǫ , where B ∈ N and ǫ denotes the empty string A (context-free) grammar is regular if it is a left or right regular grammar. Regular grammars are too weak to express programming languages. 6 / 13
Regular grammar (2) Is the following grammar (with start non-terminal S ) regular or context-free ? S → 0 S S → 1 A A → ǫ A → 2 A If it is a regular grammar then what is the equivalent regular expression? Otherwise, write down the context-free language defined by the above grammar. 7 / 13
Regular grammar (3) What language is defined by the following grammar (with start non-terminal S )? S → 0 A A → S 1 S → ǫ Is the language defined by the above grammar regular or context-free ? Prove your answer. 8 / 13
Parse Trees Every derivation can be represented by a parse tree : ———————– • S ⇒ P V P ⇒ N V P ⇒ N V N ⇒ Jim V N S → P V P ⇒ Jim ate N ⇒ Jim ate cheese P → N P → A P S A → big|green P V P N → cheese|Jim V → ate N N ate ———————— Jim cheese • The root is S • The children of each node are the symbols (terminals and non-terminals) it is replaced with. • The internal nodes are non-terminals. The leaves–called the yield of the parse-tree–are terminals. 9 / 13
Parse Tree (another example) ———————– • S ⇒ P V P ⇒ A P V P ⇒ big P V P ⇒ S → P V P big N V P ⇒ big Jim V P ⇒ big Jim ate P P → N ⇒ big Jim ate A P ⇒ big Jim ate green P P → A P ⇒ big Jim ate green N A → big|green ⇒ big Jim ate green cheese N → cheese|Jim V → ate S ———————— P V P A P ate A P N N big green Jim cheese 10 / 13
Ambiguity: Example A CFG for 2 + 3 ∗ 4 E arithmetic expressions E ———————– E * E E → n E + E E → i E + E 4 E → E + E 2 E * E E → E − E 2 3 E → E ∗ E 3 4 Violates precedence rules. E → E / E E → ( E ) ———————— 11 / 13
Ambiguity • There may be more than one parse tree for the same sentence generated by a grammar G. If this is the case, we call G ambiguous . • Problems of ambiguous grammar: one sentence has different interpretations. 12 / 13
Some facts about multiple representations • There are infinitely many context-free grammars generating a given context-free language. (How?) • There may be more than one parse tree for the same sentence generated by a grammar. • For programming languages, we require that CFG is unambiguous . 13 / 13
Recommend
More recommend