Introduction Context-Free Grammars Radboud University Nijmegen Regular Grammars The CYK Algorithm Formal Languages, Grammars and Automata Lecture 5 Helle Hvid Hansen helle@cs.ru.nl http://www.cs.ru.nl/~helle/ Foundations Group – Intelligent Systems Section Institute for Computing and Information Sciences Radboud University Nijmegen 6 June 2014 Helle Hvid Hansen 6 June 2014 FLGA 1 / 19
Introduction Context-Free Grammars Radboud University Nijmegen Regular Grammars The CYK Algorithm Midterm Enquete Results lecture level count exercise level count (very slow) 0 0 (very easy) 0 0 1 4 1 3 (good) 2 7 (appropriate) 2 9 3 5 3 6 (very fast) 4 3 (very hard) 4 0 average: 2.4 average: 2.2 Comments: • Sometimes too fast, sometimes too slow (3 students). • Solutions online (2 students). Helle Hvid Hansen 6 June 2014 FLGA 2 / 19
Introduction Context-Free Grammars Radboud University Nijmegen Regular Grammars The CYK Algorithm Overview • Applications of finite automata: text search, natural language processing, lexical analysis (parsing), biology, video games (PacMan), internet protocols (TCP), ... (see notes on webpage). • Programming languages (like Java etc.) and natural languages are generally not regular. Helle Hvid Hansen 6 June 2014 FLGA 3 / 19
Introduction Context-Free Grammars Radboud University Nijmegen Regular Grammars The CYK Algorithm Overview • Applications of finite automata: text search, natural language processing, lexical analysis (parsing), biology, video games (PacMan), internet protocols (TCP), ... (see notes on webpage). • Programming languages (like Java etc.) and natural languages are generally not regular. Formal Languages Grammars Automata (generators) (acceptors) Helle Hvid Hansen 6 June 2014 FLGA 3 / 19
Introduction Context-Free Grammars Radboud University Nijmegen Regular Grammars The CYK Algorithm Today Topics: • Context-free grammars and context-free languages • Regular grammars. Motivation/Application: Compilation of programming languages. Helle Hvid Hansen 6 June 2014 FLGA 4 / 19
� � � Introduction Context-Free Grammars Radboud University Nijmegen Regular Grammars The CYK Algorithm Compilation and Parsing source code (ASCII) lexical analysis (DFA) token string (context-free language) parsing parse tree (data structure) code generation executable code Helle Hvid Hansen 6 June 2014 FLGA 5 / 19
Introduction Context-Free Grammars Radboud University Nijmegen Regular Grammars The CYK Algorithm Generating Strings with Production Rules Example: S → O | E E → λ | aEa | bEb O → a | b | aOa | bOb Helle Hvid Hansen 6 June 2014 FLGA 6 / 19
Introduction Context-Free Grammars Radboud University Nijmegen Regular Grammars The CYK Algorithm Generating Strings with Production Rules Example: S → O | E E → λ | aEa | bEb O → a | b | aOa | bOb Productions, always start with S S → E → aEa → abEba → abba Helle Hvid Hansen 6 June 2014 FLGA 6 / 19
Introduction Context-Free Grammars Radboud University Nijmegen Regular Grammars The CYK Algorithm Generating Strings with Production Rules Example: S → O | E E → λ | aEa | bEb O → a | b | aOa | bOb Productions, always start with S S → E → aEa → abEba → abba S → E → bEb → baEab → babEbab → babaEabab → babaabab Helle Hvid Hansen 6 June 2014 FLGA 6 / 19
Introduction Context-Free Grammars Radboud University Nijmegen Regular Grammars The CYK Algorithm Generating Strings with Production Rules Example: S → O | E E → λ | aEa | bEb O → a | b | aOa | bOb Productions, always start with S S → E → aEa → abEba → abba S → E → bEb → baEab → babEbab → babaEabab → babaabab S → O → bOb → bab Helle Hvid Hansen 6 June 2014 FLGA 6 / 19
Introduction Context-Free Grammars Radboud University Nijmegen Regular Grammars The CYK Algorithm Generating Strings with Production Rules Example: S → O | E E → λ | aEa | bEb O → a | b | aOa | bOb Productions, always start with S S → E → aEa → abEba → abba S → E → bEb → baEab → babEbab → babaEabab → babaabab S → O → bOb → bab S → O → bOb → baOab → babab Helle Hvid Hansen 6 June 2014 FLGA 6 / 19
Introduction Context-Free Grammars Radboud University Nijmegen Regular Grammars The CYK Algorithm Generating Strings with Production Rules Example: S → O | E E → λ | aEa | bEb O → a | b | aOa | bOb Productions, always start with S S → E → aEa → abEba → abba S → E → bEb → baEab → babEbab → babaEabab → babaabab S → O → bOb → bab S → O → bOb → baOab → babab We can generate exactly the set of words w with w = w R ( w is a palindrome) Helle Hvid Hansen 6 June 2014 FLGA 6 / 19
Introduction Context-Free Grammars Radboud University Nijmegen Regular Grammars The CYK Algorithm Context-Free Grammar Def. A context-free grammar (CFG) G = ( V , Σ , S , P ) consists of V a set of non-terminal symbols Σ a set of terminal symbols S a start symbol, S ∈ V a set of production rules of the form X → w P where X ∈ V , w ∈ ( V ∪ Σ) ∗ Helle Hvid Hansen 6 June 2014 FLGA 7 / 19
Introduction Context-Free Grammars Radboud University Nijmegen Regular Grammars The CYK Algorithm Context-Free Grammar Def. A context-free grammar (CFG) G = ( V , Σ , S , P ) consists of V a set of non-terminal symbols Σ a set of terminal symbols S a start symbol, S ∈ V a set of production rules of the form X → w P where X ∈ V , w ∈ ( V ∪ Σ) ∗ Notation (Backus-Naur Form or BNF): Group together rules for the same non-terminal: E → λ | aEa | bEb is shorthand for three rules: E → λ, E → aEa , E → bEb Helle Hvid Hansen 6 June 2014 FLGA 7 / 19
Introduction Context-Free Grammars Radboud University Nijmegen Regular Grammars The CYK Algorithm Derivations and Language Let G = ( V , Σ , S , P ) be a context-free grammar, and let u , v , w ∈ ( V ∪ Σ) ∗ be arbitrary. • Given a string uXv ∈ ( V ∪ Σ) ∗ , we can apply a rule X → w : uXv → uwv • A derivation of u from v is a sequence of rule applications: v → v ′ → v ′′ → · · · → u We say that u can be derived from v in G if there is a derivation of u from v in G , and write v ⇒ u . Helle Hvid Hansen 6 June 2014 FLGA 8 / 19
Introduction Context-Free Grammars Radboud University Nijmegen Regular Grammars The CYK Algorithm Derivations and Language Let G = ( V , Σ , S , P ) be a context-free grammar, and let u , v , w ∈ ( V ∪ Σ) ∗ be arbitrary. • Given a string uXv ∈ ( V ∪ Σ) ∗ , we can apply a rule X → w : uXv → uwv • A derivation of u from v is a sequence of rule applications: v → v ′ → v ′′ → · · · → u We say that u can be derived from v in G if there is a derivation of u from v in G , and write v ⇒ u . • Why “context-free”? A rule X → w can be applied in any context to replace X with w . (There are also context-sensitive grammars) Helle Hvid Hansen 6 June 2014 FLGA 8 / 19
Introduction Context-Free Grammars Radboud University Nijmegen Regular Grammars The CYK Algorithm Language and Parsing • The language generated by G is L ( G ) = { w ∈ Σ ∗ | S ⇒ w } • A parser is an algorithm that determines for given G and w whether w ∈ L ( G )? Note: in general there can be many derivations of a w in G . Helle Hvid Hansen 6 June 2014 FLGA 9 / 19
Introduction Context-Free Grammars Radboud University Nijmegen Regular Grammars The CYK Algorithm Language and Parsing • The language generated by G is L ( G ) = { w ∈ Σ ∗ | S ⇒ w } • A parser is an algorithm that determines for given G and w whether w ∈ L ( G )? Note: in general there can be many derivations of a w in G . Def. A derivation is leftmost if in each step a rule is applied to the leftmost non-terminal. (Think of reading from left to right). Rightmost derivations are defined analogously. Lemma: For a CFG G and word w , w ∈ L ( G ) iff there is a leftmost derivation of w in G . (So we can restrict to searching for a leftmost derivation.) Helle Hvid Hansen 6 June 2014 FLGA 9 / 19
Introduction Context-Free Grammars Radboud University Nijmegen Regular Grammars The CYK Algorithm Another Example G : S → aSB | λ, B → b Derivations of aabb : d 1 : S → aSB → aaSBB → aaBB → aabB → aabb d 2 : S → aSB → aSb → aaSBb → aaSbb → aabb d 3 : S → aSB → aSb → aaSBb → aaBb → aabb (Beware of typo in derivations in [Silva], Example 5.2.4) Derivation d 1 is leftmost, d 2 is rightmost, d 3 is neither. Derivation Trees (on blackboard). Helle Hvid Hansen 6 June 2014 FLGA 10 / 19
Introduction Context-Free Grammars Radboud University Nijmegen Regular Grammars The CYK Algorithm Ambiguity • Def. A CFG G is unambiguous if for each w ∈ L ( G ) there exists a unique leftmost derivation of w in G . Otherwise, G is ambiguous. Helle Hvid Hansen 6 June 2014 FLGA 11 / 19
Introduction Context-Free Grammars Radboud University Nijmegen Regular Grammars The CYK Algorithm Ambiguity • Def. A CFG G is unambiguous if for each w ∈ L ( G ) there exists a unique leftmost derivation of w in G . Otherwise, G is ambiguous. • Two syntactically correct strings can have different meanings. Examples: – “Time flies like an arrow; fruit flies like a banana” – “The peasants are revolting” (“revolting”= disgusting/in rebellion) Helle Hvid Hansen 6 June 2014 FLGA 11 / 19
Recommend
More recommend