categorial grammar
play

Categorial Grammar C a gr C oltekin c.coltekin@rug.nl - PowerPoint PPT Presentation

Categorial Grammar C a gr C oltekin c.coltekin@rug.nl November 18, 2008 1 / 28 Overview A review of CFGs and Chomsky hierarchy Categorial Grammar Categorial Grammar and semantics Learning Categorial Grammars 2 /


  1. Categorial Grammar C ¸a˘ grı C ¸¨ oltekin c.coltekin@rug.nl November 18, 2008 1 / 28

  2. Overview ◮ A review of CFGs and Chomsky hierarchy ◮ Categorial Grammar ◮ Categorial Grammar and semantics ◮ Learning Categorial Grammars 2 / 28

  3. Grammars ◮ A grammar is a set of rules governing use of a given natural language. ◮ Formal grammars are precise description of a given formal language. They are commonly used to describe components of natural language grammar, such as syntax. ◮ The grammar of a language recognizes and generates all and the only set of strings (sentences, phrases) that belongs to that language. 3 / 28

  4. Context-free grammars Formally, a Context-free grammar (CFG) is specified by a tuple ( V , S , Σ , R ), where: ◮ V is a finite set of non-terminal symbols. ◮ S ∈ V is the start symbol (sentence). ◮ Σ is a finite set of terminal symbols. ◮ R is a set of rules of the form X → y where X is a single symbol from V , and y is a (possibly empty) string of terminal and non-terminal symbols. 4 / 28

  5. CFGs for natural language syntax Example: derivation of sentence ‘She read a nice book’. The grammar Derivation Tree S S → NP VP S ⇒ NP VP VP → V NP NP ⇒ she NP VP NP → DET N VP ⇒ V NP N → ADJ N V ⇒ read She V NP NP ⇒ DET N NP → she DET ⇒ a V → read read DET N N ⇒ ADJ N DET → a ADJ ⇒ nice ADJ → nice ADJ N a N ⇒ book N → book nice book 5 / 28

  6. Chomsky hierarchy of (formal) languages Grammar Language Automaton Unrestricted (type-0) Recursively enumerable Turing machine Context-sensitive (type-1) Context-sensitive Linear-bounded Context-free (type-2) Context-free Push-down Regular (type-3) Regular Finite State ◮ Each language in the hierarchy is a proper subset of the ones higher in the hierarchy. ◮ We try to find the most restrictive grammar that is adequate to describe the language. ◮ The syntax natural languages are known to be (slightly) more complex than context-free, generally referred to as mildly context-free . 6 / 28

  7. Categorial grammars: overview ◮ CG has a long history: basic ideas dates back to 1935. ◮ CG is ‘radically’ lexicalized: all language specific information resides in the lexicon. ◮ Generative power of CG is equivalent to CFG. ◮ CG assumes a strong relation between syntax and semantics. 7 / 28

  8. Categorial grammars: categories Formally a CG is specified by the tuple ( A , S , Σ), where: ◮ A is a set of atomic (or basic) categories. ◮ S ∈ A is the start symbol. ◮ Σ is lexicon containing lexical items of the form word := category where category can be any valid CG category. ◮ Valid CG categories consist of ◮ A finite set of basic categories, A . ◮ A set of complex categories, C , such that: ◮ If X , Y ∈ A , then ( X \ Y ) , ( X / Y ) ∈ C ◮ If X , Y ∈ C , then ( X \ Y ) , ( X / Y ) ∈ C 8 / 28

  9. Categorial grammars: rules CG has two operations (combinators): ◮ Forward Application: X / Y Y ⇒ X ( > ) ◮ Backward Application: Y X \ Y ⇒ X ( < ) 9 / 28

  10. CG for natural languages Basic categories are S , N and NP . Example lexicon: she := NP read := (S \ NP)/NP a := NP/N nice := N/N book := N An example derivation: she read a nice book NP (S \ NP)/NP (NP/N) (N/N) N > N > NP > (S \ NP) < S 10 / 28

  11. CG lexical categories: more examples Conventional name CG category Example Proper nouns NP Mary Common nouns N book Determiners NP/N the Adjectives N/N green Intransitive verbs S \ NP sleep Transitive verbs (S \ NP)/NP read Ditransitive verbs ((S \ NP)/NP)/NP give Adverbs (S \ NP) \ (S \ NP) well (N \ N)/NP Prepositions with ((S \ NP) \ (S \ NP))/NP 11 / 28

  12. CG lexical categories: more derivations she saw the boy with a book NP (S \ NP)/NP (NP/N) N (N \ N)/NP (NP/N) N > NP > N \ N < N > NP > S \ NP < S 12 / 28

  13. CG lexical categories: more derivations (2) she saw the boy with a telescope NP (S \ NP)/NP (NP/N) N ((S \ NP) \ (S \ NP))/NP (NP/N) N > > NP NP > > S \ NP (S \ NP) \ (S \ NP) < S \ NP < S 13 / 28

  14. CG and semantics ◮ We extend categories to include semantic types. ◮ The function application rules become: Forward Application: X / Y : f Y : a ⇒ X : fa ( > ) Backward Application: Y : a X \ Y : f ⇒ X : fa ( < ) ◮ Example lexicon extended with semantic types: she ′ she := NP : read := (S \ NP)/NP : λ x λ y . like ′ xy a := NP/N : λ x . a ′ x nice := N/N : λ x . nice ′ x book := N : book ′ 14 / 28

  15. Yet another example derivation ◮ Lexicon: walk := S \ NP : λ x . walk ′ x kitties := NP : cats ′ milk := NP : milk ′ eat := (S \ NP)/NP : λ x λ y . like ′ xy ◮ An example derivation: kitties eat milk NP : cats ′ (S \ NP)/NP : λ x λ y . like ′ xy NP : milk ′ > S \ NP: λ y . like ′ milk ′ y < S: like ′ milk ′ cats ′ 15 / 28

  16. CFG vs. CG: the lexicon and rules CFG: CG: NP → she she := NP V → read read := (S \ NP)/NP DET → a a := NP/N ADJ → nice nice := N/N N → book book := N S → NP VP VP → V NP X / Y Y ⇒ X ( > ) NP → DET N Y X \ Y ⇒ X ( < ) N → ADJ N 16 / 28

  17. CFG vs. CG: derivation CFG: S NP VP She V NP read DET N CG: ADJ N a she read a nice book nice book NP (S \ NP)/NP (NP/N) (N/N) N > N > NP > (S \ NP) < S 17 / 28

  18. Beyond context free power The CG has extensions that provide expressive capacity for covering non-context-free phenomena in natural languages. Combinatory Categorial Grammar (CCG), a popular extension of CG, adds a few more rules. Function composition rules: Forward X / Y : f Y / Z : g ⇒ X / Z : λ x . f ( gx ) ( > B ) Backward Y \ Z : f X \ Y : g ⇒ X \ Z : λ x . f ( gx ) ( < B ) Forward cross X / Y : f Y \ Z : g ⇒ X \ Z : λ x . f ( gx ) ( > B × ) Backward cross Y / Z : f X \ Y : g ⇒ X / Z : λ x . f ( gx ) ( < B × ) Type raising rules: Forward X : a ⇒ T / (T \ X) : λ f . fa ( > T ) Backward X : a ⇒ T \ (T / X) : λ f . fa ( < T ) 18 / 28

  19. Why learning with categorial grammars? ◮ Highly lexicalized ◮ Based on sound mathematical formalisms ◮ Transparency between syntax and semantics ◮ Encouraging formal results from learning theory ◮ Extensions (e.g. CCG) are possible for wider coverage of human languages 19 / 28

  20. Learning CG ◮ Assume the combinators (operations) are given ◮ Input is (somewhat noisy) valid sentences kitties eat milk penguin eats cookies ◮ Output is a lexicalized grammar milk := NP : milk ′ cookies := NP : cookies ′ penguin := NP : penguin ′ kitties := NP : cats ′ eat := (S \ NP)/NP : λ x λ y . eat ′ x y 20 / 28

  21. Learning CG: generating hypotheses Assuming input ‘ kitties eat milk ’, and only possible lexical categories NP , (S \ NP)/NP : milk := NP 0.8 milk := (S \ NP)/NP 0.2 kitties := NP 0.7 kitties := (S \ NP)/NP 0.2 eat := NP 0.3 eat := (S \ NP)/NP 0.6 This is overly simplified, hypothesis generation is complicated. 21 / 28

  22. Learning CG: problems Assuming we have K categories, with input of length N . ◮ Number of lexical hypotheses to generate are N × K . ◮ This amounts to K N possible lexical category assignments for every input sentence. ◮ To validate (parse) the input, we need to consider (2 N )! C N = ( N +1)! N ! different number of pairings. ◮ K can be infinite! 22 / 28

  23. Learning CG: some possible solutions We do not have any labeled data, but we have certain cues/constraints that may help: ◮ Some hypotheses are impossible. ◮ Lexical items consistently occurring in the same context are likely to have same categories. ◮ Sentences have to parse to S . ◮ When learning with semantics, the semantic output has to ‘make sense’. ◮ Certain category structures are likely to occur in natural languages. ◮ Certain languages tend to use certain category structures. ◮ We expect a tendency towards unambiguous lexical items. ◮ We expect lexicon to be compact. 23 / 28

  24. Learning CG: a short example from learning morphology Input: a-dam-lar : plural ( man ) Lexicon contains: adam := N : man 1. Generate all lexical hypotheses: a := N : man dam.lar := N : man a := N plu / N : λ x . plural ( x ) dam.lar := N plu \ N dat : λ x . plural ( x ) a.dam := N : man lar := N : man a.dam := N plu / N : λ x . plural ( x ) lar := N plu \ N dat : λ x . plural ( x ) 2. Parse the input: (1) adam : man lar : λ x . plural ( x ) (3) a : man damlar : λ x . plural ( x ) N : man N plu \ N : λ x . plural ( x ) N : man N plu \ N : λ x . plural ( x ) < < N plu : plural ( man ) N plu : plural ( man ) (2) adam : λ x . plural ( x ) lar : man (4) a : λ x . plural ( x ) damlar : man N plu / N : λ x . plural ( x ) N : man N plu / N : λ x . plural ( x ) N : man < < N plu : plural ( man ) N plu : plural ( man ) 3. Parse (1) scores highest, since it is supported by the lexicon. 3.1 Item ‘ lar := N plu \ N dat : λ x . plural ( x )’ inserted into lexicon 3.2 Weight of item ‘ adam := N : man ’ is increased. 24 / 28

  25. 25 / 28

  26. Example: cross-serial dependencies dat Jan 1 Marie 2 het boek 3 wil 1 laten 2 lezen 3 that Jan Marie the book wants let read that Jan wants to let Marie read the book 26 / 28

Recommend


More recommend