the big question how do we infer and reason about
play

The big question: How do we infer and reason about meanings of - PowerPoint PPT Presentation

The big question: How do we infer and reason about meanings of sentences? Conceptual importance: Discovering the process of cognition and intelligence. Applications: Automating language-related tasks, such as document search. The big


  1. The big question: How do we infer and reason about meanings of sentences? Conceptual importance: Discovering the process of cognition and intelligence. Applications: Automating language-related tasks, such as document search.

  2. The big challenge: Meaning of a sentence � = Collection of meanings of its words. � = { John , likes , Mary } John likes Mary . . . sentence word 1 word 2 word n � = A B Z S = Meaning of a sentence A function of meanings of its words. . . . word 1 word 2 word n A B Z sentence = process depending on S grammatical structure S

  3. Two complementary approaches to meaning 1- The logical or symbolic model  = Meaning of sentence A truth function of its words.  words = ∅ .  2- The vector space or distributional model  = Words Vectors built from context ,  function = ∅ .  . . . word 1 word 2 word n A B Z

  4. Logical vs Vector Space Models (I) Logical Models   Compositional ,         Pros : Model-theoretic semantics (Montague) ,         Automated inferences .      Qualitative (true-false) ,         Not very suitable for real world text ,    Cons :    Says very little about lexical semantics ,            Forgets some of the syntactic structure .  (II) Vector Space Model  Cons: Non-compositional .         Quantitative ,    Pros :    All about lexical semantics .   

  5. A formalism with the best of the two: Compositional & Distributional A function of the − − − − − → Meaning of a sentence = vectors of its words. . . . word 1 word 2 word n A B Z sentence = process depending on S grammatical structure S

  6. Compositional Distributional Models of Meaning Clark, Coecke, Grefenstette, Pulman, Sadrzadeh Computing and Computer Laboratories Oxford and Cambridge

  7. Aim: Understanding this model. Theoretical Preliminaries 0- Some Category Theory 1- Pregroup Grammars 2- Vector Space Models 3- Pregroups and Vector Spaces Categorically 5- Combining the two: Categorical Semantics for Compositional Distributional Models Ed - Concrete: Implementation, Evaluation, Experiments.

  8. Some Category Theory A category has - Objects: A, B, C - Morphisms: f, g, h f ✲ B g ✲ C A B and -The morphisms must compose: f ✲ B and B g ✲ C then ∃ h, A h ✲ C such that h = f ; g . If A - Each object has an identity morphism A 1 A ✲ A 1 B ✲ B B f ✲ B we have This is the unit of composition, i.e. for A 1 A ; f = f ; 1 B = f

  9. Example Objects Morphisms systems processes sets relations sets functions formulas proofs grammatical types grammatical reductions vector spaces linear maps

  10. Sets and Relations Objects: sets A = { x, y } B = { z, w } C = { s, t } Morphisms: Relations f ✲ B A is defined by f ⊆ { ( a, b ) | a ∈ A, b ∈ B } For instance f ✲ B A given by f = { ( x, z ) , ( x, w ) , ( y, z ) } g ✲ C g = { ( z, s ) , ( w, s ) } B given by

  11. Sets and Relations Composition: Composing Relations f ✲ B g ✲ C h ✲ C A ∃ h, A such that h = f ; g In general � � ( a, c ) | ∃ b, ( a, b ) ∈ f & ( b, c ) ∈ g f ; g = For instance in our example f ; g =?

  12. Sets and Relations Identity: Diagonal Relation 1 A = { ( a, a ) | a ∈ A } For our example 1 A = { ( x, x ) , ( y, y ) } 1 B = { ( z, z ) , ( w, w ) } These must satisfy 1 A ; f = f ; 1 B = f For instance compute 1 A ; f = { ( x, x ) , ( y, y ) } ; { ( x, z ) , ( x, w ) , ( y, z ) } and verify that it is = f

  13. Monoidal Category A category with a binary operation called tensor and denoted by ⊗ . This operator acts on two objects and returns their composite A ⊗ B It also acts on morphisms and turns them parallel  f ✲ A Q  A ⊗ B f ⊗ g ✲ Q ⊗ W If then g ✲  B W The tensor has a unit I , that is A ⊗ I = I ⊗ A = A

  14. Sets and Relations There is more than one ⊗ here, but for our purposes, given two sets A, B , we take their tensor product to be cartesian product A ⊗ B = { ( a, b ) | a ∈ A, b ∈ B } For our previous example we have A ⊗ B = { ( x, z ) , ( x, w ) , ( y, z ) , ( y, w ) } I = {∗} The unit is the singleton set A ⊗ I = A × I = { ( a, ∗ ) | a ∈ A } ∼ = { a | a ∈ A } = A Tensor on morphisms is cartesian product of relations.

  15. Diagrammatic Calculus The objects and morphisms of a monoidal category are usually de- picted as follows 1 A f g ; f 1 A ⊗ 1 B f ⊗ 1 C f ⊗ g ( f ⊗ g ); h C E D g f g B B D E f g f f C A A B B B C h f A C A B A A

  16. Diagrammatic Calculus The elements within the objects (e.g. elements of a set) can be depicted using the unit I as follows: ψ : I → A π : A → I π ◦ ψ : I → I π π A π ψ = ψ o A ψ A For instance the morphism I → A can be element x of A = { x, y } . x : I → { x, y }

  17. Compact Category A monoidal category where each object A has a left adjoint A l and a right adjoint A r . This means that for each object A , we have 4 morphisms in the category: ǫ l : A l ⊗ A → I ǫ r : A ⊗ A r → I η l : I → A ⊗ A l η r : I → A r ⊗ A Diagrammatically, these morphisms are depicted by: A A l A r A A r l A A A

  18. Compact Category These morphisms should satisfy: ( η l ⊗ 1 A ); (1 A ⊗ ǫ l ) = 1 A (1 A ⊗ η r ); ( ǫ r ⊗ 1 A ) = 1 A (1 A l ⊗ η l ); ( ǫ l ⊗ 1 A l ) = 1 A ( η r ⊗ 1 A r ); (1 A r ⊗ ǫ r ) = 1 A Diagrammatically, these are depicted by: A A A A l l = = l A A A A l r r A A A A = = r r A A A A

  19. Pregroups ( P, ≤ , • , I, ( − ) l , ( − ) r ) ∃ p r ∈ P, ∃ p l ∈ P ∀ p ∈ P, p l • p ≤ I ≤ p • p l p • p r ≤ I ≤ p r • p ⇒ q l ≤ p l , q r ≤ p r Adjoint are unique and anti-tone p ≤ q = I l = I r = I Unit is self adjoint ( p • q ) l = q l • p l ( p • q ) r = q r • p r So is multiplication ( p r ) r � = p � = ( p l ) l Same adjoint do not cancel out ( p l ) r = p = ( p r ) l But opposite adjoints do

  20. Example of a Proof: adjoints are unique. Suppose p has another left adjoint, call it x . This means x • p ≤ I ≤ p • x Now we have x = x • I ≤ x • p • p l = x • p • p l ≤ I • p l = p l Hence x ≤ p l Similarly p l = p l • I ≤ p l • p • x = p l • p • x ≤ I • x = x Hence p l ≤ x

  21. Example of a Proof • is self-dual We want to show the following (also for the right adjoint) ( p • q ) l = q l • p l Compute ( q l • p l ) • ( p • q ) = q l • ( p l • p ) • q ≤ q l • 1 • q = q l • q ≤ I Also ( p • q ) • ( q l • p l ) = p • ( q • q l ) • p l ≥ p • 1 • p l = p • p l ≥ I Hence we have ( q l • p l ) • ( p • q ) ≤ I ≤ ( p • q ) • ( q l • p l ) So q l • p l is the left adjoint to p • q , but so is ( p • q ) l . Since adjoints are unique, we get q l • p l = ( p • q ) l

  22. Examples of a Pregroup (0) A pregroup in which p l = p r = p − 1 is a (po)-group. (1) The set of all unbounded monotone functions on integers. f : Z → Z m ≤ n = ⇒ f ( m ) ≤ f ( n ) m → ∞ = ⇒ f ( m ) → ∞ and The order is defined pointwisely f ≤ g f ( n ) ≤ g ( n ) ∀ n ∈ Z iff The • is function composition and its unit is the identity ( f • g )( n ) = f ( g ( n )) and I ( n ) = n Adjoints are defined canonically, ∨ is max, ∧ is min f r ( x ) = ∨{ y ∈ Z | f ( y ) ≤ x } f l ( x ) = ∧{ y ∈ Z | x ≤ f ( y ) }

  23. Example 1) Take f ( x ) = 2 x . Define adjoints as follows: f r ( x ) = ∨{ y ∈ Z | 2 y ≤ x } f l ( x ) = ∧{ y ∈ Z | x ≤ 2 y } f r ( x ) = ⌊ x/ 2 ⌋ f l ( x ) = ⌊ ( x + 1) / 2 ⌋ and where ⌊ x ⌋ is the biggest integer less than or equal to x . 2) Restrict to N and a nice example is π r ( x ) =? π ( x ) = the x ’th prime π r (5) = 3 π (5) = 11

  24. Application to Linguistics Let Σ be the set of words of a natural language and B their types. Def. A Pregroup dictionary for Σ based on B is a binary relation D ⊆ Σ × T ( B ) where T ( B ) is the free pregroup generated over the partial order B . Def. A Pregroup grammar is a pair G = � D, s � of a pregroup dictionary and a distinguished element s ∈ B . Def. A string of words w 1 . . . w n of Σ is a grammatical sentence if and only if t 1 • · · · • t n ≤ s for ( w i , t i ) an element in D .

  25. Example A simple dictionary has basic types B = { π, o, w, s, q, q, j, σ } π, o, w stand for subject, direct object, indirect object, s, j stand for statement, infinitive of a verb, q, q stand for yes-no and wh questions, σ is an index type. Partial order π ≤ n, o ≤ n . Dictionary likes: π r so l does: π r sj l σ John: π like: σ r jπ l not: σ r jj l σ Mary : o

  26. Examples Compose the types of the constituents John likes Mary → statement ( π r s o l ) π o ≤ s Compute ππ r so l o ≤ 1 so l o ≤ 1 s 1 = s John does not like Mary → statement ( π r sj l σ ) ( σ r jj l σ ) ( σ r jo l ) ≤ π o s Compute: ππ r sj l σσ r jj l σσ r jo l o ≤ 1 sj l 1 jj l 1 j 1 = sj l jj l j ≤ s 11 = s Can you think of a simpler way to compute the above?

  27. Depicting the Reduction Each reduction corresponds to a diagram. John likes Mary John does not like Mary π r sj l σ σ r jj l σ σ r jo l π o π r s o l π o

Recommend


More recommend