learning with partially ordered representations
play

Learning with Partially Ordered Representations Jane Chandlee, Remi - PowerPoint PPT Presentation

Learning Features Grammars Learning Algorithm References Learning with Partially Ordered Representations Jane Chandlee, Remi Eyraud, Jeffrey Heinz, Adam Jardine, Jonathan Rawski 1 Learning Features Grammars Learning Algorithm References


  1. Learning Features Grammars Learning Algorithm References Learning with Partially Ordered Representations Jane Chandlee, Remi Eyraud, Jeffrey Heinz, Adam Jardine, Jonathan Rawski 1

  2. Learning Features Grammars Learning Algorithm References Jane Chandlee Remi Eyraud (Haverford) (Aix-Marseille) Jeff Heinz Adam Jardine (Stony Brook) (Rutgers) 2

  3. Learning Features Grammars Learning Algorithm References The Talk in a Nutshell Previously ◮ Efficient Learning of subregular languages and functions ◮ Question: How to extend these learners for multiple, shared properties? Today ◮ Describe model-theoretic characterization of strings and trees ◮ Describe the partial order structure of the space of feature-based hypotheses ◮ Showcase a learning algorithm which exploits this structure to generalize from data to grammars. 3

  4. Learning Features Grammars Learning Algorithm References The Talk in a Nutshell Previously ◮ Efficient Learning of subregular languages and functions ◮ Question: How to extend these learners for multiple, shared properties? Today ◮ Describe model-theoretic characterization of strings and trees ◮ Describe the partial order structure of the space of feature-based hypotheses ◮ Showcase a learning algorithm which exploits this structure to generalize from data to grammars. 3

  5. Learning Features Grammars Learning Algorithm References Finite Word Models ‘word’ is synonymous with ‘structure.’ ◮ A model of a word is a representation of it. ◮ A (Relational) Model contains two kinds of elements. ◮ A domain: a finite set of elements. ◮ Relations over domain elements. ◮ Every word has a model. ◮ Different words have different models. 4

  6. Learning Features Grammars Learning Algorithm References Finite Word Models 5

  7. Learning Features Grammars Learning Algorithm References Finite Word Models 1. Successor (Immediate Precedence) a b b a ⊳ ⊳ ⊳ 1 2 3 4 2. General precedence < < a b b a < < < 1 2 3 4 < 6

  8. Learning Features Grammars Learning Algorithm References Tree Models (Rogers 2003) Pic courtesy of Rogers 2014 ESSLLI course. 7

  9. Learning Features Grammars Learning Algorithm References Subregular Hierarchy (Rogers et al 2013) 8

  10. Learning Features Grammars Learning Algorithm References Local Factors Pics courtesy of Heinz and Rogers 2014 ESSLLI course. 9

  11. Learning Features Grammars Learning Algorithm References Locality and Projection Theorem (Medvedev) A set of strings is Regular iff it is a homomorphic image of a Strictly 2-Local set. Theorem (Thatcher) A set of Σ -labeled trees is recognizable by a finite-state tree automaton (i.e. regular) iff it is a projection of a local set of trees. Theorem (Thatcher) A set of strings L is the yield of a local set of trees (equivalently, is the yield of a recognizable set of trees) iff it is Context-Free. 10

  12. Learning Features Grammars Learning Algorithm References Unconventional Word Models Successor (Immediate Precedence) voiced voiced vowel vowel back labial labial back stop stop low low ⊳ ⊳ ⊳ 1 2 3 4 voiced stop stop ⊳ 11

  13. Learning Features Grammars Learning Algorithm References The Challenge of Features Distinctive Feature Theory “part of the heart of phonology” — Rice (2003) “The most fudamental insight gained during the last century” — Ladefoged & Halle (1988) *NT → { *nt, *np, *nk, *mt, *mp, *mk, ... } Wilson & Gallagher 2018 “Could there be a non-statistical model that learns by memorizing feature sequences? The problem confronting such a model is that any given segment sequence has may different featural representations. Without a method for deciding which representations are relevant for assessing wellformedness (the role that statistics plays in Maxent) learning is doomed .” 12

  14. Learning Features Grammars Learning Algorithm References The Challenge of Features Distinctive Feature Theory “part of the heart of phonology” — Rice (2003) “The most fudamental insight gained during the last century” — Ladefoged & Halle (1988) *NT → { *nt, *np, *nk, *mt, *mp, *mk, ... } Wilson & Gallagher 2018 “Could there be a non-statistical model that learns by memorizing feature sequences? The problem confronting such a model is that any given segment sequence has may different featural representations. Without a method for deciding which representations are relevant for assessing wellformedness (the role that statistics plays in Maxent) learning is doomed .” 12

  15. Learning Features Grammars Learning Algorithm References Example Imagine the sequence nt is not present in a corpus. There are many possible equivalent constraints: *nt *[+nasal][+coronal] *[+consonant][+coronal,-continuant] *[+sonorant][-sonorant] .... How can a learner decide which of these constraints is responsible for the absence of nt ? 13

  16. Learning Features Grammars Learning Algorithm References Example Imagine the sequence nt is not present in a corpus. There are many possible equivalent constraints: *nt *[+nasal][+coronal] *[+consonant][+coronal,-continuant] *[+sonorant][-sonorant] .... How can a learner decide which of these constraints is responsible for the absence of nt ? 13

  17. Learning Features Grammars Learning Algorithm References Example Imagine the sequence nt is not present in a corpus. There are many possible equivalent constraints: *nt *[+nasal][+coronal] *[+consonant][+coronal,-continuant] *[+sonorant][-sonorant] .... How can a learner decide which of these constraints is responsible for the absence of nt ? 13

  18. Learning Features Grammars Learning Algorithm References Constraint Explosion (Hayes and Wilson 2008) As we add segments and features, the amount of possible hypotheses grows larger. How much larger? 14

  19. Learning Features Grammars Learning Algorithm References Some definitions Definition (Restriction) A = � D A ; ≻ , R A 1 ,..., R A n � is a restriction of B = � D B ; ≻ , R B 1 ,..., R B n � iff D A ⊆ D B and for each m -ary relation R i , we have R A i = { ( x 1 ... x m ) ∈ R B i | x 1 ,..., x m ∈ D A } . Intuition: Identifies a subset A of the domain of B and strips B of all elements and relations which are not wholly within A . 15

  20. Learning Features Grammars Learning Algorithm References Some definitions Definition (Subfactor) Structure A is a subfactor of structure B ( A ⊑ B ) if A is connected, there exists a restriction of B denoted B ′ , and there exists h : A → B ′ such that for all a 1 ,... a m ∈ A and for all R i in the model signature: if h ( a 1 ) ,... h ( a m ) ∈ B ′ and R i ( a 1 ,... a m ) holds in A then R i ( h ( a 1 ) ,... h ( a m )) holds in B ′ . If A ⊑ B we also say that B is a superfactor of A . Intuition: Properties that hold of the connected structure A also hold in a related way within B. 16

  21. Learning Features Grammars Learning Algorithm References Subfactor Ideals Definition (Ideal) A non-empty subset S of a poset � A , ≤� is an ideal iff ◮ for every x ∈ S , y ≤ x implies y ∈ S , and ◮ for all x , y ∈ S there is some z ∈ S s.t x ≤ z and y ≤ z . Example [-N,+V,+C] [-N,+V] [-N,+C] [-N] 17

  22. Learning Features Grammars Learning Algorithm References Grammatical Entailments Subfactor Ideals If s is a subfactor of t for G and G generates t , then G generates s . Example [-N,+V,+C] [-N,+V] [-N,+C] [-N] 18

  23. Learning Features Grammars Learning Algorithm References Grammatical Entailments Subfactor Ideals If s is a subfactor of t for G and G generates t , then G generates s . Example [-N,+V,+C] [-N,+V] [-N,+C] [-N] 18

  24. Learning Features Grammars Learning Algorithm References Grammatical Entailments Subfactor Ideals If s is a subfactor of t for G and G generates t , then G generates s . Example [-N,+V,+C] � [-N,+V] [-N,+C] [-N] 18

  25. Learning Features Grammars Learning Algorithm References Grammatical Entailments Subfactor Ideals If s is a subfactor of t for G and G generates t , then G generates s . Example [-N,+V,+C] � [-N,+V] [-N,+C] [-N] 18

  26. Learning Features Grammars Learning Algorithm References Grammatical Entailments Subfactor Ideals If s is a subfactor of t for G and G generates t , then G generates s . Example [-N,+V,+C] � [-N,+V] [-N,+C] � [-N] 18

  27. Learning Features Grammars Learning Algorithm References Grammatical Entailments Subfactor Ideals If s is a subfactor of t for G and G generates t , then G generates s . Example [-N,+V,+C] � * [-N,+V] [-N,+C] � [-N] 18

  28. Learning Features Grammars Learning Algorithm References Grammatical Entailments Subfactor Ideals If s is a subfactor of t for G and G generates t , then G generates s . Example [-N,+V,+C] � * [-N,+V] [-N,+C] � [-N] 18

  29. Learning Features Grammars Learning Algorithm References Grammatical Entailments Subfactor Ideals If s is a subfactor of t for G and G generates t , then G generates s . Example * [-N,+V,+C] � * [-N,+V] [-N,+C] � [-N] 18

  30. Learning Features Grammars Learning Algorithm References Example with Singular Segments 19

Recommend


More recommend