a computational model of s selection
play

A computational model of S-selection Aaron Steven White 1 2 Kyle - PowerPoint PPT Presentation

A computational model of S-selection Aaron Steven White 1 2 Kyle Rawlins 1 Semantics and Linguistic Theory 26 University of Texas, Austin 14 th May, 2016 Johns Hopkins University 1 Department of Cognitive Science 2 Center for Language and Speech


  1. Lexical idiosyncrasy Lexical idiosyncrasy Observed syntactic distributions are not a perfect reflection of semantic type + projection rules Example Some Q(uestion)-selecting verbs allow concealed questions... (4) a. Mary asked what time it was. b. Mary asked the time. ...others do not (Grimshaw 1979, Pesetsky 1982, 1991, Nathan 2006, Frana 2010, a.o.) (5) a. Mary wondered what time it was. b. *Mary wondered the time. 12

  2. Pesetsky (1982, 1991) Verbs are related to semantic type signatures ( S-selection ); C- selection is an epiphenomenon of verbs’ abstract case Shared core Lexical noise (idiosyncrasy) alters verbs’ idealized syntactic dis- tributions Two kinds of lexical idiosyncrasy Grimshaw (1979) Verbs are related to semantic type signatures ( S-selection ) and syntactic type signatures ( C-selection ) 13

  3. Shared core Lexical noise (idiosyncrasy) alters verbs’ idealized syntactic dis- tributions Two kinds of lexical idiosyncrasy Grimshaw (1979) Verbs are related to semantic type signatures ( S-selection ) and syntactic type signatures ( C-selection ) Pesetsky (1982, 1991) Verbs are related to semantic type signatures ( S-selection ); C- selection is an epiphenomenon of verbs’ abstract case 13

  4. Two kinds of lexical idiosyncrasy Grimshaw (1979) Verbs are related to semantic type signatures ( S-selection ) and syntactic type signatures ( C-selection ) Pesetsky (1982, 1991) Verbs are related to semantic type signatures ( S-selection ); C- selection is an epiphenomenon of verbs’ abstract case Shared core Lexical noise (idiosyncrasy) alters verbs’ idealized syntactic dis- tributions 13

  5. A model of S-selection and projection Semantic Type Projection Rules Syntactic Idealized Distribution Syntactic Distribution Lexical Noise Model Noise Acceptability Observed Judgment Syntactic 14 Distribution Data

  6. A minimalistic answer Every object is a matrix of boolean values Strategy 1. Give model in terms of sets and functions 2. Convert this model into a boolean matrix model Specifying the model Question How do we represent each object in the model? 15

  7. Strategy 1. Give model in terms of sets and functions 2. Convert this model into a boolean matrix model Specifying the model Question How do we represent each object in the model? A minimalistic answer Every object is a matrix of boolean values 15

  8. 2. Convert this model into a boolean matrix model Specifying the model Question How do we represent each object in the model? A minimalistic answer Every object is a matrix of boolean values Strategy 1. Give model in terms of sets and functions 15

  9. Specifying the model Question How do we represent each object in the model? A minimalistic answer Every object is a matrix of boolean values Strategy 1. Give model in terms of sets and functions 2. Convert this model into a boolean matrix model 15

  10. A model of S-selection and projection Semantic Type Projection Rules Syntactic Idealized Distribution Syntactic Distribution Lexical Noise Model Noise Acceptability Observed Judgment Syntactic 16 Distribution Data

  11. A boolean model of S-selection think → {[ P]} know → {[ P], [ Q]} wonder → {[ Q]} [ P] [ Q] · · · think 1 0   · · · know 1 1 · · ·   S =   wonder 0 1 · · ·   . . ...   . . . . · · · 17

  12. A boolean model of S-selection think → {[ P]} know → {[ P], [ Q]} wonder → {[ Q]} [ P] [ Q] · · · think 1 0   · · · know 1 1 · · ·   S =   wonder 0 1 · · ·   . . ...   . . . . · · · 17

  13. A boolean model of S-selection think → {[ P]} know → {[ P], [ Q]} wonder → {[ Q]} [ P] [ Q] · · · think 1 0   · · · know 1 1 · · ·   S =   wonder 0 1 · · ·   . . ...   . . . . · · · 17

  14. A boolean model of S-selection think → {[ P]} know → {[ P], [ Q]} wonder → {[ Q]} [ P] [ Q] · · · think 1 0   · · · know 1 1 · · ·   S =   wonder 0 1 · · ·   . . ...   . . . . · · · 17

  15. A boolean model of projection [ P] → {[ that S], [ NP], ...} [ Q] → {[ whether S], [ NP], ...} [ that S] [ whether S] [ NP] · · · [ P] 1 0 1  · · ·  [ Q] 0 1 1 · · · Π =   . . .  ...  . . . . . . · · · 18

  16. A boolean model of projection [ P] → {[ that S], [ NP], ...} [ Q] → {[ whether S], [ NP], ...} [ that S] [ whether S] [ NP] · · · [ P] 1 0 1  · · ·  [ Q] 0 1 1 · · · Π =   . . .  ...  . . . . . . · · · 18

  17. A boolean model of idealized syntactic distribution D ( know, [ ˆ D ( wonder, [ ˆ ˆ ˆ that S] ) = ∨ NP] ) = ∨ Q] ,... } S ( know , t ) ∧ Π ( t , [ Q] ,... } S ( wonder , t ) ∧ Π ( t , [ that S] ) NP] ) ˆ D ( know, [ D ( VERB, SYNTYPE ) = ∨ D ( VERB, SYNTYPE ) = ∨ that S] ) = 1 − ∏ t ∈ SEMTYPES S ( VERB , t ) ∧ Π ( t , SYNTYPE ) t ∈ SEMTYPES S ( VERB , t ) ∧ Π ( t , SYNTYPE ) Q] ,... } 1 − S ( know , t ) × Π ( t , [ that S] ) t ∈{ [ t ∈{ [ t ∈{ [ P] , [ P] , [ P] , [ [ [ P] P] [ [ Q] Q] [ [ that S] that S] [ [ whether S] whether S] [ NP] · · · · · · · · · · · · think think   0 . 94 1 0 . 03 0   [ [ P] P]   0 . 99 1 0 . 12 0 1   · · · · · · · · · · · · know know 0 . 97 1 0 . 91 1 [ [ Q] Q] 0 . 07 0 0 . 98 1 1 · · · · · · · · · · · ·                 . . . . . ... ... wonder wonder 0 . 17 0 0 . 93 1     · · · · · ·   . . . .  .      . . . . . . . . . ... ... · · · · · ·   . . . .   . . . . · · · · · · [ that S] [ that S] [ whether S] [ whether S] [ NP] · · · · · · think think   1 0 . 97 0 0 . 14 1   · · · · · · know know 1 0 . 95 1 0 . 99 1 · · · · · ·         wonder wonder 0 0 . 12 1 0 . 99 1     · · · · · ·     . . . . . ... ...   . . . . .   . . . . . · · · · · · 19

  18. A boolean model of idealized syntactic distribution D ( know, [ ˆ D ( wonder, [ ˆ ˆ ˆ that S] ) = ∨ NP] ) = ∨ Q] ,... } S ( know , t ) ∧ Π ( t , [ Q] ,... } S ( wonder , t ) ∧ Π ( t , [ that S] ) NP] ) ˆ D ( know, [ D ( VERB, SYNTYPE ) = ∨ D ( VERB, SYNTYPE ) = ∨ that S] ) = 1 − ∏ t ∈ SEMTYPES S ( VERB , t ) ∧ Π ( t , SYNTYPE ) t ∈ SEMTYPES S ( VERB , t ) ∧ Π ( t , SYNTYPE ) Q] ,... } 1 − S ( know , t ) × Π ( t , [ that S] ) t ∈{ [ t ∈{ [ t ∈{ [ P] , [ P] , [ P] , [ [ [ P] P] [ [ Q] Q] [ [ that S] that S] [ [ whether S] whether S] [ NP] · · · · · · · · · · · · think think   0 . 94 1 0 . 03 0   [ [ P] P]   0 . 99 1 0 . 12 0 1   · · · · · · · · · · · · know know 0 . 97 1 0 . 91 1 [ [ Q] Q] 0 . 07 0 0 . 98 1 1 · · · · · · · · · · · ·                 . . . . . ... ... wonder wonder 0 . 17 0 0 . 93 1     · · · · · ·   . . . .  .      . . . . . . . . . ... ... · · · · · ·   . . . .   . . . . · · · · · · [ that S] [ that S] [ whether S] [ whether S] [ NP] · · · · · · think think   1 0 . 97 0 0 . 14 1   · · · · · · know know 1 0 . 95 1 0 . 99 1 · · · · · ·         wonder wonder 0 0 . 12 1 0 . 99 1     · · · · · ·     . . . . . ... ...   . . . . .   . . . . . · · · · · · 19

  19. A boolean model of idealized syntactic distribution D ( know, [ ˆ D ( wonder, [ ˆ ˆ ˆ that S] ) = ∨ NP] ) = ∨ Q] ,... } S ( know , t ) ∧ Π ( t , [ Q] ,... } S ( wonder , t ) ∧ Π ( t , [ that S] ) NP] ) ˆ D ( know, [ D ( VERB, SYNTYPE ) = ∨ D ( VERB, SYNTYPE ) = ∨ that S] ) = 1 − ∏ t ∈ SEMTYPES S ( VERB , t ) ∧ Π ( t , SYNTYPE ) t ∈ SEMTYPES S ( VERB , t ) ∧ Π ( t , SYNTYPE ) Q] ,... } 1 − S ( know , t ) × Π ( t , [ that S] ) t ∈{ [ t ∈{ [ t ∈{ [ P] , [ P] , [ P] , [ [ [ P] P] [ [ Q] Q] [ [ that S] that S] [ [ whether S] whether S] [ NP] · · · · · · · · · · · · think think   0 . 94 1 0 . 03 0   [ [ P] P]   0 . 99 1 0 . 12 0 1   · · · · · · · · · · · · know know 0 . 97 1 0 . 91 1 [ [ Q] Q] 0 . 07 0 0 . 98 1 1 · · · · · · · · · · · ·                 . . . . . ... ... wonder wonder 0 . 17 0 0 . 93 1     · · · · · ·   . . . .  .      . . . . . . . . . ... ... · · · · · ·   . . . .   . . . . · · · · · · [ that S] [ that S] [ whether S] [ whether S] [ NP] · · · · · · think think   1 0 . 97 0 0 . 14 1   · · · · · · know know 1 0 . 95 1 0 . 99 1 · · · · · ·         wonder wonder 0 0 . 12 1 0 . 99 1     · · · · · ·     . . . . . ... ...   . . . . .   . . . . . · · · · · · 19

  20. A boolean model of idealized syntactic distribution D ( know, [ ˆ D ( wonder, [ ˆ ˆ ˆ that S] ) = ∨ NP] ) = ∨ Q] ,... } S ( know , t ) ∧ Π ( t , [ Q] ,... } S ( wonder , t ) ∧ Π ( t , [ that S] ) NP] ) ˆ D ( know, [ D ( VERB, SYNTYPE ) = ∨ D ( VERB, SYNTYPE ) = ∨ that S] ) = 1 − ∏ t ∈ SEMTYPES S ( VERB , t ) ∧ Π ( t , SYNTYPE ) t ∈ SEMTYPES S ( VERB , t ) ∧ Π ( t , SYNTYPE ) Q] ,... } 1 − S ( know , t ) × Π ( t , [ that S] ) t ∈{ [ t ∈{ [ t ∈{ [ P] , [ P] , [ P] , [ [ [ P] P] [ [ Q] Q] [ [ that S] that S] [ [ whether S] whether S] [ NP] · · · · · · · · · · · · think think   0 . 94 1 0 . 03 0   [ [ P] P]   0 . 99 1 0 . 12 0 1   · · · · · · · · · · · · know know 0 . 97 1 0 . 91 1 [ [ Q] Q] 0 . 07 0 0 . 98 1 1 · · · · · · · · · · · ·                 . . . . . ... ... wonder wonder 0 . 17 0 0 . 93 1     · · · · · ·   . . . .  .      . . . . . . . . . ... ... · · · · · ·   . . . .   . . . . · · · · · · [ that S] [ that S] [ whether S] [ whether S] [ NP] · · · · · · think think   1 0 . 97 0 0 . 14 1   · · · · · · know know 1 0 . 95 1 0 . 99 1 · · · · · ·         wonder wonder 0 0 . 12 1 0 . 99 1     · · · · · ·     . . . . . ... ...   . . . . .   . . . . . · · · · · · 19

  21. A boolean model of idealized syntactic distribution D ( know, [ ˆ D ( wonder, [ ˆ ˆ ˆ that S] ) = ∨ NP] ) = ∨ Q] ,... } S ( know , t ) ∧ Π ( t , [ Q] ,... } S ( wonder , t ) ∧ Π ( t , [ that S] ) NP] ) ˆ D ( know, [ D ( VERB, SYNTYPE ) = ∨ D ( VERB, SYNTYPE ) = ∨ that S] ) = 1 − ∏ t ∈ SEMTYPES S ( VERB , t ) ∧ Π ( t , SYNTYPE ) t ∈ SEMTYPES S ( VERB , t ) ∧ Π ( t , SYNTYPE ) Q] ,... } 1 − S ( know , t ) × Π ( t , [ that S] ) t ∈{ [ t ∈{ [ t ∈{ [ P] , [ P] , [ P] , [ [ [ P] P] [ [ Q] Q] [ [ that S] that S] [ [ whether S] whether S] [ NP] · · · · · · · · · · · · think think   0 . 94 1 0 . 03 0   [ [ P] P]   0 . 99 1 0 . 12 0 1   · · · · · · · · · · · · know know 0 . 97 1 0 . 91 1 [ [ Q] Q] 0 . 07 0 0 . 98 1 1 · · · · · · · · · · · ·                 . . . . . ... ... wonder wonder 0 . 17 0 0 . 93 1     · · · · · ·   . . . .  .      . . . . . . . . . ... ... · · · · · ·   . . . .   . . . . · · · · · · [ that S] [ that S] [ whether S] [ whether S] [ NP] · · · · · · think think   1 0 . 97 0 0 . 14 1   · · · · · · know know 1 0 . 95 1 0 . 99 1 · · · · · ·         wonder wonder 0 0 . 12 1 0 . 99 1     · · · · · ·     . . . . . ... ...   . . . . .   . . . . . · · · · · · 19

  22. A boolean model of idealized syntactic distribution D ( know, [ ˆ D ( wonder, [ ˆ ˆ ˆ that S] ) = ∨ NP] ) = ∨ Q] ,... } S ( know , t ) ∧ Π ( t , [ Q] ,... } S ( wonder , t ) ∧ Π ( t , [ that S] ) NP] ) ˆ D ( know, [ D ( VERB, SYNTYPE ) = ∨ D ( VERB, SYNTYPE ) = ∨ that S] ) = 1 − ∏ t ∈ SEMTYPES S ( VERB , t ) ∧ Π ( t , SYNTYPE ) t ∈ SEMTYPES S ( VERB , t ) ∧ Π ( t , SYNTYPE ) Q] ,... } 1 − S ( know , t ) × Π ( t , [ that S] ) t ∈{ [ t ∈{ [ t ∈{ [ P] , [ P] , [ P] , [ [ [ P] P] [ [ Q] Q] [ [ that S] that S] [ [ whether S] whether S] [ NP] · · · · · · · · · · · · think think   0 . 94 1 0 . 03 0   [ [ P] P]   0 . 99 1 0 . 12 0 1   · · · · · · · · · · · · know know 0 . 97 1 0 . 91 1 [ [ Q] Q] 0 . 07 0 0 . 98 1 1 · · · · · · · · · · · ·                 . . . . . ... ... wonder wonder 0 . 17 0 0 . 93 1     · · · · · ·   . . . .  .      . . . . . . . . . ... ... · · · · · ·   . . . .   . . . . · · · · · · [ that S] [ that S] [ whether S] [ whether S] [ NP] · · · · · · think think   1 0 . 97 0 0 . 14 1   · · · · · · know know 1 0 . 95 1 0 . 99 1 · · · · · ·         wonder wonder 0 0 . 12 1 0 . 99 1     · · · · · ·     . . . . . ... ...   . . . . .   . . . . . · · · · · · 19

  23. A model of S-selection and projection Semantic Type Projection Rules Syntactic Idealized Distribution Syntactic Distribution Lexical Noise Model Noise Acceptability Observed Judgment Syntactic 20 Distribution Data

  24. A boolean model of observed syntactic distribution ∀ t ∈ SYNTYPE : D ( wonder , t ) = ˆ D ( wonder , t ) ∧ N ( wonder , t ) [ that S] [ whether S] [ NP] [ that S] [ whether S] [ NP] · · · · · · think 1 0 1 think 1 1 1 · · · · · ·     know 1 1 1 know 1 1 1 · · · · · ·     wonder  0 1 1  wonder  1 1 0  · · · · · ·         . . . . . . ... ...     . . . . . . . . . . . . · · · · · · [ that S] [ whether S] [ NP] · · · think 1 0 1 · · ·   know 1 1 1 · · ·   wonder  0 1 0  · · ·    . . .  ...  . . .  . . . · · · 21

  25. A boolean model of observed syntactic distribution ∀ t ∈ SYNTYPE : D ( wonder , t ) = ˆ D ( wonder , t ) ∧ N ( wonder , t ) [ that S] [ whether S] [ NP] [ that S] [ whether S] [ NP] · · · · · · think 1 0 1 think 1 1 1 · · · · · ·     know 1 1 1 know 1 1 1 · · · · · ·     wonder  0 1 1  wonder  1 1 0  · · · · · ·         . . . . . . ... ...     . . . . . . . . . . . . · · · · · · [ that S] [ whether S] [ NP] · · · think 1 0 1 · · ·   know 1 1 1 · · ·   wonder  0 1 0  · · ·    . . .  ...  . . .  . . . · · · 21

  26. A boolean model of observed syntactic distribution ∀ t ∈ SYNTYPE : D ( wonder , t ) = ˆ D ( wonder , t ) ∧ N ( wonder , t ) [ that S] [ whether S] [ NP] [ that S] [ whether S] [ NP] · · · · · · think 1 0 1 think 1 1 1 · · · · · ·     know 1 1 1 know 1 1 1 · · · · · ·     wonder  0 1 1  wonder  1 1 0  · · · · · ·         . . . . . . ... ...     . . . . . . . . . . . . · · · · · · [ that S] [ whether S] [ NP] · · · think 1 0 1 · · ·   know 1 1 1 · · ·   wonder  0 1 0  · · ·    . . .  ...  . . .  . . . · · · 21

  27. A boolean model of observed syntactic distribution ∀ t ∈ SYNTYPE : D ( wonder , t ) = ˆ D ( wonder , t ) ∧ N ( wonder , t ) [ that S] [ whether S] [ NP] [ that S] [ whether S] [ NP] · · · · · · think 1 0 1 think 1 1 1 · · · · · ·     know 1 1 1 know 1 1 1 · · · · · ·     wonder  0 1 1  wonder  1 1 0  · · · · · ·         . . . . . . ... ...     . . . . . . . . . . . . · · · · · · [ that S] [ whether S] [ NP] · · · think 1 0 1 · · ·   know 1 1 1 · · ·   wonder  0 1 0  · · ·    . . .  ...  . . .  . . . · · · 21

  28. Animating abstractions Question What is this model useful for? Answer In conjunction with modern computational techniques, this model allow us to scale distributional analysis to an entire lexicon Basic idea Distributional analysis corresponds to reversing model arrows 22

  29. A model of S-selection and projection Semantic Type Projection Rules Syntactic Idealized Distribution Syntactic Distribution Lexical Noise Model Noise Acceptability Observed Judgment Syntactic 23 Distribution Data

  30. A model of S-selection and projection Semantic Type Projection Rules Syntactic Idealized Distribution Syntactic Distribution Lexical Noise Model Noise Acceptability Observed Judgment Syntactic 23 Distribution Data

  31. The MegaAttitude data set

  32. for 1000 clause-embedding verbs 50 syntactic frames MegaAttitude materials Ordinal (1-7 scale) acceptability ratings 25

  33. 50 syntactic frames MegaAttitude materials Ordinal (1-7 scale) acceptability ratings for 1000 clause-embedding verbs 25

  34. Verb selection 26

  35. MegaAttitude materials Ordinal (1-7 scale) acceptability ratings for 1000 clause-embedding verbs × 50 syntactic frames 27

  36. Solution Construct semantically bleached frames using indefinites (6) Examples of responsives a. know + NP V {that, whether} S Someone knew {that, whether} something happened. b. tell + NP V NP {that, whether} S Someone told someone {that, whether} something happened. Sentence construction Challenge Automate construction of a very large set of frames in a way that is sufficiently general to many verbs 28

  37. Frame construction Syntactic type NP PP S ACTIVE PASSIVE COMP TENSE that for [+Q] [+FIN] [-FIN] ∅ [ NP] [ NP PP] [ PP] [ PP S] [ NP S] [ S] to ∅ whether which NP -ed would -ing 29

  38. Frame construction Syntactic type NP PP S ACTIVE PASSIVE COMP TENSE that for [+Q] [+FIN] [-FIN] ∅ [ NP] [ NP PP] [ PP] [ PP S] [ NP S] [ S] to ∅ whether which NP -ed would -ing 29

  39. Frame construction Syntactic type NP PP S ACTIVE PASSIVE COMP TENSE that for [+Q] [+FIN] [-FIN] ∅ [ NP] [ NP PP] [ PP] [ PP S] [ NP S] [ S] to ∅ whether which NP -ed would -ing 29

  40. Frame construction Syntactic type NP PP S ACTIVE PASSIVE COMP TENSE that for [+Q] [+FIN] [-FIN] ∅ [ NP] [ NP PP] [ PP] [ PP S] [ NP S] [ S] to ∅ whether which NP -ed would -ing 29

  41. Frame construction Syntactic type NP PP S ACTIVE PASSIVE COMP TENSE that for [+Q] [+FIN] [-FIN] ∅ [ NP] [ NP PP] [ PP] [ PP S] [ NP S] [ S] to ∅ whether which NP -ed would -ing 29

  42. Frame construction Syntactic type NP PP S ACTIVE PASSIVE COMP TENSE that for [+Q] [+FIN] [-FIN] ∅ [ NP] [ NP PP] [ PP] [ PP S] [ NP S] [ S] to ∅ whether which NP -ed would -ing 29

  43. Frame construction Syntactic type NP PP S ACTIVE PASSIVE COMP TENSE that for [+Q] [+FIN] [-FIN] ∅ [ NP] [ NP PP] [ PP] [ PP S] [ NP S] [ S] to ∅ whether which NP -ed would -ing 29

  44. Frame construction Syntactic type NP PP S ACTIVE PASSIVE COMP TENSE that for [+Q] [+FIN] [-FIN] ∅ [ NP] [ NP PP] [ PP] [ PP S] [ NP S] [ S] to ∅ whether which NP -ed would -ing 29

  45. Frame construction Syntactic type NP PP S ACTIVE PASSIVE COMP TENSE that for [+Q] [+FIN] [-FIN] ∅ [ NP] [ NP PP] [ PP] [ PP S] [ NP S] [ S] to ∅ whether which NP -ed would -ing 29

  46. Frame construction Syntactic type NP PP S ACTIVE PASSIVE COMP TENSE that for [+Q] [+FIN] [-FIN] ∅ [ NP] [ NP PP] [ PP] [ PP S] [ NP S] [ S] to ∅ whether which NP -ed would -ing 29

  47. (6) Examples of responsives a. know + NP V {that, whether} S Someone knew {that, whether} something happened. b. tell + NP V NP {that, whether} S Someone told someone {that, whether} something happened. Sentence construction Challenge Automate construction of a very large set of frames in a way that is sufficiently general to many verbs Solution Construct semantically bleached frames using indefinites 30

  48. b. tell + NP V NP {that, whether} S Someone told someone {that, whether} something happened. Sentence construction Challenge Automate construction of a very large set of frames in a way that is sufficiently general to many verbs Solution Construct semantically bleached frames using indefinites (6) Examples of responsives a. know + NP V {that, whether} S Someone knew {that, whether} something happened. 30

  49. Sentence construction Challenge Automate construction of a very large set of frames in a way that is sufficiently general to many verbs Solution Construct semantically bleached frames using indefinites (6) Examples of responsives a. know + NP V {that, whether} S Someone knew {that, whether} something happened. b. tell + NP V NP {that, whether} S Someone told someone {that, whether} something happened. 30

  50. Sentence construction Challenge Automate construction of a very large set of frames in a way that is sufficiently general to many verbs Solution Construct semantically bleached frames using indefinites (6) Examples of responsives a. know + NP V {that, whether} S Someone knew {that, whether} something happened. b. tell + NP V NP {that, whether} S Someone told someone {that, whether} something happened. 30

  51. • 1,000 lists of 50 items each • Each verb only once per list • Each frame only once per list • 727 unique Mechanical Turk participants • Annotators allowed to do multiple lists, but never the same list twice • 5 judgments per item • No annotator sees the same sentence more than once Data collection • 1,000 verbs × 50 syntactic frames = 50,000 sentences 31

  52. • Each verb only once per list • Each frame only once per list • 727 unique Mechanical Turk participants • Annotators allowed to do multiple lists, but never the same list twice • 5 judgments per item • No annotator sees the same sentence more than once Data collection • 1,000 verbs × 50 syntactic frames = 50,000 sentences • 1,000 lists of 50 items each 31

  53. • Each frame only once per list • 727 unique Mechanical Turk participants • Annotators allowed to do multiple lists, but never the same list twice • 5 judgments per item • No annotator sees the same sentence more than once Data collection • 1,000 verbs × 50 syntactic frames = 50,000 sentences • 1,000 lists of 50 items each • Each verb only once per list 31

  54. • 727 unique Mechanical Turk participants • Annotators allowed to do multiple lists, but never the same list twice • 5 judgments per item • No annotator sees the same sentence more than once Data collection • 1,000 verbs × 50 syntactic frames = 50,000 sentences • 1,000 lists of 50 items each • Each verb only once per list • Each frame only once per list 31

  55. • Annotators allowed to do multiple lists, but never the same list twice • 5 judgments per item • No annotator sees the same sentence more than once Data collection • 1,000 verbs × 50 syntactic frames = 50,000 sentences • 1,000 lists of 50 items each • Each verb only once per list • Each frame only once per list • 727 unique Mechanical Turk participants 31

  56. • 5 judgments per item • No annotator sees the same sentence more than once Data collection • 1,000 verbs × 50 syntactic frames = 50,000 sentences • 1,000 lists of 50 items each • Each verb only once per list • Each frame only once per list • 727 unique Mechanical Turk participants • Annotators allowed to do multiple lists, but never the same list twice 31

  57. • No annotator sees the same sentence more than once Data collection • 1,000 verbs × 50 syntactic frames = 50,000 sentences • 1,000 lists of 50 items each • Each verb only once per list • Each frame only once per list • 727 unique Mechanical Turk participants • Annotators allowed to do multiple lists, but never the same list twice • 5 judgments per item 31

  58. Data collection • 1,000 verbs × 50 syntactic frames = 50,000 sentences • 1,000 lists of 50 items each • Each verb only once per list • Each frame only once per list • 727 unique Mechanical Turk participants • Annotators allowed to do multiple lists, but never the same list twice • 5 judgments per item • No annotator sees the same sentence more than once 31

  59. Task Turktools (Erlewine & Kotek 2015) 32

  60. Validating the data Interannotator agreement Spearman rank correlation calculated by list on a pilot 30 verbs Pilot verb selection Same verbs used by White (2015), White et al. (2015), selected based on Hacquard & Wellwood’s (2012) attitude verb classifi- cation 1. Linguist-to-linguist median: 0.70, 95% CI: [0.62, 0.78] 2. Linguist-to-annotator median: 0.55, 95% CI: [0.52, 0.58] 3. Annotator-to-annotator median: 0.56, 95% CI: [0.53, 0.59] 33

  61. Results 7 6 NP V whether S 5 4 3 2 1 1 2 3 4 5 6 7 NP V S 34

  62. Results 7 wonder know 6 NP V whether S 5 4 think 3 2 1 want 1 2 3 4 5 6 7 NP V S 35

  63. Model fitting and results

  64. A model of S-selection and projection Semantic Type Projection Rules Syntactic Idealized Distribution Syntactic Distribution Lexical Noise Model Noise Acceptability Observed Judgment Syntactic 37 Distribution Data

  65. A model of S-selection and projection Semantic Type Projection Rules Syntactic Idealized Distribution Syntactic Distribution Lexical Noise Model Noise Acceptability Observed Judgment Syntactic 37 Distribution Data

  66. A model of S-selection and projection Semantic Type Projection Rules Syntactic Idealized Distribution Syntactic Distribution Lexical Noise Model Noise Acceptability Observed Judgment Syntactic 37 Distribution Data

  67. Challenges 2 50 T possible 1. Infeasible to search over 2 1000 T configurations ( T # of type signatures) 2. Finding the best boolean model fails to capture uncertainty inherent in judgment data Fitting the model Goal Find representations of verbs’ semantic type signatures and projection rules that best explain the acceptability judgments 38

  68. Fitting the model Goal Find representations of verbs’ semantic type signatures and projection rules that best explain the acceptability judgments Challenges 1. Infeasible to search over 2 1000 T × 2 50 T possible configurations ( T = # of type signatures) 2. Finding the best boolean model fails to capture uncertainty inherent in judgment data 38

  69. Going probabilistic Wrap boolean expressions in probability measures Fitting the model Solution Search probability distributions over verbs’ semantic type sig- natures and projection rules 39

  70. Fitting the model Solution Search probability distributions over verbs’ semantic type sig- natures and projection rules Going probabilistic Wrap boolean expressions in probability measures 39

  71. A boolean model of idealized syntactic distribution D ( know, [ ˆ D ( wonder, [ ˆ ˆ ˆ that S] ) = ∨ NP] ) = ∨ Q] ,... } S ( know , t ) ∧ Π ( t , [ Q] ,... } S ( wonder , t ) ∧ Π ( t , [ that S] ) NP] ) ˆ D ( know, [ D ( VERB, SYNTYPE ) = ∨ D ( VERB, SYNTYPE ) = ∨ that S] ) = 1 − ∏ t ∈ SEMTYPES S ( VERB , t ) ∧ Π ( t , SYNTYPE ) t ∈ SEMTYPES S ( VERB , t ) ∧ Π ( t , SYNTYPE ) Q] ,... } 1 − S ( know , t ) × Π ( t , [ that S] ) t ∈{ [ t ∈{ [ t ∈{ [ P] , [ P] , [ P] , [ [ [ P] P] [ [ Q] Q] [ [ that S] that S] [ [ whether S] whether S] [ NP] · · · · · · · · · · · · think think   0 . 94 1 0 . 03 0   [ [ P] P]   0 . 99 1 0 . 12 0 1   · · · · · · · · · · · · know know 0 . 97 1 0 . 91 1 [ [ Q] Q] 0 . 07 0 0 . 98 1 1 · · · · · · · · · · · ·                 . . . . . ... ... wonder wonder 0 . 17 0 0 . 93 1     · · · · · ·   . . . .  .      . . . . . . . . . ... ... · · · · · ·   . . . .   . . . . · · · · · · [ that S] [ that S] [ whether S] [ whether S] [ NP] · · · · · · think think   1 0 . 97 0 0 . 14 1   · · · · · · know know 1 0 . 95 1 0 . 99 1 · · · · · ·         wonder wonder 0 0 . 12 1 0 . 99 1     · · · · · ·     . . . . . ... ...   . . . . .   . . . . . · · · · · · 40

  72. A boolean model of idealized syntactic distribution D ( know, [ ˆ D ( wonder, [ ˆ ˆ ˆ that S] ) = ∨ NP] ) = ∨ Q] ,... } S ( know , t ) ∧ Π ( t , [ Q] ,... } S ( wonder , t ) ∧ Π ( t , [ that S] ) NP] ) ˆ D ( know, [ D ( VERB, SYNTYPE ) = ∨ D ( VERB, SYNTYPE ) = ∨ that S] ) = 1 − ∏ t ∈ SEMTYPES S ( VERB , t ) ∧ Π ( t , SYNTYPE ) t ∈ SEMTYPES S ( VERB , t ) ∧ Π ( t , SYNTYPE ) Q] ,... } 1 − S ( know , t ) × Π ( t , [ that S] ) t ∈{ [ t ∈{ [ t ∈{ [ P] , [ P] , [ P] , [ [ [ P] P] [ [ Q] Q] [ [ that S] that S] [ [ whether S] whether S] [ NP] · · · · · · · · · · · · think think   0 . 94 1 0 . 03 0   [ [ P] P]   0 . 99 1 0 . 12 0 1   · · · · · · · · · · · · know know 0 . 97 1 0 . 91 1 [ [ Q] Q] 0 . 07 0 0 . 98 1 1 · · · · · · · · · · · ·                 . . . . . ... ... wonder wonder 0 . 17 0 0 . 93 1     · · · · · ·   . . . .  .      . . . . . . . . . ... ... · · · · · ·   . . . .   . . . . · · · · · · [ that S] [ that S] [ whether S] [ whether S] [ NP] · · · · · · think think   1 0 . 97 0 0 . 14 1   · · · · · · know know 1 0 . 95 1 0 . 99 1 · · · · · ·         wonder wonder 0 0 . 12 1 0 . 99 1     · · · · · ·     . . . . . ... ...   . . . . .   . . . . . · · · · · · 40

  73. Wrapping with probabilities P ( S [ VERB , t ] ∧ Π [ t , SYNTYPE ]) = P ( S [ VERB , t ]) P ( Π [ t , SYNTYPE ] | S [ VERB , t ]) = P ( S [ VERB , t ]) P ( Π [ t , SYNTYPE ]) (∨ ) ( ) ∧ S [ VERB , t ] ∧ Π [ t , SYNTYPE ] ¬ ( S [ VERB , t ] ∧ Π [ t , SYNTYPE ]) = P ¬ P t t (∧ ) = 1 − P ¬ ( S [ VERB , t ] ∧ Π [ t , SYNTYPE ]) t ∏ = 1 − P ( ¬ ( S [ VERB , t ] ∧ Π [ t , SYNTYPE ])) t = 1 − ∏ 1 − P ( S [ VERB , t ] ∧ Π [ t , SYNTYPE ]) t = 1 − ∏ 1 − P ( S [ VERB , t ]) P ( Π [ t , SYNTYPE ]) t 41

  74. Remaining challenge Don’t know the number of type signatures T Standard solution Fit the model with many type signatures and compare using an information criterion, e.g., the Akaike Information Criterion (AIC) Fitting the model Algorithm Projected gradient descent with adaptive gradient (Duchi et al. 2011) 42

  75. Standard solution Fit the model with many type signatures and compare using an information criterion, e.g., the Akaike Information Criterion (AIC) Fitting the model Algorithm Projected gradient descent with adaptive gradient (Duchi et al. 2011) Remaining challenge Don’t know the number of type signatures T 42

Recommend


More recommend