more than words
play

MORE THAN WORDS A DISCRIMINATIVE LEARNING MODEL WITH LEXICAL - PowerPoint PPT Presentation

MORE THAN WORDS A DISCRIMINATIVE LEARNING MODEL WITH LEXICAL BUNDLES March 8th, 2017 Saskia E. Lensink, R. Harald Baayen s.e.lensink@hum.leidenuniv.nl Contents Multi-word units and their cognitive reality Experimental methods


  1. MORE THAN WORDS A DISCRIMINATIVE LEARNING MODEL WITH LEXICAL BUNDLES March 8th, 2017 Saskia E. Lensink, R. Harald Baayen s.e.lensink@hum.leidenuniv.nl

  2. Contents ■ Multi-word units and their cognitive reality ■ Experimental methods ■ Computational model of multi-word units ■ Eye-tracking study ■ Production study ■ Results and implications 2

  3. A typology of multi-word units Wray (2012) 3

  4. Multi-word units ■ Indicator of nativen eness ess ■ Thought to be repres resent nted ed as a whole ole ■ How can we exper perime imentally ntally test t for the cognitive reality of these multi-word units? 4

  5. Multi-word frequencies Previous studies have found an effect of frequencies of regular multi-word units suggests storage orage of wholes les 5

  6. Previous studies ■ self-paced reading Tremblay, Derwing, Libben, & Westbury, 2011 ■ phrasal decision tasks Arnon & Snider, 2010; Ellis & Simpson-Vlach, 2009 ■ priming of the last word of the ngram Ellis & Simpson-Vlach, 2009 ■ word reading tasks Arnon & Priva, 2013; Ellis & Simpson-Vlach, 2009; Han, 2015; Tremblay & Tucker, 2011 ■ picture naming Janssen & Barber, 2012 ■ sentence recall Tremblay et al., 2011 ■ immediate free recall Tremblay & Baayen, 2010 ■ eye-tracking Siyanova-Chanturia, Conklin, & Van Heuven, 2011 ■ ERPs Tremblay & Baayen 2010 ■ L1 language acquisition Bannard & Matthews, 2008 ■ L2 speakers Conklin & Schmitt, 2012; Han, 2015; Jiang & Nekrasova, 2007; Siyanova-Chanturia et al, 2011 6

  7. Frequency is an impoverished measure ■ Collapses counts of homo omopho hone nes ■ Collapses counts of different rent senses nses ■ Language always occurs in context xt – prediction also plays a large role in processing ■ Salien ence ce and recen cency cy also play a role 7

  8. Mind the neighbors! ■ When studying words, we pay attention to – Frequency effects – Length – Neighborhood density effects ■ When studying multi-word units, we pay attention to – Frequency effects – Length – But ut not ot to to neighbo ghborho hood od densit nsity effects ects! 8

  9. Motivation for our study ■ We know that the framework of discriminative learning has given us some new insights into language ■ A computational model implementing discriminative learning, NDL, provides us with a measure reflecting neighborhood density effects ■ When adding features of discriminative learning to our models of the processing of multi-word units, we might gain new insights into the processing of multi-word units ■ We conducted both an eye-tracking and a production study to study comprehension and production 9

  10. NDL Baayen et al., 2011 ■ Naïve Discriminative Learning ■ Implements Rescorla-Wagner equations that specify how experience alters the strength of association of a cue cue to a given outcome come ■ Distributional properties of corpus data used, using basic principles of error-dri driven en learn rning ing ■ Weight from cues to outcomes adjus usted ed depending on corre rect ct/inc incorre rrect ct predict iction on of an outcome given a certain cue This approach successfully predicted word frequency effects, morphological family size effects, inflectional entropy effects, and phrasal frequency effects 10

  11. NDL Baayen et al., 2011 ■ Outcomes are thought of as point nter ers s to locati tions ons in a multi- dimensional semanti mantic c space ce ■ These locations are const stantl antly y up updated ed by the experiences a language user has 11

  12. NDL with lexical bundles 12

  13. Weight word X Bottom-up information 13

  14. Total activation trigram (act) Bottom-up information 14

  15. Prior activation trigram Top-down information 15

  16. Activation diversity Competing trigrams – neighborhood density 16

  17. Ey Eye trac e tracking king Eye-tracking experiment ■ Plaatje eye-tracker/oog oid 17

  18. Stimuli ■ most common n-grams (trigrams) from corpus ■ OpenSoNaR corpus ■ Use frequencies extracted from a corpus of Dutch subtitles (N = 109,807,716) 18

  19. Procedure ■ Silent reading ■ Comprehension questions to ascertain attentive reading ■ 30 participants (10 male) ■ Analyzed using generalized additive mixed-effects models (GAMMS) 19

  20. Modeling data ■ See if and to what extent NDL measures gives us more insights over and above more traditional frequency measures ■ Some frequency and NDL measures show high amount of colline ineari rity ty – e.g. ‘ freqABC ’ and ‘prior’ ■ Models with just frequencies performed worse than models with both frequencies and NDL measures ■ Neighborhood density effects are best reflected by the Activation Diversity measure, which was a significant predictor in several models 20

  21. First fixation durations FreqC ActDivTrigram FreqABC firstFixX firstFixX ActDivTrigram firstFixX FreqABC 21

  22. Second fixation durations length secondFixX prior Weight word 3 22

  23. Number of fixations secondFixX firstFixX 23

  24. Discussion eye-tracking data ■ Already in the first fixation effects of the trigram frequencies and third word ■ Processes of top down n infor ormat mation on (freq equenc ency effects ects), bott ottom om-up up informati ormation on (acti ctivations ations) ) and uncer certainty tainty reduc uction tion (activ tivation ation di diversi ersity ty/nei neighbor ghborhood hood effects ects) ■ Knowled wledge ge verif rificati cation on (freq equenci uencies es): a reader spends more time in early measures with higher frequencies and if enough information is available – if not, a new fixation is planned asap ■ Bott ottom om-up up informatio ormation (w3): 3): when further into the trigram at your second fixation, it pays to spend more time to resolve things locally if the third word provides a lot of support for the trigram. If not, participants are faster to refixate ■ uncer ertainty tainty reduct uction on (nei eigh ghbor borho hood od densi nsity) y): if there are many competing trigrams, shorter looking times in first fixations and a higher number of fixations. 24

  25. General discussion ■ Multi-word units are relevant ant un unit of storage age (also in Dutch) ■ Both single le words ds and the ful ull trigram ram play a role ■ Adding measures from a discrimina criminativ tive mode del provides us with new w insight ights into the processing of MWUs ■ Considering neigh ghbor borhoo ood d densi ensity ty effec ects ts provides us with more insights into the workings of MWU processing ■ In processing of multi-word units, opposing forces of top-do down n inform ormati tion on, bott ottom om-up up informa ormati tion on and un uncer ertainty tainty reduc ducti tion on are at work 25

  26. Questions? Qu estions? 26

  27. Extra slides – production 27

  28. Production experiments 28

  29. Procedure ■ Same stimuli as used in the eye-tracking study ■ Word reading task ■ 30 participants (8 male) ■ Onsets and durations measured using Praat ■ Analyzed using generalized additive mixed effect models (GAMMs) 29

  30. Production onsets 30

  31. Production durations 31

  32. A trade-off naming latencies durations 32

  33. Discussion production data ■ Processes of top down n informa mation on (frequen ency cy effects ts), bot ottom om- up informati mation on (acti tivat ations ons) ) and unc ncertainty tainty reduct ction ion (activat ation ion diversity ity/nei neighb ghbor orhood ood effects) ■ There is a trade ade-off between starting early and being able to pronounce the trigram fast ■ Top-down wn informati mation on slows you down at first, but makes total durat ration ons shorter er (longer to plan, but easier motor program to execute) ■ Bott ottom-up up informa rmation tion gives you a quick ck start but slows you down later (shorter to plan, but harder motor program to execute) ■ Neighb hbor orhood ood effects apparent in produc ducti tion on durat ration ons – longer durations when the number of neighbors is different from the average (less motor practice) 33

Recommend


More recommend