m otivating e xample 2
play

M OTIVATING E XAMPLE 2 Other languages display still more variation - PowerPoint PPT Presentation

C OMPOSITIONAL M ORPHOLOGY FOR W ORD R EPRESENTATIONS AND L ANGUAGE M ODELLING Jan Botha , Phil Blunsom ICML 2014, Beijing M OTIVATION P ROPOSED M ETHOD E XPERIMENTS M OTIVATING E XAMPLE W HAT WE SEE The king finally abdicated after years of


  1. C OMPOSITIONAL M ORPHOLOGY FOR W ORD R EPRESENTATIONS AND L ANGUAGE M ODELLING Jan Botha , Phil Blunsom ICML 2014, Beijing

  2. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS M OTIVATING E XAMPLE W HAT WE SEE The king finally abdicated after years of unkingly conduct .

  3. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS M OTIVATING E XAMPLE W HAT WE SEE The king finally abdicated after years of unkingly conduct . Wait what – unkingly?

  4. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS M OTIVATING E XAMPLE W HAT WE SEE The king finally abdicated after years of unkingly conduct . Wait what – unkingly? unkingly 2n’kINli a word you have probably never seen, but still understand

  5. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS M OTIVATING E XAMPLE W HAT WE SEE The king finally abdicated after years of unkingly conduct . Wait what – unkingly? unkingly 2n’kINli a word you have probably never seen, but still understand ⇒ compositional morphology in action

  6. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS M OTIVATING E XAMPLE W HAT WE SEE The king finally abdicated after years of unkingly conduct . Wait what – unkingly? unkingly 2n’kINli a word you have probably never seen, but still understand ⇒ compositional morphology in action W HAT OUR MODELS SEE ( MOSTLY ) 10 2 95 529 11 88 21 50 74 239

  7. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS M OTIVATING E XAMPLE W HAT WE SEE The king finally abdicated after years of unkingly conduct . Wait what – unkingly? unkingly 2n’kINli a word you have probably never seen, but still understand ⇒ compositional morphology in action W HAT OUR MODELS SEE ( MOSTLY ) 10 2 95 529 11 88 21 50 74 239

  8. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS M OTIVATING E XAMPLE 2 Other languages display still more variation C ZECH T URKISH PRODUCTIVE DERIVATION Avrupa (Europe) CONJUGATION Avrupalı (of Europe) cistit (to clean) ˇ Avrupalıla¸ s (become of Europe) cistím ˇ Avrupalıla¸ stır (to Europeanise) cistíš ˇ Avrupalıla¸ stırama (be unable to Europeanise) cistí ˇ Avrupalıla¸ stıramadık (we were unable to Europeanise) cistíme ˇ . . . cistíte ˇ cistil ˇ cištˇ ˇ en cisti ˇ cistˇ ˇ ete cistˇ ˇ eme

  9. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS M OTIVATING E XAMPLE 2 Other languages display still more variation C ZECH T URKISH PRODUCTIVE DERIVATION Avrupa (Europe) CONJUGATION Avrupalı (of Europe) cistit (to clean) ˇ Avrupalıla¸ s (become of Europe) cistím ˇ Avrupalıla¸ stır (to Europeanise) cistíš ˇ Avrupalıla¸ stırama (be unable to Europeanise) cistí ˇ Avrupalıla¸ stıramadık (we were unable to Europeanise) cistíme ˇ . . . cistíte ˇ cistil ˇ cištˇ ˇ en ⇒ we should model morphemes! cisti ˇ cistˇ ˇ ete cistˇ ˇ eme

  10. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS R EPRESENTING WORDS ◮ Discrete set? {a, aardvark, . . . , account, accounted, accounting, . . . }

  11. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS R EPRESENTING WORDS ◮ Discrete set? {a, aardvark, . . . , account, accounted, accounting, . . . } ◮ Vector space? x 2 accounted account a aardvark x 1

  12. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS E XTRACT FROM C OLLOBERT & W ESTON E MBEDDINGS

  13. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS E XTRACT FROM C OLLOBERT & W ESTON E MBEDDINGS

  14. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS E XTRACT FROM C OLLOBERT & W ESTON E MBEDDINGS

  15. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS M ORPHEME VECTORS Existing word vectors already capture some morphology. ◮ − banks − − − − → bank ≈ − − → kings − − − − → king ≈ − − → queens − − − − − → − − → queen (Mikolov et al. 2013)

  16. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS M ORPHEME VECTORS Existing word vectors already capture some morphology. ◮ − banks − − − − → bank ≈ − − → kings − − − − → king ≈ − − → queens − − − − − → − − → queen (Mikolov et al. 2013) Logical extension: ◮ − kings ≈ − − − → king + − − → → - s ◮ − unkingly ≈ − − − − − → un - + − → king + − − → → - ly

  17. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS M ORPHEME VECTORS Existing word vectors already capture some morphology. ◮ − banks − − − − → bank ≈ − − → kings − − − − → king ≈ − − → queens − − − − − → − − → queen (Mikolov et al. 2013) Logical extension: ◮ − kings ≈ − − − → king + − − → → - s ◮ − unkingly ≈ − − − − − → un - + − → king + − − → → - ly H OW TO ... ◮ obtain morpheme vectors ◮ compose morpheme vectors ◮ do it all within a language model usable in an MT decoder

  18. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS M ORPHOLOGICAL COMPOSITION AS ADDITION Literally, word = sum of its parts?

  19. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS M ORPHOLOGICAL COMPOSITION AS ADDITION Literally, word = sum of its parts? Problems: hang + − − − → over � = − − → over + − − → − → ◮ bag of morphemes: hang greenhouse � = − − − − − − − − → green + − − − → − − → ◮ non-compositionality: house

  20. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS M ORPHOLOGICAL COMPOSITION AS ADDITION Literally, word = sum of its parts? Problems: − hang + − − → over � = − − → over + − − → − → ◮ bag of morphemes: hang − greenhouse � = − − − − − − − → green + − − − → − − → ◮ non-compositionality: house P RAGMATIC S OLUTION include word identity as component too: − − − − − − − → green stem + − − − − → − − → greenhouse ≡ house stem − − − − − → − → un pre + − king stem + − − → → unkingly ≡ ly suf

  21. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS M ORPHOLOGICAL COMPOSITION AS ADDITION Literally, word = sum of its parts? Problems: − hang + − − → over � = − − → over + − − → − → ◮ bag of morphemes: hang greenhouse � = − − − − − − − − → green + − − − → − − → ◮ non-compositionality: house P RAGMATIC S OLUTION include word identity as component too: − greenhouse ≡ − − − − − − − → greenhouse id + − − − − − − − → green stem + − − − → − − → house stem unkingly ≡ − − − − − − → unkingly id + − − − − − → → un pre + − king stem + − − → → ly suf

  22. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS S IMPLEST VECTOR - BASED PROBABILISTIC LM LBL (Log-bilinear model) (Mnih & Hinton, 2007; Mnih & Teh, 2012) “colorless green ideas sleep furiously .”

  23. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS A DD MORPHEME VECTORS INSIDE LM LBL ++ “colorless green ideas sleep furiously .”

  24. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS C OMPUTATIONAL E FFICIENCY Problem: Each probability query requires normalisation over vocabulary. ◮ O ( vocab size ) ◮ rich morphology ⇒ large vocabulary

  25. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS C OMPUTATIONAL E FFICIENCY Problem: Each probability query requires normalisation over vocabulary. ◮ O ( vocab size ) ◮ rich morphology ⇒ large vocabulary S OLUTION : D ECOMPOSE MODEL USING WORD CLASSES � � � � word | history = class ( word ) | history P P � � × P word | class ( word ) , history ◮ use unsupervised Brown-clustering √ ◮ each LM query becomes 2 × O ( vocab size ) ⇒ fast enough for MT-decoding

  26. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS E VALUATION O VERVIEW Setup ◮ 4-gram models ◮ Czech, English, French, German, Spanish, Russian ◮ train on 20–50m tokens ◮ large vocabularies (exclude 5% of singletons)

  27. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS E VALUATION O VERVIEW Setup ◮ 4-gram models ◮ Czech, English, French, German, Spanish, Russian ◮ train on 20–50m tokens ◮ large vocabularies (exclude 5% of singletons) Three evaluation contexts: ◮ Perplexity on test data ◮ Word similarity rating ◮ Machine translation

  28. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS E VALUATION O VERVIEW Three evaluation contexts: ◮ Perplexity on test data ◮ Word similarity rating ◮ Machine translation

  29. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS P ERPLEXITY I MPROVEMENTS BY L ANGUAGE CLBL → CLBL ++ 683 → 643 6 422 → 404 313 → 300 4 281 → 273 % 207 → 203 232 → 227 2 0 CS DE EN ES FR RU

  30. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS P ERPLEXITY I MPROVEMENTS ON G ERMAN CLBL → CLBL ++ (B REAK - DOWN BY TOKEN FREQUENCY ) 20 15 % 10 5 0 0 < 10 1 < 10 2 < 10 3 < 10 4 < 10 5 < 10 6 < 10 7 Bins of test token frequency

  31. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS E VALUATION O VERVIEW Three evaluation contexts: ◮ Perplexity on test data ◮ Word similarity rating ◮ Machine translation

  32. M OTIVATION P ROPOSED M ETHOD E XPERIMENTS E VALUATION O VERVIEW Three evaluation contexts: ◮ Perplexity on test data ◮ Word similarity rating ◮ Machine translation

Recommend


More recommend