non monotonic sequential text generation
play

Non-Monotonic Sequential Text Generation Sean Welleck, Kiant - PowerPoint PPT Presentation

Non-Monotonic Sequential Text Generation Sean Welleck, Kiant Brantley, Hal Daum III, Kyunghyun Cho Sequential Text Generation Y = ( y 1 , y 2 , , y N ) ( hi , how , are , you , ? ) Sequential Text Generation Unconditional Y ( hi ,


  1. Non-Monotonic Sequential Text Generation Sean Welleck, Kianté Brantley, Hal Daumé III, Kyunghyun Cho

  2. Sequential Text Generation Y = ( y 1 , y 2 , …… , y N ) ( hi , how , are , you , ? )

  3. Sequential Text Generation Unconditional Y ( hi , how , are , you , ? ) Policy ( good , to , see , you , ! ) ∼ … π ( what , time , is , it , ? )

  4. Sequential Text Generation Conditional X Y → Policy 元気ですか ? → ( how , are , you , ? ) π Transformer, LSTM, …

  5. Sequential Text Generation Monotonic how are you ? π ( a 4 | s 4 ) π ( a 1 | s 1 ) π ( a 3 | s 3 ) π ( a 2 | s 2 ) token ( how , are , X )

  6. Sequential Text Generation Non-Monotonic how you are ? π ( a 3 | s 3 ) π ( a 2 | s 2 ) π ( a 1 | s 1 ) π ( a 4 | s 4 ) are how ? you how are you ?

  7. Binary Tree Generating Policy are …, how, are, you , ?, the, … [ ] [ ] …., you , ?, … …., how , …

  8. Binary Tree Generating Policy are …, how, are, you , ?, the, … how ? …., you , ?, … …., how , … you …., you , … ∅ ∅ ∅ ∅ ∅

  9. Binary Tree Generating Policy are how ? you ∅ ∅ ∅ ∅ ∅ are how ? you ∅ ∅ ∅ ∅ ∅ in-order traversal how are you ?

  10. Binary Tree Generating Policy are how ? are how ? you … … ∅ ∅ you are you ∅ ∅ ∅ ∅ ∅ ? how ∅ ∅ ∅ ∅ ∅ ∅ ∅ ∅

  11. Imitation Learning Define an oracle π *( a t | s t , X , Y ) Sample sequences ( a 1 , …, a T ) ∼ π * Minimize cost KL [ π *( ⋅ | s t ), π θ ( ⋅ | s t ) ]

  12. A B C D E A B C D E A B C D E A B C D E Oracles Oracle : only puts mass on valid actions π * uniform B D A ∅ ∅ C ∅ ∅ ∅

  13. A B C D E A B C D E A B C D E A B C D E A B C D E A B C D E Oracles Oracle : only puts mass on valid actions π * uniform B D A ∅ ∅ C ∅ ∅ ∅ ℒ 1 = KL ( , ) π θ π * uniform

  14. A B C D E A B C D E A B C D E A B C D E Oracles left-right : only put mass on ‘left-most’ valid action π * left-right A B ∅ C ∅ D ∅ ∅ ∅

  15. A B C D E A B C D E A B C D E Coaching Weight correct actions by the learned policy A π * π θ π * coaching uniform C ∅ ∝ ⊙ … …

  16. A B C D E A B C D E A B C D E A B C D E A B C D E Coaching Weight valid actions by the learned policy A π * π θ π * coaching uniform C ∅ ∝ ⊙ … … Loss reinforces preferred orders KL ( , ) π θ π * coaching

  17. Results | Unconditional

  18. Results | Unconditional

  19. Results | Conditional Word Reordering

  20. Results | Conditional Machine Translation

  21. Results | Variable-Sized Text Infilling Left-Right π ( ⋅ | ) ∼ Non-Monotonic π ( ⋅ | ) ∼ …

  22. Results | Variable-Sized Text Infilling

  23. 
 • Code & Pre-trained Models : 
 https://github.com/wellecks/nonmonotonic_text • Poster #45 (Pacific Ballroom)

  24. 
 • Code & Pre-trained Models : 
 https://github.com/wellecks/nonmonotonic_text • Poster #45 (Pacific Ballroom) ! thank you

Recommend


More recommend