a reinforcement learning framework for natural question
play

A Reinforcement Learning Framework for Natural Question Generation - PowerPoint PPT Presentation

A Reinforcement Learning Framework for Natural Question Generation using Bi-discriminators Presenter: Ji, Lu Zhihao Fan 1 , Zhongyu Wei 1 , Siyuan Wang 1 , Yang Liu 2 , Xuanjing Huang 3 1 School of Data Science, Fudan University, China 2


  1. A Reinforcement Learning Framework for Natural Question Generation using Bi-discriminators Presenter: Ji, Lu Zhihao Fan 1 , Zhongyu Wei 1 , Siyuan Wang 1 , Yang Liu 2 , Xuanjing Huang 3 1 School of Data Science, Fudan University, China 2 Liulishuo Company 3 School of Computer Science, Fudan University, China

  2. Outline § Introduction § Framework § Experiment § Conclusion

  3. Natural question generation § Generating a natural question which can potentially engage a human in starting a conversation (Mostafazadeh et al., 2016) [Mostafazadeh et al., 2017] Nasrin Mostafazadeh, Ishan Misra, Jacob Devlin, Margaret Mitchell, Xiaodong He, and Lucy Vanderwende. 2016. Generating natural questions about an image. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics

  4. Existing approaches Approaches: • Retrieval based model • Seq2Seq and its variants that committed to better fit the labeled data with NLL lose. Limitations: • Natural is not emphasized in these models • No knowledge about unnatural questions • Hard to identify the progress in generating natural questions

  5. Compare with Questions in VQA and VQG § VQA questions are much simpler and can be easily answered using information from the source image directly. § VQG questions are more complex and answers are not trivial.

  6. Compare with Questions in VQA and VQG § Regard question for VQA as negative samples of VQG to train the generator in adversarial learning fashion

  7. Contributions § Consider question generation as language generation task with specific attributes in terms of content and linguistics, i.e. interesting and human-written. § For the attribute of human written, we use a generative adversarial network (GAN) to learn a dynamic discriminator to distinguish human generated questions and machine generated questions. § For the attribute of natural, we use questions from VQA as negative samples and questions from VQG as positive samples to train a static discriminator.

  8. Outline § Introduction § Framework § Experiment § Conclusion

  9. Framework

  10. Structure for Question Distribution § An overall domain 𝒠 for all the questions. § According to linguistic attribute, we split 𝒠 into two antithetic domains 𝒠 " (machine generated) and 𝒠 # (human written). § According to content attribute natural, we further split 𝒠 # into two antithetic domains 𝒠 $% (natural) and 𝒠 $& (descriptive). • 𝒠 = 𝒠 " ⋃ 𝒠 # , 𝒠 # = 𝒠 $% ⋃ 𝒠 $& • 𝒠 $% ⊂ 𝒠 # ⊂ 𝒠

  11. Bi-discriminator Configuration § Dynamic discriminator 𝐸 - is proposed to distinguish human written questions and machine generated questions. § It is used to guide the generator to produce questions closer to samples from the domain of 𝒠 # . § 𝑀 / 0 = −𝔽 3~𝒠 5 log91 − 𝐸 - 𝑅 < −𝔽 3~𝒠 = log 𝐸 - 𝑅

  12. Bi-discriminator Configuration § Static discriminator 𝐸 > is proposed to distinguish natural questions and descriptive questions. § 𝑞 @ 𝑅, 𝐽 = B𝑄 𝑅 ∈ 𝒠 $% | 𝐽 , 𝑅 ∈ 𝒠 $% | 𝐽 𝑄 𝑅 ∈ 𝒠 $& | 𝐽 , 𝑅 ∈ 𝒠 $& | 𝐽 § 𝑀 / F = −(1 − 𝑞 @ 𝑅, 𝐽 ) I log 𝑞 @ 𝑅, 𝐽

  13. Optimize with Reinforcement Learning

  14. Outline § Introduction § Framework § Experiment § Conclusion

  15. Dataset § MSCOCO part of Visual Question Generation (VQG) § contains 2500, 1250 and 1250 images for training, validation and testing respectively. § Each image is accompanied with 5 natural questions produced by human annotators. § VQA is used to train the static discriminator. § For each image in VQA, three questions are collected. § Contains about 80000, 40000, 80000 images for training, validation and testing respectively.

  16. Models for Comparison § 𝐿𝑂𝑂 : Retrieve question from those of similar images. § 𝐽𝑛𝑕2𝑇𝑓𝑟 : Generates a question from image features following Seq2Seq fashion. § 𝐽𝑛𝑕2𝑇𝑓𝑟 RST − @SUV$ : Pre-train on VQA. § 𝑁𝐽𝑌𝐹𝑆 − 𝐶𝑀𝐹𝑉 − 4 : Optimizing BLEU-4 directly with RL and curriculum learning. § 𝑆𝑓𝑗𝑜𝑔𝑝𝑠𝑑𝑓 / 0 : Utilize 𝐸 - to guide the training of the generator. § 𝑆𝑓𝑗𝑜𝑔𝑝𝑠𝑑𝑓 / F : Utilize 𝐸 > to guide the training of the generator. § 𝑆𝑓𝑗𝑜𝑔𝑝𝑠𝑑𝑓 / 0 %/ F : Our proposed model.

  17. Automatic Evaluation Corpus Model BLEU-4 METEOR ROUGE CIDEr BLEU-4 37.062 19.799 22.413 52.324 50.199 𝐿𝑂𝑂 36.744 21.028 23.125 54.089 51.171 𝐽𝑛𝑕2𝑇𝑓𝑟 37.522 22.106 23.877 53.310 54.076 𝐽𝑛𝑕2𝑇𝑓𝑟 RST&@SUV$ 41.674 24.808 24.382 57.777 60.527 𝑁𝐽𝑌𝐹𝑆 − BLEU − 4 𝑆𝑓𝑗𝑜𝑔𝑝𝑠𝑑𝑓 / 0 38.945 24.420 24.665 56.196 59.513 40.063 25.237 25.492 57.503 61.745 𝑆𝑓𝑗𝑜𝑔𝑝𝑠𝑑𝑓 / F 41.098 26.265 25.634 57.679 63.388 𝑆𝑓𝑗𝑜𝑔𝑝𝑠𝑑𝑓 / 0 %/ F

  18. Human Evaluation § 200 images are sampled § Questions from different systems are presented for annotation § 2 annotators are involved to rate questions with 3-level grades § 3 is the most interesting Model # of 1 # of 2 # of 3 Avg score 𝐿𝑂𝑂 214 120 66 1.63 𝐽𝑛𝑕2𝑇𝑓𝑟 182 147 71 1.72 𝑁𝐽𝑌𝐹𝑆 − BLEU − 4 153 172 75 1.81 𝑆𝑓𝑗𝑜𝑔𝑝𝑠𝑑𝑓 / 0 167 153 80 1.78 𝑆𝑓𝑗𝑜𝑔𝑝𝑠𝑑𝑓 / 0 %/ F 149 160 91 1.86 𝐻𝑠𝑝𝑣𝑜𝑒 − 𝑈𝑠𝑣𝑢ℎ 50 70 271 2.55

  19. Examples

  20. Outline § Introduction § Framework § Experiment § Conclusion

  21. Conclusion and Future Work • We propose a reinforcement learning framework for natural question generation which incorporates two discriminators to take two specific attributes of natural question into consideration. • It can be generalized to other attributes easily. • It relies on labeled dataset to train the discriminator. • Unsupervised approach is in need.

  22. A Reinforcement Learning Framework for Natural Question Generation Using Bi-discriminators More information, please contact 1430080043@fudan.edu.cn http://www.sdspeople.fudan.edu.cn/zywei/ Zhihao Fan 1 , Zhongyu Wei 1 , Siyuan Wang 1 , Yang Liu 2 , Xuanjing Huang 3 1 School of Data Science, Fudan University, China 2 Liulishuo Company 3 School of Computer Science, Fudan University, China

Recommend


More recommend