a transition based directed acyclic graph parser for
play

A Transition-Based Directed Acyclic Graph Parser for Universal - PowerPoint PPT Presentation

1 A Transition-Based Directed Acyclic Graph Parser for Universal Conceptual Cognitive Annotation Daniel Hershcovich, Omri Abend and Ari Rappoport ACL 2017 2 TUPA Transition-based UCCA Parser The first parser to support the combination of


  1. 1 A Transition-Based Directed Acyclic Graph Parser for Universal Conceptual Cognitive Annotation Daniel Hershcovich, Omri Abend and Ari Rappoport ACL 2017

  2. 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three properties: 1. Non-terminal nodes — entities and events over the text You want to a long take bath

  3. 3 TUPA — Transition-based UCCA Parser The first parser to support the combination of three properties: 1. Non-terminal nodes — entities and events over the text 2. Reentrancy — allow argument sharing You want to a long take bath

  4. 4 TUPA — Transition-based UCCA Parser The first parser to support the combination of three properties: 1. Non-terminal nodes — entities and events over the text 2. Reentrancy — allow argument sharing 3. Discontinuity — conceptual units are split — needed for many semantic schemes (e.g. AMR, UCCA). You want to a long take bath

  5. 5 Introduction

  6. 6 Linguistic Structure Annotation Schemes • Syntactic dependencies • Semantic dependencies (Oepen et al., 2016) Syntactic (UD) root dobj xcomp det nsubj mark amod You want to take a long bath ARG1 ARG1 ARG2 BV top ARG2 ARG1 Semantic (DM) Bilexical dependencies.

  7. 7 Linguistic Structure Annotation Schemes • Syntactic dependencies • Semantic dependencies (Oepen et al., 2016) • Semantic role labeling (PropBank, FrameNet) • AMR (Banarescu et al., 2013) • UCCA (Abend and Rappoport, 2013) • Other semantic representation schemes 1 Semantic representation schemes attempt to abstract away from syntactic detail that does not affect meaning: . . . bathed = . . . took a bath 1 See recent survey (Abend and Rappoport, 2017)

  8. 8 The UCCA Semantic Representation Scheme

  9. 9 Universal Conceptual Cognitive Annotation (UCCA) Cross-linguistically applicable (Abend and Rappoport, 2013). Stable in translation (Sulem et al., 2015). English Hebrew

  10. 10 Universal Conceptual Cognitive Annotation (UCCA) Rapid and intuitive annotation interface (Abend et al., 2017). Usable by non-experts. ucca-demo.cs.huji.ac.il Facilitates semantics-based human evaluation of machine translation (Birch et al., 2016). ucca.cs.huji.ac.il/mteval

  11. 11 Graph Structure UCCA generates a directed acyclic graph (DAG). Text tokens are terminals, complex units are non-terminal nodes. Remote edges enable reentrancy for argument sharing. Phrases may be discontinuous (e.g., multi-word expressions). —– primary edge A P A - - - remote edge You want F P D to A P C C process F A participant a long take bath C center D adverbial F function You want to take a long bath

  12. 12 Transition-based UCCA Parsing

  13. 13 Transition-Based Parsing First used for dependency parsing (Nivre, 2004). Parse text w 1 . . . w n to graph G incrementally by applying transitions to the parser state: stack, buffer and constructed graph.

  14. 14 Transition-Based Parsing First used for dependency parsing (Nivre, 2004). Parse text w 1 . . . w n to graph G incrementally by applying transitions to the parser state: stack, buffer and constructed graph. Initial state: stack buffer a long You want to take bath

  15. 15 Transition-Based Parsing First used for dependency parsing (Nivre, 2004). Parse text w 1 . . . w n to graph G incrementally by applying transitions to the parser state: stack, buffer and constructed graph. Initial state: stack buffer a long You want to take bath TUPA transitions: { Shift, Reduce, Node X , Left-Edge X , Right-Edge X , Left-Remote X , Right-Remote X , Swap, Finish } Support non-terminal nodes, reentrancy and discontinuity.

  16. 16 Example ⇒ Shift stack buffer a want to long You take bath graph A P A want You F P D to A C C F a long take bath

  17. 17 Example ⇒ Right-Edge A stack buffer a want to long You take bath graph A P A want You F P D to A C C F a long take bath

  18. 18 Example ⇒ Shift stack buffer a want to long You take bath graph A P A want You F P D to A C C F a long take bath

  19. 19 Example ⇒ Swap stack buffer a want to long You take bath graph A P A want You F P D to A C C F a long take bath

  20. 20 Example ⇒ Right-Edge P stack buffer a want to long You take bath graph A P A want You F P D to A C C F a long take bath

  21. 21 Example ⇒ Reduce stack buffer a to long take bath graph A P A want You F P D to A C C F a long take bath

  22. 22 Example ⇒ Shift stack buffer a to long You take bath graph A P A want You F P D to A C C F a long take bath

  23. 23 Example ⇒ Shift stack buffer a to long You take bath graph A P A want You F P D to A C C F a long take bath

  24. 24 Example ⇒ Node F stack buffer a to long You take bath graph A P A want You F P D to A C C F a long take bath

  25. 25 Example ⇒ Reduce stack buffer a long You take bath graph A P A want You F P D to A C C F a long take bath

  26. 26 Example ⇒ Shift stack buffer a long You take bath graph A P A want You F P D to A C C F a long take bath

  27. 27 Example ⇒ Shift stack buffer a long You take bath graph A P A want You F P D to A C C F a long take bath

  28. 28 Example ⇒ Node C stack buffer a long You take bath graph A P A want You F P D to A C C F a long take bath

  29. 29 Example ⇒ Reduce stack buffer a long You bath graph A P A want You F P D to A C C F a long take bath

  30. 30 Example ⇒ Shift stack buffer a long You bath graph A P A want You F P D to A C C F a long take bath

  31. 31 Example ⇒ Right-Edge P stack buffer a long You bath graph A P A want You F P D to A C C F a long take bath

  32. 32 Example ⇒ Shift stack buffer a long You bath graph A P A want You F P D to A C C F a long take bath

  33. 33 Example ⇒ Right-Edge F stack buffer a long You bath graph A P A want You F P D to A C C F a long take bath

  34. 34 Example ⇒ Reduce stack buffer long You bath graph A P A want You F P D to A C C F a long take bath

  35. 35 Example ⇒ Shift stack buffer long You bath graph A P A want You F P D to A C C F a long take bath

  36. 36 Example ⇒ Swap stack buffer long You bath graph A P A want You F P D to A C C F a long take bath

  37. 37 Example ⇒ Right-Edge D stack buffer long You bath graph A P A want You F P D to A C C F a long take bath

  38. 38 Example ⇒ Reduce stack buffer You bath graph A P A You want F P D to A C C F a long take bath

  39. 39 Example ⇒ Swap stack buffer You bath graph A P A want You F P D to A C C F a long take bath

  40. 40 Example ⇒ Right-Edge A stack buffer You bath graph A P A want You F P D to A C C F a long take bath

  41. 41 Example ⇒ Reduce stack buffer You bath graph A P A You want F P D to A C C F a long take bath

  42. 42 Example ⇒ Reduce stack buffer You bath graph A P A You want F P D to A C C F a long take bath

  43. 43 Example ⇒ Shift stack buffer You bath graph A P A want You F P D to A C C F a long take bath

  44. 44 Example ⇒ Shift stack buffer You bath graph A P A want You F P D to A C C F a long take bath

  45. 45 Example ⇒ Left-Remote A stack buffer You bath graph A P A want You F P D to A C C F a long take bath

  46. 46 Example ⇒ Shift stack buffer You bath graph A P A want You F P D to A C C F a long take bath

  47. 47 Example ⇒ Right-Edge C stack buffer You bath graph A P A want You F P D to A C C F a long take bath

  48. 48 Example ⇒ Finish stack buffer You bath graph A P A You want F P D to A C C F a long take bath

  49. 49 Training An oracle provides the transition sequence given the correct graph: A P A You want F P D to A C C F a take long bath ⇓ Shift , Right-Edge A , Shift , Swap , Right-Edge P , Reduce , Shift , Shift , Node F , Reduce , Shift , Shift , Node C , Reduce , Shift , Right-Edge P , Shift , Right-Edge F , Reduce , Shift , Swap , Right-Edge D , Reduce , Swap , Right-Edge A , Reduce , Reduce , Shift , Shift , Left-Remote A , Shift , Right-Edge C , Finish

  50. 50 TUPA Model Learn to greedily predict transition based on current state. Experimenting with three classifiers: Sparse Perceptron with sparse features (Zhang and Nivre, 2011). Embeddings + feedforward NN (Chen and Manning, 2014). MLP BiLSTM Embeddings + deep bidirectional LSTM + MLP (Kiperwasser and Goldberg, 2016). Features: words, POS, syntactic dependencies, existing edge labels from the stack and buffer + parents, children, grandchildren; ordinal features (height, number of parents and children) stack buffer

Recommend


More recommend