towards the naive classification of
play

Towards the Naive Classification of Rhetorical Relations at Scale - PowerPoint PPT Presentation

Towards the Naive Classification of Rhetorical Relations at Scale Georg Rehm DFKI GmbH Alt-Moabit 91c, 10559 Berlin Workshop on Coherence Relations Humboldt-Universitt zu Berlin January 17-18, 2020 Storytelling Theory Storytelling =


  1. Towards the Naive Classification of Rhetorical Relations at Scale Georg Rehm DFKI GmbH Alt-Moabit 91c, 10559 Berlin Workshop on Coherence Relations Humboldt-Universität zu Berlin January 17-18, 2020

  2. Storytelling Theory • Storytelling = human technique to order a series of events in the world and find meaningful patterns in them (Brunner 1991) • Organise events into a schematic structure, for example, in terms of topic, locality or causal relationships, and construct explanatory models of the world and the events happening in it • Semantic Storytelling = attempt to translate the theories of storytelling into a formal and machine-processible scheme Towards the Naive Classification of Rhetorical Relations at Scale 3

  3. Semantic Storytelling • Develop a system that, given an incoming document collection, is able to (semi-)automatically extract or generate different story paths or plot lines • The goal is to support knowledge workers (journalists, authors, scholars, politicians, business analysts etc.) in their daily work of processing huge amounts of incoming content • Helps to quickly grasp what is going on in a collection Towards the Naive Classification of Rhetorical Relations at Scale 4

  4. Previous work: NLP-Pipeline based approach • Combine various text analysis procedures in a pipeline (NER, Coreference Resolution, Relation Extraction, etc.) • Connect extracted entities to knowledge bases • Use rule-based story grammars Towards the Naive Classification of Rhetorical Relations at Scale 5

  5. Previous work: NLP-Pipeline based approach 1. NER 2. Relation 3. Timelining Entities like Persons, Extraction Locations, 4. Event Detection Anchor Entities and Detect relations Organizations, Relations in Time between Entities Temporal Expressions 6. Building 6. Train Model 5. Topic Datasets for 7. Visualizing on basis of Detection Patterns of Results Dataset Narration Towards the Naive Classification of Rhetorical Relations at Scale 6

  6. Now: Discourse-parsing inspired approach • Scalable: text segments can be phrases, sentences, paragraphs, texts • relating text segments to each other by using sense taxonomies from research on coherence relations • Goal: automate storytelling by detecting discourse relations between texts segments of different sources on the same topic • Makes it possible to detect and create new storylines extracted from a document collection • In future work: Combine both approaches Towards the Naive Classification of Rhetorical Relations at Scale 7

  7. Semantic Storytelling: Technical Description • Initialization: User defines Topic T, initialized as a sentence, keyword or named entity • Semantic Storytelling tool will: 1. Determine the Relevance of a Segment for a Topic 2. Determine the Importance of a Segment 3. Determine the Discourse or Semantic Relation between two Segments Towards the Naive Classification of Rhetorical Relations at Scale 8

  8. Semantic Storytelling Self-contained Incoming Content Web content Wikipedia document collection Architecture 1 T Determine the relevance of a segment for Possible instantiations of T • Complete document a A Sentence 1 Document relevance Topic • Summary Ranked list of Sentence 5 B • Claim or fact text segments T b • Segment relevance Sentence 4 Event C • Named entity A isLessImportantThan 2 Determine importance B T C of a segment isMoreImportantThan isMoreImportantThan Comparison User 3 Discourse relation between Comparison B T generating A segment and topic Stories C Expansion “ Explore The Neighbourhood! ” GUI Towards the Naive Classification of Rhetorical Relations at Scale 9

  9. Step 1: Relevance of a Segment • Is segment x relevant for segment t ? • Use for example: – Topic modelling – Topic overlap or entity overlap – Text similarity or document similarity Towards the Naive Classification of Rhetorical Relations at Scale 10

  10. Step 2: Importance of a Segment • How important or central is the information contained in a segment for a topic? • In RST terms: Determine the nucleus (vs. satellite ) • Possible applications in Question Answering-task: Is segment x a potential answer for segment t ? Towards the Naive Classification of Rhetorical Relations at Scale 11

  11. Step 3: Discourse Relations • Find the the discourse or semantic relation between a text segment and the Topic T • From Rhetorical Structure Theory (Thompson 1988) we borrow the idea that between larger sequences of texts (i. e., non-elementary discourse units) discourse relations exist • These relations contribute to the coherence of a text Towards the Naive Classification of Rhetorical Relations at Scale 12

  12. Step 3: Discourse Relations • For our experiments, we adopt the top-level senses of the Penn Discourse Treebank, with which we can describe those discourse relations: – Temporal – Contingency – Comparison – Expansion , and an additional label – None Towards the Naive Classification of Rhetorical Relations at Scale 13

  13. Discourse Relations according to Penn Discourse Treebank (2.0) Towards the Naive Classification of Rhetorical Relations at Scale 14

  14. Step 3: Discourse Relations • For training, we use the two arguments of a relation, but at a later point we deploy it using individual sentences • We argue that the sentence-level is the most appropriate level to use as input for our classifier and that the discrepancy between argument shapes and typical sentence lengths is tolerable Towards the Naive Classification of Rhetorical Relations at Scale 15

  15. Use- Case: “Explore the Neighbourhood!” • Goal is to help a knowledge worker to develop a mobile app which includes interesting stories about important persons, places, etc. related to a district in Berlin • The district Moabit was chosen due to its rich history and lively present • Here, a story about the author Kurt Tucholsky and his connection to Moabit is shown • Screenshots for a demo app are provided by 3pc Towards the Naive Classification of Rhetorical Relations at Scale 16

  16. Use- Case: “Explore the Neighbourhood!” • Curated stories can be published to the app • Stories may contain geographical points of interest within Moabit which are connected through an overall story arch, such as a biography Towards the Naive Classification of Rhetorical Relations at Scale 17

  17. Use- Case: “Explore the Neighbourhood!” Towards the Naive Classification of Rhetorical Relations at Scale 18

  18. Use- Case: “Explore the Neighbourhood!” Towards the Naive Classification of Rhetorical Relations at Scale 19

  19. Use- Case: “Explore the Neighbourhood!” Towards the Naive Classification of Rhetorical Relations at Scale 20

  20. Use- Case: “Explore the Neighbourhood!” Towards the Naive Classification of Rhetorical Relations at Scale 21

  21. User Interface for creating curated stories: Towards the Naive Classification of Rhetorical Relations at Scale 22

  22. Experiments: Dataset “ Moabit Stories” • Created data set “ Moabit Stories” from crawled English webpages • Used focused crawling methods based on keywords (= topics) and manual postprocessing • Boilerplated content and metadata (author, date, url, language, etc.) • Result: data set of more than 100 documents containing relevant information and stories connected to the district of Moabit in Berlin, grouped by topics Towards the Naive Classification of Rhetorical Relations at Scale 23

  23. Experiments: Discourse Relations Classifier • The discourse relations classifier is trained on PDTB2 (Prasad 2008) • The text is encoded as deep contextual representations with a language model based on the transformer architecture (pre-trained language model from DistilBERT (Sanh 2019)) Towards the Naive Classification of Rhetorical Relations at Scale 24

  24. Architecture of Siamese BERT model • Architecture of the Siamese BERT model used for the classification of discourse relations between two text segments d 1 and d 2 • The output of the classification layer ŷ holds the predicted semantic relation according to the top-level PDTB2 senses: Temporal, Contingency, Comparison, Expansion and additionally None Towards the Naive Classification of Rhetorical Relations at Scale 25

  25. Architecture of Siamese BERT model • BERT used in a Siamese fashion, 6 hidden layers, each consisting of 768 units with last hidden states h 1 , h 2 • Concatenation layer takes both last hidden states h 1 , h 2 as input, output is a combined concatenation of the text representations • Multi-Layer-Perceptron Layer consisting of two fully connected layers where each layer has 100 units • Activation with ReLU Towards the Naive Classification of Rhetorical Relations at Scale 26

  26. Experiments: Results PDTB2 Training Towards the Naive Classification of Rhetorical Relations at Scale 27

  27. Experiments: “ Moabit Stories” Steps: • Group documents by topics based on the query terms for the focused crawler • Split documents into sentences • Find document pairs among the topic groups by representing documents as tf-idf vectors and using cosine similarity with 𝑑𝑝𝑡𝑗𝑜𝑓 𝑒 𝑏 , 𝑒𝑐 > 0.15 for document pairs • 19,796 sentence pairs passed to the classifier Towards the Naive Classification of Rhetorical Relations at Scale 28

  28. Towards the Naive Classification of Rhetorical Relations at Scale 29

Recommend


More recommend