Discourse and Summarization Prof. Sameer Singh CS 295: STATISTICAL NLP WINTER 2017 March 16, 2017 Based on slides from Dan Jurafsky, Jacob Eisenstein, and everyone else they copied from.
Upcoming… Final report due in a week: March 20, 2017 • Project Instructions up: ACL style, 5 pages (+references) • CS 295: STATISTICAL NLP (WINTER 2017) 2
Outline Discourse Summarization Wrapup CS 295: STATISTICAL NLP (WINTER 2017) 3
Outline Discourse Summarization Wrapup CS 295: STATISTICAL NLP (WINTER 2017) 4
Discourse Coreference Resolving entities and events. Coherence What makes the text coherent? Relations Rhetorical and narrative links between units CS 295: STATISTICAL NLP (WINTER 2017) 5
Discourse Coreference Resolving entities and events. Coherence What makes the text coherent? Relations Rhetorical and narrative links between units CS 295: STATISTICAL NLP (WINTER 2017) 6
Coherence CS 295: STATISTICAL NLP (WINTER 2017) 7
Coherence CS 295: STATISTICAL NLP (WINTER 2017) 8
Coherence vs Semantics A meaningless sentence can be grammatical.. Colorless green ideas sleep furiously The discourse equivalent of grammaticality is coherence Can a coherent text be without meaning? CS 295: STATISTICAL NLP (WINTER 2017) 9
Example Essay CS 295: STATISTICAL NLP (WINTER 2017) 10
Example Essay The second reason for the five-paragraph theme is that it makes you focus on a single topic. Some people start writing on the usual topic, like TV commercials, and they wind up all over the place, talking about where TV came from or capitalism or health foods or whatever. But with only five paragraphs and one topic you’re not tempted to get beyond your original idea, like commercials are a good source of information about products. You give your three examples, and zap! you’re done. This is another way the five-paragraph theme keeps you from thinking too much. CS 295: STATISTICAL NLP (WINTER 2017) 11
Detecting “Coherency” CS 295: STATISTICAL NLP (WINTER 2017) 12
Discourse Connectors CS 295: STATISTICAL NLP (WINTER 2017) 13
Lexical Chains CS 295: STATISTICAL NLP (WINTER 2017) 14
Discourse Relations 1. In today’s society, college is ambiguous. 2. We need it to live, 3. but we also need it to love. 4. Moreover, without college most of the world’s learning would be egregious. 5. College, however, has myriad costs. 6. One of the most important issues facing the world is how to reduce college costs. 7. Some have argued that college costs are due to the luxuries students now expect. 8. Others have argued that the costs are a result of athletics. 9. In reality, high college costs are the result of excessive pay for teaching assistants CS 295: STATISTICAL NLP (WINTER 2017) 15
Discourse Relations 1. In today’s society, college is ambiguous. 2. We need it to live, 3. but we also need it to love. 4. Moreover, without college most of the world’s learning would be egregious. 5. College, however, has myriad costs. 6. One of the most important issues facing the world is how to reduce college costs. 7. Some have argued that college costs are due to the luxuries students now expect. 8. Others have argued that the costs are a result of athletics. 9. In reality, high college costs are the result of excessive pay for teaching assistants CS 295: STATISTICAL NLP (WINTER 2017) 16
Discourse Relations 1. In today’s society, college is ambiguous. 2. We need it to live, 3. but we also need it to love. 4. Moreover, without college most of the world’s learning would be egregious. 5. College, however, has myriad costs. 6. One of the most important issues facing the world is how to reduce college costs. 7. Some have argued that college costs are due to the luxuries students now expect. 8. Others have argued that the costs are a result of athletics. 9. In reality, high college costs are the result of excessive pay for teaching assistants CS 295: STATISTICAL NLP (WINTER 2017) 17
Discourse Relations 1. In today’s society, college is ambiguous. 2. We need it to live, 3. but we also need it to love. 4. Moreover, without college most of the world’s learning would be egregious. 5. College, however, has myriad costs. 6. One of the most important issues facing the world is how to reduce college costs. 7. Some have argued that college costs are due to the luxuries students now expect. 8. Others have argued that the costs are a result of athletics. 9. In reality, high college costs are the result of excessive pay for teaching assistants CS 295: STATISTICAL NLP (WINTER 2017) 18
Discourse Relations 1. In today’s society, college is ambiguous. 2. We need it to live, 3. but we also need it to love. 4. Moreover, without college most of the world’s learning would be egregious. 5. College, however, has myriad costs. 6. One of the most important issues facing the world is how to reduce college costs. 7. Some have argued that college costs are due to the luxuries students now expect. 8. Others have argued that the costs are a result of athletics. 9. In reality, high college costs are the result of excessive pay for teaching assistants CS 295: STATISTICAL NLP (WINTER 2017) 19
Discourse Relations 1. In today’s society, college is ambiguous. 2. We need it to live, 3. but we also need it to love. 4. Moreover, without college most of the world’s learning would be egregious. 5. College, however, has myriad costs. 6. One of the most important issues facing the world is how to reduce college costs. 7. Some have argued that college costs are due to the luxuries students now expect. 8. Others have argued that the costs are a result of athletics. 9. In reality, high college costs are the result of excessive pay for teaching assistants CS 295: STATISTICAL NLP (WINTER 2017) 20
Coherence Structure 1. In today’s society, college is ambiguous. 2. We need it to live, 3. but we also need it to love. Segmentation 4. Moreover, without college most of the world’s learning would be egregious. 5. College, however, has myriad costs. 6. One of the most important issues facing Zoning/Ordering the world is how to reduce college costs. 7. Some have argued that college costs are due to the luxuries students now expect. 8. Others have argued that the costs are a Centering/Salience result of athletics. 9. In reality, high college costs are the result of excessive pay for teaching assistants CS 295: STATISTICAL NLP (WINTER 2017) 21
Coherence Structure 1. In today’s society, college is ambiguous. 2. We need it to live, 3. but we also need it to love. Segmentation` 4. Moreover, without college most of the world’s learning would be egregious. 5. College, however, has myriad costs. 6. One of the most important issues facing Zoning/Ordering the world is how to reduce college costs. 7. Some have argued that college costs are due to the luxuries students now expect. 8. Others have argued that the costs are a Centering/Salience result of athletics. 9. In reality, high college costs are the result of excessive pay for teaching assistants CS 295: STATISTICAL NLP (WINTER 2017) 22
Coherence Structure 1. In today’s society, college is ambiguous. 1. In today’s society, college is ambiguous. 2. We need it to live, 2. We need it to live, 3. but we also need it to love. 3. but we also need it to love. Segmentation 4. Moreover, without college most of the 4. Moreover, without college most of the world’s learning would be egregious. world’s learning would be egregious. 5. College, however, has myriad costs. 5. College, however, has myriad costs. 6. One of the most important issues facing 6. One of the most important issues facing Zoning/Ordering the world is how to reduce college costs. the world is how to reduce college costs. 7. Some have argued that college costs are 7. Some have argued that college costs are due to the luxuries students now expect. due to the luxuries students now expect. 8. Others have argued that the costs are a 8. Others have argued that the costs are a Centering/Salience result of athletics. result of athletics. 9. In reality, high college costs are the result 9. In reality, high college costs are the result of excessive pay for teaching assistants of excessive pay for teaching assistants CS 295: STATISTICAL NLP (WINTER 2017) 23
Recommend
More recommend