statistical nlp
play

Statistical NLP Spring 2011 Lecture 25: Summarization Dan Klein - PDF document

Statistical NLP Spring 2011 Lecture 25: Summarization Dan Klein UC Berkeley Document Summarization 1 Multi-document Summarization 27,000+ more Extractive Summarization 2 Selection mid-90s Maximum Marginal Relevance Greedy


  1. Statistical NLP Spring 2011 Lecture 25: Summarization Dan Klein – UC Berkeley Document Summarization 1

  2. Multi-document Summarization … 27,000+ more Extractive Summarization 2

  3. Selection mid-‘90s • Maximum Marginal Relevance Greedy search over sentences [Carbonell and Goldstein, 1998] s s 2 s 1 Q s s 4 s 3 Minimize redundancy present Maximize similarity to the query Selection mid-‘90s • Maximum Marginal Relevance • Graph algorithms [Mihalcea 05++] present 3

  4. Selection mid-‘90s • Maximum Marginal Relevance • Graph algorithms s 1 s 2 present Nodes are sentences s 3 s 4 Selection mid-‘90s • Maximum Marginal Relevance • Graph algorithms s 1 s 2 present Nodes are sentences s 3 s 4 Edges are similarities 4

  5. Selection mid-‘90s • Maximum Marginal Relevance • Graph algorithms Stationary distribution represents node centrality s 2 s 1 present s s 4 Nodes are sentences s 3 Edges are similarities Selection mid-‘90s • Maximum Marginal Relevance • Graph algorithms • Word distribution models w P D (w) (w) w P P A (w) (w) present Obama 0.017 Obama ? ~ speech 0.024 speech ? health 0.009 health ? Montana 0.002 Montana ? Input document distribution Summary distribution 5

  6. Selection mid-‘90s • Maximum Marginal Relevance • Graph algorithms • Word distribution models SumBasic [Nenkova and Vanderwende, 2005] Value(w i ) = P D (w i ) present Value(s i ) = sum of its word values Choose s i with largest value Adjust P D (w) Repeat until length constraint Selection mid-‘90s • Maximum Marginal Relevance • Graph algorithms • Word distribution models • Regression models F(x) word values word values position position length length present s 1 12 1 24 s s 1 1 s 2 4 2 14 s s 2 2 s 3 6 3 18 s s 3 3 frequency is just one of many features 6

  7. Selection mid-‘90s • Maximum Marginal Relevance • Graph algorithms • Word distribution models • Regression models • Topic model-based [Haghighi and Vanderwende, 2009] present 7

  8. 8

  9. 9

  10. 10

  11. PYTHY H & V 09 11

  12. Selection Optimal search using MMR mid-‘90s • Maximum Marginal Relevance s 2 s s 1 • Graph algorithms Q s 4 s • Word distribution models s 3 • Regression models • Topic models Integer Linear Program • Globally optimal search present [McDonald, 2007] 12

  13. Selection [Gillick and Favre, 2008] The health care bill is a major test for the concept concept value value s 1 Obama administration. s 2 Universal health care is a divisive issue. s 3 President Obama remained calm. s 4 Obama addressed the House on Tuesday. Selection [Gillick and Favre, 2008] The health care bill is a major test for the concept concept value value s 1 Obama administration. obama 3 s 2 Universal health care is a divisive issue. s 3 President Obama remained calm. s 4 Obama addressed the House on Tuesday. 13

  14. Selection [Gillick and Favre, 2008] The health care bill is a major test for the concept concept value value s 1 Obama administration. obama 3 s 2 Universal health care is a divisive issue. health 2 s 3 President Obama remained calm. s 4 Obama addressed the House on Tuesday. Selection [Gillick and Favre, 2008] The health care bill is a major test for the concept concept value value s 1 Obama administration. obama 3 s 2 Universal health care is a divisive issue. health 2 s 3 President Obama remained calm. house 1 s 4 Obama addressed the House on Tuesday. 14

  15. Selection [Gillick and Favre, 2008] The health care bill is a major test for the concept concept value value s 1 Obama administration. obama 3 s 2 Universal health care is a divisive issue. health 2 s 3 President Obama remained calm. house 1 s 4 Obama addressed the House on Tuesday. summary summary length length value value Length limit: greedy {s 1 , s 3 } 17 5 18 words optimal {s 2 , s 3 , s 4 } 17 6 Maximize Concept Coverage Optimization problem: Set Coverage Value of concept c Set of concepts Set of extractive summaries present in summary s of document set D Results Bigram Recall Pyramid Baseline 4.00 Baseline 23.5 35.0 2009 2009 6.85 [Gillick and Favre 09] 15

  16. Selection Integer Linear Program for the maximum coverage model [Gillick, Riedhammer, Favre, Hakkani-Tur, 2008] total concept value summary length limit maintain consistency between selected sentences and concepts Selection [Gillick and Favre, 2009] This ILP is tractable for reasonable problems 16

  17. Results [G & F, 2009] • 52 submissions • 27 teams • 44 topics • 10 input docs • 100 word summaries Gillick & Favre • Rating scale: 1-10 • Rating scale: 0-1 • Humans in [8.3, 9.3] • Humans in [0.62, 0.77] • Rating scale: 1-10 • Rating scale: 0-1 • Humans in [8.5, 9.3] • Humans in [0.11, 0.15] Error Breakdown? [Gillick and Favre, 2008] 17

  18. Selection First sentences are unique How to include sentence position? Selection Some interesting work on sentence ordering [Barzilay et. al., 1997; 2002] But choosing independent sentences is easier • First sentences usually stand alone well • Sentences without unresolved pronouns • Classifier trained on OntoNotes: <10% error rate Baseline ordering module (chronological) is not obviously worse than anything fancier 18

  19. Problems with Extraction What would a human do? It is therefore unsurprising that Lindsay pleaded not guilty yesterday afternoon to the charges filed against her, according to her publicist. Problems with Extraction What would a human do? It is therefore unsurprising that Lindsay pleaded not guilty yesterday afternoon to the charges filed against her, according to her publicist. 19

  20. Sentence Rewriting [Berg-Kirkpatrick, Gillick, and Klein 11] Sentence Rewriting [Berg-Kirkpatrick, Gillick, and Klein 11] 20

  21. Sentence Rewriting [Berg-Kirkpatrick, Gillick, and Klein 11] Sentence Rewriting New Optimization problem: Safe Deletions Value of deletion d Set branch cut deletions made in creating summary s How do we know how much a given deletion costs? [Berg-Kirkpatrick, Gillick, and Klein 11] 21

  22. Learning Features: Embed ILP in cutting plane algorithm. Results Bigram Recall Pyramid Baseline Baseline 4.00 23.5 2009 2009 6.85 35.0 Now Now 7.75 41.3 [Berg-Kirkpatrick, Gillick, and Klein 11] Beyond Extraction / Compression? Sentence extraction is limiting ... and boring! But abstractive summaries are much harder to generate… in 25 words? 22

  23. http://www.rinkworks.com/bookaminute/ 23

Recommend


More recommend