Text Summarization of Review Sentiments Eric Jensen Summize, Inc.

Outline ! Opinions on the web ! Opinion mining ! Text summarization " The problem " Proposed algorithm " Results ! Conclusions

Growth of Amazon, IMDB, and Blogs 3.5M 3.0M 2.5M User 2.0M Reviews Blog 1.5M Reviews 1.0M 500K 0 1999 2001 2003 2005 2007

Opinions on the web Consumer Reports Amazon Four Word Users Film Review Focus Yahoo Answers Twitter Blogs Length

Support (or lack of?) 100% 90% Cumulative Proportion 80% 70% 60% 50% 40% 1 11 21 31 41 51 61 71 81 91 101 111 121 Number of Review s

How many are you willing to read?

Opinion mining ! Sentiment analysis ! Facet mining ! Text summarization

Sentiment analysis! (Pang EMNLP 2002, Dave, et. al WWW 2003) I Am Legend “I won't review the movie because this has already been done. What I will rate is the 2-disc ‘Special Edition’ of this movie…Overall, I feel this 2-disc edition is not worth the extra money it costs.”

Facet mining (Hu and Liu KDD 2004, Popescu and Etzioni EMNLP 2005, Titov and McDonald WWW 2008) ! Digital camera " Resolution " Zoom " User interface ! I Am Legend " Acting " Special effects " 2-disc special edition?

Text summarization The problem : understand the prevailing sentiments as quickly as possible ! Leverage the ratings users provide to produce more meaningful summaries ! Don’t restrict to fixed categories/facets ! Why did the users rate it this way

Example I Am Legend riveting movie • hollywood ending • amazing story • excellent character • riveting performance • dark sci-fi • grotesque film

Experimentation ! Dataset ! Evaluation ! Baseline ! Results ! Consensus Building

Experimentation: Dataset ! Amazon and IMDB ! 10 million user reviews ! 3.6 million products ! Books, movies, music, and others

Evaluation ! Sampled 30 products " Stratified by category " Minimum of 10 reviews each ! Task: ideal 10-word summary of the prevailing sentiments about that product " Mix positive and negative in appropriate ratio " Arbitrary length phrases ! E.g. vacuum cleaner : high suction, heavy, do not buy

Evaluation: Metrics ! Text Analysis Conference (formerly DUC) ! Overlap of reference ∑ Count ( gram ) summaries highly match n ∈ gram reference − = ROUGE N n ∑ correlated with Count gram ( ) n ∈ gram reference n manual evaluation (Lin & Hovy HLT- NAACL 2003)

Framework Output Input riveting movie • hollywood ending • amazing story • excellent character • riveting performance • dark sci-fi • grotesque film

Baseline: Adapted facet-oriented mining (Hu and Liu KDD 2004) 1. Identify noun phrases and treat adjacent adjectives as opinion words 2. Rank noun phrases by TFxIDF 3. Choose top opinion word by frequency 4. Choose top summary phrases by frequency - 3 & 4 our adaptation

Proposed algorithm 1. Identify each opinion word and treat the following word as a “facet” word 2. Rank facet words by frequency 3. Choose top opinion word by frequency 4. Choose top phrases by frequency

Results Method / Metric Precision Recall F 0.5 Facets ROUGE-1 0.329 0.189 0.215 Summize ROUGE-1 0.293 0.263 0.273 +26.81% Facets ROUGE-2 0.105 0.025 0.033 Summize ROUGE-2 0.050 0.044 0.045 +36.25% Facets ROUGE-SU4 0.161 0.054 0.059 Summize ROUGE-SU4 0.107 0.088 0.091 +55.03%

Consensus Building 1 fraction of products 0.9 cluster pro ba bility 0.8 0.7 0.6 ility probab 0.5 0.4 0.3 0.2 0.1 0 0 1 2 3 4 10 10 10 10 10 rev i e w cnt

Conclusions ! Number of opinions on the web are growing faster than anyone wants to read ! Text summarization reveals the why behind the ratings ! Facets do not capture the ideal summaries (sentiment-oriented ones are 26% closer) ! Scaling is both a problem and an opportunity

Future Directions ! Scale to more and more reviews ! Analyze opinions from unstructured sources (blogs, twitters, etc.)

Plugging my other work ! Semi-automatic evaluation (ACM TOIS ’07) ! Query classification (ACM TOIS ’07) ! Query log analysis (SIGIR ‘04)

Text Summarization of Review Sentiments Eric Jensen Summize, Inc. - PowerPoint PPT Presentation

Text Summarization of Review Sentiments Eric Jensen Summize, Inc. Outline ! Opinions on the web ! Opinion mining ! Text summarization " The problem " Proposed algorithm " Results ! Conclusions Growth of Amazon, IMDB, and Blogs

Tutorial on Abstractive Text Summarization Advaith Siddharthan NLG Summer School, Aberdeen, 22

Algorithms for NLP Summarization Chan Young Park CMU Slides adapted from: Dan Jurafsky

Algorithms for NLP Summarization Chan Young Park CMU Slides adapted from: Dan Jurafsky

Neural Text Summarization Piji Li NLP Center, Tencent AI Lab pijili@tencent.com Paper Reading,

Alternative Perspectives on Summarization Systems & Applications Ling 573 May 25, 2017

Text Summarization Using A Trainable Summarizer and Latent Semantic Analysis Jen-Yuan Yeh 1 ,

Alternative Summarization: Abstraction, Reviews & Speech Ling 573 Systems and Applications

Chinese Text Summarization Using A Trainable Summarizer and Latent Semantic Analysis Jen-Yuan Yeh

LexPageRank: Prestige in Multi-Document Text Summarization G unes Erkan ,

Title of an article [16 pt] Introduction [14 pt] Text. Text. Text. Text. Text. Text. Text. Text.

Recent Advances in Automatic Speech Summarization Sadaoki Furui Department of Computer Science

A Contextual Query Expansion Approach by Term Clustering for Robust Text Summarization Massih

Improving Neural Abstractive Text Summarization with Prior Knowledge Gaetano Rossiello , Pierpaolo

ACL19 Summarization Xiachong Feng Papers Multi-Document Summarization Scientific Paper

MeanSum : A Neural Model for Unsupervised Multi-Document Abstractive Summarization Eric Chu *

Document Summarization Statistical NLP Spring 2011 Lecture 25: Summarization Dan Klein UC

Types of Subjectivity Subjectivity in Language Sentiments: positive or negative emotions,

Statistical NLP Spring 2011 Lecture 25: Summarization Dan Klein UC Berkeley Document

Overview of TREC 2013 Ellen Voorhees Text REtrieval Conference (TREC) Back to our roots, writ

Movie Summarization and Movie Summarization and Skimming Demonstrator Skimming Demonstrator

Get To The Point: Summarization with Pointer-Generator Networks Abigail See* Peter J. Liu

Overview of TAC 2011 Summarization Track Karolina Owczarzak, Hoa Trang Dang National Institute of

linking, cross-lingual entity linking) TAC 2011 Summarization Track Guided Summarization task

Scaling NewSum Big data text Clustering and https://www.scify.org Summarization using N-Gram

Text Summarization of Review Sentiments Eric Jensen Summize, Inc. - PowerPoint PPT Presentation

Text Summarization of Review Sentiments Eric Jensen Summize, Inc. Outline ! Opinions on the web ! Opinion mining ! Text summarization " The problem " Proposed algorithm " Results ! Conclusions Growth of Amazon, IMDB, and Blogs

Tutorial on Abstractive Text Summarization Advaith Siddharthan NLG Summer School, Aberdeen, 22

Algorithms for NLP Summarization Chan Young Park CMU Slides adapted from: Dan Jurafsky

Algorithms for NLP Summarization Chan Young Park CMU Slides adapted from: Dan Jurafsky

Neural Text Summarization Piji Li NLP Center, Tencent AI Lab pijili@tencent.com Paper Reading,

Alternative Perspectives on Summarization Systems &amp; Applications Ling 573 May 25, 2017

Text Summarization Using A Trainable Summarizer and Latent Semantic Analysis Jen-Yuan Yeh 1 ,

Alternative Summarization: Abstraction, Reviews &amp; Speech Ling 573 Systems and Applications

Chinese Text Summarization Using A Trainable Summarizer and Latent Semantic Analysis Jen-Yuan Yeh

LexPageRank: Prestige in Multi-Document Text Summarization G unes Erkan ,

Title of an article [16 pt] Introduction [14 pt] Text. Text. Text. Text. Text. Text. Text. Text.

Recent Advances in Automatic Speech Summarization Sadaoki Furui Department of Computer Science

A Contextual Query Expansion Approach by Term Clustering for Robust Text Summarization Massih

Improving Neural Abstractive Text Summarization with Prior Knowledge Gaetano Rossiello , Pierpaolo

ACL19 Summarization Xiachong Feng Papers Multi-Document Summarization Scientific Paper

MeanSum : A Neural Model for Unsupervised Multi-Document Abstractive Summarization Eric Chu *

Document Summarization Statistical NLP Spring 2011 Lecture 25: Summarization Dan Klein UC

Types of Subjectivity Subjectivity in Language Sentiments: positive or negative emotions,

Statistical NLP Spring 2011 Lecture 25: Summarization Dan Klein UC Berkeley Document

Overview of TREC 2013 Ellen Voorhees Text REtrieval Conference (TREC) Back to our roots, writ

Movie Summarization and Movie Summarization and Skimming Demonstrator Skimming Demonstrator

Get To The Point: Summarization with Pointer-Generator Networks Abigail See* Peter J. Liu

Overview of TAC 2011 Summarization Track Karolina Owczarzak, Hoa Trang Dang National Institute of

linking, cross-lingual entity linking) TAC 2011 Summarization Track Guided Summarization task

Scaling NewSum Big data text Clustering and https://www.scify.org Summarization using N-Gram

Alternative Perspectives on Summarization Systems & Applications Ling 573 May 25, 2017

Alternative Summarization: Abstraction, Reviews & Speech Ling 573 Systems and Applications