Review Mining Soo-Min Lim and Eduard Hovy. (2006). Automatic - PowerPoint PPT Presentation

Review Mining Soo-Min Lim and Eduard Hovy. (2006). Automatic Identification of Pro and Con Reasons in Online Reviews. COLING-ACL-2006. and Oscar Tackstrom and Ryan McDonald (2011). Discovering Fine-Grained Sentiment with Latent Variable Structured Prediction Models. ECIR-2011.

Automatic Identification of Pro and Con Reasons in Online Reviews Overview ● Goal: ○ Extract sentences that explain the sentiment of reviews (pros/cons) ● Difficulties: ○ No/little labeled data ○ Pros/cons may be objective sentences ■ e.g., “the battery life lasts 3 hours” ○ Domain-specificity

Automatic Identification of Pro and Con Reasons in Online Reviews Overview ● Focus on reasons for opinions ○ reason may be objective statement ● 2 steps: ○ generate training data by aligning pros and cons with opinion- bearing sentences ○ train MaxEnt classifier to automatically identify pros and cons ● Training data: epinions.com, <review text, pros, cons> triplets ● MaxEnt classification in 2 parts: ○ identification phase ○ classification phase ■ features: lexical, positional, opinion-bearing words ● Testing data: complaints.com

Automatic Identification of Pro and Con Reasons in Online Reviews Intuitions ● MaxEnt: “best model is the one that is consistent with the set of constraints imposed by the evidence but otherwise is as uniform as possible” ● Lexical features: “there are certain words that are frequently used in pro and con sentences which are likely to represent reasons why an author writes a review” ● Positional features: “important sentences that contain topics in a text have certain positional patterns” ● Opinion-bearing word features: capture pro and con sentences which opinion-bearing expressions (objective sentences should be captured by lex and pos features)

Automatic Identification of Pro and Con Reasons in Online Reviews Discussion ● Novel part of paper is alignment step, but there is no explicit evaluation of this step ● Pro/con dictionary baseline for identification? ● Why where identification and classification separate steps? ○ Could do identification of cons, identification of pros ● Training set balanced differently than test set ○ epinions.com -- more positive reviews ○ complaints.com -- mostly negative ● “The average accuracy 68.0% is comparable with the pair-wise human agreement 82.1%” (baseline 59.9%) -- ??? ● Best accuracy and recall on restaurant complaints, best precision on mp3 complaints ● Captured both opinion-bearing and objective pro/con statements

Discovering fine-grained sentiment with latent variable structured prediction models Overview ● Fine-grained sentiment analysis, from coarse-grained supervision ● This is important because ○ Applications like opinion summarization and search we need analysis on fine-grained levels ○ Available data usually has document level labels ● Goal: Has better performance on sentence than lexicon based and document centric ML approaches

Discovering fine-grained sentiment with latent variable structured prediction models Overview ● Hidden Conditional Random Fields (HCRF) model analyzes sentence-level sentiment ● Training set: 143,580 positive, negative and neutral reviews from five different domains: books, dvds, electronics, music, and videogames ● Test set: 294 positive, negative and neutral reviews

Discovering fine-grained sentiment with latent variable structured prediction models Intuitions ● Documents may have a dominant class without having uniform sentiment. Will likely have majority one sentiment, some neutral, and minority other sentiment. ● Sequential relationship between sentence sentiment ● Document sentiment is influenced by all sentences and vice versa

Discovering fine-grained sentiment with latent variable structured prediction models Overview ● Hidden CRF model y d observable variable ● ○ for document ○ sentiment ○ y s ● i (i=1..n) latent variables for sentence ○ sentiment ○ ○ Training: HCRF is trained on document level labels ○ Decoding: Sentence level labels are obtained from latent variables

Discovering fine-grained sentiment with latent variable structured prediction models Discussion ● Sentence analysis without sentence level supervision ● Diverse set of review subjects ● Performance increase on larger data sets ● Comparison to baseline system trained on sentence- level sentiment data ● Little about choice of features ● Little about training process

Comparing Papers ● Both are similar tasks: sentence-level sentiment from document- level labels ● (Lim, Hovy) exploits structure of epinions.com ○ Better surface-level results, but more questionable methodology, evaluation ○ Straightforward ○ Task seems harder ● (Tackstrom, McDonald) uses machine learning model with latent variables ○ Doesn’t need special structure of text ○ Requires more data

Discovering fine-grained sentiment with latent variable structured prediction models Optimization We model probability of vector: y d =(y d , y s ) conditioned on input sentences: ● p θ (y d , y s | s )=exp{<φ(y d , y s , s ), θ> - A θ ( s )} ● From independence assumptions φ(y d , y s , s ) = ⊕ n i=1 φ(y d , y s i , y s i-1 , s ) φ(y d , y s i , y s i-1 , s ) =φ(y d , y s i , y s i-1 ) ⊕ φ(y s i , s ) ● Conditional probability of observable variable p θ (y d | s )=Σ ys p θ (y d , y s | s ) - marginalizing over hidden variables

Review Mining Soo-Min Lim and Eduard Hovy. (2006). Automatic - PowerPoint PPT Presentation

Review Mining Soo-Min Lim and Eduard Hovy. (2006). Automatic Identification of Pro and Con Reasons in Online Reviews. COLING-ACL-2006. and Oscar Tackstrom and Ryan McDonald (2011). Discovering Fine-Grained Sentiment with Latent Variable

Web Mining Web Mining Web Mining Web Mining Web mining is the use of data mining techniques

Web Mining Web Mining Web mining is the use of data mining techniques to automatically

Cement, Aggregates, Mining Presentation Cement, Aggregates and Mining Cement, Aggregates and

Frequent Pattern Mining Frequent Sequence Mining Frequent Tree Mining Christian Borgelt

Data Mining 2020 Frequent Pattern Mining (2) Ad Feelders Universiteit Utrecht October 2, 2020

Introduction What is data mining? to Data Mining: On what kind of data? Data Mining

Web MINING Web MINING Overview Overview Dr Ahmed Rafea Rafea Dr Ahmed 1 Web Mining Outline

Web Mining Andreas Andersson Gustav Strmberg Sandra Stendahl Introduction Web mining o

Week 5 Video 1 Relationship Mining Correlation Mining Relationship Mining Discover

Week 5 Video 2 Relationship Mining Causal Mining Causal Data Mining These slides developed in

Data Mining 2018 Frequent Pattern Mining (2) Ad Feelders Universiteit Utrecht October 10, 2018

Introduction What is data mining? to Data mining functionalities Data Mining Major

Web Mining Web Mining to automatically discover and extract information from Web

Web Mining Web Mining to automatically discover and extract information from Web

Data mining Machine Intelligence Thomas D. Nielsen September 2008 Data mining September 2008

NANO MINING POOL CLOUD CONTRACTS AND MINING SERVICES OUR PRODUCTS Cloud cards are mining cards

Q4 2019 Preliminary Earnings Results Summary February 5, 2020 SAFE HARBOR STATEMENT This

Hermes: A distributed messaging tool for NLP Ilaria Bordino, Andrea Ferretti, Marco Firrincieli,

GDG Community Building Tips ...ideas that work The struggle is real! Have ever been overwhelmed?

COVID 19 INSIGHTS: The challenges for students and families in Australias disadvantaged

ASPLOS 2014 Program Chairs Report Goals Move the field forward Continue as a broad,

Local and Online search algorithms Chapter 4 Chapter 4 1 Outline Local search algorithms

Digital vigilan*sm: A conceptual, ethical and policy challenge

Online Social Network Survey Ginny Kidwell & Travis Grosser 2009 LINKS Center Summer SNA

Review Mining Soo-Min Lim and Eduard Hovy. (2006). Automatic - PowerPoint PPT Presentation

Review Mining Soo-Min Lim and Eduard Hovy. (2006). Automatic Identification of Pro and Con Reasons in Online Reviews. COLING-ACL-2006. and Oscar Tackstrom and Ryan McDonald (2011). Discovering Fine-Grained Sentiment with Latent Variable

Web Mining Web Mining Web Mining Web Mining Web mining is the use of data mining techniques

Web Mining Web Mining Web mining is the use of data mining techniques to automatically

Cement, Aggregates, Mining Presentation Cement, Aggregates and Mining Cement, Aggregates and

Frequent Pattern Mining Frequent Sequence Mining Frequent Tree Mining Christian Borgelt

Data Mining 2020 Frequent Pattern Mining (2) Ad Feelders Universiteit Utrecht October 2, 2020

Introduction What is data mining? to Data Mining: On what kind of data? Data Mining

Web MINING Web MINING Overview Overview Dr Ahmed Rafea Rafea Dr Ahmed 1 Web Mining Outline

Web Mining Andreas Andersson Gustav Strmberg Sandra Stendahl Introduction Web mining o

Week 5 Video 1 Relationship Mining Correlation Mining Relationship Mining Discover

Week 5 Video 2 Relationship Mining Causal Mining Causal Data Mining These slides developed in

Data Mining 2018 Frequent Pattern Mining (2) Ad Feelders Universiteit Utrecht October 10, 2018

Introduction What is data mining? to Data mining functionalities Data Mining Major

Web Mining Web Mining to automatically discover and extract information from Web

Web Mining Web Mining to automatically discover and extract information from Web

Data mining Machine Intelligence Thomas D. Nielsen September 2008 Data mining September 2008

NANO MINING POOL CLOUD CONTRACTS AND MINING SERVICES OUR PRODUCTS Cloud cards are mining cards

Q4 2019 Preliminary Earnings Results Summary February 5, 2020 SAFE HARBOR STATEMENT This

Hermes: A distributed messaging tool for NLP Ilaria Bordino, Andrea Ferretti, Marco Firrincieli,

GDG Community Building Tips ...ideas that work The struggle is real! Have ever been overwhelmed?

COVID 19 INSIGHTS: The challenges for students and families in Australias disadvantaged

ASPLOS 2014 Program Chairs Report Goals Move the field forward Continue as a broad,

Local and Online search algorithms Chapter 4 Chapter 4 1 Outline Local search algorithms

Digital vigilan*sm: A conceptual, ethical and policy challenge

Online Social Network Survey Ginny Kidwell &amp; Travis Grosser 2009 LINKS Center Summer SNA

Online Social Network Survey Ginny Kidwell & Travis Grosser 2009 LINKS Center Summer SNA