Content Models with Attitude Christina Sauper, Aria Haghighi, Regina Barzilay MIT 1
Review Aggregation • Hundreds of reviews for each product • Opinions vary widely → Need aggregate statistics • Histograms show sentiment distribution, but it’s not enough 2
Aspect-based Analysis Prior work: Use a set of predefined domain-specific product aspects (e.g., Snyder and Barzilay 2007) → Coarse level analysis 3
Informative Aggregation Useful information: – What’s the best dish at this restaurant? – What do people dislike about this restaurant? – Which dishes do people disagree about? 4
Informative Aggregation Aggregation of product-specific aspects Japanese Restaurant We had a great time last We had a great time last Wow, I can’t believe Wow, I can’t believe I have such mixed things I have such mixed things night at this restaurant. night at this restaurant. how much this place has how much this place has to to say say about about this this The The sushi sushi was was so so changed! They used to changed! They used to restaurant. On one restaurant. On one incredibly fresh. We had incredibly fresh. We had be mediocre, but now be mediocre, but now hand, hand, their their sushi sushi is is a bad experience at the a bad experience at the they never fail to amaze. they never fail to amaze. unquestionably the best unquestionably the best bar, bar, though. though. My My We started off at the bar We started off at the bar in the city. On the other, in the city. On the other, chocolate martini was chocolate martini was with with awesome awesome sake sake the the atmosphere atmosphere isn’t isn’t absolutely terrible. We absolutely terrible. We bombs. When we got to bombs. When we got to that great. Plus, their that great. Plus, their will be back, but we’ll will be back, but we’ll our table, the sushi was our table, the sushi was drinks are completely drinks are completely skip the drinks. skip the drinks. fantastic. fantastic. watered down. watered down. Sushi 100% positive Chicken 33% positive Relevant aspects User sentiment 5
Corpus-driven Aspect Definition Define aspects dynamically based on reviews Japanese Restaurant Bakery We had a great time Wow, I can’t believe I have such mixed We had a great time Wow, I can’t believe I have such mixed last night at this how much this place things to say about this last night at this how much this place things to say about this restaurant. The sushi has changed! They restaurant. On one restaurant. The sushi has changed! They restaurant. On one was so incredibly fresh. used to be mediocre, hand, their sushi is was so incredibly fresh. used to be mediocre, hand, their sushi is We had a bad but now they never fail unquestionably the We had a bad but now they never fail unquestionably the experience at the bar, to amaze. We started best in the city. On the experience at the bar, to amaze. We started best in the city. On the though. My chocolate off at the bar with other, the atmosphere though. My chocolate off at the bar with other, the atmosphere martini was absolutely awesome sake bombs. isn’t that great. Plus, martini was absolutely awesome sake bombs. isn’t that great. Plus, terrible. We will be When we got to our their drinks are terrible. We will be When we got to our their drinks are back, but we’ll skip the table, the sushi was completely watered back, but we’ll skip the table, the sushi was completely watered drinks. fantastic. down. drinks. fantastic. down. - Sushi - Cookies - Cakes - Sake - Dessert - Pies → Aspects specific to each product 6
Corpus-driven Aspect Definition Allows comparison across multiple reviews Bakery I buy all of my baked I picked up a birthday This place is nice for goods from this cake for my son here some baked goods, bakery. Their bread is yesterday. It was the but some things are so delicious! It’s also most amazing cake really nasty. The loaf good for all kinds of I’ve ever seen! The of bread I bought was baked goods. They decorations were stale! They were also have some truly outstanding, and all happy to take it back beautiful cakes on the kids loved the and give me another, display. Even their chocolate icing. I’ll but I’ll be watching cookies are great! definitely come back! next time. … truly beautiful cakes on display. … most amazing cake I’ve ever seen! – Consensus (both positive and negative) What’s the best/worst aspect of this product? 7
Corpus-driven Aspect Definition Allows comparison across multiple reviews Bakery I buy all of my baked I picked up a birthday This place is nice for goods from this cake for my son here some baked goods, bakery. Their bread is yesterday. It was the but some things are so delicious! It’s also most amazing cake really nasty. The loaf good for all kinds of I’ve ever seen! The of bread I bought was baked goods. They decorations were stale! They were also have some truly outstanding, and all happy to take it back beautiful cakes on the kids loved the and give me another, display. Even their chocolate icing. I’ll but I’ll be watching cookies are great! definitely come back! next time. Their bread is so delicious ! The loaf of bread I bought was stale ! – Consensus (both positive and negative) What’s the best/worst aspect of this product? – Conflicts of opinion What aspects do people disagree about? 8
Task: Input Input: – Food-related snippets from restaurant reviews • Concise description of a user’s opinion – Automatically extracted from full review text (Sauper et al. 2010) We went to the restaurant, and the sushi was incredibly fresh . – Segmented by restaurant, but no additional annotation Japanese Restaurant Bakery the sushi was so incredibly fresh I’d recommend the apple pie best chicken katsu in town the bread was disappointingly stale drinks are fun, fresh, and delicious chocolate torte is the stuff of dreams 9
Task: Output Output: – Relevant aspects for each restaurant – Aspect label for each snippet – Sentiment label for each snippet Mexican Restaurant Burrito Salsa + they had a decent burrito + the salsa is incredible − the burrito was mediocre at best + the mango salsa is perfectly diced − the burrito was heavily cilantroed + hola free chips & salsa 10
Possible Solution Use clustering based on lexical similarity the martinis were very good the sushi was the best I’d ever had the martinis were tasty best paella I’d ever had the fillet was the best steak we’d ever had the wine list was pricey it’s the best soup I’ve ever had their wine selection is horrible Partial output of state-of-the-art clustering system Problem: Clusters and aspects are not aligned! 11
Our Solution • Jointly model aspect and sentiment • Leverage data to distinguish relevant words Bakery Japanese salmon fantastic pies delicious Review 1 cookies fresh sake smooth maki beautiful cakes fantastic Review 2 pies amazing salmon fresh cakes beautiful maki delicious Review 3 bread stale miso bland 12
Model: Overview • Each snippet has an aspect and a sentiment • Each word is drawn from a topic distribution: – Aspects are specific to a single product pizza dessert pad thai – Sentiment is global across all products great horrible amazing – Background distribution is global was our food • Transition distribution encodes word topic transitions They had wonderful appetizers . 13
Model: Generative Story 1. Global distributions 2. Restaurant-level distributions 3. Snippet-level latent structure 4. Words 14
Model: Generative Story Globally, a. Background distribution word distribution for stop words and in-domain white noise b. Sentiment distributions , word distributions over positive and negative sentiment words small bias for seed words c. Transition distribution first-order Markov distribution of word topic transitions Background Sentiment Transition distribution distributions distribution B + - Λ 15
Model: Generative Story For each restaurant , a. Aspect distributions word distribution for each aspect b. Aspect-sentiment binomials probability of positive vs. negative sentiment for each aspect c. Aspect multinomial probability of each aspect Aspect multinomial Aspect distributions Aspect-sentiment binomials φ 1 φ 2 φ K 1 2 … K … ψ 16
Model: Generative Story For each snippet , Aspect a. Aspect ψ 2 chosen from aspect multinomial Sentiment b. Sentiment φ 2 + chosen from aspect-sentiment binomial c. Sequence of word topics Background, Aspect, or Sentiment selected from transition distribution Word topic sequence Λ B A B S S 17
Model: Generative Story For each word , Aspect a. Word 2 2 chosen from topic-specific distribution based on word topic sequence Sentiment + + Word topic sequence B B A A B B S S S S Background B B The pizza was really great 18
Standard Variational Inference • Desired posterior: Observed data Model parameters Latent structure 19
Standard Variational Inference • Desired posterior: • Optimizing directly is intractable • Instead, optimize variational objective: s.t. factorizes 20
Recommend
More recommend