outline mining product features and customer opinions
play

Outline Mining Product Features and Customer Opinions 1 - PDF document

Outline Mining Product Features and Customer Opinions 1 Mining Customer Reviews: Related Work from Reviews 2 OPI NE: Tasks and Results 3 Product Feature Extraction 4 Customer Opinion Extraction 5 Conclusions and


  1. Outline Mining Product Features and Customer Opinions 1 Mining Customer Reviews: Related Work from Reviews 2 OPI NE: Tasks and Results 3 Product Feature Extraction 4 Customer Opinion Extraction 5 Conclusions and Future Work Ana-Maria Popescu University of Washington 2 http://www.cs.washington.edu/research/knowitall/ Mining Customer Reviews Finding and analyzing subjective phrases or sentences Positive: The hotel had a great location. Takamura’05, Wilson’04,Turney’03, Riloff et al.’ 03,etc. Classifying consumer reviews polarity classification, strength classification Trump I nternational : Review # 4 : positive: * * * Turney’02, Pang et al.’05, Pang et al.’02, Kushal et al.’03, etc. Extracting product features and opinions from reviews hotel_location:great[+ ] Hu & Liu’04, Kobayashi’04, Yi et al.’05, Gamon et al.’05, OPI NE 3 4 Tasks and Results OPI NE KnowI tAll is a Web-based information extraction system (Etzioni et al’05) Given a target class (Country) I dentify product features P = 94% R = 77% The Extractor instantiates extraction rules (“country such as [X]”) and uses search engine to find candidate instances I dentify the semantic orientation of potential P = 78% R = 88% The Assessor eliminates incorrect candidates using high-precision opinion words (adjectives, nouns, etc.) in the lexical patterns context of product features and review sentences. hits(“Garth Brooks and other countries”) PMI (“[X] and other countries”, “Garth Brooks”) = I dentify opinion phrases P = 79% R = 76% hits(“and other countries”)* hits(“Garth Brooks”) I dentify opinion phrase polarity P = 86% R = 89% OPI NE is built on top of KnowI tAll I t uses and extends KnowI tAll’s architecture I t extensively uses high-precision lexical patterns I t uses the Web to collect statistics 5 6 1

  2. Outline Feature Extraction Product classes Hotels 1 Mining Customer Reviews: Related Work I nstances Trump I nternational 2 OPI NE: Tasks and Results Features Examples Properties Quality Size 3 Product Feature Extraction Parts Room Features of parts RoomSize Related concepts Neighborhood 4 Customer Opinion Extraction Features of related NeighborhoodSafety concepts 5 Conclusions and Future Work 7 8 Feature Extraction Feature Extraction I loved the hot water and Extract noun phrases np I loved the hot water and Assess potential features the clean bathroom. such that np contains only the clean bathroom. using bootstrapped lexical nouns and frequency( np )> 1 patterns ( discriminators ) The fan was broken and The fan was broken and as potential features. our room was hot the our room was hot the Examples entire time. X of Y Y has X Y’s X entire time. Y with X Y comes with X I like a nice, hot room Y equipped with X Y contains X when the snow piles up Y boasts X Y offers X I like a nice, hot room outside. when the snow piles up outside. 9 10 Feature Extraction Feature Extraction Assess potential features Assess potential features I loved the hot water and I loved the hot water and using discriminators using discriminators the clean bathroom. the clean bathroom. PMI (hotel’s[Y], room) = PMI (hotel’s[Y], room) = The fan was broken and The fan was broken and = hits(“hotel’s room”) = hits(“hotel’s room”) our room was hot the our room was hot the hits(“hotel’s”)* hits(“room”) hits(“hotel’s”)* hits(“room”) entire time. entire time. PMI (hotel’s [Y],room)= PMI (hotel’s [Y],room)= I like a nice, hot room 0.54 * 10 -13 I like a nice, hot room 0.54 * 10 -13 when the snow piles up when the snow piles up PMI (hotel’s [Y],snow)= PMI (hotel’s [Y],snow)= outside. outside. 0.64 * 10 -16 0.64 * 10 -16 PMI (hotel’s [Y], room) > > PMI (hotel’s [Y], room) > > PMI (hotel’s [Y], snow) PMI (hotel’s [Y], snow) 11 12 2

  3. Results Outline 5 consumer electronics product classes (Hu&Liu’04) 314 reviews Hu&Liu OPI NE-Web OPI NE 1 Mining Customer Reviews: Related Work P 0.72 0.79 0.94 2 OPI NE: Tasks and Results R 0.80 0.66 0.77 3 Product Feature Extraction 1/ 3 of OPI NE precision increase is due to OPI NE assessment 2/ 3 of OPI NE precision increase is due to Web PMI statistics 4 Customer Opinion Extraction 2 product classes (Hotels, Scanners) 1307 reviews P = 89% R = 73% 5 Conclusions and Future Work 14 Potential Opinions Semantic Orientation Use syntax-based rules to I loved the hot water and The room was hot(-) and stuffy(-). extract potential opinions After freezing for hours, the room was nice(+ ) and hot(+ ). the clean bathroom. po for each feature f cold basic loud visible casual modern central quiet The fan was broken and I f [subj= f , pred = be, arg] po := arg Task our room was hot the Compute the SO label for a (word, feature, sentence) tuple entire time. I f [subj, pred , obj= f] Solution po := pred I Compute the SO label for each word I like a nice, hot room … I I Compute the SO label for each (word, feature) pair when the snow piles up (similar intuition to Kim&Hovy’04, I I I Compute the SO label for each (word, feature, sentence) tuple Hu&Liu’04) outside. Each solution step = labeling problem relaxation labeling 15 16 Relaxation Labeling Word Semantic Orientation I nput Building word neighborhoods: conjunctions, disjunctions, objects, labels syntactic attachment rules an initial object label mapping WordNet synonymy/ antonymy an object’s neighborhood morphology information a support function q for an object label Output final object label mapping I loved the hot water and the clean bathroom. RL Update Equation neighbor(hot, love, synt_dep_path) neighbor(hot, clean, and) P(w = L) m (1 + q(w, L) m ) P(w = L) m+ 1 = Σ L’ P(w = L’) m (1 + q(w, L’) m ) w = word, L = SO label The room was spacious but hot. neighbor(hot, spacious, but) 17 3

  4. I nitialize word-label mapping ( SO-PMI -based method) Semantic Orientation hot clean spacious + - | + - | + - | 0.56 0.43 0.01 0.94 0.06 0.01 0.89 0.1 0.01 Potential opinion words can change orientation based on features and love I loved the hot water and the clean bathroom. + - | The fan was broken and our room was hot the entire time. 0.98 0.01 0.01 Our room was really hot. but attach Compute the SO label for each (word, feature) pair relaxation labeling update + - | hot 0.72 0.21 0.09 20 Computing Word Semantic Orientation I nitialize (w,f)-label mapping ( use word labels) n and : stuffy(room) n and : broken(fan) hot(room) I like(+ ) a nice(+ ), hot(+ ) room when the snow piles up outside. + - | + - | + - | 0.72 0.21 0.07 0.13 0.78 0.09 0.1 0.86 0.01 hot(room) n and : nice(room) n attach : like(room) + - | + - | + - | n attach : unbearably n attach : stiflingly 0.20 0.68 0.12 0.98 0.01 0.01 0.93 0.06 0.01 + - | + - | 0.01 0.98 0.01 0.06 0.94 0.01 relaxation labeling update + - | hot(room) relaxation labeling update 0.65 0.24 0.11 + - | hot(room) 0.20 0.68 0.12 Computing Feature-dependent Semantic Orientation Computing Sentence-dependent Semantic Orientation Results I ssues PMI + + : Version of PMI -based method for finding SO labels of words or Parsing errors (especially in long-range dependency cases) (word, feature) pairs missed candidate opinions Hu+ + : Version of Hu’s WordNet-based method for finding word SO labels incorrect polarity assignment OP-1: OPI NE version which only computes the dominant SO label of a word PMI + + Hu+ + OP-1 OPI NE Sparse data problems for infrequent opinion words incorrect polarity assignment P 0.72 0.74 0.69 0.78 Complicated opinion expressions 0.91 0.78 0.88 0.88 R opinion nesting, conditionals, subjunctive expressions, etc. OPI NE’s improvements are mostly due to contextual information use 23 24 4

Recommend


More recommend