peer review analysis
play

Peer-review analysis Comprehensive exam Presentered by : Wenting - PowerPoint PPT Presentation

Peer-review analysis Comprehensive exam Presentered by : Wenting Xiong Diane Litman Committees: Rebecca Hwa Jingtao Wang 1 Motivation Goal Mine useful information in peers feedback and represent them in a intuitive and concise way


  1. Peer-review analysis Comprehensive exam Presentered by : Wenting Xiong Diane Litman Committees: Rebecca Hwa Jingtao Wang 1

  2. Motivation • Goal Mine useful information in peers’ feedback and represent them in a intuitive and concise way • Tasks and related research topics – Identify review helpfulness NLP – Review analysis – Summarize reviewers’ comments NLP – Paraphrasing and Summarization – Sense-making of review comments interactive review exploration HCI – Visual text analytics 2

  3. Part.1 NLP -- Review Analysis 3

  4. Outline 1. Review helpfulness analysis 2. Sentiment analysis (opinion mining) Aspect detection Sentiment orientation Sentiment classification & extraction 4

  5. 1 Review helpfulness analysis 1. Automatic prediction – Learning techniques – Features utilities – The ground-truth 2. Analysis of perceived review helpfulness – Users’ bias when vote for helpfulness – Influence of the other reviews of the same product 5

  6. 1.1 -- Learning techniques • Problem formalization – Input: textual reviews – Output: helpfulness score • Learning Algorithms – Supervised learning – Regression • Product reviews (e.g. electronics) <Kim 2006>, <Zhang 2006>, <Liu 2007>,<Ghose 2010>, <O'Mahony 2010> • Trip reviews <Zhang 2006> • Movie reviews <Zhang 2006> – Unsupervised learning – Clustering • Book reviews <Tsur 2009> • Focus – Predict absolute scores VS. rankings – Identify most helpful <Liu 2007> vs. unhelpful <Tsur 2009> 6

  7. 1.1-- Feature utilities • Features used to model review helpfulness Category Feature type Unigrams, bigrams Low level Structural Syntactic Linguistic Semantic: 1) domain lexicons Sentiment analysis 2) Subjectivity Readability metrics High level Reviewer profile Social factors Product ratings – Controversial results about the effectiveness of subjectivity features • term-based counts not useful <Kim, et. al, 2006> , category-based count shows positive words correlate with greater helpfulness <Ghose, et. al, 2010> – Data sparsity issues? 7

  8. 1.1 --The ground-truth • Various gold-standard of review helpfulness – Aggregated helpfulness votes Perceived helpfulness e.g. ¡ <Kim 2006> – Manual annotations of helpfulness Real helpfulness <Liu 2007> • Problems Percentage of helpful votes is not consistent with annotators judgments based on helpfulness specifications Error rate of preference pair < 0.5 <Liu 2007> 8

  9. 1 Review helpfulness analysis 1. Automatic prediction – Learning techniques – Features utilities – The ground-truth 2. Analysis of perceived review helpfulness – Biased voting of review helpfulness on Amazon.com – The perceived helpfulness is not only determined by the textual content 9

  10. 1.2 Analysis of perceived review helpfulness • Biased voting of review helpfulness on Amazon.com – Imbalanced vote – Winner Circle bias – Early bird bias <Liu 2007>  “x/y” does not capture the true helpfulness of reviews • The perceived helpfulness is not only determined by the textual content – Influence of the other reviews of the same product – Individual bias <Danescu-Niculescu-Mizil 2009> 10

  11. 1 Review helpfulness analysis • Summary – Effective features for identify review helpfulness – Perceived helpfulness VS. real helpfulness • Comments – New features • Introduce domain knowledge and information from other dimensions – Data sparsity problem • High-level features • Deep learning from low-level features – Other machine learning techniques • Theory-based generative models 11

  12. Outline 1. Review helpfulness analysis 2. Sentiment analysis (opinion mining) 12

  13. 2 Sentient analysis (opinion mining) How ¡people ¡think ¡about ¡what? 1. Aspect ¡detec,on 2. Sen,ment ¡orienta,on 3. Sen,ment ¡classifica,on ¡& ¡extrac,on ¡ 13

  14. 2.1 Aspect detection • Frequency-based approach – Most frequent noun-phrase + sentiment-pivot expansion <Liu, 2004> – PMI (pointwise Mutual information) with meronymy discriminators + WordNet <Popescu 2005> • Generative approach – LDA, MG-LDA <Titov 2008> , sentence-level local-LDA <Brody 2010> – Multiple-aspect sentiment model <Titov 2008> – Content-attitude model <Sauper 2011> 14

  15. 2.2 Sentiment orientation • Aggregating from subjective terms – Manually constructed subjective lexicons • Bootstrapping with PMI – Adj & adv <Turney 2001> – opinion-bearing words <Liu 2004> • Graph-based approach – Relaxiation labeling <Popescu 2005> – Scoring <Brody 2010> • Domain adaptation – SCL algorithm <Blitzer 2007> • Through topic models – MAS -- aspect-independent + aspect-dependent <Titov 2008> – Content-attitude models -- predicted posterior of sentiment distribution <Sauper, 2011> 15

  16. 2.3 Sentiment classification and extraction • Classification – Binary <Turney 2001> – Finer-grained e.g. metric labeling <Pang 2005> • Data sparsity – Bag-of-Words vs. Bag-of-Opinions <Qu 2010> • Opinion-oriented extraction – Topic of interest • Pre-defined • Automatically learned • User-specified 16

  17. 2 Summary Comparing reviews’ helpfulness and sentiment • In terms of automatic prediction, both are metric inferring problem, that can be formalized as standard ML problems with same input X though different output Y • The learned knowledge about opinion topics and the associated sentiments would help model the general utility of reviews 17

  18. Part.2 NLP -- Paraphrasing & Summarization 18

  19. Outline 1. Paraphrasing Paraphrases are semantically equivalent with each other 1. Paraphrase recognition 2. Paraphrase generation 2. Summarization Shorter representation of the same semantic information of the input text 1. Informativeness computation 2. Extracted summarization of evaluative text 19

  20. 1.1 Paraphrase recognition • Discriminative approach – Various ¡string ¡similarity ¡metrics ¡ – Different ¡level ¡of ¡abstrac,on ¡of ¡textual ¡strings <Malakasiotis 2009> Ques%on: ¡ Any ¡useful ¡exis6ng ¡resourses ¡for ¡iden6fying ¡equivalent ¡seman6c ¡ informa6on? • Word-­‑level: ¡dic,onary, ¡WordNet • Phrase-­‑level: ¡ ¡? • Sentence-­‑level: ¡ ¡? 20

  21. 1.2 Paraphrase generation • Corpora – Monolingual vs. bilingual • Methods – Distributional-similarity based – Corpora based • Evaluation – Intrinsic evaluation vs. extrinsic evaluation 21

  22. 1.2 -- Corpora • Monolingual corpora – Parallel corpora • Translation candidates • Definitions of the same term – Comparable corpora • Summary of the same event • Documents on the same topic • Bilingual parallel corpora 22

  23. 1.1 -- Methods.1 • Distributional-similarity based methods – DIRT, paths frequently occur with same words at their ends • Using a single monolingual corpus • MI to measure association strength between slot and its arguments <Lin 2001> – Sentence-lattices, argument similarity of multiple slots on sentence-lattices • Using a comparable monolingual corpus • Hierarchical clustering for grouping similar sentences • MSA to induce lattices <Barzilay 2003> 23

  24. 1.2 -- Methods.2 • Corpora-based methods – Monolingual parallel corpus • Monolingual MT <Quirk 2004> • Merging partial parse trees FSA <Pang 2003> • Paraphrasing from definitions <Hashimoto 2011> – Monolingual comparable corpus • MSR paraphrase corpus <Dolan 2005> • Edit distance, Journalism convention • Sentence-lattices <Barzilay 2003> – Bilingual parallel corpus • Pivot approach <Callison-Burch 2005> <Zhao 2008> • Random-walk based HTP <Kok 2009> 24

  25. 1.2 -- Evaluation • Intrinsic evaluation – Responsiveness • Can access precision, but no recall – Standard test references <Callison-Burch 2008> • Manually aligned corpus • Lower bound precision & relative recall • Extrinsic evaluation – Alignment tasks in monolingual translation • Alignment error rate • Alignment precision, recall, F-measure <Dolan 2004> • Model-specific evaluation – FSA <Pang 2005> 25

  26. 2 Summarization Tasks in automatic summarization I. Content selection II. Information ordering III. Automatic editing, information fusion Focus of this talk -- 1. Informativeness computation 2. Information selection (and generation) 3. Summarization evaluation 26

  27. 2.1 Computing informativeness • Semantic information (Topic identification) – Word-level • Frequency, TFIDF <Liu 2004> , Topic signature <Lin 2001> , PMI(w, topic) <Wang 2011> , external domain knowledge <Zhuang 2006> – Sentence-level • HMM content models <barzilay 2004> • Category classification + sentence clustering <Abu-Jbara 2011> – Summary-level • Sentiment-aspect match model + KL divergence <Lerman 2009> • Opinion-based sentiment scores for evaluative texts • Sentiment polarity, intensity, mismatch, diversity <Lerman 2009> • Discriminative approach to predict informativeness • Combine statistic, semantic, sentiment features in linear or log-linear models <wang 2011> 27

Recommend


More recommend