Feature Design for Polarity Classification Presentation for FEAST (Saarland University) 22nd April 2009 By Michael Wiegand Spoken Language Systems (IRTG)
Outline of Talk • Introduction to Polarity Classification • Semi-Supervised Learning for Document-Level Classification • The Task • Related Work • Feature Design • Experiments • Supervised Learning Sentence-Level Classification • The Task • Related Work • Feature Design • Experiments • Topic-Related Sentence-Level Classification �������������� • Conclusion 2
Outline of Talk • Introduction to Polarity Classification • Semi-Supervised Learning for Document-Level Classification • The Task • Related Work • Feature Design • Experiments • Supervised Learning Sentence-Level Classification • The Task • Related Work • Feature Design • Experiments • Topic-Related Sentence-Level Classification �������������� • Conclusion 3
What is Polarity Classification? • Polarity Classification is a subtask in Opinion Mining • 2 different types of text classification in Opinion Mining: • Subjectivity Detection • Does a text represent an ������� or a ���� ? • ������������������������������������������� vs. ���� ����������������������������������������������� ����������������������������������������������� • Polarity Classification • Given an opinionated text, is the opinion expressed in the text �������� or �������� ? • �������������� vs. ��������������� 4
What is Polarity Classification? • Polarity Classification is a subtask in Opinion Mining • 2 different types of text classification in Opinion Mining: • Subjectivity Detection • Does a text represent an ������� or a ���� ? • ������������������������������������������� vs. ���� ����������������������������������������������� ����������������������������������������������� • Polarity Classification • Given an opinionated text, is the opinion expressed in the text ��������� or �������� ? • �������������� vs. ��������������� 5
Why Polarity Classification? • Increasingly more opinionated content on the web (Web 2.0) � need for retrieving/classifying this kind of content • What makes polarity classification difficult? • Different from common topic classification • Different from common topic classification • Different kind of cues: ������������������ (e.g. ���� , �� �� etc.); not necessarily frequent content words! • ������������������ ( ��!����� , �����"����� ) • ����������������� of polar expressions (e.g. ����� � ��� vs. ����� ��������� ���� ) 6
Outline of Talk • Introduction to Polarity Classification • Semi-Supervised Learning for Document-Level Classification • The Task • Related Work • Feature Design • Experiments • Supervised Learning Sentence-Level Classification • The Task • Related Work • Feature Design • Experiments • Topic-Related Sentence-Level Classification �������������� • Conclusion 7
Semi-Supervised Learning - an Illustration 8
Semi-Supervised Learning - an Illustration ������������ ������������ ���������� ���� 9
Semi-Supervised Learning - an Illustration ������������ ������������ ���� 10
Semi-Supervised Learning - an Illustration ���������������������� �������������� 11
Semi-Supervised Learning - an Illustration �������������������� �������������������� ����������� ����������� ������� ��������������������� ���������������� 12
Semi-Supervised Learning - an Illustration �������������������� �������������������� ����������� ����������� ������� ��������������������� ���������������� 13
Semi-Supervised Learning - an Illustration �������������������� �������������������� ����������� ����������� ������� ��������������������� ���������������� 14
Semi-Supervised Learning - an Illustration 15
Semi-Supervised Learning - an Illustration ����������� ���������������� ������������� ������������� ���������������� ������� ��������� ���� 16
Semi-Supervised Learning - an Illustration ������������ ������������ �������� ������� �������������� �������� 17
Outline of Talk • Introduction to Polarity Classification • Semi-Supervised Learning for Document-Level Classification • �������� • Related Work • Feature Design • Experiments • Supervised Learning Sentence-Level Classification • The Task • Related Work • Feature Design • Experiments • Topic-Related Sentence-Level Classification �������������� • Conclusion 18
The task • Document-level text classification of reviews • Decide whether a document is either a positive or a negative review • Use labeled and unlabeled documents for training training • All documents, both labeled and unlabeled, are assumed to be subjective ( ������������������� ��#$������% ) • All documents, both labeled and unlabeled, are either positive or negative reviews 19
Outline of Talk • Introduction to Polarity Classification • Semi-Supervised Learning for Document-Level Classification • The Task • ������������ • Feature Design • Experiments • Supervised Learning Sentence-Level Classification • The Task • Related Work • Feature Design • Experiments • Topic-Related Sentence-Level Classification �������������� • Conclusion 20
Related Work • Supervised Learning: • Different algorithms and feature selection/extraction methods [Pang 2002; Salvetti 2006; Ng 2006; Gamon 2004] • Unsupervised Learning: • Induction of polarity lexicons (i.e. identification of • Induction of polarity lexicons (i.e. identification of polar expression) using ����������&������ �� ��&������ [Turney 2002] • Semi-Supervised Learning: • Extending Turney‘s webmining approach with labeled data [Beineke 2004] • EM in the context of domain adaptation [Aue 2005] 21
Contribution of this work • ����� extensive study of semi-supervised learning for polarity classification • Comparison of different feature sets • Evaluation on various domains • Evaluation on various domains 22
Outline of Talk • Introduction to Polarity Classification • Semi-Supervised Learning for Document-Level Classification • The Task • Related Work • �������������� • Experiments • Supervised Learning Sentence-Level Classification • The Task • Related Work • Feature Design • Experiments • Topic-Related Sentence-Level Classification �������������� • Conclusion 23
Why is feature selection more important in semi- supervised learning than in supervised learning? • Less information contained in small labeled datasets � intrinsic predictiveness of features is important • Inappropriate feature sets may lead • Inappropriate feature sets may lead semi-supervised classifiers astray • In polarity classification there is the danger that topic information interferes 24
Recommend
More recommend