mining domain specific dictionaries
play

Mining Domain-Specific Dictionaries Konstantinos Pantelis Ioannis - PowerPoint PPT Presentation

Mining Domain-Specific Dictionaries Konstantinos Pantelis Ioannis Katakis Fotios Kokkoras Ntonas Agathangelou Technological University of International Open University Educational Institute Athens Hellenic University of Thessaly of


  1. Mining Domain-Specific Dictionaries Konstantinos Pantelis Ioannis Katakis Fotios Kokkoras Ntonas Agathangelou Technological University of International Open University Educational Institute Athens Hellenic University of Thessaly of Cyprus 15 th International Conference on Web Information System Engineering (WISE 2014)

  2. Summary ? The Opinion Mining Problem Introduction Proposed Method Experimental Evaluation Interface 2 Pantelis Agathangelou, Mining Domain Specific Dictionaries

  3. The Opinion Mining Problem In Dictionary Based Solutions • Social Networks • Blogs Collected • Discussion Boards Opinions Analyze Opinions Domain Lexicon Data Analysis Classify Opinions Classification 1. Positive 2. Positive 3. Negative 4. Positive Summarized Results 3 Pantelis Agathangelou, Mining Domain Specific Dictionaries

  4. DOMAIN SPECIFIC OPINION LEXICON LEXICON ATTRIBUTES SAMPLE • List of terms of known polarity Positive Sentiment Tension (Positive or Negative) Beautiful 10 • Strength or Sentiment Tension Astonishing 4 Cool -4 Negative Tension Differences In Comparison Slow -6 To Generic Ugly -3 Lexicon s Low +3 Pantelis Agathangelou, Mining Domain Specific Dictionaries 4

  5. INTRODUCTION  What we do in this paper ?  What is the innovation ?  We mine a domain specific o The designed algorithm can dictionary operate with a small initial seed list  We implement a multiple stage approach o The method is unsupervised  We utilize language o It can operate in multiple patterns for the extraction languages, provided the process appropriate patterns o Produces fast and accurate results. Pantelis Agathangelou, Mining Domain Specific Dictionaries 5

  6. Proposed Method Opinion Preprocessing Auxiliary List Preparation (Modules) Seed Import and Filtered Seed Extraction Conjunction Based Extraction Double Propagation & Opinion Word Validation 6 Pantelis Agathangelou, Mining Domain Specific Dictionaries

  7. OPINION PREPROCESSING • Receives user opinions in raw form. • Implement some form of preprocessing • Sentence splitting – delimitation Sentence Splitters • Additionally – Stemmer Engine Pantelis Agathangelou, Mining Domain Specific Dictionaries 7

  8. AUXILIARY LIST PREPARATION - MODULES Articles Basic Verbs Comparatives Decreasers the be cheaper little a bend finer clearer an chose newer slower one throw stronger poorer Future Words Increasers Negations Pronouns will better none my to hotter no they let harder any everybody if darker anyone her Sum: 380 word constants 8 Pantelis Agathangelou, Mining Domain Specific Dictionaries

  9. SEED IMPORT AND FILTERED SEED EXTRACTION FILTER SEED EXTRACTION PATTERNS EXTRACTION PROCESS Opinions Positive Seed Patterns Seed Seed Extraction + Patterns Modules Negative Seed Patterns Filter Seed Lexicon 21 positive, 12 negative polarity patterns in total Pantelis Agathangelou, Mining Domain Specific Dictionaries 9

  10. CONJUNCTION-BASED EXTRACTION CONJUNCTION BASED EXTRACTION EXTRACTION PROCESS PATTERNS Opinions Positive Conj Based Patterns Filter Seed Conjunction Based + Extraction Patterns Modules Negative Conj Based Patterns Conjunction Based Lexicon 6 positive, 4 negative extraction patterns in total Pantelis Agathangelou, Mining Domain Specific Dictionaries 10

  11. DOUBLE PROPAGATION EXTRACTION METHOD DOUBLE PROPAGATION PATTERNS EXTRACTION PROCESS Opinions Double 6 extraction patterns in total Propagation Extraction OPINION TARGET EXTRACTION Patterns Opinion Target Modules e.g. nice phone, amazing screen Extraction Opinion Targets Filter Seed + 11 Conj Based Opinion Target Double Propagation e.g. wide and tall Lexicon List Lexicon Pantelis Agathangelou, Mining Domain Specific Dictionaries

  12. DOUBLE PROPAGATION SENTIMENT EXTRACTION STEP 1 INTRA SENTENTIAL EXAMPLE • Intra – Sentential Sentiment (magnificent cool screen)[+1] Consistency Opinion word cool is extracted from STEP 2 opinion target screen , inherits sentence polarity [+1] • Inter – Sentential Sentiment INTER SENTENTIAL EXAMPLE Consistency (max depth = 3) (very good that mobile)[+1] (awesome screen)[0] (easy browsing fabulous graphics)[+2] awesome extracted from screen , inherits sentence polarity [+2] at depth 1 Pantelis Agathangelou, Mining Domain Specific Dictionaries 12

  13. OPINION WORD VALIDATION • We use the extracted double Opinions propagation opinion word set and opinion target word set Opinion Target • Sentiment Threshold [Sent] : List Minimum accepted polarity Double Propagation • Frequency Threshold [Freq] : Opinion word List Minimum accepted frequency co-existence of opinion word – opinion target > [Sent] Filter Double && Propagation [Freq] Lexicon Pantelis Agathangelou, Mining Domain Specific Dictionaries 13

  14. EXPERIMENTAL RESULTS ALGORITHM FEATURES AVERAGE PRECISION – RECALL METRICS • When Conjunction Based Extraction Fails to discover seed words, double propagation fills in the extraction gap. • When Conjunction Based Extraction value is balanced so is double propagation. Conclusion : The above results justify the unsupervised manner of the proposed method. Pantelis Agathangelou, Mining Domain Specific Dictionaries 14

  15. EXPERIMENTAL RESULTS COMPARISON BETWEEN EXTRACTION QUALITY OF THE EXTRACTED LEXICON BY STEPS EVALUATION CLASSIFICATION • Filter Seed builds the base of classification, but double propagation extents it. • Conjunction based has low impact at overall classification Pantelis Agathangelou, Mining Domain Specific Dictionaries 15

  16. EXPERIMENTAL RESULTS Evaluation Based on Sentiment Classification Conclusion : double propagation normalizes the quality of the lexicon upwards 16 Pantelis Agathangelou, Mining Domain Specific Dictionaries

  17. INTERFACE WELCOME SCREEN POLARITY CLASSIFICATION OPTIONS http://deixto.com/niosto/ Pantelis Agathangelou, Mining Domain Specific Dictionaries 17

  18. INTERFACE STEMMER OPINION OPTIONS ALGORITHM EVALUATION OPTIONS http://deixto.com/niosto/ Pantelis Agathangelou, Mining Domain Specific Dictionaries 18

  19. More… Mining Domain-Specific Dictionaries Pantelis Agathangelou, Katakis Ioannis, Fotios Kokkoras, Konstantinos Ntonas Thank you for your attention! DEiXTo - Web Extraction Tool: http://deixto.com/ NiosTo – Dictionary Extraction Tool : http://deixto.com/niosto/ pandelisagathangelou@gmail.com 19 Pantelis Agathangelou, Mining Domain Specific Dictionaries

Recommend


More recommend