Metaphor Detection through Term Relevance Marc Schulder Eduard Hovy Saarland University Carnegie Mellon University � 1
Metaphor Detection “And yet we stand together as we did two centuries ago.” � 2
Metaphor Detection “And yet we stand together as we did two centuries ago.” � 2
Metaphor Detection Challenges Required Knowledge � 3
Metaphor Detection Challenges Required Knowledge Conceptual Mappings Target Domain ⨉ Mappings ⨉ Source Domains � 3
Metaphor Detection Challenges Required Knowledge Conceptual Mappings Target Domain ⨉ Mappings ⨉ Source Domains Selectional Preference Violation Preferences ⨉ Argument Domains � 3
Metaphor Detection Challenges Required Knowledge Conceptual Mappings Target Domain ⨉ Mappings ⨉ Source Domains Selectional Preference Violation Preferences ⨉ Argument Domains Typical Restrictions Low Coverage POS Limitations � 3
Metaphor Detection Challenges Required Knowledge Conceptual Mappings Target Domain ⨉ Mappings ⨉ Source Domains Selectional Preference Violation Preferences ⨉ Argument Domains Typical Restrictions Low Coverage POS Limitations � 3
Metaphor Detection Term Relevance � 4
Metaphor Detection Term Relevance Simple � 4
Metaphor Detection Term Relevance Simple Robust � 4
Metaphor Detection Term Relevance Simple Robust POS independent � 4
Metaphor Detection Term Relevance Simple Robust POS independent Target Domain only � 4
Term Relevance Hypothesis If a word does not fit in the context , then it is probably not meant literally . See also Sporleder & Li (2009)/Li & Sporleder(2010) � 5
Term Relevance Overview Relevance Metric Domain Data (Web Corpus) Evaluation (Metaphor Corpus) Basic Classifier Multi-Feature Classifier � 6
Term Relevance Metric Is word common in all domains? yes no Is word typical literal for this domain? yes no literal metaphor � 7
Term Relevance TF-IDF Term Frequency How often does term appear in a document Document Frequency In how many documents does term appear TF-IDF Impact of term on document term frequency ⨉ inverse document frequency � 8
Term Relevance TF-IDF Term Frequency domain How often does term appear in a document n i a m o Document Frequency D In how many documents does term appear d o m a i n s TF-IDF n i a m o Impact of term on document d domai term frequency ⨉ inverse document frequency � 8
Term Relevance TF-IDF Term Frequency domain How often does term appear in a document n i a m o Document Frequency D In how many documents does term appear d o m a i n s Domain Relevance TF-IDF n i a m o Impact of term on document d domai term frequency ⨉ inverse document frequency � 8
Term Relevance Metric Is word common in all domains? yes no Is word typical literal for this domain? yes no literal metaphor � 9
Term Relevance Metric document frequency > δ yes no Is word typical literal for this domain? yes no literal metaphor � 10
Term Relevance Metric document frequency > δ yes no literal domain relevance > 𝜹 yes no literal metaphor � 11
Overview Relevance Metric Domain Data Evaluation Basic Classifier Multi-Feature Classifier � 12
Domain Data Web Corpus ClueWeb-09 1 Billion Web Documents 500 Million English Web Documents Segment en0000 3 Million Documents 1.8 Million Documents without Spam � 13
Domain Data Domain Clustering ClueWeb-09 Domain Seeds Lucene Database Domain Data Pseudo-Domain Data � 14
Domain Data Domain Clustering ClueWeb-09 Legislative Domain Seeds Lucene Database pass law regulate debate parliament Domain Data Pseudo-Domain Data � 14
Domain Data Domain Clustering ClueWeb-09 Domain Seeds Lucene Database Domain Data Pseudo-Domain Data Legislative 10,000 docs � 14
Domain Data Domain Clustering ClueWeb-09 Economy Domain Seeds Lucene Database budget tax spend plan finances Domain Data Pseudo-Domain Data Legislative 10,000 docs � 14
Domain Data Domain Clustering ClueWeb-09 Domain Seeds Lucene Database Domain Data Pseudo-Domain Data Legislative Economy 10,000 docs 10,000 docs � 14
Domain Data Domain Clustering ClueWeb-09 Domain Seeds Lucene Database Domain Data Pseudo-Domain Data Legislative Economy Pseudo 1 10,000 docs 10,000 docs Pseudo 2 Pseudo 3 10,000 docs 10,000 docs 10,000 docs � 14
Overview Relevance Metric Domain Data Evaluation Basic Classifier Multi-Feature Classifier � 15
Evaluation Experimental Setup “And yet we stand together as we did two centuries ago.” � 16
Evaluation Experimental Setup X X X M M X X X X X X “And yet we stand together as we did two centuries ago.” � 16
Evaluation Experimental Setup X X X M M X X X X X X “And yet we stand together as we did two centuries ago.” Metrics • F-Measure • Precision • Recall • Accuracy � 16
Evaluation Experimental Setup X X X M M X X X X X X “And yet we stand together as we did two centuries ago.” Metrics • F-Measure • Precision • Recall • Accuracy � 16
Evaluation Experimental Setup X X X M M X X X X X X “And yet we stand together as we did two centuries ago.” Baseline: M M M M M M M M M M M Metrics • F-Measure • Precision • Recall • Accuracy � 16
Evaluation Gold Corpus MICS Governance Corpus 2510 Sentences � 17
Evaluation Gold Corpus MICS Governance Corpus 2510 Sentences 17 % 0 metaphors 1 metaphor 23 % 60 % 2+ metaphors � 17
Evaluation Gold Corpus MICS Governance Corpus 2510 Sentences 17 % 0 metaphors 1 metaphor 23 % 60 % 2+ metaphors Examples “And yet we stand together as we did two centuries ago.” “Many Jewish voters will find themselves at a crossroads .” � 17
Overview Relevance Metric Domain Data Evaluation Basic Classifier Multi-Feature Classifier � 18
Basic Classifier Seeds & Thresholds � 19
Basic Classifier Seeds & Thresholds Economy budget tax spend plan finances � 19
Basic Classifier Seeds & Thresholds Economy budget tax spend plan finances Economy 10,000 docs � 19
Basic Classifier Seeds & Thresholds Economy budget tax spend plan finances Economy 10,000 docs domain relevance > 𝜹 document frequency > δ � 19
Basic Classifier Seeds & Thresholds document frequency > δ yes no domain relevance > 𝜹 literal yes no literal metaphor � 20
Basic Classifier Seeds & Thresholds document frequency > δ yes no domain relevance > 𝜹 literal yes no literal metaphor Seed Set 1: Manual 8 Subdomains 4-14 manual seeds 8 ⨉ 10.000 Docs each 𝜹 =0.02 ; δ =0.1 � 20
Basic Classifier Seeds & Thresholds document frequency > δ yes no legislative relevance > 𝜹 literal yes no economy relevance > 𝜹 literal Seed Set 1: Manual yes no 8 Subdomains ... literal 4-14 manual seeds metaphor 8 ⨉ 10.000 Docs each 𝜹 =0.02 ; δ =0.1 � 21
Basic Classifier Seeds & Thresholds document frequency > δ yes no domain relevance > 𝜹 literal yes no literal metaphor Seed Set 1: Manual 8 Subdomains 4-14 manual seeds 8 ⨉ 10.000 Docs each 𝜹 =0.02 ; δ =0.1 � 22
Basic Classifier Seeds & Thresholds document frequency > δ yes no domain relevance > 𝜹 literal yes no literal metaphor Seed Set 1: Manual Seed Set 2: Gold 8 Subdomains 1 Domain 4-14 manual seeds 50 best gold metaphors 8 ⨉ 10.000 Docs each 80.000 Docs 𝜹 =0.01 ; δ =0.1 𝜹 =0.02 ; δ =0.1 � 22
Basic Classifier Evaluation 1 All Metaphor Manual Seeds Gold Seeds 0.75 0.5 0.25 .249 .350 .346 .142 .276 .245 1.000 .478 .591 0 F1 Precision Recall � 23
Overview Almost Done Relevance Metric Domain Data Evaluation Basic Classifier Multi-Feature Classifier � 24
Multi-Feature Classifier Conditional Random Fields Setup Bigram model 10-fold cross validation � 25
Multi-Feature Classifier Conditional Random Fields Setup Bigram model 10-fold cross validation Features Relevance Weights ( 𝜹 =0.02 ; δ =0.79) Part of Speech Lexicographer Sense � 25
Multi-Feature Classifier Evaluation 1 All Metaphor Manual Seeds CRF: Basic 0.75 CRF: Relev CRF: PosLex CRF: PosLex + Relev 0.5 0.25 .187 .219 .340 .373 .706 .683 .654 .640 .108 .130 .230 .263 0 F1 Precision Recall � 26
Multi-Feature Classifier Training Size 0.4 0.4 0.35 0.35 0.3 0.3 0.25 0.25 F 1 0.2 0.2 0.15 0.15 0.1 0.1 Models CRF Basic + Relevance 0.05 0.05 CRF PosLex + Relevance Threshold Baseline 0 0 200 400 600 800 1000 1200 1400 1600 1800 Number of Training Sentences � 27
Recommend
More recommend