SemEval 2014 Task-3 Cross-Level Semantic Similarity MultiJEDI ERC 259234
Semantic Similarity
Semantic Similarity Mostly focused on similar types of lexical items
Semantic Similarity What if we have different types of inputs?
CLSS: Cross-Level Semantic Similarity A new type of similarity task
CLSS: Cross-Level Semantic Similarity A new type of similarity task • • •
CLSS: Comparison Types Paragraph to Sentence
CLSS: Comparison Types Paragraph to Sentence Sentence to Phrase
CLSS: Comparison Types Paragraph to Sentence Sentence to Phrase Phrase to Word
CLSS: Comparison Types Paragraph to Sentence Sentence to Phrase Phrase to Word Word to Sense
Task Data 4000 pairs in total Training set Test set
Task Data A wide range of domains and text styles
word-to-sense pairs Word to Sense
word-to-sense pairs Word to Sense
word-to-sense pairs Word to Sense
word-to-sense pairs Word to Sense
Rating Scale
Crafting an idealized similarity distribution
Crafting an idealized similarity distribution larger side
Crafting an idealized similarity distribution larger side
Crafting an idealized similarity distribution 2 0 4 1 3 larger side
Crafting an idealized similarity distribution 2 0 4 1 3 larger side
Crafting an idealized similarity distribution 2 0 4 1 3 smaller side larger side
Crafting an idealized similarity distribution 2 0 4 1 3 smaller side larger side
Crafting an idealized similarity distribution 2 0 4 1 3
Crafting an idealized similarity distribution 2 0 4 1 3
Crafting an idealized similarity distribution 2 0 4 1 3
Test and Training data IAA Paragraph-Sentence Sentence-Phrase Phrase-Word Word-Sense Krippendorff’s α Training (all) Training (unadjudicated) Test (all) Test (unadjudicated)
The annotation procedure produces a balanced rating distribution
Experimental Setup Baslines: The quick brown fox • The brown fox was quick The quick brown fox • The brown fox es were quick
Experimental Setup Baslines: The quick brown fox • The brown fox was quick The quick brown fox • The brown fox es were quick Evaluation Measure:
Number of participants Paragraph-Sentence Sentence-Phrase Phrase-Word Word-Sense
Top 5 Systems and Baselines Gold LCS Baseline GST Baseline SemantiKLUE run1 UNAL-NLP run2 ECNU run1 SimCompass run1 Meerkat Mafia pw* 0 1 2 3 4 paragraph-sentence sentence-phrase phrase-word word-sense
Top 5 Systems and Baselines Gold LCS Baseline GST Baseline SemantiKLUE run1 UNAL-NLP run2 ECNU run1 SimCompass run1 Meerkat Mafia pw* 0 1 2 3 4 paragraph-sentence sentence-phrase phrase-word word-sense
Where do the baselines stand? LCS Baseline GST Baseline SemantiKLUE run1 UNAL-NLP run2 ECNU run1 SimCompass run1 Meerkat Mafia pw* 0 0.75 1.5 2.25 3 paragraph-sentence sentence-phrase phrase-word word-sense
Where do the baselines stand? LCS Baseline GST Baseline SemantiKLUE run1 UNAL-NLP run2 ECNU run1 SimCompass run1 Meerkat Mafia pw* 0 0.75 1.5 2.25 3 paragraph-sentence sentence-phrase phrase-word word-sense
Where do the baselines stand? LCS Baseline GST Baseline SemantiKLUE run1 UNAL-NLP run2 ECNU run1 SimCompass run1 Meerkat Mafia pw* 0 0.75 1.5 2.25 3 paragraph-sentence sentence-phrase phrase-word word-sense
Correlation per genre paragraph-to-sentence
Correlation per genre paragraph-to-sentence
Correlation per genre paragraph-to-sentence
Correlation per genre phrase-to-word
Correlation per genre phrase-to-word
What makes the task difficult?
Handling OOV words and novel usages
Dealing with social media text
CLSS: Cross-Level Semantic Similarity Similarity of different types of lexical items High-quality dataset: 4000 pairs for four comparison types 38 systems from 19 teams
Thank you! MultiJEDI ERC 259234 David Jurgens Mohammad Taher Pilehvar Roberto Navigli
Recommend
More recommend