an exploration of embeddings for generalized phrases
play

An Exploration of Embeddings for Generalized Phrases Wenpeng Yin - PowerPoint PPT Presentation

An Exploration of Embeddings for Generalized Phrases Wenpeng Yin & Hinrich Schutze ... Prachi, 12485 Hrishikesh, 14111265 CSE IIT Kanpur Embeddings for Contents Generalized Phrases CS671, NLP Motivation Motivation 1 Embedding


  1. An Exploration of Embeddings for Generalized Phrases Wenpeng Yin & Hinrich Schutze ... Prachi, 12485 Hrishikesh, 14111265 CSE IIT Kanpur

  2. Embeddings for Contents Generalized Phrases CS671, NLP Motivation Motivation 1 Embedding Learning for Embedding Learning for SkipBs 2 SkipBs Embedding Learning for Embedding Learning for Phrases Phrases 3 Experiments Conclusion Experiments 4 References Appendix Conclusion 5 Appendix I Appendix II Bibliography Appendix 6 CS671, NLP (CSE IIT Kanpur) Embeddings for Generalized Phrases 0 / 26

  3. Motivation Embeddings for Generalized Phrases Generalized Phrases CS671, NLP Motivation Embedding Learning for Generalized Phrases include SkipBs Embedding Skip-bigrams (SkipBs) Learning for Phrases For example , skip-bigrams at a distance 2 in the sentence “This Experiments tea helped me to relax”are: Conclusion “This*helped”, “tea*me”, “helped*to”... References Appendix Continuous and non-continuous linguistic phrases Appendix I For example , “cold cuts”and “White House”are continuous Appendix II phrases and “take over”and “turn off”are non-continuous. CS671, NLP (CSE IIT Kanpur) Embeddings for Generalized Phrases 1 / 26

  4. Motivation Embeddings for Motivation Generalized Phrases CS671, NLP Motivation A particular task involving a word can be solved based only on Embedding Learning for context of word . SkipBs Embedding Generalized phrases can be used to infer the attributes of the Learning for Phrases context they enclose. Experiments For example: He helped Xiulan to find a flat. Conclusion References They can capture non-compositional semantics. Appendix For example: “keep up”, “keep on”, “keep from”etc. Appendix I Appendix II Embeddings of generalized phrases are better suited than word embeddings for a coreference resolution and paraphrase identification . CS671, NLP (CSE IIT Kanpur) Embeddings for Generalized Phrases 2 / 26

  5. Embedding Learning for SkipBs Embeddings for Embedding Learning for SkipBs Generalized Phrases CS671, NLP Motivation Used word2vec on English Gigaword corpus. Embedding Learning for SkipBs The corpus is represented as a sequence of sentences, each Embedding consisting of two tokens: a SkipB and a word that occurs Learning for Phrases between the two enclosing words of the SkipB. Experiments The distance between the two enclosing words can be k = 2 or Conclusion References 2 ≤ k ≤ 3 . Appendix when k = 2 , the trigram w i − 1 w i w i + 1 generates the single Appendix I sentence “ w i − 1 ∗ w i + 1 ∗ w i ”; Appendix II when 2 ≤ k ≤ 3 , the fourgram w i − 2 w i − 1 w i w i + 1 generates four sentences “ w i − 2 ∗ w i ∗ w i − 1 ”, “ w i − 1 ∗ w i + 1 ∗ w i ”, “ w i − 2 ∗ w i + 1 ∗ w i − 1 ”and “ w i − 2 ∗ w i + 1 ∗ w i ”. CS671, NLP (CSE IIT Kanpur) Embeddings for Generalized Phrases 3 / 26

  6. Embedding Learning for Phrases Phrase Collection Embeddings for Phrase Collection Generalized Phrases CS671, NLP Motivation Embedding Learning for SkipBs Embedding Learning for Extracted two-word phrases defined in Wiktionary and two-word Phrases phrases defined in Wordnet. Phrase Collection Phrase continuity A collection of continuous and noncontinuous phrases of size identification Sentence 95218 is formed. Reformatting Examples: Phrase Neighbors Experiments Conclusion References Appendix Appendix I CS671, NLP (CSE IIT Kanpur) Embeddings for Generalized Phrases 4 / 26

  7. Embedding Learning for Phrases Phrase continuity identification Embeddings for Identification of Phrase Continuity Generalized Phrases CS671, NLP Motivation Embedding Learning for For each phrase “A B”, compute [ c 1 , c 2 , c 3 , c 4 , c 5 ] where c i , SkipBs 1 ≤ i ≤ 5 , indicates there are c i occurrences of A and B in that Embedding Learning for order with a distance of i . Phrases Phrase Collection If c 1 is 10 times higher than ( c 2 + c 3 + c 4 + c 5 ) / 4 , classify Phrase continuity identification “A B”as continuous, otherwise as discontinuous. Sentence Reformatting For example, Examples: Phrase “pick off”: [ 1121 , 632 , 337 , 348 , 4052 ] Neighbors “Cornell University”: [ 14831 , 16 , 177 , 331 , 3471 ] Experiments Conclusion References Appendix Appendix I CS671, NLP (CSE IIT Kanpur) Embeddings for Generalized Phrases 5 / 26

  8. Embedding Learning for Phrases Sentence Reformatting Embeddings for Sentence Reformatting Generalized Phrases CS671, NLP Motivation Embedding Learning for SkipBs Sentence “...A...B...”is reformated into “...A B...A B...”if Embedding Learning for “A B”is a discontinuous phrase and is separated by maximal 4 Phrases words. Phrase Collection Phrase continuity Sentence “...AB...”into “...A B...”if “A B”is a continuous identification Sentence phrase. Reformatting Examples: Phrase word2vec is run on the reformatted corpus. Neighbors Experiments Conclusion References Appendix Appendix I CS671, NLP (CSE IIT Kanpur) Embeddings for Generalized Phrases 6 / 26

  9. Embedding Learning for Phrases Examples: Phrase Neighbors Embeddings for Examples of Phrase Neighbors Generalized Phrases CS671, NLP Motivation Embedding Learning for SkipBs Embedding Learning for Phrases Phrase Collection Phrase continuity identification Sentence Reformatting Examples: Phrase Neighbors Experiments Conclusion References Appendix Appendix I CS671, NLP (CSE IIT Kanpur) Embeddings for Generalized Phrases 7 / 26

  10. Experiments Animacy classification for markables Embeddings for Animacy classification for markables Generalized Phrases CS671, NLP Motivation Embedding Learning for SkipBs Figure : Example of markables Embedding Learning for Phrases A markable in coreference resolution refers to an entity in the Experiments Animacy real world or another linguistic expression. classification for markables Paraphrase Classifying markables as animate/inanimate is useful for Identification coreference resolution systems. Conclusion References animate chains : an animate pronoun markable and no Appendix inanimate pronoun markable Appendix I Appendix II inanimate chains : an inanimate pronoun markable and no animate pronoun markable CS671, NLP (CSE IIT Kanpur) Embeddings for Generalized Phrases 8 / 26

  11. Experiments Animacy classification for markables Embeddings for Generalized Phrases CS671, NLP Motivation Embedding Learning for SkipBs Embedding Learning for Phrases Experiments Animacy classification for markables Paraphrase Frequent Errors Identification Conclusion Unspecific SkipBs References For example, “take*in”and “then*goes” Appendix Appendix I Untypical use of specific SkipBs Appendix II For example, “...the southeastern area of Fujian whose economy is the most active” CS671, NLP (CSE IIT Kanpur) Embeddings for Generalized Phrases 9 / 26

  12. Experiments Animacy classification for markables Embeddings for Examples of SkipB Neighbors Generalized Phrases CS671, NLP Motivation Embedding Learning for SkipBs Embedding Learning for Phrases Experiments Animacy classification for markables Paraphrase Identification Conclusion References Appendix Appendix I Appendix II CS671, NLP (CSE IIT Kanpur) Embeddings for Generalized Phrases 10 / 26

  13. Experiments Paraphrase Identification Embeddings for Paraphrase Identification Task Generalized Phrases CS671, NLP Motivation Embedding Standard approaches are unlikely to assign a high similarity score Learning for SkipBs to the two sentences “he started the machine”and “he turned Embedding the machine on”. Learning for Phrases Experiments A sentence like “...A B...A B...”is considered as “A B”. Animacy classification for markables Paraphrase Identification Conclusion References Appendix Appendix I Appendix II CS671, NLP (CSE IIT Kanpur) Embeddings for Generalized Phrases 11 / 26

  14. Experiments Paraphrase Identification Embeddings for Comparison of Word and Phrase Embeddings Generalized Phrases CS671, NLP Motivation Embedding Learning for SkipBs Embedding Learning for Phrases Experiments Animacy classification for markables Paraphrase Identification Conclusion References Appendix Appendix I Appendix II CS671, NLP (CSE IIT Kanpur) Embeddings for Generalized Phrases 12 / 26

  15. Conclusion Embeddings for Summary Generalized Phrases CS671, NLP Motivation Embedding Learning for SkipBs Embedding Learning for Phrases Experiments Conclusion Future Work Bibliography References Appendix Appendix I Appendix II Figure : Generalized Phrases for Linguistic Tasks CS671, NLP (CSE IIT Kanpur) Embeddings for Generalized Phrases 13 / 26

Recommend


More recommend