Learning Word Representations by Jointly Modeling Syntagmatic and Paradigmatic Relations Fei Sun , Jiafeng Guo, Yanyan Lan, Jun Xu, and Xueqi Cheng ofey.sunfei@gmail.com CAS Key Lab of Network Data Science and Technology Institute of Computing Technology, Chinese Academy of Sciences October 22, 2015 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION October 22, 2015 1 / 27
Word Representations Sentiment Machine Analysis Translation [Kalchbrenner and [Maas et al., Blunsom, 2013] 2011] language POS Word Taging modeling Representations [Collobert et al., [Bengio et al., 2011] 2003] Word-Sense Parsing Disambiguation [Socher et al., [Collobert et al., 2011] 2011] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION October 22, 2015 2 / 27
Word Representations Models GloVe [Pennington et al., 2014] Word2Vec [Mikolov et al., 2013] LBL [Mnih and Hinton, 2007] NPLMs [Bengio et al., 2003] LDA [Blei et al., 2003] HAL [Deerwester et al., 1990] LSI BOW [Lund et al., 1995] [Harris, 1954] 1940 1960 1980 2000 2020 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION October 22, 2015 3 / 27
Word Representations Models GloVe [Pennington et al., 2014] Word2Vec [Mikolov et al., 2013] LBL [Mnih and Hinton, 2007] NPLMs [Bengio et al., 2003] Relations? LDA [Blei et al., 2003] HAL [Deerwester et al., 1990] LSI BOW [Lund et al., 1995] [Harris, 1954] 1940 1960 1980 2000 2020 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION October 22, 2015 3 / 27
Relations One Hypothesis Two Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION October 22, 2015 4 / 27
The Distributional Hypothesis [Harris, 1954, Firth, 1957] “ You shall know a word by the company it keeps .” —J.R. Firth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION October 22, 2015 5 / 27
Syntagmatic and Paradigmatic Relations [Sahlgren, 2008] syntagmatic was a physicist. Albert Einstein paradigmatic Feynman was a physicist. Richard syntagmatic • Syntagmatic: words co-occur in the same text region • Paradigmatic: words occur in the same context, may not at the same time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION October 22, 2015 6 / 27
Syntagmatic syntagmatic was a physicist. Albert Einstein Feynman was a physicist. Richard syntagmatic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION October 22, 2015 7 / 27
Syntagmatic syntagmatic was a physicist. Albert Einstein Feynman was a physicist. Richard syntagmatic d 1 d 2 Einstein 1 0 Feynman 0 1 physicist 1 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION October 22, 2015 7 / 27
Syntagmatic syntagmatic was a physicist. Albert Einstein Feynman was a physicist. Richard syntagmatic Feynman physicist 1 d 1 d 2 Einstein 1 0 Feynman 0 1 physicist 1 1 Einstein 0 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION October 22, 2015 7 / 27
Syntagmatic syntagmatic was a physicist. Albert Einstein Feynman was a physicist. Richard syntagmatic Feynman physicist 1 d 1 d 2 Einstein 1 0 Feynman 0 1 physicist 1 1 Einstein 0 1 LSI, LDA, PV-DBOW · · · . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION October 22, 2015 7 / 27
Paradigmatic was a physicist. Albert Einstein paradigmatic Feynman was a physicist. Richard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION October 22, 2015 8 / 27
Paradigmatic was a physicist. Albert Einstein paradigmatic Feynman was a physicist. Richard Einstein Feynman physicist Einstein 0 0 1 Feynman 0 0 1 physicist 1 1 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION October 22, 2015 8 / 27
Paradigmatic was a physicist. Albert Einstein paradigmatic Feynman was a physicist. Richard Einstein Feynman physicist Einstein 0 0 1 Einstein Feynman Feynman 0 0 1 physicist physicist 1 1 0 0 0 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION October 22, 2015 8 / 27
Paradigmatic was a physicist. Albert Einstein paradigmatic Feynman was a physicist. Richard Einstein Feynman physicist Einstein 0 0 1 Einstein Feynman Feynman 0 0 1 physicist physicist 1 1 0 0 0 0 NLMs, Word2Vec, GloVe · · · . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION October 22, 2015 8 / 27
Motivation syntagmatic was a physicist. Albert Einstein paradigmatic Feynman was a physicist. Richard syntagmatic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION October 22, 2015 9 / 27
Motivation syntagmatic was a physicist. Albert Einstein paradigmatic Word2Vec Feynman was a physicist. Richard syntagmatic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION October 22, 2015 9 / 27
Motivation syntagmatic was a physicist. Albert Einstein paradigmatic Word2Vec PV-DBOW Feynman was a physicist. Richard syntagmatic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION October 22, 2015 9 / 27
Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION October 22, 2015 10 / 27
nw nw Parallel Document Context Model (PDC) . . N . ∑ ∑ log p ( w n i | h n ℓ = i ) the w n n =1 i ∈ d n Paradigmatic cat on the sat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION October 22, 2015 11 / 27
nw nw Parallel Document Context Model (PDC) . . N . ( ) ∑ ∑ log p ( w n i | h n i )+ log p ( w n ℓ = i | d n ) the w n n =1 i ∈ d n Paradigmatic cat on the sat . . . . . . the cat sat Syntagmatic on the . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION October 22, 2015 11 / 27
Parallel Document Context Model (PDC) . . N . ( ) ∑ ∑ log p ( w n i | h n i )+ log p ( w n ℓ = i | d n ) the w n n =1 i ∈ d n Paradigmatic Negative Sampling cat N ( ∑ ∑ log σ ( ⃗ i · ⃗ i )+ log σ ( ⃗ i · ⃗ w n h n w n ℓ = d n ) w n n =1 i ∈ d n on w ′ · ⃗ + k · E w ′ ∼ P nw log σ ( − ⃗ h n i ) ) w ′ · ⃗ the + k · E w ′ ∼ P nw log σ ( − ⃗ d n ) sat 1 . σ ( x ) = . . 1 + exp ( − x ) . . . the cat sat Syntagmatic on the . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION October 22, 2015 11 / 27
Parallel Document Context Model (PDC) . . N . ( ) ∑ ∑ log p ( w n i | h n i )+ log p ( w n ℓ = i | d n ) the w n n =1 i ∈ d n Paradigmatic Negative Sampling cat N ( ∑ ∑ log σ ( ⃗ i · ⃗ i )+ log σ ( ⃗ i · ⃗ w n h n w n ℓ = d n ) w n n =1 i ∈ d n on w ′ · ⃗ + k · E w ′ ∼ P nw log σ ( − ⃗ h n i ) ) w ′ · ⃗ the + k · E w ′ ∼ P nw log σ ( − ⃗ d n ) sat 1 . σ ( x ) = . . 1 + exp ( − x ) PDC PV-DM . . . the cat MF for W-D + W-C sat not clear Syntagmatic on the . . . [Levy and Goldberg, 2014] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION October 22, 2015 11 / 27
nw nc Hierarchical Document Context Model (HDC) N ∑ ∑ log p ( w n ℓ = i | d n ) w n n =1 i ∈ d n sat Syntagmatic . . . the cat sat on the . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION October 22, 2015 12 / 27
Recommend
More recommend