Learning Word Representations by Jointly Modeling Syntagmatic and Paradigmatic Relations Fei Sun , Jiafeng Guo, Yanyan Lan, Jun Xu, and Xueqi Cheng CAS Key Lab of Network Data Science and Technology Institute of Computing Technology, Chinese Academy of Sciences July 20, 2015 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION July 20, 2015 1 / 28
Word Representations Sentiment Machine Analysis Translation [Kalchbrenner and [Maas et al., Blunsom, 2013] 2011] language POS Word Taging modeling Representations [Collobert et al., [Bengio et al., 2011] 2003] Word-Sense Parsing Disambiguation [Socher et al., [Collobert et al., 2011] 2011] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION July 20, 2015 2 / 28
Word Representations Models GloVe [Pennington et al., 2014] Word2Vec [Mikolov et al., 2013] LBL [Mnih and Hinton, 2007] NPLMs [Bengio et al., 2003] LDA [Blei et al., 2003] HAL [Deerwester et al., 1990] LSI BOW [Lund et al., 1995] [Harris, 1954] 1940 1960 1980 2000 2020 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION July 20, 2015 3 / 28
Word Representations Models GloVe [Pennington et al., 2014] Word2Vec [Mikolov et al., 2013] LBL [Mnih and Hinton, 2007] NPLMs [Bengio et al., 2003] Relations? LDA [Blei et al., 2003] HAL [Deerwester et al., 1990] LSI BOW [Lund et al., 1995] [Harris, 1954] 1940 1960 1980 2000 2020 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION July 20, 2015 3 / 28
Relations One Hypothesis Two Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION July 20, 2015 4 / 28
The Distributional Hypothesis [Harris, 1954, Firth, 1957] “ You shall know a word by the company it keeps .” —J.R. Firth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION July 20, 2015 5 / 28
Syntagmatic and Paradigmatic Relations [Gabrilovich and Markovitch, 2007] syntagmatic was a physicist. Albert Einstein paradigmatic Feynman was a physicist. Richard syntagmatic • Syntagmatic: words co-occur in the same text region • Paradigmatic: words occur in the same context, may not at the same time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION July 20, 2015 6 / 28
Syntagmatic syntagmatic was a physicist. Albert Einstein Feynman was a physicist. Richard syntagmatic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION July 20, 2015 7 / 28
Syntagmatic syntagmatic was a physicist. Albert Einstein Feynman was a physicist. Richard syntagmatic d 1 d 2 Einstein 1 0 Feynman 0 1 physicist 1 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION July 20, 2015 7 / 28
Syntagmatic syntagmatic was a physicist. Albert Einstein Feynman was a physicist. Richard syntagmatic Feynman physicist 1 d 1 d 2 Einstein 1 0 Feynman 0 1 physicist 1 1 Einstein 0 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION July 20, 2015 7 / 28
Syntagmatic syntagmatic was a physicist. Albert Einstein Feynman was a physicist. Richard syntagmatic Feynman physicist 1 d 1 d 2 Einstein 1 0 Feynman 0 1 physicist 1 1 Einstein 0 1 LSI, LDA, PV-DBOW · · · . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION July 20, 2015 7 / 28
Paradigmatic was a physicist. Albert Einstein paradigmatic Feynman was a physicist. Richard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION July 20, 2015 8 / 28
Paradigmatic was a physicist. Albert Einstein paradigmatic Feynman was a physicist. Richard Einstein Feynman physicist Einstein 0 0 1 Feynman 0 0 1 physicist 1 1 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION July 20, 2015 8 / 28
Paradigmatic was a physicist. Albert Einstein paradigmatic Feynman was a physicist. Richard Einstein Feynman physicist Einstein 0 0 1 Einstein Feynman Feynman 0 0 1 physicist physicist 1 1 0 0 0 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION July 20, 2015 8 / 28
Paradigmatic was a physicist. Albert Einstein paradigmatic Feynman was a physicist. Richard Einstein Feynman physicist Einstein 0 0 1 Einstein Feynman Feynman 0 0 1 physicist physicist 1 1 0 0 0 0 NLMs, Word2Vec, GloVe · · · . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION July 20, 2015 8 / 28
Motivation syntagmatic was a physicist. Albert Einstein paradigmatic Word2Vec PV-DBOW Feynman was a physicist. Richard syntagmatic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION July 20, 2015 9 / 28
Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION July 20, 2015 10 / 28
nw nw Parallel Document Context Model (PDC) . . . N ∑ ∑ log p ( w n i | h n ℓ = i ) the w n n =1 i ∈ d n Paradigmatic cat on the sat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION July 20, 2015 11 / 28
nw nw Parallel Document Context Model (PDC) . . . N ( ) ∑ ∑ log p ( w n i | h n i )+ log p ( w n ℓ = i | d n ) the w n n =1 i ∈ d n Paradigmatic cat on the sat . . . . . . the cat sat Syntagmatic on the . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION July 20, 2015 11 / 28
Parallel Document Context Model (PDC) . . . N ( ) ∑ ∑ log p ( w n i | h n i )+ log p ( w n ℓ = i | d n ) the w n n =1 i ∈ d n Paradigmatic Negative Sampling cat N ( ∑ ∑ log σ ( ⃗ i · ⃗ i )+ log σ ( ⃗ i · ⃗ w n h n w n ℓ = d n ) on w n n =1 i ∈ d n w ′ · ⃗ + k · E w ′ ∼ P nw log σ ( ⃗ h n i ) the ) w ′ · ⃗ sat + k · E w ′ ∼ P nw log σ ( ⃗ d n ) . 1 . σ ( x ) = . 1 + exp ( − x ) . . . the cat sat Syntagmatic on the . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION July 20, 2015 11 / 28
Parallel Document Context Model (PDC) . . . N ( ) ∑ ∑ log p ( w n i | h n i )+ log p ( w n ℓ = i | d n ) the w n n =1 i ∈ d n Paradigmatic Negative Sampling cat N ( ∑ ∑ log σ ( ⃗ i · ⃗ i )+ log σ ( ⃗ i · ⃗ w n h n w n ℓ = d n ) on w n n =1 i ∈ d n w ′ · ⃗ + k · E w ′ ∼ P nw log σ ( ⃗ h n i ) the ) w ′ · ⃗ sat + k · E w ′ ∼ P nw log σ ( ⃗ d n ) . 1 . σ ( x ) = . 1 + exp ( − x ) . . . the cat sat PDC PV-DM Syntagmatic on the . . . MF for W-D + W-C not clear . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION July 20, 2015 11 / 28
nw nc Hierarchical Document Context Model (HDC) N ∑ ∑ log p ( w n ℓ = i | d n ) w n n =1 i ∈ d n sat Syntagmatic . . . the cat sat on the . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION July 20, 2015 12 / 28
nw nc Hierarchical Document Context Model (HDC) on cat · · · the the · · · N i + L ( ) ∑ ∑ ∑ log p ( w n log p ( c n j | w n ℓ = i | d n )+ i ) w n n =1 i ∈ d n j = i − L j ̸ = i Paradigmatic sat Syntagmatic . . . the cat sat on the . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fei Sun W ORD R EPRESENTATION July 20, 2015 12 / 28
Recommend
More recommend