Remember Bag of Words? P ( Hello ) = 2 5 P ( I ) = 1 5 = P ( Will ) = P ( Say ) Günay Natural Language Processing I (Ch. 22) Spring 2013 7 / 30
Remember Bag of Words? P ( Hello ) = 2 5 P ( I ) = 1 5 = P ( Will ) = P ( Say ) Words are independent? Günay Natural Language Processing I (Ch. 22) Spring 2013 7 / 30
Remember Bag of Words? P ( Hello ) = 2 5 P ( I ) = 1 5 = P ( Will ) = P ( Say ) Words are independent? Called unigram or 1 -gram : Günay Natural Language Processing I (Ch. 22) Spring 2013 7 / 30
Remember Bag of Words? P ( Hello ) = 2 5 P ( I ) = 1 5 = P ( Will ) = P ( Say ) Words are independent? Called unigram or 1 -gram : � P ( w 1 , w 2 , . . . , w n ) = P ( w i ) i Günay Natural Language Processing I (Ch. 22) Spring 2013 7 / 30
Can we get more from Bayes? Distinguish between: “I will say hello” “I hello say will” Günay Natural Language Processing I (Ch. 22) Spring 2013 8 / 30
Can we get more from Bayes? Distinguish between: “I will say hello” “I hello say will” P (” hello ” | ” I will say ”) > < P (” will ” | ” I hello say ”) Günay Natural Language Processing I (Ch. 22) Spring 2013 8 / 30
Can we get more from Bayes? Distinguish between: “I will say hello” “I hello say will” P (” hello ” | ” I will say ”) > P (” will ” | ” I hello say ”) Günay Natural Language Processing I (Ch. 22) Spring 2013 8 / 30
Can we get more from Bayes? Distinguish between: “I will say hello” “I hello say will” P (” hello ” | ” I will say ”) > P (” will ” | ” I hello say ”) Words dependent on previous words: called N -gram Günay Natural Language Processing I (Ch. 22) Spring 2013 8 / 30
Can we get more from Bayes? Distinguish between: “I will say hello” “I hello say will” P (” hello ” | ” I will say ”) > P (” will ” | ” I hello say ”) Words dependent on previous words: called N -gram P ( w 1 , w 2 , . . . , w n ) = P ( w 1 : n ) � = P ( w i | w 1 :( i − 1 ) ) i Günay Natural Language Processing I (Ch. 22) Spring 2013 8 / 30
Must Remember All Words That Came Before? P (” 1752 ” | ” Thomas Bayes . . . ”) =? Günay Natural Language Processing I (Ch. 22) Spring 2013 9 / 30
Must Remember All Words That Came Before? P (” 1752 ” | ” Thomas Bayes . . . ”) =? Markov assumption: Only remember last N words: N -gram. Günay Natural Language Processing I (Ch. 22) Spring 2013 9 / 30
Must Remember All Words That Came Before? P (” 1752 ” | ” Thomas Bayes . . . ”) =? Markov assumption: Only remember last N words: N -gram. k � P ( w 1 : k ) = P ( w i | w ( i − N ):( i − 1 ) ) i Günay Natural Language Processing I (Ch. 22) Spring 2013 9 / 30
Let’s Read Shakespeare. . . In Unigram Unigram=1-gram Günay Natural Language Processing I (Ch. 22) Spring 2013 10 / 30
Shakespeare In Bigram N = 2: bigram Günay Natural Language Processing I (Ch. 22) Spring 2013 11 / 30
Shakespeare In Trigram N = 3: trigram Günay Natural Language Processing I (Ch. 22) Spring 2013 12 / 30
Shakespeare In 4-gram Günay Natural Language Processing I (Ch. 22) Spring 2013 13 / 30
Shakespeare N -gram Quiz Find: 1 real quote 3x unigram picks 3x bigram picks 3x trigram picks Günay Natural Language Processing I (Ch. 22) Spring 2013 14 / 30
Shakespeare N -gram Quiz Find: 1 real quote 3x unigram picks 3x bigram picks 3x trigram picks Günay Natural Language Processing I (Ch. 22) Spring 2013 14 / 30
Bigram Probability Question P ( ˆ woe is me | ˆ ) =? Given that: ˆ: symbol showing start of sentence P ( woe i | ˆ i − 1 ) = . 0002 P ( is i | woe i − 1 ) = . 07 P ( me i | is i − 1 ) = . 0005 Günay Natural Language Processing I (Ch. 22) Spring 2013 15 / 30
Bigram Probability Question P ( ˆ woe is me | ˆ ) = . 0002 × . 07 × . 0005 = 7 × 10 − 9 Given that: ˆ: symbol showing start of sentence P ( woe i | ˆ i − 1 ) = . 0002 P ( is i | woe i − 1 ) = . 07 P ( me i | is i − 1 ) = . 0005 Günay Natural Language Processing I (Ch. 22) Spring 2013 15 / 30
Other Tricks Stationarity assumption: Context doesn’t change over time. Günay Natural Language Processing I (Ch. 22) Spring 2013 16 / 30
Other Tricks Stationarity assumption: Context doesn’t change over time. Smoothing: Remember Laplace smooting? Günay Natural Language Processing I (Ch. 22) Spring 2013 16 / 30
Other Tricks Stationarity assumption: Context doesn’t change over time. Smoothing: Remember Laplace smooting? Hidden variables: E.g., identify what a “noun” is. Günay Natural Language Processing I (Ch. 22) Spring 2013 16 / 30
Other Tricks Stationarity assumption: Context doesn’t change over time. Smoothing: Remember Laplace smooting? Hidden variables: E.g., identify what a “noun” is. Use abstractions: Group “New York City”, or just look at letters. Günay Natural Language Processing I (Ch. 22) Spring 2013 16 / 30
Smaller Than Words? What if we cannot distinguish words? Günay Natural Language Processing I (Ch. 22) Spring 2013 17 / 30
Smaller Than Words? What if we cannot distinguish words? Günay Natural Language Processing I (Ch. 22) Spring 2013 17 / 30
Smaller Than Words? What if we cannot distinguish words? English: “choosespain.com” “Choose Spain” OR “Chooses Pain”? Günay Natural Language Processing I (Ch. 22) Spring 2013 17 / 30
Smaller Than Words? What if we cannot distinguish words? English: “choosespain.com” “Choose Spain” OR “Chooses Pain”? Segmentation: Dividing into words. Günay Natural Language Processing I (Ch. 22) Spring 2013 17 / 30
Smaller Than Words? What if we cannot distinguish words? English: “choosespain.com” “Choose Spain” OR “Chooses Pain”? Segmentation: Dividing into words. Use Bayes again: s ∗ = max P ( w 1 : n ) = max � P ( w i | w 1 : i ) i Günay Natural Language Processing I (Ch. 22) Spring 2013 17 / 30
Smaller Than Words? What if we cannot distinguish words? English: “choosespain.com” “Choose Spain” OR “Chooses Pain”? Segmentation: Dividing into words. Use Bayes again: s ∗ = max P ( w 1 : n ) = max � P ( w i | w 1 : i ) i Or Markov assumption (e.g., unigram): s ∗ = max � P ( w i ) i Günay Natural Language Processing I (Ch. 22) Spring 2013 17 / 30
Segmentation Complexity s ∗ = max � P ( w i ) i What’s the complexity of segmenting: ”nowisthetime” ? Günay Natural Language Processing I (Ch. 22) Spring 2013 18 / 30
Segmentation Complexity s ∗ = max � P ( w i ) i What’s the complexity of segmenting: ”nowisthetime” ? 1 n − 1 2 ( n − 1 ) 2 3 ( n − 1 )! 4 2 n − 1 5 ( n − 1 ) n Günay Natural Language Processing I (Ch. 22) Spring 2013 18 / 30
Segmentation Complexity s ∗ = max � P ( w i ) i What’s the complexity of segmenting: ”nowisthetime” ? 1 n − 1 2 ( n − 1 ) 2 3 ( n − 1 )! 4 2 n − 1 5 ( n − 1 ) n Günay Natural Language Processing I (Ch. 22) Spring 2013 18 / 30
Segmentation Complexity s ∗ = max � P ( w i ) i What’s the complexity of segmenting: ”nowisthetime” ? 1 n − 1 2 ( n − 1 ) 2 3 ( n − 1 )! 4 2 n − 1 5 ( n − 1 ) n Solution: Separate each character with n − 1 divisions, form words by whether division exists or not. Günay Natural Language Processing I (Ch. 22) Spring 2013 18 / 30
Reducing Segmentation Complexity Exploit independence: ”nowisthetime” ? Günay Natural Language Processing I (Ch. 22) Spring 2013 19 / 30
Reducing Segmentation Complexity Exploit independence: ”nowisthetime” ? Divide into first, f , and recurse for rest, r : s ∗ = max s = f + r P ( f ) · s ∗ ( r ) Günay Natural Language Processing I (Ch. 22) Spring 2013 19 / 30
Reducing Segmentation Complexity Exploit independence: ”nowisthetime” ? Divide into first, f , and recurse for rest, r : s ∗ = max s = f + r P ( f ) · s ∗ ( r ) Gives 99% accuracy and easy implementation ! Günay Natural Language Processing I (Ch. 22) Spring 2013 19 / 30
Segmentation Problems Günay Natural Language Processing I (Ch. 22) Spring 2013 20 / 30
Segmentation Problems How can we improve? 1 More Data 2 Markov 3 Smoothing Günay Natural Language Processing I (Ch. 22) Spring 2013 20 / 30
Segmentation Problems How can we improve? 1 More Data 2 Markov 3 Smoothing Günay Natural Language Processing I (Ch. 22) Spring 2013 20 / 30
Segmentation Problems How can we improve? 1 More Data 2 Markov 3 Smoothing Need to get the context . Günay Natural Language Processing I (Ch. 22) Spring 2013 20 / 30
Segmentation Problems (2) Günay Natural Language Processing I (Ch. 22) Spring 2013 21 / 30
Segmentation Problems (2) How can we improve? 1 More Data 2 Markov 3 Smoothing Günay Natural Language Processing I (Ch. 22) Spring 2013 21 / 30
Segmentation Problems (2) How can we improve? 1 More Data 2 Markov 3 Smoothing Günay Natural Language Processing I (Ch. 22) Spring 2013 21 / 30
Segmentation Problems (2) How can we improve? 1 More Data 2 Markov 3 Smoothing Günay Natural Language Processing I (Ch. 22) Spring 2013 21 / 30
Segmentation Problems (2) How can we improve? 1 More Data 2 Markov 3 Smoothing Need to know more words . Günay Natural Language Processing I (Ch. 22) Spring 2013 21 / 30
What Else Can We Do with Letters? Language identification? Günay Natural Language Processing I (Ch. 22) Spring 2013 22 / 30
What Else Can We Do with Letters? Language identification? Günay Natural Language Processing I (Ch. 22) Spring 2013 22 / 30
Bigram Recognition with Letters Günay Natural Language Processing I (Ch. 22) Spring 2013 23 / 30
Bigram Recognition with Letters Günay Natural Language Processing I (Ch. 22) Spring 2013 23 / 30
Trigram Recognition with Letters Günay Natural Language Processing I (Ch. 22) Spring 2013 24 / 30
Trigram Recognition with Letters Günay Natural Language Processing I (Ch. 22) Spring 2013 24 / 30
Trigram Recognition with Letters Günay Natural Language Processing I (Ch. 22) Spring 2013 24 / 30
Trigram Recognition with Letters 99% accuracy from trigrams! Günay Natural Language Processing I (Ch. 22) Spring 2013 24 / 30
Can We Identify Categories Too? Günay Natural Language Processing I (Ch. 22) Spring 2013 25 / 30
Can We Identify Categories Too? Text classification Günay Natural Language Processing I (Ch. 22) Spring 2013 25 / 30
Text Classification What algorithms can we use? Naive Bayes: Günay Natural Language Processing I (Ch. 22) Spring 2013 26 / 30
Text Classification What algorithms can we use? Naive Bayes: Spam vs. Ham Günay Natural Language Processing I (Ch. 22) Spring 2013 26 / 30
Text Classification What algorithms can we use? Naive Bayes: Spam vs. Ham k -Nearest Neighbor: Günay Natural Language Processing I (Ch. 22) Spring 2013 26 / 30
Text Classification What algorithms can we use? Naive Bayes: Spam vs. Ham k -Nearest Neighbor: Similar words Günay Natural Language Processing I (Ch. 22) Spring 2013 26 / 30
Text Classification What algorithms can we use? Naive Bayes: Spam vs. Ham k -Nearest Neighbor: Similar words Support Vector Machines: Günay Natural Language Processing I (Ch. 22) Spring 2013 26 / 30
Text Classification What algorithms can we use? Naive Bayes: Spam vs. Ham k -Nearest Neighbor: Similar words Support Vector Machines: Supervised learning Günay Natural Language Processing I (Ch. 22) Spring 2013 26 / 30
Recommend
More recommend