cs325 artificial intelligence natural language processing
play

CS325 Artificial Intelligence Natural Language Processing I (Ch. 22) - PowerPoint PPT Presentation

CS325 Artificial Intelligence Natural Language Processing I (Ch. 22) Dr. Cengiz Gnay, Emory Univ. Spring 2013 Gnay Natural Language Processing I (Ch. 22) Spring 2013 1 / 30 AI in Natural Language Processing (NLP) Whats NLP? Gnay


  1. Remember Bag of Words? P ( Hello ) = 2 5 P ( I ) = 1 5 = P ( Will ) = P ( Say ) Günay Natural Language Processing I (Ch. 22) Spring 2013 7 / 30

  2. Remember Bag of Words? P ( Hello ) = 2 5 P ( I ) = 1 5 = P ( Will ) = P ( Say ) Words are independent? Günay Natural Language Processing I (Ch. 22) Spring 2013 7 / 30

  3. Remember Bag of Words? P ( Hello ) = 2 5 P ( I ) = 1 5 = P ( Will ) = P ( Say ) Words are independent? Called unigram or 1 -gram : Günay Natural Language Processing I (Ch. 22) Spring 2013 7 / 30

  4. Remember Bag of Words? P ( Hello ) = 2 5 P ( I ) = 1 5 = P ( Will ) = P ( Say ) Words are independent? Called unigram or 1 -gram : � P ( w 1 , w 2 , . . . , w n ) = P ( w i ) i Günay Natural Language Processing I (Ch. 22) Spring 2013 7 / 30

  5. Can we get more from Bayes? Distinguish between: “I will say hello” “I hello say will” Günay Natural Language Processing I (Ch. 22) Spring 2013 8 / 30

  6. Can we get more from Bayes? Distinguish between: “I will say hello” “I hello say will” P (” hello ” | ” I will say ”) > < P (” will ” | ” I hello say ”) Günay Natural Language Processing I (Ch. 22) Spring 2013 8 / 30

  7. Can we get more from Bayes? Distinguish between: “I will say hello” “I hello say will” P (” hello ” | ” I will say ”) > P (” will ” | ” I hello say ”) Günay Natural Language Processing I (Ch. 22) Spring 2013 8 / 30

  8. Can we get more from Bayes? Distinguish between: “I will say hello” “I hello say will” P (” hello ” | ” I will say ”) > P (” will ” | ” I hello say ”) Words dependent on previous words: called N -gram Günay Natural Language Processing I (Ch. 22) Spring 2013 8 / 30

  9. Can we get more from Bayes? Distinguish between: “I will say hello” “I hello say will” P (” hello ” | ” I will say ”) > P (” will ” | ” I hello say ”) Words dependent on previous words: called N -gram P ( w 1 , w 2 , . . . , w n ) = P ( w 1 : n ) � = P ( w i | w 1 :( i − 1 ) ) i Günay Natural Language Processing I (Ch. 22) Spring 2013 8 / 30

  10. Must Remember All Words That Came Before? P (” 1752 ” | ” Thomas Bayes . . . ”) =? Günay Natural Language Processing I (Ch. 22) Spring 2013 9 / 30

  11. Must Remember All Words That Came Before? P (” 1752 ” | ” Thomas Bayes . . . ”) =? Markov assumption: Only remember last N words: N -gram. Günay Natural Language Processing I (Ch. 22) Spring 2013 9 / 30

  12. Must Remember All Words That Came Before? P (” 1752 ” | ” Thomas Bayes . . . ”) =? Markov assumption: Only remember last N words: N -gram. k � P ( w 1 : k ) = P ( w i | w ( i − N ):( i − 1 ) ) i Günay Natural Language Processing I (Ch. 22) Spring 2013 9 / 30

  13. Let’s Read Shakespeare. . . In Unigram Unigram=1-gram Günay Natural Language Processing I (Ch. 22) Spring 2013 10 / 30

  14. Shakespeare In Bigram N = 2: bigram Günay Natural Language Processing I (Ch. 22) Spring 2013 11 / 30

  15. Shakespeare In Trigram N = 3: trigram Günay Natural Language Processing I (Ch. 22) Spring 2013 12 / 30

  16. Shakespeare In 4-gram Günay Natural Language Processing I (Ch. 22) Spring 2013 13 / 30

  17. Shakespeare N -gram Quiz Find: 1 real quote 3x unigram picks 3x bigram picks 3x trigram picks Günay Natural Language Processing I (Ch. 22) Spring 2013 14 / 30

  18. Shakespeare N -gram Quiz Find: 1 real quote 3x unigram picks 3x bigram picks 3x trigram picks Günay Natural Language Processing I (Ch. 22) Spring 2013 14 / 30

  19. Bigram Probability Question P ( ˆ woe is me | ˆ ) =? Given that: ˆ: symbol showing start of sentence P ( woe i | ˆ i − 1 ) = . 0002 P ( is i | woe i − 1 ) = . 07 P ( me i | is i − 1 ) = . 0005 Günay Natural Language Processing I (Ch. 22) Spring 2013 15 / 30

  20. Bigram Probability Question P ( ˆ woe is me | ˆ ) = . 0002 × . 07 × . 0005 = 7 × 10 − 9 Given that: ˆ: symbol showing start of sentence P ( woe i | ˆ i − 1 ) = . 0002 P ( is i | woe i − 1 ) = . 07 P ( me i | is i − 1 ) = . 0005 Günay Natural Language Processing I (Ch. 22) Spring 2013 15 / 30

  21. Other Tricks Stationarity assumption: Context doesn’t change over time. Günay Natural Language Processing I (Ch. 22) Spring 2013 16 / 30

  22. Other Tricks Stationarity assumption: Context doesn’t change over time. Smoothing: Remember Laplace smooting? Günay Natural Language Processing I (Ch. 22) Spring 2013 16 / 30

  23. Other Tricks Stationarity assumption: Context doesn’t change over time. Smoothing: Remember Laplace smooting? Hidden variables: E.g., identify what a “noun” is. Günay Natural Language Processing I (Ch. 22) Spring 2013 16 / 30

  24. Other Tricks Stationarity assumption: Context doesn’t change over time. Smoothing: Remember Laplace smooting? Hidden variables: E.g., identify what a “noun” is. Use abstractions: Group “New York City”, or just look at letters. Günay Natural Language Processing I (Ch. 22) Spring 2013 16 / 30

  25. Smaller Than Words? What if we cannot distinguish words? Günay Natural Language Processing I (Ch. 22) Spring 2013 17 / 30

  26. Smaller Than Words? What if we cannot distinguish words? Günay Natural Language Processing I (Ch. 22) Spring 2013 17 / 30

  27. Smaller Than Words? What if we cannot distinguish words? English: “choosespain.com” “Choose Spain” OR “Chooses Pain”? Günay Natural Language Processing I (Ch. 22) Spring 2013 17 / 30

  28. Smaller Than Words? What if we cannot distinguish words? English: “choosespain.com” “Choose Spain” OR “Chooses Pain”? Segmentation: Dividing into words. Günay Natural Language Processing I (Ch. 22) Spring 2013 17 / 30

  29. Smaller Than Words? What if we cannot distinguish words? English: “choosespain.com” “Choose Spain” OR “Chooses Pain”? Segmentation: Dividing into words. Use Bayes again: s ∗ = max P ( w 1 : n ) = max � P ( w i | w 1 : i ) i Günay Natural Language Processing I (Ch. 22) Spring 2013 17 / 30

  30. Smaller Than Words? What if we cannot distinguish words? English: “choosespain.com” “Choose Spain” OR “Chooses Pain”? Segmentation: Dividing into words. Use Bayes again: s ∗ = max P ( w 1 : n ) = max � P ( w i | w 1 : i ) i Or Markov assumption (e.g., unigram): s ∗ = max � P ( w i ) i Günay Natural Language Processing I (Ch. 22) Spring 2013 17 / 30

  31. Segmentation Complexity s ∗ = max � P ( w i ) i What’s the complexity of segmenting: ”nowisthetime” ? Günay Natural Language Processing I (Ch. 22) Spring 2013 18 / 30

  32. Segmentation Complexity s ∗ = max � P ( w i ) i What’s the complexity of segmenting: ”nowisthetime” ? 1 n − 1 2 ( n − 1 ) 2 3 ( n − 1 )! 4 2 n − 1 5 ( n − 1 ) n Günay Natural Language Processing I (Ch. 22) Spring 2013 18 / 30

  33. Segmentation Complexity s ∗ = max � P ( w i ) i What’s the complexity of segmenting: ”nowisthetime” ? 1 n − 1 2 ( n − 1 ) 2 3 ( n − 1 )! 4 2 n − 1 5 ( n − 1 ) n Günay Natural Language Processing I (Ch. 22) Spring 2013 18 / 30

  34. Segmentation Complexity s ∗ = max � P ( w i ) i What’s the complexity of segmenting: ”nowisthetime” ? 1 n − 1 2 ( n − 1 ) 2 3 ( n − 1 )! 4 2 n − 1 5 ( n − 1 ) n Solution: Separate each character with n − 1 divisions, form words by whether division exists or not. Günay Natural Language Processing I (Ch. 22) Spring 2013 18 / 30

  35. Reducing Segmentation Complexity Exploit independence: ”nowisthetime” ? Günay Natural Language Processing I (Ch. 22) Spring 2013 19 / 30

  36. Reducing Segmentation Complexity Exploit independence: ”nowisthetime” ? Divide into first, f , and recurse for rest, r : s ∗ = max s = f + r P ( f ) · s ∗ ( r ) Günay Natural Language Processing I (Ch. 22) Spring 2013 19 / 30

  37. Reducing Segmentation Complexity Exploit independence: ”nowisthetime” ? Divide into first, f , and recurse for rest, r : s ∗ = max s = f + r P ( f ) · s ∗ ( r ) Gives 99% accuracy and easy implementation ! Günay Natural Language Processing I (Ch. 22) Spring 2013 19 / 30

  38. Segmentation Problems Günay Natural Language Processing I (Ch. 22) Spring 2013 20 / 30

  39. Segmentation Problems How can we improve? 1 More Data 2 Markov 3 Smoothing Günay Natural Language Processing I (Ch. 22) Spring 2013 20 / 30

  40. Segmentation Problems How can we improve? 1 More Data 2 Markov 3 Smoothing Günay Natural Language Processing I (Ch. 22) Spring 2013 20 / 30

  41. Segmentation Problems How can we improve? 1 More Data 2 Markov 3 Smoothing Need to get the context . Günay Natural Language Processing I (Ch. 22) Spring 2013 20 / 30

  42. Segmentation Problems (2) Günay Natural Language Processing I (Ch. 22) Spring 2013 21 / 30

  43. Segmentation Problems (2) How can we improve? 1 More Data 2 Markov 3 Smoothing Günay Natural Language Processing I (Ch. 22) Spring 2013 21 / 30

  44. Segmentation Problems (2) How can we improve? 1 More Data 2 Markov 3 Smoothing Günay Natural Language Processing I (Ch. 22) Spring 2013 21 / 30

  45. Segmentation Problems (2) How can we improve? 1 More Data 2 Markov 3 Smoothing Günay Natural Language Processing I (Ch. 22) Spring 2013 21 / 30

  46. Segmentation Problems (2) How can we improve? 1 More Data 2 Markov 3 Smoothing Need to know more words . Günay Natural Language Processing I (Ch. 22) Spring 2013 21 / 30

  47. What Else Can We Do with Letters? Language identification? Günay Natural Language Processing I (Ch. 22) Spring 2013 22 / 30

  48. What Else Can We Do with Letters? Language identification? Günay Natural Language Processing I (Ch. 22) Spring 2013 22 / 30

  49. Bigram Recognition with Letters Günay Natural Language Processing I (Ch. 22) Spring 2013 23 / 30

  50. Bigram Recognition with Letters Günay Natural Language Processing I (Ch. 22) Spring 2013 23 / 30

  51. Trigram Recognition with Letters Günay Natural Language Processing I (Ch. 22) Spring 2013 24 / 30

  52. Trigram Recognition with Letters Günay Natural Language Processing I (Ch. 22) Spring 2013 24 / 30

  53. Trigram Recognition with Letters Günay Natural Language Processing I (Ch. 22) Spring 2013 24 / 30

  54. Trigram Recognition with Letters 99% accuracy from trigrams! Günay Natural Language Processing I (Ch. 22) Spring 2013 24 / 30

  55. Can We Identify Categories Too? Günay Natural Language Processing I (Ch. 22) Spring 2013 25 / 30

  56. Can We Identify Categories Too? Text classification Günay Natural Language Processing I (Ch. 22) Spring 2013 25 / 30

  57. Text Classification What algorithms can we use? Naive Bayes: Günay Natural Language Processing I (Ch. 22) Spring 2013 26 / 30

  58. Text Classification What algorithms can we use? Naive Bayes: Spam vs. Ham Günay Natural Language Processing I (Ch. 22) Spring 2013 26 / 30

  59. Text Classification What algorithms can we use? Naive Bayes: Spam vs. Ham k -Nearest Neighbor: Günay Natural Language Processing I (Ch. 22) Spring 2013 26 / 30

  60. Text Classification What algorithms can we use? Naive Bayes: Spam vs. Ham k -Nearest Neighbor: Similar words Günay Natural Language Processing I (Ch. 22) Spring 2013 26 / 30

  61. Text Classification What algorithms can we use? Naive Bayes: Spam vs. Ham k -Nearest Neighbor: Similar words Support Vector Machines: Günay Natural Language Processing I (Ch. 22) Spring 2013 26 / 30

  62. Text Classification What algorithms can we use? Naive Bayes: Spam vs. Ham k -Nearest Neighbor: Similar words Support Vector Machines: Supervised learning Günay Natural Language Processing I (Ch. 22) Spring 2013 26 / 30

Recommend


More recommend