NLP Programming Tutorial 4 - Word Segmentation Graham Neubig Nara - PowerPoint PPT Presentation

NLP Programming Tutorial 4 – Word Segmentation NLP Programming Tutorial 4 - Word Segmentation Graham Neubig Nara Institute of Science and Technology (NAIST) 1

NLP Programming Tutorial 4 – Word Segmentation Introduction 2

NLP Programming Tutorial 4 – Word Segmentation What is Word Segmentation ● Sentences in Japanese or Chinese are written without spaces 単語分割を行う ● Word segmentation adds spaces between words 単語分割を行う ● For Japanese, there are tools like MeCab, KyTea 3

NLP Programming Tutorial 4 – Word Segmentation Tools Required: Substring ● In order to do word segmentation, we need to find substrings of a word $ ./my-program.py hello world 4 lo wo

NLP Programming Tutorial 4 – Word Segmentation Handling Unicode Characters with Substr ● The “unicode()” and “encode()” functions handle UTF-8 $ cat test_file.txt 単語分割 $ ./my-program.py �� str: 5 単語分割 utf_str:

NLP Programming Tutorial 4 – Word Segmentation Word Segmentation is Hard! ● Many analyses for each sentence, only one correct 農産物価格安定法 o x 農産物価格安定法農産物価格安定法 (agricultural product price stabilization law) (agricultural cost of living discount measurement) ● How do we choose the correct analysis? 6

NLP Programming Tutorial 4 – Word Segmentation One Solution: Use a Language Model! ● Choose the analysis with the highest probability 農産物価格安定法 )= 4.12*10 -23 P( 農産物価格安定法 ) = 3.53*10 -24 P( 農産物価格安定法 )= 6.53*10 -25 P( 農産物価格安定法 )= 6.53*10 -27 P( … ● Here, we will use a unigram language model 7

NLP Programming Tutorial 4 – Word Segmentation Problem: HUGE Number of Possibilities 農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法 … 農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法 (how many?) 農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法農産物価格安定法 ● How do we find the best answer efficiently? 8

NLP Programming Tutorial 4 – Word Segmentation This Man Has an Answer! Andrew Viterbi (Professor UCLA →Founder of Qualcomm) 9

NLP Programming Tutorial 4 – Word Segmentation Viterbi Algorithm 10

NLP Programming Tutorial 4 – Word Segmentation The Viterbi Algorithm ● Efficient way to find the shortest path through a graph 1.4 2.3 4.0 2.5 0 1 2 3 2.1 Viterbi 1.4 2.3 0 1 2 3 11

NLP Programming Tutorial 4 – Word Segmentation Graph?! What?! ??? (Let Me Explain!) 12

NLP Programming Tutorial 4 – Word Segmentation Word Segmentations as Graphs 1.4 2.3 4.0 2.5 0 1 2 3 2.1 農産物 13

NLP Programming Tutorial 4 – Word Segmentation Word Segmentations as Graphs 1.4 2.3 4.0 2.5 0 1 2 3 2.1 農産物 ● Each edge is a word 14

NLP Programming Tutorial 4 – Word Segmentation Word Segmentations as Graphs 1.4 2.3 4.0 2.5 0 1 2 3 2.1 農産物 ● Each edge is a word ● Each edge weight is a negative log probability - log(P( 農産 )) = 1.4 ● Why?! (hint, we want the shortest path) 15

NLP Programming Tutorial 4 – Word Segmentation Word Segmentations as Graphs 1.4 2.3 4.0 2.5 0 1 2 3 2.1 農産物 ● Each path is a segmentation for the sentence 16

NLP Programming Tutorial 4 – Word Segmentation Word Segmentations as Graphs 1.4 2.3 4.0 2.5 0 1 2 3 2.1 農産物 ● Each path is a segmentation for the sentence ● Each path weight is a sentence unigram negative log probability - log(P( 農産 )) + - log(P( 物 )) = 1.4 + 2.3 = 3.7 17

NLP Programming Tutorial 4 – Word Segmentation Ok Viterbi, Tell Me More! ● The Viterbi Algorithm has two steps ● In forward order, find the score of the best path to each node ● In backward order, create the best path 18

NLP Programming Tutorial 4 – Word Segmentation Forward Step 19

NLP Programming Tutorial 4 – Word Segmentation Forward Step e 2 1.4 2.3 4.0 2.5 0 1 2 3 e 1 e 3 e 5 e 4 2.1 best_score [0] = 0 for each node in the graph (ascending order) best_score [ node ] = ∞ for each incoming edge of node score = best_score [ edge.prev_node ] + edge.score if score < best_score [ node ] best_score [ node ] = score 20 best_edge [ node ] = edge

NLP Programming Tutorial 4 – Word Segmentation Example: e 2 1.4 0 1 2 3 2.3 2.5 4.0 e 5 0.0 ∞ ∞ ∞ e 3 e 1 e 4 2.1 Initialize: best_score[0] = 0 21

NLP Programming Tutorial 4 – Word Segmentation Example: e 2 1.4 0 1 2 3 2.3 2.5 4.0 e 5 0.0 2.5 ∞ ∞ e 3 e 1 e 4 2.1 Initialize: best_score[0] = 0 Check e 1 : score = 0 + 2.5 = 2.5 (< ∞) best_score[1] = 2.5 best_edge[1] = e 1 22

NLP Programming Tutorial 4 – Word Segmentation Example: e 2 1.4 0 1 2 3 2.3 2.5 4.0 e 5 0.0 2.5 1.4 ∞ e 3 e 1 e 4 2.1 Initialize: best_score[0] = 0 Check e 1 : score = 0 + 2.5 = 2.5 (< ∞) best_score[1] = 2.5 best_edge[1] = e 1 Check e 2 : score = 0 + 1.4 = 1.4 (< ∞) best_score[2] = 1.4 23 best_edge[2] = e 2

NLP Programming Tutorial 4 – Word Segmentation Example: e 2 1.4 0 1 2 3 2.3 2.5 4.0 e 5 0.0 2.5 1.4 ∞ e 3 e 1 e 4 2.1 Check e 3 : Initialize: score = 2.5 + 4.0 = 6.5 (> 1.4) best_score[0] = 0 No change! Check e 1 : score = 0 + 2.5 = 2.5 (< ∞) best_score[1] = 2.5 best_edge[1] = e 1 Check e 2 : score = 0 + 1.4 = 1.4 (< ∞) best_score[2] = 1.4 24 best_edge[2] = e 2

NLP Programming Tutorial 4 – Word Segmentation Example: e 2 1.4 0 1 2 3 2.3 2.5 4.0 e 5 0.0 2.5 1.4 4.6 e 3 e 1 e 4 2.1 Check e 3 : Initialize: score = 2.5 + 4.0 = 6.5 (> 1.4) best_score[0] = 0 No change! Check e 1 : Check e 4 : score = 0 + 2.5 = 2.5 (< ∞) score = 2.5 + 2.1 = 4.6 (< ∞) best_score[1] = 2.5 best_score[3] = 4.6 best_edge[1] = e 1 best_edge[3] = e 4 Check e 2 : score = 0 + 1.4 = 1.4 (< ∞) best_score[2] = 1.4 25 best_edge[2] = e 2

NLP Programming Tutorial 4 – Word Segmentation Example: e 2 1.4 0 1 2 3 2.3 2.5 4.0 e 5 0.0 2.5 1.4 3.7 e 3 e 1 e 4 2.1 Check e 3 : Initialize: score = 2.5 + 4.0 = 6.5 (> 1.4) best_score[0] = 0 No change! Check e 1 : Check e 4 : score = 0 + 2.5 = 2.5 (< ∞) score = 2.5 + 2.1 = 4.6 (< ∞) best_score[1] = 2.5 best_score[3] = 4.6 best_edge[1] = e 1 best_edge[3] = e 4 Check e 5 : Check e 2 : score = 1.4 + 2.3 = 3.7 (< 4.6) score = 0 + 1.4 = 1.4 (< ∞) best_score[3] = 3.7 best_score[2] = 1.4 best_edge[3] = e 5 26 best_edge[2] = e 2

NLP Programming Tutorial 4 – Word Segmentation Result of Forward Step e 2 1.4 2.3 0 1 2 3 4.0 2.5 e 1 e 3 e 5 0.0 2.5 1.4 3.7 e 4 2.1 best_score = ( 0.0, 2.5, 1.4, 3.7 ) best_edge = ( NULL, e 1 , e 2 , e 5 ) 27

NLP Programming Tutorial 4 – Word Segmentation Backward Step 28

NLP Programming Tutorial 4 – Word Segmentation Backward Step e 2 1.4 2.3 0 1 2 3 4.0 2.5 e 1 e 3 e 5 0.0 2.5 1.4 3.7 e 4 2.1 best_path = [ ] next_edge = best_edge [ best_edge .length – 1] while next_edge != NULL add next_edge to best_path next_edge = best_edge [ next_edge.prev_node ] reverse best_path 29

NLP Programming Tutorial 4 – Word Segmentation Example of Backward Step e 2 1.4 0 1 2 3 2.3 4.0 2.5 0.0 e 1 2.5 e 3 1.4 e 5 3.7 e 4 2.1 Initialize: best_path = [] next_edge = best_edge[3] = e 5 30

NLP Programming Tutorial 4 - Word Segmentation Graham Neubig Nara - PowerPoint PPT Presentation

NLP Programming Tutorial 4 Word Segmentation NLP Programming Tutorial 4 - Word Segmentation Graham Neubig Nara Institute of Science and Technology (NAIST) 1 NLP Programming Tutorial 4 Word Segmentation Introduction 2 NLP Programming

NLP Programming Tutorial 0 - Programming Basics Graham Neubig Nara Institute of Science and

Segmentation Bottom-up Segmentation Semantic / instance segmentation Many Slides from L.

VIDEO SIGNALS Segmentation WHAT IS SEGMENTATION WHAT IS SEGMENTATION Segmentation is a

NLP Programming Tutorial 5 - Part of Speech Tagging with Hidden Markov Models Graham Neubig

NLP Programming Tutorial 2 - Bigram Language Models Graham Neubig Nara Institute of Science and

NLP Programming Tutorial 1 - Unigram Language Models Graham Neubig Nara Institute of Science and

NLP Programming Tutorial 8 - Phrase Structure Parsing Graham Neubig Nara Institute of Science

NLP Programming Tutorial 12 - Dependency Parsing Graham Neubig Nara Institute of Science and

NLP Programming Tutorial 6 - Advanced Discriminative Learning Graham Neubig Nara Institute of

NLP Programming Tutorial 8 - Recurrent Neural Nets Graham Neubig Nara Institute of Science and

NLP Programming Tutorial 6 - Kana-Kanji Conversion Graham Neubig Nara Institute of Science and

NLP Programming Tutorial 13 - Beam and A* Search Graham Neubig Nara Institute of Science and

NLP Programming Tutorial 3 - The Perceptron Algorithm Graham Neubig Nara Institute of Science

NLP Programming Tutorial 11 - The Structured Perceptron Graham Neubig Nara Institute of Science

NLP Programming Tutorial 7 - Topic Models Graham Neubig Nara Institute of Science and Technology

Semantic Segmentation / Instance Segmentation Based on Deep learning Yiding Liu 2018.12.08

Combining dual price smoothing and piecewise linear penalty function stabilization in column

Using Big Data To Solve Economic and Social Problems Professor Raj Chetty Head Section Leader

Column Generation Algorithms for the Capacitated m -Ring-Star Problem 1 Edna Ayako Hoshino and Cid

Silent Shout 2.009 SILVER Silent Shout The Market 65 million users $9.7 billion market 2.009

Mission and Challenges MARISSA WOLTMANN Director of Policy and Applied Research April 5, 2018

Kripke and Two-Dimensionalism David Chalmers Overview 1. Are Kripkes views in Naming and

The Embedding Problem for Non-Cognitivism; Introduction to Cognitivism; Motivational Externalism

UNDERSTANDING TORT LAW PRIVATE NUISANCE 04 THE DEFENDANTS MOTIVE The Mayor Of Bradford v