Natural Language Processing Machine Translation III Dan Klein – UC Berkeley 1
Syntactic Models 2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
Syntactic Decoding 29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
Flexible Syntax 49
Soft Syntactic MT: From Chiang 2010 50
51
Hiero Rules From [Chiang et al, 2005] 52
53
54
55
56
57
58
59
60
61
62
63
64
Exploiting GPUs 65
Lots to Parse ≈ 2.6 billion words 66
Lots to Parse ≈ 6 months (CPU) 67
Lots to Parse ≈ 3.6 days (GPU) 68
CPU Parsing [Petrov & Klein, 2007] • NLP algorithms achieve speed by exploiting sparsity. >98% sparsity Slide credit: Slav Petrov 69
CPU Parsing Grammar × S ××× NP VP Skip Spans Skip Rules 70
CPU Parsing CPU 71
CPU Parsing CPU 72
CPU Parsing CPU CPU 73
The Future of Hardware 74
The Future of Hardware 75
The Future of Hardware 76
The Future of Hardware 16384 77
The Future of Hardware 32 Threads 78
The Future of Hardware add.s32 %r1, %r631, %r0; ld.global.f32 %f81, [%r1]; ld.global.f32 %f82, [%r34]; mul.ftz.f32 %f94, %f82, %f81; mov.f32 %f95, 0f3E002E23; mov.f32 %f96, 0f00000000; mad.f32 %f93, %f94, %f95, %f96; shl.b32 %r2, %r646, 8; add.s32 %r3, %r658, %r2; shl.b32 %r4, %r3, 2; add.s32 %r5, %r631, %r4; mul.lo.s32 %r6, %r646, 588; shl.b32 %r7, %r6, 1; add.s32 %r8, %r5, %r7; ld.global.f32 %f83, [%r8]; mul.ftz.f32 %f98, %f82, %f83; Warp 79
Warps Warp 80
Warps Warp Divergence 81
Warps 82
Warps 83
Warps Warp Divergence 84
Warps Warp Divergence 85
Warps ✔ ✗ Coalescence 86
Designing GPU Algorithms Warp Coalescence Dense, Uniform Computation 87
Designing GPU Algorithms CPU GPU Irregular, Regular, Sparse Dense × × ××× 88
Designing GPU Algorithms CPU GPU Irregular, Regular, Sparse Dense × × × ××× ××× [Canny, Hall, and Klein, 2013] 89
Designing GPU Algorithms CKY Algorithm 90
CKY Parsing for each sentence: Item Queue for each span (begin, end): for each split: for each rule (P ‐ > L R): score[begin, end, P] Grammar += ruleScore[P ‐ > L R] Application * score[begin, split, L] * score[split, end, R] 91
CKY Parsing for each sentence: Item Queue for each span (begin, end): for each split: Grammar applyGrammar(begin, split, end) Application 92
CKY Parsing Item Queue for each parse item in sentence: Grammar applyGrammar(item) Application 93
CKY Parsing CPU for each parse item in sentence: applyGrammar(item) GPU 94
GPU Parsing Pipeline CPU GPU Queue Grammar (i, k, j) S (0, 1, 3) (0, 1, 3) NP VP (0, 2, 3) 3 (1, 2, 4) 2 (1, 3, 4) … 95
Parsing Speed CPU 10 s/sec GPU 190 s/sec 0 100 200 300 400 500 Sentences per second [Canny, Hall, and Klein, 2013] 96
Exploiting Sparsity Grammar × S ××× NP VP CPU Queuing GPU Application 97
Exploiting Sparsity Grammar Grammar S S VP NP NP VP GPU Application GPU Application 98
Exploiting Sparsity (0, 1, 3) (0, 2, 3) (1, 2, 4) (1, 3, 4) 3 (2, 3, 5) 2 (2, 4, 5) (3, 4, 6) … Warp 99
Exploiting Sparsity (0, 1, 3) S NP VP PP … (0, 2, 3) S NP VP PP … (1, 2, 4) S NP VP PP … (1, 3, 4) S NP VP PP … (2, 3, 5) S NP VP PP … (2, 4, 5) S NP VP PP … (3, 4, 6) S NP VP PP … … 100
Recommend
More recommend