determinantal point processes
play

DETERMINANTAL POINT PROCESSES FOR NATURAL LANGUAGE PROCESSING - PowerPoint PPT Presentation

DETERMINANTAL POINT PROCESSES FOR NATURAL LANGUAGE PROCESSING Jennifer Gillenwater Joint work with Alex Kulesza and Ben Taskar OUTLINE OUTLINE Motivation & background on DPPs OUTLINE Motivation & background on DPPs Large-scale


  1. DETERMINANTAL POINT PROCESSES FOR NATURAL LANGUAGE PROCESSING Jennifer Gillenwater Joint work with Alex Kulesza and Ben Taskar

  2. OUTLINE

  3. OUTLINE Motivation & background on DPPs

  4. OUTLINE Motivation & background on DPPs Large-scale settings

  5. OUTLINE Motivation & background on DPPs Large-scale settings Structured summarization

  6. OUTLINE Motivation & background on DPPs Large-scale settings Structured summarization Other potential NLP applications

  7. MOTIVATION & BACKGROUND

  8. SUMMARIZATION

  9. SUMMARIZATION ...

  10. SUMMARIZATION ...

  11. SUMMARIZATION ... Quality : relevance to the topic

  12. SUMMARIZATION ... Quality : Diversity : relevance to coverage of the topic core ideas

  13. SUBSET SELECTION

  14. SUBSET SELECTION

  15. SUBSET SELECTION

  16. SUBSET SELECTION

  17. AREA AS SET-GOODNESS

  18. AREA AS SET-GOODNESS feature space

  19. AREA AS SET-GOODNESS feature space B i B j

  20. AREA AS SET-GOODNESS feature space p B > quality = i B i similarity = B > i B j B i B j

  21. AREA AS SET-GOODNESS feature space p B > quality = i B i B i + B j similarity = B > i B j i B j ) 2 B i 2 � ( B > 2 k B j k 2 k 2 B i k q = a e B j r a

  22. AREA AS SET-GOODNESS feature space p B > quality = i B i B i + B j similarity = B > i B j i B j ) 2 B i 2 � ( B > 2 k B j k 2 k 2 B i k q = a e B j r a

  23. AREA AS SET-GOODNESS feature space p B > quality = i B i B i + B j similarity = B > i B j i B j ) 2 B i 2 � ( B > 2 k B j k 2 k 2 B i k q = a e B j r a

  24. VOLUME AS SET-GOODNESS q 2 � ( B > k B i k 2 2 k B j k 2 area = i B j ) 2

  25. VOLUME AS SET-GOODNESS q 2 � ( B > k B i k 2 2 k B j k 2 area = i B j ) 2

  26. VOLUME AS SET-GOODNESS q 2 � ( B > k B i k 2 2 k B j k 2 area = i B j ) 2 length = k B i k 2

  27. VOLUME AS SET-GOODNESS q 2 � ( B > k B i k 2 2 k B j k 2 area = i B j ) 2 volume = base × height length = k B i k 2

  28. VOLUME AS SET-GOODNESS q 2 � ( B > k B i k 2 2 k B j k 2 area = i B j ) 2 volume = base × height length = k B i k 2 vol( B ) = height × base = || B 1 || 2 vol(proj ⊥ B 1 ( B 2: N ))

  29. AREA AS A DET q 2 � ( B > k B i k 2 2 k B j k 2 area = i B j ) 2

  30. AREA AS A DET q 2 � ( B > k B i k 2 2 k B j k 2 area = i B j ) 2 1 ( ) B > || B i || 2 2 i B j 2 = det B > || B j || 2 i B j 2

  31. AREA AS A DET q 2 � ( B > k B i k 2 2 k B j k 2 area = i B j ) 2 1 ( ) B > || B i || 2 2 i B j 2 = det B > || B j || 2 i B j 2 1 = det ( ) 2 B i B j B i B j

  32. VOLUME AS A DET 1 = det ( ) 2 B i vol( B { i,j } ) B j B i B j

  33. VOLUME AS A DET 1 = det ( ) 2 B i vol( B { i,j } ) B j B i B j 1 ( ) 2 B 1 vol( B ) = det B 1 B N . . . . . . B N vol( B ) 2 = det( B > B ) = det( L )

  34. COMPLEX STATISTICS

  35. COMPLEX STATISTICS

  36. COMPLEX STATISTICS

  37. COMPLEX STATISTICS P

  38. COMPLEX STATISTICS P

  39. COMPLEX STATISTICS P

  40. COMPLEX STATISTICS P

  41. COMPLEX STATISTICS P

  42. COMPLEX STATISTICS P

  43. COMPLEX STATISTICS P

  44. COMPLEX STATISTICS P

  45. COMPLEX STATISTICS ⇒ 2 N sets N items =

  46. EFFICIENT COMPUTATION

  47. EFFICIENT COMPUTATION 1 det 2

  48. EFFICIENT COMPUTATION 2 det

  49. EFFICIENT COMPUTATION 2 det P

  50. EFFICIENT COMPUTATION 2 det O ( N 3 )

  51. POINT PROCESSES Y = { 1 , . . . , N }

  52. POINT PROCESSES Y = { 1 , . . . , N }

  53. POINT PROCESSES Y = { 1 , . . . , N } ( ) P

  54. POINT PROCESSES Y = { 1 , . . . , N } ( ) = 0 . 2 P

  55. DETERMINANTAL

  56. DETERMINANTAL P ( { 2 , 3 , 5 } ) ∝

  57. DETERMINANTAL P ( { 2 , 3 , 5 } ) ∝ L 11 L 12 L 13 L 14 L 15 L 21 L 22 L 23 L 24 L 25 L 31 L 32 L 33 L 34 L 35 L 41 L 42 L 43 L 44 L 45 L 51 L 52 L 53 L 54 L 55

  58. DETERMINANTAL P ( { 2 , 3 , 5 } ) ∝ L 11 L 12 L 13 L 14 L 15 L 21 L 22 L 23 L 24 L 25 L 31 L 32 L 33 L 34 L 35 L 41 L 42 L 43 L 44 L 45 L 51 L 52 L 53 L 54 L 55

  59. DETERMINANTAL L 22 L 23 L 25 P ( { 2 , 3 , 5 } ) ∝ L 32 L 33 L 35 L 52 L 53 L 55

  60. DETERMINANTAL det ( L 22 L 23 L 25 ) P ( { 2 , 3 , 5 } ) ∝ L 32 L 33 L 35 L 52 L 53 L 55

  61. DETERMINANTAL det ( L 22 L 23 L 25 ) P ( { 2 , 3 , 5 } ) = L 32 L 33 L 35 L 52 L 53 L 55

  62. DETERMINANTAL det ( L 22 L 23 L 25 ) P ( { 2 , 3 , 5 } ) = L 32 L 33 L 35 L 52 L 53 L 55 det( L + I )

  63. EFFICIENT INFERENCE

  64. EFFICIENT INFERENCE P L ( Y = Y ) Normalizing:

  65. EFFICIENT INFERENCE P L ( Y = Y ) Normalizing: P ( Y ⊆ Y ) Marginalizing:

  66. EFFICIENT INFERENCE P L ( Y = Y ) Normalizing: P ( Y ⊆ Y ) Marginalizing: Conditioning: P L ( Y = B | A ⊆ Y ) P L ( Y = B | A ∩ Y = ∅ )

  67. EFFICIENT INFERENCE P L ( Y = Y ) Normalizing: P ( Y ⊆ Y ) Marginalizing: Conditioning: P L ( Y = B | A ⊆ Y ) P L ( Y = B | A ∩ Y = ∅ ) Sampling: Y ∼ P L

  68. EFFICIENT INFERENCE P L ( Y = Y ) Normalizing: P ( Y ⊆ Y ) Marginalizing: Conditioning: P L ( Y = B | A ⊆ Y ) P L ( Y = B | A ∩ Y = ∅ ) Sampling: Y ∼ P L O ( N 3 )

  69. LARGE-SCALE SETTINGS

  70. DUAL KERNEL KULESZA AND TASKAR (NIPS 2010)

  71. DUAL KERNEL KULESZA AND TASKAR (NIPS 2010) L B 1 B N B 1 B 2 B 3 B 2 . . . B 3 . . . B N

  72. DUAL KERNEL KULESZA AND TASKAR (NIPS 2010) L B 1 B N B 1 B 2 B 3 B 2 . . . = B 3 N × N . . . B N

  73. DUAL KERNEL KULESZA AND TASKAR (NIPS 2010) C B 1 B N B 1 B 2 B 3 B 2 . . . = B 3 N × N . . . B N

  74. DUAL KERNEL KULESZA AND TASKAR (NIPS 2010) C B 1 B N B 1 B 2 B 3 B 2 . . . = B 3 . . . B N

  75. DUAL KERNEL KULESZA AND TASKAR (NIPS 2010) C B 1 B N B 1 B 2 B 3 D × D B 2 . . . = B 3 . . . B N

  76. DUAL INFERENCE

  77. DUAL INFERENCE L = V Λ V > C = ˆ V Λ ˆ V >

  78. DUAL INFERENCE V = B > ˆ V Λ � 1 L = V Λ V > C = ˆ V Λ ˆ V > 2

  79. DUAL INFERENCE V = B > ˆ V Λ � 1 L = V Λ V > C = ˆ V Λ ˆ V > 2 Normalizing O ( D 3 ) P Y det( L Y )

  80. DUAL INFERENCE V = B > ˆ V Λ � 1 L = V Λ V > C = ˆ V Λ ˆ V > 2 Normalizing O ( D 3 ) P Y det( L Y ) O ( D 3 + D 2 k 2 ) Marginalizing & Conditioning

  81. DUAL INFERENCE V = B > ˆ V Λ � 1 L = V Λ V > C = ˆ V Λ ˆ V > 2 Normalizing O ( D 3 ) P Y det( L Y ) O ( D 3 + D 2 k 2 ) Marginalizing & Conditioning Sampling O ( ND 2 k ) Y ∼ P L

  82. EXPONENTIAL N

  83. EXPONENTIAL N We want to select a diverse set of parses. N = O ( { sentence length } { sentence length } )

  84. EXPONENTIAL N We want to select a diverse set of parses. N = O ( { sentence length } { sentence length } ) N = O ( { node degree } { path length } )

  85. STRUCTURE FACTORIZATION KULESZA AND TASKAR (NIPS 2010) i =

  86. STRUCTURE FACTORIZATION KULESZA AND TASKAR (NIPS 2010) i = B i = q ( i ) φ ( i ) quality similarity

  87. STRUCTURE FACTORIZATION KULESZA AND TASKAR (NIPS 2010) i = { i α } α ∈ F α c = 1 i = B i = q ( i ) φ ( i ) quality similarity

  88. STRUCTURE FACTORIZATION KULESZA AND TASKAR (NIPS 2010) i = { i α } α ∈ F α c = 2 i = B i = q ( i ) φ ( i ) quality similarity

  89. STRUCTURE FACTORIZATION KULESZA AND TASKAR (NIPS 2010) i = { i α } α ∈ F α c = 2 i =  Q � B i = q ( i α ) φ ( i ) α ∈ F

  90. STRUCTURE FACTORIZATION KULESZA AND TASKAR (NIPS 2010) i = { i α } α ∈ F α c = 2 i =  Q �  P � B i = q ( i α ) φ ( i α ) α ∈ F α ∈ F

  91. STRUCTURE FACTORIZATION KULESZA AND TASKAR (NIPS 2010) i = { i α } α ∈ F α c = 2 i =  Q �  P � B i = q ( i α ) φ ( i α ) α ∈ F α ∈ F O ( ND 2 k ) Y ∼ P L

  92. STRUCTURE FACTORIZATION KULESZA AND TASKAR (NIPS 2010) α M = R = c = 2  Q �  P � B i = q ( i α ) φ ( i α ) α ∈ F α ∈ F O ( D 2 k 3 + Dk 2 M c R ) Y ∼ P L

  93. STRUCTURE FACTORIZATION KULESZA AND TASKAR (NIPS 2010) α M = R = c = 2  Q �  P � B i = q ( i α ) φ ( i α ) α ∈ F α ∈ F O ( D 2 k 3 + Dk 2 M c R ) Y ∼ P L M c R = 4 2 ⇤ 12 = 192 ⌧ N = 4 12 = 16 , 777 , 216

  94. LARGE FEATURE SETS?

  95. LARGE FEATURE SETS? N = # of items Large Exponential D = # of features dual + Small dual structure

  96. LARGE FEATURE SETS? N = # of items Large Exponential D = # of features dual + Small dual structure ? ? Large

  97. RANDOM PROJECTIONS GILLENWATER, KULESZA, AND TASKAR (EMNLP 2012)

  98. RANDOM PROJECTIONS GILLENWATER, KULESZA, AND TASKAR (EMNLP 2012) D N Φ

  99. RANDOM PROJECTIONS GILLENWATER, KULESZA, AND TASKAR (EMNLP 2012) D Φ M c R

  100. RANDOM PROJECTIONS GILLENWATER, KULESZA, AND TASKAR (EMNLP 2012) D d Φ D M c R ×

Recommend


More recommend