truly subcubic algorithms for language edit distance and
play

Truly Subcubic Algorithms for Language Edit Distance and RNA Folding - PowerPoint PPT Presentation

Truly Subcubic Algorithms for Language Edit Distance and RNA Folding via Fast Bounded-Difference Min-Plus Product Karl Bringmann , Fabrizio Grandoni, Barna Saha, Virginia Vassilevska Williams June 11, 2017 Bounded Differences (BD) Matrices


  1. Truly Subcubic Algorithms for Language Edit Distance and RNA Folding via Fast Bounded-Difference Min-Plus Product Karl Bringmann , Fabrizio Grandoni, Barna Saha, Virginia Vassilevska Williams June 11, 2017

  2. Bounded Differences (BD) Matrices Integer matrix 𝑁 has BD if for all 𝑗, π‘˜ : 2 2 3 2 𝑁 𝑗, π‘˜ βˆ’ 𝑁[𝑗, π‘˜ + 1] ≀ 1 1 1 2 3 and 2 1 2 3 𝑁 𝑗, π‘˜ βˆ’ 𝑁[𝑗 + 1, π‘˜] ≀ 1 1 0 1 2 More generally: 𝑿 -BD when differences are at most 𝑋

  3. οΏ½ οΏ½ (min,+) Product For π‘œΓ—π‘œ -matrices 𝐡, 𝐢 , their (min,+) product 𝐷 = 𝐡 βˆ— 𝐢 is defined by 𝐷 𝑗, π‘˜ = min 7 𝐡 𝑗, 𝑙 + 𝐢[𝑙, π‘˜] (min,+) product is equivalent to All Pairs Shortest Paths [Fischer,Meyer’71] trivial algorithm: 𝑃(π‘œ ; ) best known algorithm: π‘œ ; /2 ?( @AB C ) [Williams’14] 𝐷 𝑗, π‘˜ = E 𝐡 𝑗, 𝑙 β‹… 𝐢[𝑙, π‘˜] Standard matrix multiplication: 7 time 𝑃(π‘œ G ) where πœ• ≀ 2.373

  4. οΏ½ (min,+) Product For π‘œΓ—π‘œ -matrices 𝐡, 𝐢 , their (min,+) product 𝐷 = 𝐡 βˆ— 𝐢 is defined by 𝐷 𝑗, π‘˜ = min 7 𝐡 𝑗, 𝑙 + 𝐢[𝑙, π‘˜] (min,+) product is equivalent to All Pairs Shortest Paths [Fischer,Meyer’71] trivial algorithm: 𝑃(π‘œ ; ) best known algorithm: π‘œ ; /2 ?( @AB C ) [Williams’14] Big Open Problem: Is (min,+) product in time 𝑷(𝒐 πŸ’O𝜻 ) for some 𝜻 > 𝟏 ? Study special cases!

  5. οΏ½ (min,+) Product for Structured Matrices Matrices with small entries: [Alon,Galil,Margalit’97] If 𝐡, 𝐢 have entries in βˆ’π‘ˆ, … , π‘ˆ βˆͺ ∞ i(π‘ˆπ‘œ G ) then 𝐡 βˆ— 𝐢 can be computed in time 𝑃 Sketch: 𝐡 S 𝑗, π‘˜ = 𝑦 U[V,W] 𝐷 𝑗, π‘˜ = min 7 𝐡 𝑗, 𝑙 + 𝐢[𝑙, π‘˜] 𝐷′ 𝑗, π‘˜ = E 𝐡 S 𝑗, 𝑙 β‹… 𝐢 S [𝑙, π‘˜] 7 𝐷 𝑗, π‘˜ = degree of highest monomial in 𝐷 S [𝑗, π‘˜]

  6. (min,+) Product for Structured Matrices Matrices with small entries: [Alon,Galil,Margalit’97] If 𝐡, 𝐢 have entries in βˆ’π‘ˆ, … , π‘ˆ βˆͺ ∞ i(π‘ˆπ‘œ G ) then 𝐡 βˆ— 𝐢 can be computed in time 𝑃 Matrices with few distinct entries: [Yuster’09] If each row of 𝐡 has a small number of distinct entries, then for arbitrary 𝐢 we can compute 𝐡 βˆ— 𝐢 in truly subcubic time Question: Is (min,+) product in time 𝑷(𝒐 πŸ’O𝜻 ) for BD matrices? Why care about BD matrices?

  7. 1 st Application: Language Edit Distance (LED) for simplicity: |𝐻| = 𝑃(1) CFG Parsing: Given a context-free grammar 𝐻 and a string 𝑑 of length π‘œ , is 𝑑 in 𝑀(𝐻) ? i(π‘œ G ) ... is in time 𝑃 [L. Valiant’75] Language Edit Distance: β€žerror-correcting CFG parsingβ€œ Given a CFG 𝐻 and a string 𝑑 , compute minimum edit distance of 𝑑 to any string in 𝑀(𝐻) insertions, deletions, substitutions ... is in time 𝑃(π‘œ ; ) [Aho,Peterson’72] We show using Valiant’s approach: If (min,+) product on BD matrices is in time 𝑃(π‘œ n ) , ~8 page proof i(π‘œ n ) then LED is in time 𝑃 intuitive reason for BD: LED( 𝑑 ) and LED( 𝑑𝑑 ) differ by ≀ 1 for any symbol 𝑑

  8. 2 nd Application: RNA Folding RNA can be seen as a sequence of symbols from {A,C,G,U} Biologists want to predict the secondary structure of RNA: A can pair with U, and C can pair with G Given an RNA sequence, find the largest set of matching pairs, such that no two pairs intersect AUUGCAG not allowed but AUUGCAG is okay ... is in time 𝑃(π‘œ ; ) [Nussinov,Jacobson’80] Disclaimer: No author of ... can be cast as a LED problem (without substitutions) this paper is a biologist. If (min,+) product on BD matrices is in time 𝑃(π‘œ n ) , i(π‘œ n ) then RNA Folding is in time 𝑃

  9. 3 rd Application: Optimal Stack Generation for simplicity: |Ξ£| = 𝑃(1) Optimal Stack Generation: Given a string 𝑑 over alphabet Ξ£ , determine the shortest sequence of stack operations push(.), emit, pop s.t. performing these operations starting from an empty stack will emit 𝑑 and end with an empty stack a a 𝑑 = bab b b b b b b push(b) emit push(a) emit pop emit pop b a b ... is in time 𝑃(π‘œ ; ) (dynamic programming) [Tarjan’05] If (min,+) product on O(1)-BD matrices is in time 𝑃(π‘œ n ) , We show: i(π‘œ n ) then Optimal Stack Generation is in time 𝑃 intuitive reason for BD: OSG( 𝑑 ) and OSG( 𝑑𝑑 ) differ by ≀ 3 for any 𝑑 ∈ Ξ£

  10. Main Result ... so we have seen that (min,+) product of BD matrices is well motivated Main Result: We can compute the (min,+) product of BD matrices in randomized time 𝑃(π‘œ v.y; ) and deterministic time 𝑃(π‘œ v.yz ) here: 𝑷(𝒐 πŸ‘.𝟘 ) Generalization: For 𝑿 -BD matrix 𝐡 with 𝑋 β‰ͺ π‘œ ;OG β‰ˆ π‘œ t.uvu and arbitrary 𝐢 we can compute their (min,+) product in randomized truly subcubic time

  11. Algorithm Sketch Input: BD matrices 𝐡, 𝐢 . Want: 𝐷 𝑗, π‘˜ = min 7 𝐡 𝑗, 𝑙 + 𝐢[𝑙, π‘˜] 1) Compute approximation 𝐸 𝑗, π‘˜ = 𝐷 𝑗, π‘˜ Β± 𝑃 π‘œ t.v time 𝑃(π‘œ v.u ) compute 𝐷 𝑗, π‘˜ exactly for all 𝑗, π‘˜ that are multiples of π‘œ t.v set 𝐸 𝑗, π‘˜ to some 𝐷[𝑗’, π‘˜β€™] by rounding 𝑗, π‘˜ If 𝐡, 𝐢 are BD, then their (𝑗 S , π‘˜ S ) (min,+) product is also BD (𝑗, π‘˜) π‘œ t.v

  12. Algorithm Sketch Input: BD matrices 𝐡, 𝐢 . Want: 𝐷 𝑗, π‘˜ = min 7 𝐡 𝑗, 𝑙 + 𝐢[𝑙, π‘˜] 1) Compute approximation 𝐸 𝑗, π‘˜ = 𝐷 𝑗, π‘˜ Β± 𝑃 π‘œ t.v ≀ 𝑃 π‘œ t.v 𝐡 𝑗, 𝑙 + 𝐢 𝑙, π‘˜ = 𝐷 𝑗, π‘˜ implies 𝐡 𝑗, 𝑙 + 𝐢 𝑙, π‘˜ βˆ’ 𝐸 𝑗, π‘˜ call these triples (𝑗, 𝑙, π‘˜) relevant then 𝐷 𝑗, π‘˜ = 7:(V,7,W) ‒€@β‚¬β€’β€šΖ’β€ž 𝐡 𝑗, 𝑙 + 𝐢[𝑙, π‘˜] min

  13. Algorithm Sketch (𝑗, 𝑙, π‘˜) relevant: |𝐡 𝑗, 𝑙 + 𝐢 𝑙, π‘˜ βˆ’ 𝐸 𝑗, π‘˜ | ≀ 𝑃 π‘œ t.v Input: BD matrices 𝐡, 𝐢 . Want: 𝐷 𝑗, π‘˜ = min 7 𝐡 𝑗, 𝑙 + 𝐢[𝑙, π‘˜] 1) Compute approximation 𝐸 𝑗, π‘˜ = 𝐷 𝑗, π‘˜ Β± 𝑃 π‘œ t.v 2) Cover most relevant triples: fix 𝑗 βˆ— , π‘˜ βˆ— , and define matrices 𝐡 βˆ— , 𝐢 βˆ— 𝐡 βˆ— 𝑗, 𝑙 ≔ 𝐡 𝑗, 𝑙 + 𝐢 𝑙, π‘˜ βˆ— βˆ’ 𝐸 𝑗, π‘˜ βˆ— βˆ’ 𝐡 𝑗 βˆ— , 𝑙 + 𝐢 𝑙, π‘˜ βˆ— βˆ’ 𝐸 𝑗 βˆ— , π‘˜ βˆ— 𝐢 βˆ— 𝑙, π‘˜ ≔ 𝐡 𝑗 βˆ— , 𝑙 + 𝐢 𝑙, π‘˜ βˆ’ 𝐸 𝑗 βˆ— , π‘˜ (min,+) product 𝐷 βˆ— of 𝐡 βˆ— , 𝐢 βˆ— : = 𝐷 𝑗, π‘˜ βˆ’ 𝐸 𝑗, π‘˜ βˆ— + 𝐸 𝑗 βˆ— , π‘˜ βˆ— βˆ’ 𝐸 𝑗 βˆ— , π‘˜ 𝐷 βˆ— 𝑗, π‘˜ = min 7 𝐡 βˆ— 𝑗, 𝑙 + 𝐢 βˆ— 𝑙, π‘˜ can be cancelled afterwards

  14. Algorithm Sketch (𝑗, 𝑙, π‘˜) relevant: |𝐡 𝑗, 𝑙 + 𝐢 𝑙, π‘˜ βˆ’ 𝐸 𝑗, π‘˜ | ≀ 𝑃 π‘œ t.v Input: BD matrices 𝐡, 𝐢 . Want: 𝐷 𝑗, π‘˜ = min 7 𝐡 𝑗, 𝑙 + 𝐢[𝑙, π‘˜] 1) Compute approximation 𝐸 𝑗, π‘˜ = 𝐷 𝑗, π‘˜ Β± 𝑃 π‘œ t.v 2) Cover most relevant triples: fix 𝑗 βˆ— , π‘˜ βˆ— , and define matrices 𝐡 βˆ— , 𝐢 βˆ— 𝐡 βˆ— 𝑗, 𝑙 ≔ 𝐡 𝑗, 𝑙 + 𝐢 𝑙, π‘˜ βˆ— βˆ’ 𝐸 𝑗, π‘˜ βˆ— βˆ’ 𝐡 𝑗 βˆ— , 𝑙 + 𝐢 𝑙, π‘˜ βˆ— βˆ’ 𝐸 𝑗 βˆ— , π‘˜ βˆ— 𝐢 βˆ— 𝑙, π‘˜ ≔ 𝐡 𝑗 βˆ— , 𝑙 + 𝐢 𝑙, π‘˜ βˆ’ 𝐸 𝑗 βˆ— , π‘˜ if 𝑗, 𝑙, π‘˜ βˆ— , 𝑗 βˆ— , 𝑙, π‘˜ βˆ— , 𝑗 βˆ— , 𝑙, π‘˜ are all relevant, then 𝐡 βˆ— 𝑗, 𝑙 , 𝐢 βˆ— 𝑙, π‘˜ = 𝑃 π‘œ t.v set all 𝛻(π‘œ t.v ) -entries of 𝐡 βˆ— , 𝐢 βˆ— to ∞ then (min,+) product of 𝐡 βˆ— and 𝐢 βˆ— can be computed in time 𝑃 i(π‘œ G‑t.v ) (𝑗, 𝑙, π‘˜) is β€žcoveredβ€œ if 𝐡 βˆ— 𝑗, 𝑙 and 𝐢 βˆ— 𝑙, π‘˜ are 𝑃(π‘œ t.v ) , i.e., not set to ∞

  15. Algorithm Sketch (𝑗, 𝑙, π‘˜) relevant: |𝐡 𝑗, 𝑙 + 𝐢 𝑙, π‘˜ βˆ’ 𝐸 𝑗, π‘˜ | ≀ 𝑃 π‘œ t.v Input: BD matrices 𝐡, 𝐢 . Want: 𝐷 𝑗, π‘˜ = min 7 𝐡 𝑗, 𝑙 + 𝐢[𝑙, π‘˜] (𝑗, 𝑙, π‘˜) is β€žcoveredβ€œ 1) Compute approximation 𝐸 𝑗, π‘˜ = 𝐷 𝑗, π‘˜ Β± 𝑃 π‘œ t.v if 𝐡 βˆ— 𝑗, 𝑙 and 𝐢 βˆ— 𝑙, π‘˜ are 𝑃(π‘œ t.v ) 2) Cover most relevant triples: in some round Λ† 𝑗, π‘˜ ≔ ∞ initialize 𝐷 repeat for 𝑃(π‘œ t.; log π‘œ) rounds: i(π‘œ t.; ) iterations 𝑃 pick 𝑗 βˆ— , π‘˜ βˆ— randomly 𝐡 βˆ— 𝑗, 𝑙 ≔ 𝐡 𝑗, 𝑙 + 𝐢 𝑙, π‘˜ βˆ— βˆ’ 𝐸 𝑗, π‘˜ βˆ— βˆ’ 𝐡 𝑗 βˆ— , 𝑙 + 𝐢 𝑙, π‘˜ βˆ— βˆ’ 𝐸 𝑗 βˆ— , π‘˜ βˆ— 𝐢 βˆ— 𝑙, π‘˜ ≔ 𝐡 𝑗 βˆ— , 𝑙 + 𝐢 𝑙, π‘˜ βˆ’ 𝐸 𝑗 βˆ— , π‘˜ set all 𝛻(π‘œ t.v ) -entries of 𝐡 βˆ— , 𝐢 βˆ— to ∞ time 𝑃 π‘œ G‑t.v = 𝑃(π‘œ v.u ) compute (min,+) product 𝐷 βˆ— = 𝐡 βˆ— βˆ— 𝐢 βˆ— Λ† 𝑗, π‘˜ , 𝐷 βˆ— 𝑗, π‘˜ + 𝐸 𝑗, π‘˜ βˆ— βˆ’ 𝐸 𝑗 βˆ— , π‘˜ βˆ— + 𝐸 𝑗 βˆ— , π‘˜ Λ† 𝑗, π‘˜ ≔ min 𝐷 𝐷 Lem: After 𝑃(π‘œ ‰ log π‘œ) rounds there are 𝑃(π‘œ ;O‰/; + π‘œ v.Ε  ) total time = 𝑃 π‘œ v.β€Ή i π‘œ v.β€Ή uncovered relevant triples w.h.p. 𝑃

  16. Algorithm Sketch (𝑗, 𝑙, π‘˜) relevant: |𝐡 𝑗, 𝑙 + 𝐢 𝑙, π‘˜ βˆ’ 𝐸 𝑗, π‘˜ | ≀ 𝑃 π‘œ t.v Input: BD matrices 𝐡, 𝐢 . Want: 𝐷 𝑗, π‘˜ = min 7 𝐡 𝑗, 𝑙 + 𝐢[𝑙, π‘˜] (𝑗, 𝑙, π‘˜) is β€žcoveredβ€œ 1) Compute approximation 𝐸 𝑗, π‘˜ = 𝐷 𝑗, π‘˜ Β± 𝑃 π‘œ t.v if 𝐡 βˆ— 𝑗, 𝑙 and 𝐢 βˆ— 𝑙, π‘˜ are 𝑃(π‘œ t.v ) 2) Cover most relevant triples in some round 3) Enumerate uncovered relevant triples: ”for each uncovered relevant (𝑗, 𝑙, π‘˜) :β€œ Λ† 𝑗, π‘˜ ≔ min 𝐷 Λ† 𝑗, π‘˜ , 𝐡 𝑗, 𝑙 + 𝐢[𝑙, π‘˜] 𝐷 Λ† is correct output now 𝐷

Recommend


More recommend