nearly efficient algorithms
play

(Nearly) Efficient Algorithms for the Graph Matching Problem - PowerPoint PPT Presentation

(Nearly) Efficient Algorithms for the Graph Matching Problem Tselil Schramm (Harvard/MIT) with Boaz Barak, Chi-Ning Chou, Zhixian Lei & Yueqi Sheng (Harvard) graph matching problem (approximate graph isomorphism) input: two graphs on


  1. (Nearly) Efficient Algorithms for the Graph Matching Problem Tselil Schramm (Harvard/MIT) with Boaz Barak, Chi-Ning Chou, Zhixian Lei & Yueqi Sheng (Harvard)

  2. graph matching problem (approximate graph isomorphism) input: two graphs on π‘œ vertices goal: find permutation of vertices that maximizes # shared edges ma𝑦 𝐡 𝐻 0 , 𝜌 𝐡 𝐻 1 𝜌 𝐻 1 𝐻 0

  3. graph matching problem (approximate graph isomorphism) input: two graphs on π‘œ vertices goal: find permutation of vertices that maximizes # shared edges ma𝑦 𝐡 𝐻 0 , 𝜌 𝐡 𝐻 1 𝜌 # matched = 4 𝐻 1 𝐻 0

  4. graph matching problem (approximate graph isomorphism) input: two graphs on π‘œ vertices goal: find permutation of vertices that maximizes # shared edges ma𝑦 𝐡 𝐻 0 , 𝜌 𝐡 𝐻 1 𝜌 𝐻 1 𝐻 0

  5. graph matching problem (approximate graph isomorphism) input: two graphs on π‘œ vertices goal: find permutation of vertices that maximizes # shared edges ma𝑦 𝐡 𝐻 0 , 𝜌 𝐡 𝐻 1 𝜌 # matched = 5 𝐻 1 𝐻 0

  6. computationally hard (of course) NP-hard: reduction from quadratic assignment problem (non-simple graphs). [ Lawler’63] also: reduction from sparse random 3-SAT to approximate version [ O’Donnell -Wright-Wu- Zhou’14]

  7. practitioners: undeterred ο‚– computational biology [e.g. Singh-Xu- Bergerβ€˜08] ο‚– de-anonymization [e.g. Narayanan- Shmatikov’09] ο‚– social networks [e.g. Korula- Lattanzi’14] ο‚– image alignment [e.g. Cho- Lee’12] ο‚– machine learning [e.g. Cour-Srinivasan- Shi’07] ο‚– pattern recognition, e.g. β€œthirty years of graph matching in pattern recognition” [Conte-Foggia-Sansone- Vento’04 ]

  8. β€œrobust average - case graph isomorphism” average case: correlated random graphs structured model sample 𝐻 ∼ 𝐻(π‘œ, π‘ž) 𝐻 β‰ˆ π‘žπ›Ώ 2 β‹… π‘œ 𝛿 𝛿 ma𝑦 𝐡 𝐻 0 , 𝜌 𝐡 𝐻 1 2 𝜌 subsample edges w/prob 𝛿 𝐻 0 𝐻 1 avg. degree π‘žπ›Ώ β‹… π‘œ random permutation 𝜌 [e.g. Pedarsani- Grossglauser’11, 𝐻 0 Lyzinski-Fishkind- Priebe’14, Korula- Lattanzi’14]

  9. β€œrobust average - case graph isomorphism” average case: correlated random graphs structured model β€œnull” model sample 𝐻 ∼ 𝐻(π‘œ, π‘ž) 𝐻 𝛿 𝛿 subsample edges w/prob 𝛿 𝐻 0 𝐻 1 avg. degree π‘žπ›Ώ β‹… π‘œ 𝜌 β‰ˆ π‘žπ›Ώ 2 β‹… π‘œ 𝐻 0 2

  10. β€œrobust average - case graph isomorphism” average case: correlated random graphs structured model β€œnull” model sample sample 𝐻 0 , 𝐻 1 ∼ 𝐻(π‘œ, π‘žπ›Ώ) 𝐻 ∼ 𝐻(π‘œ, π‘ž) 𝐻 𝛿 𝛿 subsample edges w/prob 𝛿 𝐻 1 𝐻 0 𝐻 0 𝐻 1 avg. degree π‘žπ›Ώ β‹… π‘œ avg. degree π‘žπ›Ώ β‹… π‘œ 𝜌 β‰ˆ π‘žπ›Ώ 2 β‹… π‘œ β‰ˆ π‘žπ›Ώ 2 β‹… π‘œ ma𝑦 𝐡 𝐻 0 , 𝜌 𝐡 𝐻 1 𝐻 0 2 𝜌 2

  11. information theoretic limit 𝐻(π‘œ, π‘ž) 𝐻 𝛿 𝛿 for which π‘ž, 𝛿 can we recover 𝜌 ? 𝐻 0 𝐻 1 𝜌 𝐻 0 Theorem [Cullina- Kivayash’16&17] Iff π‘žπ›Ώ 2 > log π‘œ π‘œ , with high probability 𝜌 is the unique maximizing permutation.

  12. algorithms for robust average case? average-case graph isomorphism algorithms fail. e.g. matching local neighborhoods match radius- 𝑙 neighborhoods? 𝐻 1 𝐻 0

  13. algorithms for robust average case? average-case graph isomorphism algorithms fail. e.g. matching local neighborhoods 𝛿 𝛿 match radius- 𝑙 neighborhoods? 𝐻 1 𝐻 0

  14. algorithms for robust average case? average-case graph isomorphism algorithms fail. e.g. spectral algorithm 𝑀 max 𝑀 max unique entries in top eigenvector give isomorphism? 𝐻 1 𝐻 0

  15. algorithms for robust average case? average-case graph isomorphism algorithms fail. e.g. spectral algorithm 𝛿 𝛿 𝛿 𝛿 𝑀 max 𝑀 max + + unique entries in top eigenvector give isomorphism? 𝐻 1 𝐻 0 perturb eigenvectors by β‰ˆ 𝛿

  16. actual algorithms for robust average case?

  17. starting from a seed 𝜌ȁ 𝑇 known 𝑇 𝜌(𝑇) 𝐻 0 𝐻 1

  18. starting from a seed match vertices with similar adjacency into 𝑇 𝑀 𝑣 𝐻 1 𝐻 0 𝑇

  19. starting from a seed match vertices with similar adjacency into 𝑇 𝜌 𝑣 = 𝑀 𝐻 1 𝐻 0 𝑇 iff seed β‰₯ Ξ©(π‘œ πœ— ) , the seeded algorithm approximately recovers 𝜌 . [Yartseva- Grossglauser’13] 𝑃 π‘œ πœ— time to guess a seed. need 2 ΰ·¨

  20. 𝐻(π‘œ, π‘ž) structured 𝐻 our results 𝛿 𝛿 𝐻 0 𝐻 1 𝜌 Theorem 𝐻 0 1 2 π‘œ 𝑝(1) π‘œ 1βˆ’πœ— π‘œ 153 π‘œ 3 and 𝛿 = Ξ©(1) ,* there is a π‘œ 𝑃(log π‘œ) For any πœ— > 0 , if π‘ž ∈ , βˆͺ π‘œ , π‘œ π‘œ π‘œ time algorithm that recovers 𝜌 on π‘œ βˆ’ 𝑝(π‘œ) of the vertices w/prob β‰₯ 0.99 . 1 *we allow 𝛿 = Ξ© loglog π‘œ π‘œ 1/2 π‘œ 𝑝(1) π‘œ 1/153 π‘œ 2/3 π‘œ 1βˆ’πœ— 𝐻(π‘œ, π‘ž) log π‘œ π‘œ average degrees: π‘œ 3/5 π‘œ 1/3

  21. 𝐻(π‘œ, π‘ž) structured 𝐻 our results 𝛿 𝛿 𝐻 0 𝐻 1 𝜌 Theorem 𝐻 0 1 2 π‘œ 𝑝(1) π‘œ 1βˆ’πœ— π‘œ 153 π‘œ 3 and 𝛿 = Ξ©(1) ,* there is a π‘œ 𝑃(log π‘œ) For any πœ— > 0 , if π‘ž ∈ , βˆͺ π‘œ , π‘œ π‘œ π‘œ time algorithm that recovers 𝜌 on π‘œ βˆ’ 𝑝(π‘œ) of the vertices w/prob β‰₯ 0.99 . 1 *we allow 𝛿 β‰₯ log 𝑝(1) π‘œ Theorem 𝐻(π‘œ, π‘žπ›Ώ) hypothesis testing If π‘ž, 𝛿 are as above then there is a poly(π‘œ) time distinguishing null algorithm for the structured vs null distributions. 𝐻 0 𝐻 1

  22. our approach: small subgraphs hypothesis testing: correlation of subgraph counts recovery: match rare subgraphs seedless algorithms!

  23. outline ο‚– distinguishing/hypothesis testing ο‚– recovery ο‚– concluding

  24. outline ο‚– distinguishing/hypothesis testing ο‚– recovery ο‚– concluding

  25. distinguishing/hypothesis testing structured 𝐻(π‘œ, π‘ž) 𝐻 Given 𝐻 0 , 𝐻 1 sampled equally likely from structured or null , 𝛿 𝛿 decide w/prob 1 βˆ’ 𝑝(1) from which. 𝐻 0 𝐻 1 ? ? 𝜌 𝐻 0 𝐻 1 𝐻 0 ? 𝐻(π‘œ, π‘žπ›Ώ) null brute force: is there a 𝜌 with β‰₯ π‘žπ›Ώ 2 π‘œ 2 matched edges? 𝐻 0 𝐻 1

  26. …counting triangles? structured 𝐻(π‘œ, π‘ž) 𝑑𝑝𝑠 𝐿 3 𝐻 0 , 𝐻 1 : = # 𝐿 3 π‘—π‘œ 𝐻 0 # K 3 π‘—π‘œ 𝐻 1 . 𝐻 𝛿 𝛿 𝐻 0 𝐻 1 ? ? 𝜌 𝐻 0 𝐻 1 𝐻 0 ? 𝐻(π‘œ, π‘žπ›Ώ) null 𝐻 0 𝐻 1

  27. …counting triangles? structured 𝐻(π‘œ, π‘ž) 𝑑𝑝𝑠 𝐿 3 𝐻 0 , 𝐻 1 : = # 𝐿 3 π‘—π‘œ 𝐻 0 # K 3 π‘—π‘œ 𝐻 1 . 𝐻 𝛿 𝛿 𝐻 0 𝐻 1 𝜌 𝐻 0 𝐻 1 𝐻 0 𝐻(π‘œ, π‘žπ›Ώ) null triangle counts in 𝐻 0 , 𝐻 1 are independent 𝐻 0 𝐻 1 𝑙 3 (𝐻 0 , 𝐻 1 ) β‰ˆ π‘žπ›Ώπ‘œ 6 𝔽 𝑑𝑝𝑠

  28. …counting triangles? structured 𝐻(π‘œ, π‘ž) 𝐿 3 (𝐻 0 , 𝐻 1 ) β‰ˆ π‘žπ›Ώπ‘œ 6 + 𝛿 2 π‘žπ‘œ 3 𝔽 𝑑𝑝𝑠 𝐻 𝛿 𝛿 triangle counts in 𝐻 0 , 𝐻 1 are correlated 𝐻 0 𝐻 1 𝜌 𝐻 0 𝐻 1 𝐻 0 𝐻(π‘œ, π‘žπ›Ώ) null 𝐻 0 𝐻 1

  29. …counting triangles? structured 𝐻(π‘œ, π‘ž) 𝐻 𝛿 𝛿 structured 𝐿 3 (𝐻 0 , 𝐻 1 ) β‰ˆ π‘žπ›Ώπ‘œ 6 + 𝛿 2 π‘žπ‘œ 3 𝔽 𝑑𝑝𝑠 𝐻 0 𝐻 1 𝜌 null 𝐿 3 (𝐻 0 , 𝐻 1 ) β‰ˆ π‘žπ›Ώπ‘œ 6 𝔽 𝑑𝑝𝑠 𝐻 0 Variance? 𝐻(π‘œ, π‘žπ›Ώ) null Optimistically, in null case, 1/2 β‰ˆ π‘žπ›Ώπ‘œ 3 π•Ž 𝑑𝑝𝑠 𝐿 3 𝐻 0 , 𝐻 1 𝐻 0 𝐻 1

  30. β€œindependent trials” π‘ˆ 𝑑𝑝𝑠 π‘ˆ 𝐻 0 , 𝐻 1 = 1 (𝑗) (𝐻 0 , 𝐻 1 ) π‘ˆ ෍ 𝑑𝑝𝑠 Suppose we had π‘ˆ β€œindependent trials”: 𝐿 3 𝑗=1 𝐻(π‘œ, π‘ž) 𝐻 𝛿 structured 𝛿 𝐻 0 𝐻 1 β‰ˆ π‘žπ›Ώπ‘œ 6 + 𝛿 2 π‘žπ‘œ 3 𝔽 𝑑𝑝𝑠 π‘ˆ 𝐻 0 , 𝐻 1 𝜌 𝐻 0 if π‘ˆ > 1/𝛿 6 , β‰ˆ π‘žπ›Ώπ‘œ 6 𝔽 𝑑𝑝𝑠 π‘ˆ 𝐻 0 , 𝐻 1 𝐻(π‘œ, π‘žπ›Ώ) 𝑑𝑝𝑠 π‘ˆ is a good test null 1/2 β‰ˆ 1 π‘žπ›Ώπ‘œ 3 π•Ž 𝑑𝑝𝑠 π‘ˆ 𝐻 0 , 𝐻 1 𝐻 0 𝐻 1 π‘ˆ

  31. near-independent subgraphs β€œindependent trials” π‘ˆ 𝑑𝑝𝑠 π‘ˆ 𝐻 0 , 𝐻 1 = 1 Suppose we had π‘ˆ β€œindependent” subgraphs : π‘ˆ ෍ 𝑑𝑝𝑠 𝐼 𝑗 (𝐻 0 , 𝐻 1 ) 𝑗=1 𝐼 1 , … , 𝐼 π‘ˆ what properties must 𝐼 1 , … , 𝐼 π‘ˆ have to be β€œindependent”?

  32. surprisingly delicate (concentration) π‘ž = π‘œ βˆ’5/7 𝐻(π‘œ, π‘ž) 𝐻 𝐼 How many labeled copies of 𝐼 in 𝐻 ? ȁ𝑏𝑣𝑒 𝐼 ȁ β‹… π‘œ 5! 5 β‹… π‘ž 7 β‰ˆ π‘œ 5 π‘ž 7 = Θ(1) 𝔽 # 𝐼 𝐻 =

  33. surprisingly delicate (concentration) π‘ž = π‘œ βˆ’5/7 𝐻(π‘œ, π‘ž) 𝐻 𝐼 # 𝐼 (𝐻) does not concentrate! How many labeled copies of 𝐼 in 𝐻 ? ȁ𝑏𝑣𝑒 𝐼 ȁ β‹… π‘œ 5! 5 β‹… π‘ž 7 β‰ˆ π‘œ 5 π‘ž 7 = Θ(1) 𝔽 # 𝐼 𝐻 = How many labeled copies of 𝐿 4 in 𝐻 ? ȁ𝑏𝑣𝑒 𝐿 4 ȁ β‹… π‘œ 4! 4 β‹… π‘ž 6 β‰ˆ π‘œ 4 π‘ž 6 = Θ(π‘œ βˆ’2/7 ) 𝔽 # 𝐿 4 𝐻 =

  34. variance of subgraph counts 𝐻(π‘œ, π‘ž) 𝐻 𝐼 Lemma For a constant-sized subgraph 𝐼 , 2 𝔽 # 𝐼 𝐻 π•Ž # 𝐼 (𝐻) = Θ 1 β‹… min πΎβŠ‚πΌ 𝔽[# 𝐾 𝐻 ] subgraph of 𝐼 with fewest expected appearances

Recommend


More recommend