(Nearly) Efficient Algorithms for the Graph Matching Problem Tselil Schramm (Harvard/MIT) with Boaz Barak, Chi-Ning Chou, Zhixian Lei & Yueqi Sheng (Harvard)
graph matching problem (approximate graph isomorphism) input: two graphs on π vertices goal: find permutation of vertices that maximizes # shared edges maπ¦ π΅ π» 0 , π π΅ π» 1 π π» 1 π» 0
graph matching problem (approximate graph isomorphism) input: two graphs on π vertices goal: find permutation of vertices that maximizes # shared edges maπ¦ π΅ π» 0 , π π΅ π» 1 π # matched = 4 π» 1 π» 0
graph matching problem (approximate graph isomorphism) input: two graphs on π vertices goal: find permutation of vertices that maximizes # shared edges maπ¦ π΅ π» 0 , π π΅ π» 1 π π» 1 π» 0
graph matching problem (approximate graph isomorphism) input: two graphs on π vertices goal: find permutation of vertices that maximizes # shared edges maπ¦ π΅ π» 0 , π π΅ π» 1 π # matched = 5 π» 1 π» 0
computationally hard (of course) NP-hard: reduction from quadratic assignment problem (non-simple graphs). [ Lawlerβ63] also: reduction from sparse random 3-SAT to approximate version [ OβDonnell -Wright-Wu- Zhouβ14]
practitioners: undeterred ο computational biology [e.g. Singh-Xu- Bergerβ08] ο de-anonymization [e.g. Narayanan- Shmatikovβ09] ο social networks [e.g. Korula- Lattanziβ14] ο image alignment [e.g. Cho- Leeβ12] ο machine learning [e.g. Cour-Srinivasan- Shiβ07] ο pattern recognition, e.g. βthirty years of graph matching in pattern recognitionβ [Conte-Foggia-Sansone- Ventoβ04 ]
βrobust average - case graph isomorphismβ average case: correlated random graphs structured model sample π» βΌ π»(π, π) π» β ππΏ 2 β π πΏ πΏ maπ¦ π΅ π» 0 , π π΅ π» 1 2 π subsample edges w/prob πΏ π» 0 π» 1 avg. degree ππΏ β π random permutation π [e.g. Pedarsani- Grossglauserβ11, π» 0 Lyzinski-Fishkind- Priebeβ14, Korula- Lattanziβ14]
βrobust average - case graph isomorphismβ average case: correlated random graphs structured model βnullβ model sample π» βΌ π»(π, π) π» πΏ πΏ subsample edges w/prob πΏ π» 0 π» 1 avg. degree ππΏ β π π β ππΏ 2 β π π» 0 2
βrobust average - case graph isomorphismβ average case: correlated random graphs structured model βnullβ model sample sample π» 0 , π» 1 βΌ π»(π, ππΏ) π» βΌ π»(π, π) π» πΏ πΏ subsample edges w/prob πΏ π» 1 π» 0 π» 0 π» 1 avg. degree ππΏ β π avg. degree ππΏ β π π β ππΏ 2 β π β ππΏ 2 β π maπ¦ π΅ π» 0 , π π΅ π» 1 π» 0 2 π 2
information theoretic limit π»(π, π) π» πΏ πΏ for which π, πΏ can we recover π ? π» 0 π» 1 π π» 0 Theorem [Cullina- Kivayashβ16&17] Iff ππΏ 2 > log π π , with high probability π is the unique maximizing permutation.
algorithms for robust average case? average-case graph isomorphism algorithms fail. e.g. matching local neighborhoods match radius- π neighborhoods? π» 1 π» 0
algorithms for robust average case? average-case graph isomorphism algorithms fail. e.g. matching local neighborhoods πΏ πΏ match radius- π neighborhoods? π» 1 π» 0
algorithms for robust average case? average-case graph isomorphism algorithms fail. e.g. spectral algorithm π€ max π€ max unique entries in top eigenvector give isomorphism? π» 1 π» 0
algorithms for robust average case? average-case graph isomorphism algorithms fail. e.g. spectral algorithm πΏ πΏ πΏ πΏ π€ max π€ max + + unique entries in top eigenvector give isomorphism? π» 1 π» 0 perturb eigenvectors by β πΏ
actual algorithms for robust average case?
starting from a seed πΘ π known π π(π) π» 0 π» 1
starting from a seed match vertices with similar adjacency into π π€ π£ π» 1 π» 0 π
starting from a seed match vertices with similar adjacency into π π π£ = π€ π» 1 π» 0 π iff seed β₯ Ξ©(π π ) , the seeded algorithm approximately recovers π . [Yartseva- Grossglauserβ13] π π π time to guess a seed. need 2 ΰ·¨
π»(π, π) structured π» our results πΏ πΏ π» 0 π» 1 π Theorem π» 0 1 2 π π(1) π 1βπ π 153 π 3 and πΏ = Ξ©(1) ,* there is a π π(log π) For any π > 0 , if π β , βͺ π , π π π time algorithm that recovers π on π β π(π) of the vertices w/prob β₯ 0.99 . 1 *we allow πΏ = Ξ© loglog π π 1/2 π π(1) π 1/153 π 2/3 π 1βπ π»(π, π) log π π average degrees: π 3/5 π 1/3
π»(π, π) structured π» our results πΏ πΏ π» 0 π» 1 π Theorem π» 0 1 2 π π(1) π 1βπ π 153 π 3 and πΏ = Ξ©(1) ,* there is a π π(log π) For any π > 0 , if π β , βͺ π , π π π time algorithm that recovers π on π β π(π) of the vertices w/prob β₯ 0.99 . 1 *we allow πΏ β₯ log π(1) π Theorem π»(π, ππΏ) hypothesis testing If π, πΏ are as above then there is a poly(π) time distinguishing null algorithm for the structured vs null distributions. π» 0 π» 1
our approach: small subgraphs hypothesis testing: correlation of subgraph counts recovery: match rare subgraphs seedless algorithms!
outline ο distinguishing/hypothesis testing ο recovery ο concluding
outline ο distinguishing/hypothesis testing ο recovery ο concluding
distinguishing/hypothesis testing structured π»(π, π) π» Given π» 0 , π» 1 sampled equally likely from structured or null , πΏ πΏ decide w/prob 1 β π(1) from which. π» 0 π» 1 ? ? π π» 0 π» 1 π» 0 ? π»(π, ππΏ) null brute force: is there a π with β₯ ππΏ 2 π 2 matched edges? π» 0 π» 1
β¦counting triangles? structured π»(π, π) πππ πΏ 3 π» 0 , π» 1 : = # πΏ 3 ππ π» 0 # K 3 ππ π» 1 . π» πΏ πΏ π» 0 π» 1 ? ? π π» 0 π» 1 π» 0 ? π»(π, ππΏ) null π» 0 π» 1
β¦counting triangles? structured π»(π, π) πππ πΏ 3 π» 0 , π» 1 : = # πΏ 3 ππ π» 0 # K 3 ππ π» 1 . π» πΏ πΏ π» 0 π» 1 π π» 0 π» 1 π» 0 π»(π, ππΏ) null triangle counts in π» 0 , π» 1 are independent π» 0 π» 1 π 3 (π» 0 , π» 1 ) β ππΏπ 6 π½ πππ
β¦counting triangles? structured π»(π, π) πΏ 3 (π» 0 , π» 1 ) β ππΏπ 6 + πΏ 2 ππ 3 π½ πππ π» πΏ πΏ triangle counts in π» 0 , π» 1 are correlated π» 0 π» 1 π π» 0 π» 1 π» 0 π»(π, ππΏ) null π» 0 π» 1
β¦counting triangles? structured π»(π, π) π» πΏ πΏ structured πΏ 3 (π» 0 , π» 1 ) β ππΏπ 6 + πΏ 2 ππ 3 π½ πππ π» 0 π» 1 π null πΏ 3 (π» 0 , π» 1 ) β ππΏπ 6 π½ πππ π» 0 Variance? π»(π, ππΏ) null Optimistically, in null case, 1/2 β ππΏπ 3 π πππ πΏ 3 π» 0 , π» 1 π» 0 π» 1
βindependent trialsβ π πππ π π» 0 , π» 1 = 1 (π) (π» 0 , π» 1 ) π ΰ· πππ Suppose we had π βindependent trialsβ: πΏ 3 π=1 π»(π, π) π» πΏ structured πΏ π» 0 π» 1 β ππΏπ 6 + πΏ 2 ππ 3 π½ πππ π π» 0 , π» 1 π π» 0 if π > 1/πΏ 6 , β ππΏπ 6 π½ πππ π π» 0 , π» 1 π»(π, ππΏ) πππ π is a good test null 1/2 β 1 ππΏπ 3 π πππ π π» 0 , π» 1 π» 0 π» 1 π
near-independent subgraphs βindependent trialsβ π πππ π π» 0 , π» 1 = 1 Suppose we had π βindependentβ subgraphs : π ΰ· πππ πΌ π (π» 0 , π» 1 ) π=1 πΌ 1 , β¦ , πΌ π what properties must πΌ 1 , β¦ , πΌ π have to be βindependentβ?
surprisingly delicate (concentration) π = π β5/7 π»(π, π) π» πΌ How many labeled copies of πΌ in π» ? Θππ£π’ πΌ Θ β π 5! 5 β π 7 β π 5 π 7 = Ξ(1) π½ # πΌ π» =
surprisingly delicate (concentration) π = π β5/7 π»(π, π) π» πΌ # πΌ (π») does not concentrate! How many labeled copies of πΌ in π» ? Θππ£π’ πΌ Θ β π 5! 5 β π 7 β π 5 π 7 = Ξ(1) π½ # πΌ π» = How many labeled copies of πΏ 4 in π» ? Θππ£π’ πΏ 4 Θ β π 4! 4 β π 6 β π 4 π 6 = Ξ(π β2/7 ) π½ # πΏ 4 π» =
variance of subgraph counts π»(π, π) π» πΌ Lemma For a constant-sized subgraph πΌ , 2 π½ # πΌ π» π # πΌ (π») = Ξ 1 β min πΎβπΌ π½[# πΎ π» ] subgraph of πΌ with fewest expected appearances
Recommend
More recommend