Optimal Learning of Joint Alignments with a Faulty Oracle Charalampos E. Tsourakakis ctsourak@bu.edu Boston University ISIT 2020 Optimal Learning of Joint Alignments with a Faulty Oracle 1 / 35
Joint work with: Kasper Green Larsen Michael Mitzenmacher Optimal Learning of Joint Alignments with a Faulty Oracle 2 / 35
Datasets modeled as graphs World Wide Web Internet Social network Connectome Airline network Images Optimal Learning of Joint Alignments with a Faulty Oracle 3 / 35
Graphs from probing/testing pairs of items (a) (b) (a) Humans in the loop for entity resolution (b) Protein-protein interactions Optimal Learning of Joint Alignments with a Faulty Oracle 4 / 35
Joint alignment from pairwise differences [Next four slides use material from Y. Chen’s slides] • n unknown variables g (0) , . . . , g ( n − 1) • k possible states, • described as the latent function g : [ n ] → [ k ] g (0) = 5 g (1) = 7 . . . • Think of [ n ] as a set of nodes, and g ( u ) as the cluster id that corresponds node u ∈ [ n ] Optimal Learning of Joint Alignments with a Faulty Oracle 5 / 35
Joint alignment from pairwise differences • Goal: learn latent function g : [ n ] → [ k ] • We obtain a noisy measurement of pairwise difference ˜ f ( i , j ) := ( g ( i ) − g ( j )+ some iid noise) mod k . Optimal Learning of Joint Alignments with a Faulty Oracle 6 / 35
Joint alignment from pairwise differences Typical input to a multi-image alignment problem. We may compute pairwise noisy estimates of relative angles of rotation. Optimal Learning of Joint Alignments with a Faulty Oracle 7 / 35
Joint alignment from noisy pairwise differences Desired output Optimal Learning of Joint Alignments with a Faulty Oracle 8 / 35
Joint alignment from pairwise differences • Clusters: k groups, numbered { 0 , 1 , ..., k − 1 } and that we think of as being arranged modulo k • Cluster ids: g ( u ) refers to the cluster number associated with a vertex u • Query/measurement: when we query an edge e = ( x , y ), we obtain ˜ � � f ( x , y ) = g ( x ) − g ( y ) + η xy mod k (1) where the additive noise values η xy are i.i.d. random variables supported on { 0 , 1 , · · · , k − 1 } . • Problem. Recover g (up to a cyclic offset) with high probability using as few measurements as possible and as fast as possible. Optimal Learning of Joint Alignments with a Faulty Oracle 9 / 35
Noise probability distribution • When we query an edge e = ( x , y ), we obtain ˜ � � f ( x , y ) = g ( x ) − g ( y ) + η xy mod k where the additive noise values η xy are i.i.d. random variables supported on { 0 , 1 , · · · , k − 1 } . 1 � k + δ, if i = 0; Pr [ η xy = i ] = (2) 1 δ k − k − 1 , for each i � = 0 . • We choose which pairs to query in a non-adaptive way. • We obtain a set of noisy measurements { ˜ � [ n ] � f ( i , j ) = g ( i ) − g ( j ) + noise mod k } ( i , j ) ∈ Ω where Ω ⊆ is a 2 symmetric index set, wlog, a set of pairs { i , j } with i < j . Optimal Learning of Joint Alignments with a Faulty Oracle 10 / 35
Remark Our MLE problem is a discrete, non-convex problem. Optimal Learning of Joint Alignments with a Faulty Oracle 11 / 35
Joint alignment - k = 2 - Let V = [ n ] be the set of items - (Unknown) g : V → {− 1 , +1 } • Red ( R = { v ∈ V ( G ) : g ( v ) = − 1 } ) • Blue ( B = { v ∈ V ( G ) : g ( v ) = +1 } ) - Observation: Define τ ( u , v ) = g ( u ) g ( v ) ∈ {± 1 } for any u , v ∈ V . Then, if τ ( u , v ) = − 1, then u is in a different cluster than v Optimal Learning of Joint Alignments with a Faulty Oracle 12 / 35
Joint alignment - k = 2 - Model: We can query any pair of nodes { u , v } once to get a noisy measurement of τ ( u , v ). The oracle returns • ˜ τ ( u , v ) = g ( u ) g ( v ) η u , v , where • η u , v ∈ {± 1 } is iid noise in the edge observations • E [ η u , v ] = δ for all pairs u , v ∈ V - Equivalently, for each query we receive the correct answer with probability 1 − q = 1 2 + δ 2 , where q > 0 is the corruption probability. - Problem ( k = 2 ): Recover g whp with as few queries to the oracle as possible. Optimal Learning of Joint Alignments with a Faulty Oracle 13 / 35
Related work – Overview Optimal Learning of Joint Alignments with a Faulty Oracle 14 / 35
Related Work – k = 2, Correlation Clustering • Correlation Clustering: given an undirected signed graph, partition the nodes into clusters so that the total number of disagreements is minimized [Bansal et al., 2004, Shamir et al., 2004] ( NP-hard ) • Excellent survey by Bonchi et al. [Bonchi et al., 2014] • Mathieu and Schudy initiated the study of noisy correlation clustering [Mathieu and Schudy, 2010] � n • complete information (all � signs) • cardinality constraints on clusters (Ω( √ n ))) 2 Optimal Learning of Joint Alignments with a Faulty Oracle 15 / 35
Related Work – k = 2, Planted Partition Planted Partition Model • Two groups (clusters) of nodes • A graph is generated as follows. Edge probabilities are • p within each cluster, • and q < p across the clusters. • Problem: Recover the two clusters given such a graph. Results • If the two clusters are balanced, i.e., each cluster has O ( n ) nodes, then one can recover the clusters whp , see [McSherry, 2001, Vu, 2014, Abbe et al., 2016, Hajek et al., 2016]. Optimal Learning of Joint Alignments with a Faulty Oracle 16 / 35
Related Work – k = 2, # Queries as a function of the imbalance • Matrix completion techniques [Cand` es et al., 2006] can be used to predict signs of edges [Chiang et al., 2014] • γ = n max | C | cluster C • The number of queries needed for exact recovery is O ( γ 4 n log 2 n ), • Finally, Mazumdar and Saha study the case k = 2 and achieve recovery in poly-time using O ( n log n /δ 4 ) queries [Mazumdar and Saha, 2016] • State-of-the-art is due to [CT, Mitzenmacher, Larsen Webconf 2020] Optimal Learning of Joint Alignments with a Faulty Oracle 17 / 35
Related Work – k ≥ 3 • Joint alignment: Chen and Candes consider a similar setting as ours, and propose a projected power method to solve the non-convex maximum likelihood estimation problem [Chen and Candes, 2016]. Optimal Learning of Joint Alignments with a Faulty Oracle 18 / 35
Related Work – k ≥ 3 • Chen and Candes formulate the problem as a constrained PCA problem, and show that a non-convex, projected, power method solves the problem with high probability when the random queries form a random Erd¨ os-R´ enyi graph. Optimal Learning of Joint Alignments with a Faulty Oracle 19 / 35
Related Work – k ≥ 3 • The Chen-Candes algorithm is non-adaptive, and the underlying queries form a random binomial graph • They show that, in the setting where queries form a random binomial graph, the minimax probability of error tends to 1 if the � n log n � number of queries is less than Ω k δ 2 • The query complexity matches the lower bound • Before, inferior results were obtained by Mitzenmacher and Tsourakakis. Optimal Learning of Joint Alignments with a Faulty Oracle 20 / 35
Older result (2018) – Mitzenmacher-T. We prove the following result. Our proof uses BFS as its subroutine. Theorem There exists a polynomial time algorithm that performs O ( n 1+ o (1) ) queries, and recovers g (up to some global offset) whp for any 1 − q = 1+ δ 2 , where 0 < δ ≤ 1 is any positive constant. 1 • The o (1) term in the exponent is log log n . Optimal Learning of Joint Alignments with a Faulty Oracle 21 / 35
Upper bound – Larsen- Mitzenmacher-T. (2019) Theorem 1. (extremely small bias) If (lg n / nk ) 1 / 4 ≤ δ ≤ 1 / 2 k and k ≤ n o (1) , then there is a non-adaptive and deterministic query algorithm that makes O ( n log n δ 2 k ) queries, runs in O ( n log n δ 2 k ) time and is correct whp . Theorem 2. (larger bias) If 1 / 2 k ≤ δ ≤ 1 / 4 and k ≤ n o (1) , then there is a non-adaptive and deterministic query algorithm that makes O ( n log n ) queries, runs in O ( n log n ) time and is correct whp . δ δ Optimal Learning of Joint Alignments with a Faulty Oracle 22 / 35
Proposed algorithm - Step 1 O ( n log n k δ 2 ) queries Optimal Learning of Joint Alignments with a Faulty Oracle 23 / 35
Proposed algorithm – Step 2 “grounding” Optimal Learning of Joint Alignments with a Faulty Oracle 24 / 35
Proposed algorithm – Learn { g ( x ) } x ∈ S up to cyclic offset Optimal Learning of Joint Alignments with a Faulty Oracle 25 / 35
Proposed algorithm – Learn { g ( x ) } x ∈ V \ S up to (the same) cyclic offset Optimal Learning of Joint Alignments with a Faulty Oracle 26 / 35
Learning Joint Alignment with a Faulty Oracle 1 Choose S ⊆ V such that | S | = O ( log n k δ 2 ) if 0 ≤ δ ≤ 1 / 2 k and | S | = O ( lg n δ ) if 1 / 2 k ≤ δ ≤ 1 / 4. 2 Perform all queries between S and V \ S . 3 Fix a node s ∈ S and assign it the label ˆ g ( s ) = 0. 4 For each s ′ ∈ S \ { s } , compute an estimate µ s ′ of ( g ( s ′ ) − g ( s )) mod k using the plurality vote among the queries f ( s , b ) } b ∈ V \ S and assign s ′ the label ˆ { ˜ f ( s ′ , b ) − ˜ g ( s ′ ) = µ s ′ . 5 For each v / ∈ V \ S , assign it a label corresponding to the result g ( s ) + ˜ of the plurality vote among { ˆ f ( v , s ) } s ∈ S . Optimal Learning of Joint Alignments with a Faulty Oracle 27 / 35
Recommend
More recommend