Condition Number for Joint Optimization of Cycle-Consistent Networks Leonidas Guibas 1 , Qixing Huang 2 and Zhenxiao Liang 2 1 Stanford University, 2 University of Texas at Austin
Basic Idea and Cycle Consistency ◮ We can employ cycle consistency to improve the performance of multiple neural networks among several domains when the transformations form some cycles. ◮ Applications: translation, shape matching, CycleGAN, 3D model representations etc. The choice of cycles used to enforce cycle consistency is important when there are many domains.
Mapping Graph and Cycle Bases ◮ A mapping graph is a directed graph G f = ( V , E ) such that each node u ∈ V is associated with a domain D u and each edge ( u , v ) ∈ E with a function f uv : D u → D v . ◮ The cycle bases can be defined in several manners depending on what binary operator of cycles is used to compose new cycles. ◮ The most common bases are binary cycle bases and fundamental cycle bases.
Cycle Bases ◮ It has been known that there always exist binary cycle bases of size |E| − |V| + 1. ◮ In particular, a fundamental cycle bases can be easily constructed from a spanning tree on G . ◮ Not all binary bases are cycle-consistent.
Cycle-Consistency Bases ◮ A mapping graph G f is called cycle consistent if the composition of f along each cycle in G f is identity, i.e., f u k u 1 ◦ f u k − 1 u k ◦ · · · ◦ f u 2 u 3 ◦ f u 1 u 2 = I . ◮ The number of cycles in a graph can be exponentially large. It is impossible to enforce consistency on all cycles directly in large graphs. ◮ A cycle bases B = { C 1 , . . . , C |B| } is cycle-consistent if cycle-consistency is guaranteed over all cycles in G for any function family f whenever f is cycle-consistent along cylces in B .
Cycle-Consistency Bases ◮ Fundamental bases always work but not perfect. ◮ Intuitively it will be harder to optimize f along a longer cycle. Fundamental bases come from the spanning trees of graphs so that can contain many long cycles.
Simple Case of Translation Synchronization Specifically we consider the translation functions f ij ( x ) := x + t ij where t ij is parameters to be optimized. Suppose t 0 ij is the initial parameter. Loss Function: ij ) 2 + � ( t ij − t 0 � � t i l i l +1 ) 2 . min w c ( (1) { t ij , ( i , j ) ∈E} ( i , j ) ∈E 0 c =( i 1 ··· i k i 1 ) ∈C l We hope the final t ij are close to t (0) and keep the cycle ij consistency.
Condition Number for Translation Case (1) can be rewritten in matrix form: t T H t − 2 t T t 0 + � t 0 � 2 , min (2) t � � v e v T w c v c v T H := e + c . e ∈E 0 c ∈C ◮ This quadratic optimization problem is generally relevant to condition number κ ( H ) = λ max ( H ) /λ min ( H ). ◮ The deviation between the optimal solution t and the ground truth t gt ground truth solution is 1 � t ∗ − t gt � ≤ λ min ( H ) � t 0 − t gt � where t 0 is the initialization translation vector.
Sampling Process (Step I - C sup generation) We construct C sup by computing the breadth-first spanning tree T ( v i ) rooted at each vertex v i ∈ V . The resulting C sup has two desired properties: ◮ The cycles in C sup are kept as short as possible. ◮ If G is sparse, then C sup contains a mixture of short and long cycles. These long cycles can address the issue of accumulated errors if we only enforce the cycle-consistency constraint along short cycles.
Sampling Process (Step II - Weight Optimization) We formulate the following semidefinite program for optimizing cycle weights: min s 2 − s 1 (3) w c ≥ 0 , s 1 , s 2 � v e v T � w c v c v T subject to s 1 I � e + c � s 2 I e ∈E 0 c ∈C sup � | v c | 2 w c = λ, w c ≥ δ, ∀ c ∈ C min (4) c ∈C sup ◮ (3) enforces H close to an identity matrix. ◮ w c ≥ δ for c ∈ C min guaranteed cycles in C min taken into account.
Importance Sampling The semidefinite program described above controls the condition number of H , but it does not control the size of the cycle sets with positive weights. We seek to select a subset of cycles C sample ⊂ C sup and compute new weights w c , c ∈ C sample , so that � w c v c v T � w c v c v T c ≈ c . (5) c ∈C sample c ∈C sup
Main Results of Sampling Under mild assumptions, w.h.p we have E [ |C sample | ] = L , (6) � w c v c v T � w c v c v T E [ c ] = (7) c c ∈C sample c ∈C sup ||C sample | − L | ≤ O (log n ) σ 1 (8) � w c v c v T � w c v c v T � c − c � ≤ O (log n ) σ 2 (9) c ∈C sample c ∈C sup where n is the number of domains and σ 2 1 and σ 2 2 are the unweighted and weighted variances of |C sample | respectively.
Experimental Results - Consistent Shape Correspondence ShapeCoSeg-Baseline-Eval 100 %Correspondences 80 60 Ours NoWeight 40 Zhang19 20 Cosmo17 Huang14 0.03 0.06 0.09 0.12 0.15 0.18 Geodesic Error ◮ Encode the map from one shape S i and another shape S j as a functional map X ij : F ( S i ) → F ( S j ). ◮ Considered two shape collections from ShapeCoSeg: Alien (200 shapes) and Vase (300 shapes). ◮ Construct G by connecting every shape with k = 25 randomly chosen shapes.
Experimental Results - Consistent Neural Networks PASCAL3D-Baseline-Eval 60 %Correspondences 45 Ours NoWeight 30 IdenticalNet Zhang19 15 Zhou16 Dosovitsky15 Zhou15 0.03 0.06 0.09 0.12 0.15 0.18 Euclidean Error ◮ V represents image objects viewed from similar camera poses. ◮ Jointly learn the neural networks associated with each edge.
Recommend
More recommend