Reconstruction of full rank algebraic branching programs Vineet Nair Joint work with: Neeraj Kayal, Chandan Saha, Sebastien Tavenas 1
Arithmetic circuits 2
Reconstruction problem ➢ f( X ) Q[ X ] is an m-variate degree d polynomial computable by a size s circuit in circuit class C. ➢ Input: α ϵ F m f( α ) Blackbox access 3
Reconstruction problem ➢ Input: α ϵ F m f( α ) ➢ Output: A small arithmetic circuit computing f. ➢ The algorithm should run in time poly(m,s,d, b ) where ( b is the bit length of the coefficients of f). 4
Polynomial identity testing (PIT): Input: α ϵ F m f( α ) Is f( X ) = 0 ? Randomized algorithm for PIT follows easily from Schwartz-Zippel lemma Unlike PIT no efficient randomized algorithm is known for reconstruction. 5
Previous works Over finite fields [Shp07],[KS09] gave quasi-poly time deterministic reconstruction algorithm for depth three circuits with constant number of product gates. 6
Previous works Over characteristic zero fields [Sinha16] gave a poly time randomized algorithm for depth three circuits with two product gates. [GKL12] gave poly time randomized algorithm for multilinear depth four circuits with two top-level product gates. 7
Previous works [SV09], [MV16] gave deterministic poly time reconstruction for read-once formulas [KS03], [FS13] gave deterministic quasi-poly time reconstruction for ROABPs, set-multilinear ABPs and non-commutative ABPs 8
Average-case reconstruction Progress in reconstruction is slow. Can we do reconstruction for most circuits in a circuit class C ? C Efficiently reconstructed 9
Average-case reconstruction Problem definition: The input f is an m variate degree d polynomial picked according to a distribution D on circuit class C Output an efficient reconstruction algorithm for f. [GKL11], [GKQ13] gave randomized poly time algorithm for average-case reconstruction of multilinear formulas and formulas. 10
Algebraic branching programs (ABP) Definition: Consider the product of d matrices as X 1 • X 2 • … • X d , where X 1 is a row vector of length w, X d is a column vector of length w and X 2 , … , X d-1 are w x w matrices. Each entry of X i , i [d] is an affine form in X variables. | X | = m, example a 0 + a 1 x 1 + … + a m . Polynomial computed by the ABP is the entry in the 1x1 matrix computed as above. Length and width of the ABP is w and d respectively. 11
Distribution on ABPs Random ABP: Fix w,d and m. Pick the constants of the linear forms independently and uniformly at random from a large set S ⊆ Q. Average-case reconstruction: Design a reconstruction algorithm for random(m,w,d,S) ABP. 12
Average-case reconstruction for ABPs ➢ Input: Blackbox access to f( X ) computable by random(m,w,d,S) ABP. α ϵ F n f( α ) ➢ Output: A small ABP computing f with high probability. ➢ The algorithm should run in time poly( m,w,d,ρ ) - ( ρ bit length of an element in S). 13
Pseudo-random family A distribution D on m variate degree d polynomial family with seed length s=(md) O(1) generates a pseudo-random family if Every algorithm that distinguishes a polynomial coming from D and uniformly random m-variate polynomial with a non-negligible bias runs in time exponential in s. 14
Candidate family [Aar08] conjectures the family Det n (A X ) where every entry of A ϵ F t x m is chosen uniformly at random from a finite field and m << t=n 2 is pseudo-random Example x 1 +x 2 6x 1 +x 2 x 1 +3x 2 5x 1 +4x 2 8x 1 +x 2 10x 1 +x 2 8x 1 +3x 2 3x 1 +2x 2 m = 2, n = 4 8x 1 +2x 2 5x 1 +4x 2 7x 1 +9x 2 11x 1 +x 2 4x 1 +3x 2 9x 1 +3x 2 5x 1 +6x 2 9x 1 +7x 2 15
Iterated matrix multiplication Definition: Consider the product of d matrices as X 1 • X 2 • … • X d , where X 1 is a row vector of length w, X d is a column vector of length w and X 2 , … , X d-1 are w x w matrices. Each entry of X i , i [d] is a distinct variable. The variables are disjoint across matrices. IMM w,d is the entry in 1x1 matrix computed as above. 16
Consequence Det n and IMM w,d are affine projections of each other [Mahajan, Vinay 97]. Hence, it makes sense to ask whether IMM w,d (A X ) where A ϵ F t x m is chosen uniformly at random from a finite S ⊆ Q and m << t = w 2 (d-2) + 2w is pseudorandom. 17
Our Contribution 18
Main result 19
Remarks Does not resolve Aaronson’s conjecture For IMM w,d the conjecture holds • when m << w 2 d Our result holds when m w 2 d • Our result works even if the matrices are not of uniform width. 20
Full rank ABPs If m w 2 d then the affine forms in the ABP are Q-linearly independent with high probability. Full rank ABPs: the set of linear forms in X 1 , X 2 , …, X d are Q-linearly independent. Example: x 1 + x 2 x 2 + x 3 x 3 + x 4 x 4 + x 5 x 5 + x 6 x 6 + x 7 x 13 + x 14 x 7 + x 8 x 8 + x 9 x 9 + x 10 x 14 + x 15 x 10 + x 11 x 11 + x 12 x 12 + x 13 x 15 + x 16 21
Full rank ABPs If m w 2 d then the affine forms in the ABP are Q-linearly independent with high probability. Full rank ABPs: the set of linear forms in X 1 , X 2 , …, X d are Q-linearly independent. Main result: We design an efficient randomized algorithm to reconstruct full rank ABPs. 22
Equivalent polynomials An n-variate polynomial f is equivalent to an n- variate polynomial g if there exists an invertible A ϵ F n x n such that f( X ) = g(A X ) Equivalence test: f( X ) g( X ) Is there an invertible A in F nxn such that f( X ) = g(A X ) 23
Equivalent polynomials Equivalence test: IMM( X ) f( X ) Is there an invertible A in F nxn such that f( X ) = IMM(A X ) Remark: Computing a full rank ABP for f is the same as designing an efficient randomized equivalence test for IMM 24
Group of symmetries of IMM Group of symmetries: For an n variate polynomial g( X ) it is the set of all invertible A F nxn such that g(A X ) = g( X ). Characterization by symmetries: g( X ) is characterized by its group of symmetries then The group of symmetries of f( X ) and g( X ) are equal if and only if f( X ) is a constant multiple of g( X ) Main theorem 2: IMM w,d is characterized by its group of symmetries. 25
Proof Ideas 26
Template of the reconstruction algorithm Assume the input polynomial f is computable by a full rank ABP Compute a full rank ABP 1. Find the layer spaces 2. Glue them together Do a polynomial identity test to check if the polynomial computed by the ABP is f Output `f is not yes Output the full rank ABP no computable by a full computing f rank ABP’ 27
Pre-processing Let an m variate polynomial f be computed by a width w and length d full rank ABP. The number of edges is n = w 2 (d-2) +2w m n Two steps of pre-processing: • Variable reduction: At the end of this step we get an n variate f computable by a full rank ABP • Translation equivalence test: The entries in the matrices of the full rank ABP computing f are linear forms (constant term is 0). 2
Multiple full rank ABPs for f Suppose f is computable by a full rank ABP X 1 • X 2 • … • X d Then this full rank ABP for f is not unique The following transformations still compute f Transposition Left-right multiplication Corner translations 29
Transposition Recall X 1 and X d are row and column vectors Since the eventual product is a 1x1 matrix the transpose of the product still computes f Hence f is also computed by T X d • T X 2 • … • T X 1 30
Left-right multiplication Let A be a w x w invertible matrix with entries from Q Replace X 2 with X ’ 2 = X 2 • A and X ’ 3 = A -1 • X 3 f is computed by the product X 1 • X ’ 2 • X ’ 3 • … • X d 31
Corner translations Let B be an anti-symmetric w x w matrix, then X 1 • B • T X 1 = 0 x 1 + x 2 x 1 + x 2 x 2 + x 3 x 3 + x 4 0 4 5 = 0 x 2 + x 3 -4 0 8 x 3 + x 4 -5 -8 0 32
Corner translations Let B 1 , B 2 , … , B w be anti-symmetric w x w matrices. Let Y be the matrix such that the i-th column of Y is B i • T X 1 i-th column of matrix Y B i • T X 1 33
Corner translations Replace X 2 with X ’ 2 = X 2 + Y Observe that X 1 • X’ 2 = X 1 • X 2 as X 1 Y = 0 w x w f is computed by the product X 1 • X ’ 2 • X 3 • … • X d = X 1 • (X 2 + Y) • X 3 • … • X d Similarly we can define corner translations for X d-1 34
Uniqueness of the layer spaces Suppose f is computable by a full rank ABP X 1 • X 2 • … • X d Let X i denote the Q-linear space spanned by the linear forms in X i X 1,2 and X d-1,d denote the the Q-linear space spanned by the linear forms in X 1 ,X 2 and X d-1, X d respectively 35
Uniqueness of the layer spaces If X ’ 1 • X ’ 2 • … • X ’ d computes f then either X ’ i = X i for i ϵ [d]\{2,d-1} X ’ 1,2 = X 1,2 and X’ d-1,d = X d-1,d or X ’ i = X d-i for i ϵ [d]\{2,d-1} X ’ 1,2 = X d-1,d and X’ d-1,d = X 1,2 36
Uniqueness of the layer spaces X d-2 X 1 X 3 X 4 X d • • • X d-1,d X 1,2 37
Uniqueness of the layer spaces X 3 X d X d-2 X d-3 • • • X 1 X d-1,d X 1,2 38
Recommend
More recommend