Limits on Representing Functions by Linear Combinations of Simple Functions โ ๐ โถ 0,1 ๐ โ 0,1 ? โก simple simple simple simple simple simple Ryan Williams MIT
The โ -linear Representation Problem Let ๐ be a class of โsimpleโ functions (take Boolean inputs, but need not be Boolean-valued) Which โinterestingโ functions ๐ can(not) be represented by โshortโ โ -linear combinations of functions from ๐ ? โ ๐ โถ 0,1 ๐ โ 0,1 poly( ๐ ) โsizeโ ? โก โ๐ 2 ๐ โ๐ Call this a โ โ ๐ circuit simple simple simple simple simple simple Note: If ๐ spans the vector space of all functions ๐ โถ ๐, ๐ ๐ โ โ then there is always a โ โ ๐ circuit of โค ๐ ๐ sizeโฆ
The โ -linear Representation Problem Which โinterestingโ functions ๐ can(not) be represented by โshortโ โ -linear combinations of functions from ๐ ? If ๐ is the class of ๐ ๐ ๐ฉ๐ถ๐ฌ functions on ๐ variables: โ โ ๐ฉ๐ถ๐ฌ โก ๐/๐ polynomials over โ If ๐ is the class of ๐ ๐ ๐ธ๐ฉ๐บ๐ฑ๐ผ๐ functions on ๐ variables: โ โ ๐ธ๐ฉ๐บ๐ฑ๐ผ๐ โก โ๐/๐ polynomials over โ (Fourier analysis of Boolean functions) These are well-understood: ๐ is a basis for the vector space of functions ๐ โถ 0,1 ๐ โ โ โ the โ -linear representation of ๐ is unique, so the โshortestโ is also the โlongestโโฆ More interesting cases: representations are not unique
This Paper: Three Simple Classes 1. Linear Threshold Functions [ ๐ด๐ผ๐ฎ ] 2. Rectified Linear Units [ ๐บ๐๐ด๐ฝ ] ๐ฏ๐ฎ ( ๐ )- Polynomials of Degree- ๐ [ ๐ธ๐ท๐ด๐๐ ๐ ] 3. ( ๐ prime and ๐ โฅ ๐ ) For all three classes: There are โซ ๐ ๐ functions on ๐ variables, โข so โ -linear representations are not unique ๐ ๐ฐ ๐ ๐ LTFs, ๐ ๐ฐ ๐ ๐ degree- ๐ polys, โ ReLU functions โข โ -linear Representations have been studied! โ โ ๐ด๐ผ๐ฎ = Special Case of Depth-2 Threshold Circuits โ โ ๐บ๐๐ด๐ฝ = โDepth -2 Neural Net with ReLU activationโ โ โ ๐ธ๐ท๐ด๐๐[๐] = โHigher - Orderโ Fourier Analysis for ๐ โฅ ๐
Sums of Linear Threshold Functions ๐ : 0,1 ๐ โ 0,1 is an LTF if โ ๐ฅ 1 , โฆ ๐ฅ ๐ , ๐ข โ โ such that Def. ๐ โ ๐ฆ 1 , โฆ , ๐ฆ ๐ โ 0,1 ๐ , ๐ ๐ ๐ , โฆ , ๐ ๐ = ๐ โ โ ๐ ๐ ๐ ๐ ๐ โฅ ๐ Depth-Two LTF Circuits ( ๐ด๐ผ๐ฎ โ ๐ด๐ผ๐ฎ ): Major problem to find โniceโ functions without ๐ ๐ -gate ๐๐๐บ โ ๐๐๐บ circuits, for all ๐ [Hajnal et al.โ91] exp(n) depth-two lower bounds for small ๐ฅ ๐ โs [Roychowdhury-Orlitsky- Siuโ94] What about โ โ ๐ด๐ผ๐ฎ ? Special case of ๐ด๐ผ๐ฎ โ ๐ด๐ผ๐ฎ : the linear form for output LTF must always evaluate to 0 or 1 Still, no ๐ ๐.๐ -gate lower bounds were known for โ โ ๐ด๐ผ๐ฎ ! We prove: Thm โ๐ , โ๐ ๐ โ ๐ถ๐ธ without ๐ ๐ -size โ โ ๐ด๐ผ๐ฎ Thm โ๐ โ ๐ถ๐ผ๐ฑ๐ต๐ญ[๐ ๐๐๐ โ ๐ ] without ๐๐๐๐(๐) -size โ โ ๐ด๐ผ๐ฎ Note: It is a major open problem to prove โ๐ โ ๐ถ๐ธ without ๐ ๐ -size (unrestricted) circuits
Sums of ReLUs ๐ : โ ๐ โ โ + is a ReLU if โ ๐ฅ 1 , โฆ ๐ฅ ๐ , ๐ข โ โ such that Def. ๐ โ ๐ฆ 1 , โฆ , ๐ฆ ๐ โ โ ๐ , ๐ ๐ ๐ , โฆ , ๐ ๐ = ๐ง๐๐ฒ(๐, โ ๐ ๐ ๐ ๐ ๐ + ๐) โ โ ๐บ๐๐ด๐ฝ generalizes โ โ ๐ด๐ผ๐ฎ โ โ ๐บ๐๐ด๐ฝ = โDepth -Two Neural Nets with ReLU Activationsโ Very widely studied, thousands of references Several recent references [see paper] give lower bounds for some โweirdโ ๐: โ ๐ โ โ which vary sharply / sensitive No lower bounds known for discrete-domain / Boolean functions (note: โmost sensitiveโ Boolean fn PARITY has O(n)-size โโ ๐ด๐ผ๐ฎ ) We can generalize the โ โ ๐ด๐ผ๐ฎ limits to โ โ ๐บ๐๐ด๐ฝ : Thm โ๐ , โ๐ ๐ โ ๐ถ๐ธ without ๐ ๐ -size โ โ ๐บ๐๐ด๐ฝ Thm โ๐ โ ๐ถ๐ผ๐ฑ๐ต๐ญ[๐ ๐๐๐ โ ๐ ] without ๐๐๐๐(๐) -size โ โ ๐บ๐๐ด๐ฝ Again: major open problem to prove โ๐ โ ๐ถ๐ธ without ๐ ๐ -size (unrestricted) circuits
Sums of Low-Degree GF(p)-Polys โโ ๐ธ๐ท๐ด๐๐[๐] : Linear combination of ๐: 0,1 ๐ โ {0,1, โฆ , ๐ โ 1} where for every ๐ there is a degree- ๐ polynomial ๐(๐ฆ) such that โ๐ฆ โ 0,1 ๐ , ๐ ๐ = ๐ ๐ mod ๐ Case of ๐ = ๐, ๐ = ๐ is already very interesting! Compelling Conjecture [โDegree - Two Uncertainty Principleโ]: ๐ฉ๐ถ๐ฌ (on ๐ inputs) requires ๐ ๐ ๐ -size โโ ๐ธ๐ท๐ด๐๐[๐] Known: ๐ฉ๐ถ๐ฌ requires ฮฉ(2 ๐ ) -size โโ ๐ธ๐ท๐ด๐๐ ๐ ๐ฉ๐ถ๐ฌ has O(2 ๐/2 ) -size โโ ๐ธ๐ท๐ด๐๐[๐] No non-trivial lower bounds were known for โ โ ๐ธ๐ท๐ด๐๐[๐] We prove: Thm โ๐, ๐, โ๐ prime, โ๐ ๐ โ ๐ถ๐ธ without ๐ ๐ -size โโ ๐ธ๐ท๐ด๐๐[๐] Thm โ๐ โ ๐ถ๐ผ๐ฑ๐ต๐ญ[๐ ๐๐๐ โ ๐ ] without ๐๐๐๐(๐) -size โโ ๐ธ๐ท๐ด๐๐[๐] for all fixed ๐ and fixed prime ๐
A Key Theorem A new instance of โ Circuit Analysis Algorithms โ Circuit Lower Bounds โ Key Theorem: Let ๐ be a class of functions ๐ โถ ๐, ๐ ๐ โ โ . Assume: there is an ๐ป > ๐ and an algorithm ๐ฉ so that for any given ๐ ๐ , โฆ , ๐ ๐ โ ๐ , ๐ฉ can compute the โsum - productโ ๐ เท เท ๐ ๐ (๐) ๐โ ๐,๐ ๐ ๐=๐ in ๐ ๐ ๐โ๐ป time. Then: โ๐ , โ๐ โ ๐ถ๐ธ without ๐ ๐ -size โโ ๐ , and โ๐ โ ๐ถ๐ผ๐ฑ๐ต๐ญ ๐ ๐๐๐ โ ๐ without ๐๐๐๐(๐) -size โโ ๐ Applies the new Easy Witness Lemma of [Murray- Wโ18] We show how to compute sum-products in ๐ ๐ ๐โ๐ป time for LTFs, ReLUs, and low-degree polynomials
Major Ideas in the Key Theorem Assume: (1) There is a ๐ ๐ ๐โ๐ป -time sum-product algorithm ๐ฉ for ๐ (2) For some fixed ๐ , all ๐ โ ๐ถ๐ธ have ๐ ๐ -size โโ ๐ Goal: Derive a contradiction. (1) and (2) โ Given (unrestricted) circuit ๐ผ with ๐ inputs and ๐ size Can guess-and-check ๐ ๐ -size โโ ๐ computing ๐ผ , in ๐ ๐ ๐โ๐ป ๐ ๐ท ๐ time Note: to guess, we need that the coefficients in our linear combinations have โsmallโ bit complexity, WLOG (1) โ Can solve Circuit-UNSAT in nondeterministic ๐ ๐ ๐โ๐ป ๐ ๐ท ๐ time We can even solve #Circuit-SAT, because we can compute โ ๐โ ๐,๐ ๐ (โโ ๐ ๐ ) = โ โ ๐ ๐(๐) by solving sum-product for ๐ ๐ times [Murray- Wโ18] โ โ๐ , โ๐ โ ๐ถ๐ธ without ๐ ๐ -size unrestricted circuits Contradicts (2) when โโ ๐ can be simulated by Boolean circuits! The proof crucially relies on โโ ๐ computing a circuit exactly
Sum-Product Algorithm for LTF Uses (old) fact that #Subset-Sum is solvable in ๐๐๐๐ ๐ โ ๐ ๐/๐ time! Thm [HSโ76] #Subset-Sum on ๐ numbers is in ๐๐๐๐ ๐ โ ๐ ๐/๐ time Proof Given ๐ ๐ , โฆ , ๐ ๐ , ๐ , we want to know the number of ๐ป โ [๐] such that โ ๐โ๐ป ๐ ๐ = ๐ 1. Enumerate all possible ๐ ๐/๐ subsets ๐ป of {๐ ๐ , โฆ , ๐ ๐/๐ } . Make a list ๐ด ๐ of the ๐ ๐/๐ subset sums, and SORT all sums in ๐ด ๐ 2. Enumerate all possible ๐ ๐/๐ subsets ๐ผ of {๐ ๐/๐+๐ , โฆ , ๐ ๐ } . For each ๐ผ summing to a value ๐ , BINARY SEARCH for a value ๐โฒ in ๐ด ๐ such that ๐ + ๐โฒ = ๐ 3. To compute the total number of subsets summing to ๐ : For each sum value ๐โฒ appearing in ๐ด ๐ , store the number ๐ ๐โฒ of subsets in ๐ด ๐ which have value ๐โฒ . Later, if value ๐โฒ is found in the binary search, add ๐ ๐โฒ to a running sum. Takes ๐๐๐๐ ๐ โ ๐ ๐/๐ time in total
Sum-Product Algorithm for LTF Uses (old) fact that #Subset-Sum is solvable in ๐๐๐๐ ๐ โ ๐ ๐/๐ time! Thm For any ๐ ๐ , โฆ , ๐ ๐ โ ๐ด๐ผ๐ฎ , we can compute ๐ in ๐๐๐๐ ๐ โ ๐ ๐/๐ time. เท เท ๐ ๐ (๐) ๐โ ๐,๐ ๐ ๐=๐ Proof An Exact LTF ( ๐ญ๐ด๐ผ๐ฎ ) has the form ๐ ๐ = ๐ โ โ ๐ ๐ ๐ ๐ ๐ = ๐ #Subset-Sum in ๐๐๐๐ ๐ โ ๐ ๐/๐ time โ โ ๐ ๐ ๐ in ๐๐๐๐ ๐ โ ๐ ๐/๐ time [HP, CCCโ10]: Every ๐ด๐ผ๐ฎ on ๐ inputs can be written as โ ๐๐๐๐ ๐ ๐ญ๐ด๐ผ๐ฎ ๐ ๐ for ๐ญ๐ด๐ผ๐ฎ s ๐ ๐,๐ So we can write เท เท ๐ ๐ (๐) = เท เท เท ๐ ๐,๐ (๐) ๐โ ๐,๐ ๐ ๐โ ๐,๐ ๐ ๐=๐ ๐=๐ ๐๐๐๐ ๐ ๐ ๐ Simple algebra: = เท เท เท ๐ ๐,๐โฒ ๐ = เท เท เท ๐ ๐,๐โฒ ๐ ๐โ ๐,๐ ๐ ๐โ{๐,๐} ๐ ๐๐๐๐ ๐ ๐=๐ ๐๐๐๐ ๐ ๐=๐ Can compute in ๐๐๐๐ ๐ โ ๐ ๐/๐ time! ๐ Each ฯ ๐=๐ ๐ ๐,๐โฒ ๐ = ๐ ๐ for some ๐ญ๐ด๐ผ๐ฎ ๐
Recommend
More recommend