Learning Algorithms from Natural Lower Bounds CCC 2016 Marco Carmosino (UCSD) Russell Impagliazzo (UCSD) Valentine Kabanets (SFU) Antonina Kolokolova (MUN) June 11, 2016 1 / 75
Natural Proof of Circuit Lower Bounds for C ⇓ Learning Algorithm for C 2 / 75
? Natural Proof = Barrier However: Natural Proof = Algorithm 3 / 75
Not A Surprise As long as we use natural proofs we have to cope with a duality: any lower bound proof must implicitly argue a proportionately strong upper bound. – Razborov, Rudich 1997 With this duality in mind, it is no coincidence that the technical lemmas of [H˚ as87, Smo87, Raz87] yield much of the machinery for the learning result of [LMN93]. – Razborov, Rudich 1997 4 / 75
Theorem (Main Application of this work) There is a quasi-polynomial time membership-query learning algorithm for AC 0 [ p ] Open Since: Theorem ([LMN93]) There is a quasi-polynomial time uniform-sample learning algorithm for AC 0 5 / 75
Algorithms Lower Bounds 6 / 75
Reminder: Circuits 7 / 75
Complexity Petting Zoo An AC 0 Circuit 8 / 75
Circuit Zoo P / poly Unrestricted poly-size circuits ⊆ TC 0 Add MAJORITY gates ⊆ ACC 0 Add counting modulo any m ⊆ AC 0 [ p ] Add counting modulo prime p ⊆ AC 0 AND, OR, NOT, constant-depth, poly-size 9 / 75
Circuit Lower Bounds The Final Frontier P / poly ??!?!?!?!?!!?? ⊆ TC 0 ???? ⊆ ACC 0 �⊃ NEXP [Wil11] ⊆ AC 0 [ p ] � ACC 0 [Raz87 , Smo87] ⊆ AC 0 � AC 0 [2] [FSS81] 10 / 75
Reminder: Pseudorandom Generators 11 / 75
12 / 75
13 / 75
14 / 75
15 / 75
Reminder: Natural Proofs 16 / 75
Natural Lower Bounds Against C Razborov Rudich 97 Natural Proofs embed an efficient algorithm for some problem R : input: (truth table of) f : { 0 , 1 } n → { 0 , 1 } task: Distinguish: ∀ f of C -circuit complexity ≤ u ( n ) , f �∈ R versus Pr[random f ∈ R ] > 1 / 5 17 / 75
= Function on n bits 18 / 75
Prior Work 19 / 75
Prior Work C -Lower Bounds to Algorithms ◮ Formula-SAT [San10] C -Meta-Algorithms ◮ AC 0 -SAT [IMP12] (SAT, Learning, Compression) ◮ AC 0 -Learning [LMN93] ⇑ ◮ AC 0 -Compression [CKK + 15] ◮ AC 0 [ p ]-Compression [Sri15] 20 / 75
Prior Work C -Algorithms to Lower Bounds ◮ fast C -LEARN = ⇒ EXP � C [FK09, HH11, KKO13] ◮ fast C -SAT = ⇒ NEXP � C (SAT, Learning, Compression) [Wil10] ⇓ ◮ NEXP � ACC [Wil11] ◮ C -Compression = ⇒ NEXP � C [CKK + 15] C -Lower Bounds 21 / 75
Prior Work: The Pattern C -Meta-Algorithms (SAT, Learning, Compression) (SAT, Learning, Compression) ⇑ ⇓ C -Lower Bounds 22 / 75
C -Meta-Algorithms (SAT? Learning? Compression?) ? ⇑ ? ??? 23 / 75
Randomized C -Learning Algorithm ⇑ (for “powerful enough” circuit classes C ) ( C closed under polysize AC 0 [ p ]-reductions) 24 / 75
C -Learning f Learning Model: Membership Queries: may query f ( x ) for any x Uniform PAC: find circuit H s.t. Pr x [ H ( x ) = f ( x )] ≈ 1 25 / 75
Razborov Rudich 97 Natural Lower Bound Against C ⇓ Inverting Algorithm for every g ∈ C MESSAGE: No Natural Lower Bounds for C ⊇ TC 0 26 / 75
This Work Natural Lower Bound Against C ⇓ Learning Algorithm for every g ∈ C MESSAGE: Hmmm. 27 / 75
⇓ 28 / 75
Natural Lower Bound for AC 0 [ p ] [RS] ⇓ Quasi-Polytime Learning Algorithm for AC 0 [ p ] 29 / 75
Learning Algorithms Compare & Contrast [LMN] Algorithm for AC 0 Our Algorithm for AC 0 [ p ] ◮ Uniform PAC Model ◮ Membership queries ◮ n poly (log n ) runtime ◮ n poly (log n ) runtime ◮ Switching Lemma ◮ RS Lower Bound (an exp( n 1 / Ω( d ) ) lower (an exp( n 1 / Ω( d ) ) lower bound for depth d ) bound for depth d ) 30 / 75
The Proof 31 / 75
Tools ◮ NW Generator ◮ XOR Lemma ◮ Natural Properties 32 / 75
Notation 33 / 75
Agree-o-Meter Let f : { 0 , 1 } n → { 0 , 1 } . Define: 34 / 75
Hardness 35 / 75
Hardness Somewhat Very Hard Hard (not to scale) 36 / 75
NW Generator Definition 37 / 75
NW Generator Definition (stretch) (seed) 38 / 75
NW Generator Definition 39 / 75
NW Generator Definition Almost Disjoint Subset Multiplexer 40 / 75
NW Generator Definition Almost Disjoint Subset Multiplexer 41 / 75
NW Generator THEOREM: If h is a very hard function, then NW h ( s ) is a PRG. PROOF: Assume NW h ( s ) is NOT a PRG. Then ∃D : 42 / 75
NW Generator Using D : 43 / 75
Yao’s XOR Lemma h ⊕ k ( � x 1 , . . . , � x k ) = h ( � x 1 ) ⊕ · · · ⊕ h ( � x k ) THEOREM: If h is a somewhat hard function, then h ⊕ k is a very hard function . PROOF: Assume h ⊕ k is NOT very hard. Then ∃ C : 44 / 75
Yao’s XOR Lemma Proof Using C : 45 / 75
Hardness to Randomness Theorem: if h is a somewhat hard function, then NW h ⊕ k ( s ) is a PRG. Proof: Assume NW h ⊕ k ( s ) is NOT a PRG. Then compose the previous two proofs! 46 / 75
Main Proof Idea: Play to Lose PRG: ⊥ = victory Learning: circuit = victory 47 / 75
Learning Algorithm Idea: if f ∈ C ... Figure 1: Composed arguments for easy f 48 / 75
What’s Missing? CONDITION: ∀ f ∈ C ∃D a UNIFORM circuit such that... Figure 2: The circuit we need to learn f 49 / 75
NATURAL PROPERTY R vs C : ∀ f ′ ∈ C [ u ] 50 / 75
COMPARE Figure 3: HAVE vs WANT 51 / 75
. . . . . . . . . . . . . . . . . . 52 / 75
Key Lemma f ∈ C [ poly ] = ⇒ ∀ s g s ∈ C [exp( ℓ − 1 )] Requirement: g s �∈ R size( g s ) ≤ usefulness(log( ℓ )) 53 / 75
Proof of Key Lemma: Almost Disjoint Subset Multiplexer 54 / 75
Runtime Analysis 55 / 75
56 / 75
57 / 75
58 / 75
59 / 75
60 / 75
Learning Runtime = poly( ℓ ) (Recall, ℓ is the stretch of our PRG) 61 / 75
= Function on n bits 62 / 75
??? 63 / 75
??? 64 / 75
??? 65 / 75
??? 66 / 75
??? 67 / 75
68 / 75
Constraints 1. Natural Property: C -size( g s ) ≤ u (log( ℓ )) ⇒ ∀ s g s ∈ C [exp( ℓ − 1 )] recall: f ∈ C [ poly ] = 2. Runtime (want to minimize ): poly( ℓ ) 69 / 75
Learning Algorithm Runtime Usefulness vs. Size Usefulness Learning Runtime ◮ Exponential 2 Ω( n ) ◮ Polynomial ◮ Subexponential 2 n ǫ ◮ Quasipolynomial ◮ Superpolynomial n ω (1) ◮ Subexponential 70 / 75
Exp runtime Poly(n) lb Poly runtime 2 𝑜 lb 71 / 75
Randomized C -Learning Algorithm ⇑ (for “powerful enough” circuit classes C ) ( C closed under polysize AC 0 [ p ]-reductions) 72 / 75
Main Application Learning AC 0 [ p ] In quasipolynomial time quasi-poly 73 / 75
Summary ◮ Natural Proofs = ⇒ Learning (and Compression) Algorithms ◮ BPP -Natural Proofs for C ⇐ ⇒ Randomized Compression for C (answers open question of [CKKSZ15] ◮ Randomized quasi-polytime learning algorithm for AC 0 [ p ] for any prime p 74 / 75
Future Work ◮ Natural Proof of Williams’ ACC 0 lower bound? ? ◮ Natural Proofs for C = ⇒ C -SAT Algorithms ◮ Other learning models? ◮ Banish serendipity? ◮ Derandomization? ◮ Limits to “grey-box” algorithm design? 75 / 75
Ruiwen Chen, Valentine Kabanets, Antonina Kolokolova, Ronen Shaltiel, and David Zuckerman. Mining circuit lower bound proofs for meta-algorithms. Computational Complexity , 24(2):333–392, 2015. Lance Fortnow and Adam R. Klivans. Efficient learning algorithms yield circuit lower bounds. J. Comput. Syst. Sci. , 75(1):27–36, 2009. Merrick L. Furst, James B. Saxe, and Michael Sipser. Parity, circuits, and the polynomial-time hierarchy. In 22nd Annual Symposium on Foundations of Computer Science, Nashville, Tennessee, USA, 28-30 October 1981 , pages 260–270. IEEE Computer Society, 1981. Johan H˚ astad. Computational Limitations of Small-depth Circuits . MIT Press, Cambridge, MA, USA, 1987. Ryan C. Harkins and John M. Hitchcock. 75 / 75
Exact learning algorithms, betting games, and circuit lower bounds. In Luca Aceto, Monika Henzinger, and Jir´ ı Sgall, editors, Automata, Languages and Programming - 38th International Colloquium, ICALP 2011, Zurich, Switzerland, July 4-8, 2011, Proceedings, Part I , volume 6755 of Lecture Notes in Computer Science , pages 416–423. Springer, 2011. Russell Impagliazzo, William Matthews, and Ramamohan Paturi. A satisfiability algorithm for ac0. In Yuval Rabani, editor, Proceedings of the Twenty-Third Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2012, Kyoto, Japan, January 17-19, 2012 , pages 961–972. SIAM, 2012. Adam Klivans, Pravesh Kothari, and Igor Carboni Oliveira. Constructing hard functions using learning algorithms. 75 / 75
Recommend
More recommend