Preserving Randomness for Adaptive Algorithms William M. Hoza 1 Adam R. Klivans August 20, 2018 RANDOM 1 Supported by the NSF GRFP under Grant DGE-1610403 and by a Harrington Fellowship from UT Austin 1 / 13
Randomized estimation algorithms ◮ Algorithm Est( C ) estimates some value µ ( C ) ∈ R d 2 / 13
Randomized estimation algorithms ◮ Algorithm Est( C ) estimates some value µ ( C ) ∈ R d Pr[ � Est( C ) − µ ( C ) � ∞ > ε ] ≤ δ 2 / 13
Randomized estimation algorithms ◮ Algorithm Est( C ) estimates some value µ ( C ) ∈ R d Pr[ � Est( C ) − µ ( C ) � ∞ > ε ] ≤ δ ◮ Canonical example: 2 / 13
Randomized estimation algorithms ◮ Algorithm Est( C ) estimates some value µ ( C ) ∈ R d Pr[ � Est( C ) − µ ( C ) � ∞ > ε ] ≤ δ ◮ Canonical example: ◮ C is a Boolean circuit 2 / 13
Randomized estimation algorithms ◮ Algorithm Est( C ) estimates some value µ ( C ) ∈ R d Pr[ � Est( C ) − µ ( C ) � ∞ > ε ] ≤ δ ◮ Canonical example: ◮ C is a Boolean circuit def ◮ µ ( C ) = Pr x [ C ( x ) = 1] ( d = 1) 2 / 13
Randomized estimation algorithms ◮ Algorithm Est( C ) estimates some value µ ( C ) ∈ R d Pr[ � Est( C ) − µ ( C ) � ∞ > ε ] ≤ δ ◮ Canonical example: ◮ C is a Boolean circuit def ◮ µ ( C ) = Pr x [ C ( x ) = 1] ( d = 1) ◮ Est( C ) evaluates C at several randomly chosen points 2 / 13
Using Est as a subroutine Owner Steward 3 / 13
Using Est as a subroutine Owner Steward C 1 3 / 13
Using Est as a subroutine Owner Steward C 1 Y 1 ≈ µ ( C 1 ) 3 / 13
Using Est as a subroutine Owner Steward C 1 Y 1 ≈ µ ( C 1 ) C 2 3 / 13
Using Est as a subroutine Owner Steward C 1 Y 1 ≈ µ ( C 1 ) C 2 Y 2 ≈ µ ( C 2 ) 3 / 13
Using Est as a subroutine Owner Steward C 1 Y 1 ≈ µ ( C 1 ) C 2 Y 2 ≈ µ ( C 2 ) . . . C k Y k ≈ µ ( C k ) 3 / 13
Using Est as a subroutine Owner Steward C 1 Y 1 ≈ µ ( C 1 ) C 2 Y 2 ≈ µ ( C 2 ) . . . C k Y k ≈ µ ( C k ) ◮ Suppose Est uses n random bits 3 / 13
Using Est as a subroutine Owner Steward C 1 Y 1 ≈ µ ( C 1 ) C 2 Y 2 ≈ µ ( C 2 ) . . . C k Y k ≈ µ ( C k ) ◮ Suppose Est uses n random bits ◮ Na¨ ıvely, total number of random bits = nk 3 / 13
Using Est as a subroutine Owner Steward C 1 Y 1 ≈ µ ( C 1 ) C 2 Y 2 ≈ µ ( C 2 ) . . . C k Y k ≈ µ ( C k ) ◮ Suppose Est uses n random bits ◮ Na¨ ıvely, total number of random bits = nk ◮ Can we do better? 3 / 13
Main result ◮ Theorem (informal): There is a steward that uses just n + O ( k log( d + 1)) random bits! 4 / 13
Main result ◮ Theorem (informal): There is a steward that uses just n + O ( k log( d + 1)) random bits! ◮ Mild increases in both error and failure probability 4 / 13
Main result ◮ Theorem (informal): There is a steward that uses just n + O ( k log( d + 1)) random bits! ◮ Mild increases in both error and failure probability ◮ Prior work : 4 / 13
Main result ◮ Theorem (informal): There is a steward that uses just n + O ( k log( d + 1)) random bits! ◮ Mild increases in both error and failure probability ◮ Prior work : ◮ [Saks, Zhou ’99], [Impagliazzo, Zuckerman ’89] both imply stewards 4 / 13
Main result ◮ Theorem (informal): There is a steward that uses just n + O ( k log( d + 1)) random bits! ◮ Mild increases in both error and failure probability ◮ Prior work : ◮ [Saks, Zhou ’99], [Impagliazzo, Zuckerman ’89] both imply stewards ◮ Our steward has better parameters 4 / 13
Outline of our steward 1. Compute pseudorandom bits X i ∈ { 0 , 1 } n 5 / 13
Outline of our steward 1. Compute pseudorandom bits X i ∈ { 0 , 1 } n 2. Compute W i := Est( C i , X i ) 5 / 13
Outline of our steward 1. Compute pseudorandom bits X i ∈ { 0 , 1 } n 2. Compute W i := Est( C i , X i ) 3. Compute Y i by carefully modifying W i 5 / 13
Pseudorandom bits ◮ Gen : { 0 , 1 } s → { 0 , 1 } nk : Variant of INW pseudorandom generator 6 / 13
Pseudorandom bits ◮ Gen : { 0 , 1 } s → { 0 , 1 } nk : Variant of INW pseudorandom generator ◮ Before first round, steward computes ( X 1 , X 2 , . . . , X k ) = Gen( U s ) 6 / 13
Pseudorandom bits ◮ Gen : { 0 , 1 } s → { 0 , 1 } nk : Variant of INW pseudorandom generator ◮ Before first round, steward computes ( X 1 , X 2 , . . . , X k ) = Gen( U s ) ◮ In round i , steward runs Est( C i , X i ) 6 / 13
Shifting and rounding ( d + 1) · 2 ε 7 / 13
Shifting and rounding ( d + 1) · 2 ε W i 7 / 13
Shifting and rounding ( d + 1) · 2 ε µ ( C i ) W i 7 / 13
Shifting and rounding ( d + 1) · 2 ε µ ( C i ) W i 7 / 13
Shifting and rounding ( d + 1) · 2 ε µ ( C i ) W i 7 / 13
Shifting and rounding ( d + 1) · 2 ε µ ( C i ) W i 7 / 13
Shifting and rounding ( d + 1) · 2 ε µ ( C i ) W i 7 / 13
Shifting and rounding ( d + 1) · 2 ε µ ( C i ) W i 7 / 13
Shifting and rounding ( d + 1) · 2 ε µ ( C i ) W i 7 / 13
Shifting and rounding ( d + 1) · 2 ε µ ( C i ) Y i 7 / 13
Analysis ◮ Theorem (informal): With high probability, for every i , � Y i − µ ( C i ) � ∞ ≤ O ( ε d ) . 8 / 13
Analysis ◮ Theorem (informal): With high probability, for every i , � Y i − µ ( C i ) � ∞ ≤ O ( ε d ) . ◮ Notation : For W ∈ R d , ∆ ∈ [ d + 1], define ⌊ W ⌉ ∆ ∈ R d by rounding each coordinate to nearest value y such that y ≡ 2 ε ∆ mod ( d + 1) · 2 ε 8 / 13
Analysis ◮ Theorem (informal): With high probability, for every i , � Y i − µ ( C i ) � ∞ ≤ O ( ε d ) . ◮ Notation : For W ∈ R d , ∆ ∈ [ d + 1], define ⌊ W ⌉ ∆ ∈ R d by rounding each coordinate to nearest value y such that y ≡ 2 ε ∆ mod ( d + 1) · 2 ε ◮ In this notation, Y i = ⌊ W i ⌉ ∆ for a suitable ∆ ∈ [ d + 1] 8 / 13
Analysis (continued) ◮ Y i = ⌊ W i ⌉ ∆ 9 / 13
Analysis (continued) ◮ Y i = ⌊ W i ⌉ ∆ ◮ If X i = fresh randomness, then w.h.p., ⌊ W i ⌉ ∆ = ⌊ µ ( C i ) ⌉ ∆ 9 / 13
Analysis (continued) ◮ Y i = ⌊ W i ⌉ ∆ ◮ If X i = fresh randomness, then w.h.p., ⌊ W i ⌉ ∆ = ⌊ µ ( C i ) ⌉ ∆ ◮ Consider d + 2 cases: 9 / 13
Analysis (continued) ◮ Y i = ⌊ W i ⌉ ∆ ◮ If X i = fresh randomness, then w.h.p., ⌊ W i ⌉ ∆ = ⌊ µ ( C i ) ⌉ ∆ ◮ Consider d + 2 cases: ◮ Y i = ⌊ µ ( C i ) ⌉ 1 , or 9 / 13
Analysis (continued) ◮ Y i = ⌊ W i ⌉ ∆ ◮ If X i = fresh randomness, then w.h.p., ⌊ W i ⌉ ∆ = ⌊ µ ( C i ) ⌉ ∆ ◮ Consider d + 2 cases: ◮ Y i = ⌊ µ ( C i ) ⌉ 1 , or ◮ Y i = ⌊ µ ( C i ) ⌉ 2 , or 9 / 13
Analysis (continued) ◮ Y i = ⌊ W i ⌉ ∆ ◮ If X i = fresh randomness, then w.h.p., ⌊ W i ⌉ ∆ = ⌊ µ ( C i ) ⌉ ∆ ◮ Consider d + 2 cases: ◮ Y i = ⌊ µ ( C i ) ⌉ 1 , or ◮ Y i = ⌊ µ ( C i ) ⌉ 2 , or . . . 9 / 13
Analysis (continued) ◮ Y i = ⌊ W i ⌉ ∆ ◮ If X i = fresh randomness, then w.h.p., ⌊ W i ⌉ ∆ = ⌊ µ ( C i ) ⌉ ∆ ◮ Consider d + 2 cases: ◮ Y i = ⌊ µ ( C i ) ⌉ 1 , or ◮ Y i = ⌊ µ ( C i ) ⌉ 2 , or . . . ◮ Y i = ⌊ µ ( C i ) ⌉ d +1 , or 9 / 13
Analysis (continued) ◮ Y i = ⌊ W i ⌉ ∆ ◮ If X i = fresh randomness, then w.h.p., ⌊ W i ⌉ ∆ = ⌊ µ ( C i ) ⌉ ∆ ◮ Consider d + 2 cases: ◮ Y i = ⌊ µ ( C i ) ⌉ 1 , or ◮ Y i = ⌊ µ ( C i ) ⌉ 2 , or . . . ◮ Y i = ⌊ µ ( C i ) ⌉ d +1 , or ◮ Y i = something else. 9 / 13
Block decision tree C 1 10 / 13
Block decision tree C 1 C (1) C (2) C (3) ⊥ 2 2 2 10 / 13
Block decision tree C 1 C (1) C (2) C (3) ⊥ 2 2 2 C (2 , 1) C (2 , 2) C (2 , 3) ⊥ 3 3 3 10 / 13
Block decision tree C 1 C (1) C (2) C (3) ⊥ 2 2 2 C (2 , 1) C (2 , 2) C (2 , 3) ⊥ 3 3 3 ◮ A sequence ( X 1 , . . . , X k ) determines: 10 / 13
Block decision tree C 1 C (1) C (2) C (3) ⊥ 2 2 2 C (2 , 1) C (2 , 2) C (2 , 3) ⊥ 3 3 3 ◮ A sequence ( X 1 , . . . , X k ) determines: ◮ A transcript ( C 1 , Y 1 , C 2 , Y 2 , . . . , C k , Y k ) 10 / 13
Block decision tree C 1 C (1) C (2) C (3) ⊥ 2 2 2 C (2 , 1) C (2 , 2) C (2 , 3) ⊥ 3 3 3 ◮ A sequence ( X 1 , . . . , X k ) determines: ◮ A transcript ( C 1 , Y 1 , C 2 , Y 2 , . . . , C k , Y k ) ◮ A path P through tree 10 / 13
Block decision tree C 1 C (1) C (2) C (3) ⊥ 2 2 2 C (2 , 1) C (2 , 2) C (2 , 3) ⊥ 3 3 3 ◮ A sequence ( X 1 , . . . , X k ) determines: ◮ A transcript ( C 1 , Y 1 , C 2 , Y 2 , . . . , C k , Y k ) ◮ A path P through tree ◮ If we pick X 1 , . . . , X k independently and u.a.r., ( X 1 ,..., X k ) [ P has a ⊥ node] ≤ k δ Pr 10 / 13
Fooling the tree C 1 C (1) C (2) C (3) ⊥ 2 2 2 C (2 , 1) C (2 , 2) C (2 , 3) ⊥ 3 3 3 ◮ Tree has low memory 11 / 13
Fooling the tree C 1 C (1) C (2) C (3) ⊥ 2 2 2 C (2 , 1) C (2 , 2) C (2 , 3) ⊥ 3 3 3 ◮ Tree has low memory ◮ So when X 1 , . . . , X k are pseudorandom, ( X 1 ,..., X k ) [ P has a ⊥ node] ≤ k δ + γ Pr 11 / 13
Recommend
More recommend