l ecture 4
play

L ECTURE 4 Last time Testing if a graph is connected. Estimating - PowerPoint PPT Presentation

Sublinear Algorithms L ECTURE 4 Last time Testing if a graph is connected. Estimating the number of connected components. Estimating the weight of a MST Today Limitations of sublinear-time algorithms Yaos Minimax Principle


  1. Sublinear Algorithms L ECTURE 4 Last time • Testing if a graph is connected. • Estimating the number of connected components. • Estimating the weight of a MST Today • Limitations of sublinear-time algorithms • Yao’s Minimax Principle 9/15/2020 Sofya Raskhodnikova;Boston University

  2. Query Complexity • Query complexity of an algorithm is the maximum number of queries the algorithm makes. – Usually expressed as a function of input length (and other parameters) – Example: the test for sortedness (from Lecture 2) had query complexity log 𝑜 𝑃 log 𝑜 for constant 𝜁, more precisely 𝑃 𝜁 – running time ≥ query complexity • Query complexity of a problem 𝑄 , denoted 𝑟 𝑄 , is the query complexity of the best algorithm for the problem. – What is 𝑟(testing sortednes𝑡) ? How do we know that there is no better algorithm? Today: Techniques for proving lower bounds on 𝑟 𝑄 . 2

  3. Yao’s Principle A Method for Proving Lower Bounds

  4. Yao’s Minimax Principle Consider a computational problem on a finite domain. • The following statements are equivalent. Statement 1 For any probabilistic algorithm A of complexity 𝑟 there exists an input 𝑦 s.t. 𝑑𝑝𝑗𝑜 𝑢𝑝𝑡𝑡𝑓𝑡 𝑝𝑔 𝐵 [A( 𝑦 ) is wrong] > 1/3. Pr Statement 2 There is a distribution D on the inputs, s.t. for every deterministic algorithm of complexity q, 𝑦←𝐸 [A( 𝑦 ) is wrong] > 1/3. Pr • Need for lower bounds Yao’s Minimax Principle (easy direction): Statement 2 ⇒ Statement 1. 4

  5. Proof of Easy Direction of Yao’s Principle • Consider a finite set of inputs 𝑌 (e.g., all inputs of length n ). • Consider a randomized algorithm that takes an input 𝑦 ∈ 𝑌, makes ≤ 𝑟 queries to 𝑦 and outputs accept or reject. • Every randomized algorithm can be viewed as a distribution 𝜈 on deterministic algorithms (which are decision trees). • Let Y be the set of all 𝑟 -query deterministic algorithms that run on inputs in X. 5

  6. Proof of Easy Direction of Yao’s Principle • Consider a matrix M with – rows indexed by inputs 𝑦 from X, – columns indexed by algorithms 𝑧 from 𝑍 , – entry 𝑁 𝑦, 𝑧 = ቊ1 if algorithm 𝑧 is correct on input 𝑦 0 if algorithm 𝑧 is wrong on input 𝑦 … 𝒛 𝟐 𝒛 𝟑 𝒚 𝟐 1 0 𝒚 𝟑 1 1 … ⋱ • Then an algorithm A is a distribution 𝜈 over columns 𝑍 with probabilities satisfying σ 𝑧∈𝑍 𝜈(𝑧) = 1. 6

  7. Rephrasing Statements 1 and 2 in Terms of M Statement 1 For any probabilistic algorithm A of complexity q there exists an input 𝑦 s.t. 𝑑𝑝𝑗𝑜 𝑢𝑝𝑡𝑡𝑓𝑡 𝑝𝑔 𝐵 [A( 𝑦 ) is wrong] > 1/3. Pr • For all distributions 𝜈 over columns 𝑍 , there exists a row 𝑦 s.t. 𝑧←𝜈 [𝑁(𝑦, 𝑧) = 0] > 1/3. Pr Statement 2 There is a distribution D on the inputs, s.t. for every deterministic algorithm of complexity q, 𝑦←𝐸 [A( 𝑦 ) is wrong] > 1/3. Pr • There is a distribution D over rows X, s.t. for all columns 𝑧, 𝑦←𝐸 [𝑁(𝑦, 𝑧) = 0] > 1/3. Pr 7

  8. Statement 2 ⇒ Statement 1 • Suppose there is a distribution D over X, s.t. for all columns 𝑧, 𝑦←𝐸 [𝑁(𝑦, 𝑧) = 0] > 1/3. Pr • Then for all distributions 𝜈 over Y, Pr [𝑁(𝑦, 𝑧) = 0] > 1/3. 𝑦←𝐸 𝑧←𝜈 • Then for all distributions 𝜈 over Y, there exists a row 𝑦, 𝑧←𝜈 [𝑁(𝑦, 𝑧) = 0] > 1/3. Pr … 𝒛 𝟐 𝒛 𝟑 𝒚 𝟐 1 0 𝒚 𝟑 1 1 … ⋱ 8

  9. Yao’s Principle (Easy Direction) Statement 1 For any probabilistic algorithm A of complexity q there exists an input x s.t. 𝑑𝑝𝑗𝑜 𝑢𝑝𝑡𝑡𝑓𝑡 𝑝𝑔 𝐵 [A(x) is wrong] > 1/3. Pr Statement 2 There is a distribution D on the inputs, s.t. for every deterministic algorithm of complexity q, 𝑦←𝐸 [A(x) is wrong] > 1/3. Pr • Need for lower bounds Yao’s Minimax Principle (easy direction): Statement 2 ⇒ Statement 1. NOTE: Also applies to restricted algorithms • 1-sided error tests • nonadaptive tests 9

  10. Yao’s Minimax Principle as a game Players: Evil algorithms designer Al and poor lower bound prover Lola. Game1 Move 1. Al selects a q-query randomized algorithm A for the problem. Move 2. Lola selects an input on which A errs with largest probability. Game2 Move 1. Lola selects a distribution on inputs. Move 2. Al selects a q-query deterministic algorithm with as large probability of success on Lola’s distribution as possible. 10

  11. Toy Example: a Lower Bound for Testing 0* Input: string of n bits Question: Does the string contain only 0’s or is it 𝜁 -far form the all-0 string? Claim. Any algorithm needs  (1/ 𝜁 ) queries to answer this question w.p. ≥ 𝟑/𝟒 . Proof: By Yao’s Minimax Principle, enough to prove Statement 2. Distribution D on n-bit strings • Divide the input string into 1/ ε blocks of size ε𝑜 . • Let y i be the string where the i th block is 1s and remaining bits are 0. • Distribution D gives the all-0 string w.p. 1/2 and y i with w.p. 1/2, where 𝑗 is chosen uniformly at random from 1, …, 1/ ε . 0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 𝜻𝒐 𝜻𝒐 𝜻𝒐 𝜻𝒐 11

  12. A Lower Bound for Testing 0* Claim. Any 𝜁 -test for 0* needs  (1/ 𝜁 ) queries. Proof (continued): Now fix a deterministic tester A making q < 1/ 3𝜁 queries. 1. A must accept if all answers are 0. Otherwise, it would be wrong on all-0 string, that is, with probability 1/2 with respect to D. Let 𝑗 1 , . . . , 𝑗 𝑟 be the positions A queries when it sees only 0s. The test can 2. choose its queries based on previous answers. However, since all these answers are 0 and since A is deterministic, the query positions are fixed. 2 At least 1/ 𝜁 − q > • 3𝜁 of the blocks do not hold any queried indices. • Therefore, A accepts > 2/3 of the inputs y i . Thus, it is wrong with probability 2 𝜁 1 > 3𝜁 ⋅ 2 = 3 0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 𝜻𝒐 𝜻𝒐 𝜻𝒐 𝜻𝒐 Context: [Alon Krivelevich Newman Szegedy 99] Every regular language can be tested in O(1/ 𝜁 polylog 1/ 𝜁) time 12

  13. A Lower Bound for Testing Sortedness Input: a list of n numbers x 1 , x 2 ,..., x n Question: Is the list sorted or 𝜁 -far from sorted? Already saw: two different O((log n)/ 𝜁 ) time testers. Known [Ergün Kannan Kumar Rubinfeld Viswanathan 98, Fischer 01]:  (log n) queries are required for all constant 𝜁 ≤ 1/2 Today:  (log n) queries are required for all constant 𝜁 ≤ 1/2 for every 1-sided error nonadaptive test. 1-sided Error Property Tester • A test has 1-sided error if it always accepts all Accept with YES probability ≥ 𝟑/𝟒 YES instances. 𝜁 Don’t care • A test is nonadaptive if its queries do not Reject with Far from probability ≥ 𝟑/𝟒 YES depend on answers to previous queries. 13

  14. 1- Sided Error Tests Must Catch “Mistakes” • A pair (𝑗, 𝑘) is violated if 𝑗 < 𝑘 but 𝑦 𝑗 > 𝑦 𝑘 Claim. A 1-sided error test can reject only if it finds a violated pair. Proof: Every sorted partial list can be extended to a sorted list. 1 ? ? 4 … 7 ? ? 9 14

  15. Yao’s Principle Game [Jha] Lola’s distribution is uniform over the following log 𝑜 lists: ℓ 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 ℓ 2 1 1 1 1 0 0 0 0 2 2 2 2 1 1 1 1 ℓ 3 1 1 0 0 2 2 1 1 3 3 2 2 4 4 3 3 . . . ℓ log 𝑜 1 0 2 1 3 2 4 3 5 4 6 5 7 6 8 7 Claim 1. All lists above are 1/2-far from sorted. Claim 2. Every pair (𝑗, 𝑘) is violated in exactly one list above. 15

  16. Yao’s Principle Game: Al’s Move Al picks a set 𝑅 = {𝑏 1 , 𝑏 2 , … , 𝑏 |𝑅| } of positions to query. ? ? ? ? 𝑏 1 𝑏 2 𝑏 3 … 𝑏 |𝑅| • His test must be correct, i.e., must find a violated pair with probability ≥ 2/3 when input is picked according to Lola’s distribution. • 𝑅 contains a violated pair ⇔ (𝑏 𝑗 , 𝑏 𝑗+1 ) is violated for some 𝑗 [ 𝑏 𝑗 , 𝑏 𝑗+1 for some 𝑗 is vilolated in list ℓ] ≤ 𝑅 − 1 Pr log 𝑜 ℓ← Lola′s distribution 2 2 • If 𝑅 ≤ 3 log 𝑜 then this probability is < By the Union Bound 3 • So, 𝑅 = Ω(log 𝑜) • By Yao’s Minimax Principle, every randomized 1-sided error nonadaptive test for sortedness must make Ω(log 𝑜) queries. 16

  17. Testing Monotonicity of functions on Hypercube Non-adaptive 1-sided error Lower Bound

  18. Boolean Functions 𝒈 ∶ 𝟏, 𝟐 𝒐 → {𝟏, 𝟐} f(011) Graph representation: f(111) 𝑜 -dimensional hypercube f(010) f(110) f(001) f(101) f(000) f(100) 2 𝑜 vertices: bit strings of length 𝑜 • • 2 𝑜−1 𝑜 edges: (𝑦, 𝑧) is an edge if 𝑧 can be obtained from 𝑦 by increasing one bit from 0 to 1 𝑦 001001 𝑧 011001 • each vertex 𝑦 is labeled with 𝑔(𝑦) 18

  19. Boolean Functions 𝒈 ∶ 𝟏, 𝟐 𝒐 → {𝟏, 𝟐} 𝑔(11 ⋯ 11) Graph representation: 𝑜 -dimensional hypercube Vertices: increasing weight 2 𝑜 vertices: bit strings of length 𝑜 • 𝑔(00 ⋯ 00) • 2 𝑜−1 𝑜 edges: (𝑦, 𝑧) is an edge if 𝑧 can be obtained from 𝑦 by increasing one bit from 0 to 1 𝑦 001001 𝑧 011001 • each vertex 𝑦 is labeled with 𝑔(𝑦) 19

Recommend


More recommend