strengths and weaknesses of quantum examples
play

Strengths and weaknesses of quantum examples Srinivasan Arunachalam - PowerPoint PPT Presentation

Strengths and weaknesses of quantum examples Srinivasan Arunachalam (MIT) joint with Ronald de Wolf (CWI, Amsterdam) and others 1/ 18 Machine learning Classical machine learning 2/ 18 Machine learning Classical machine learning Grand goal:


  1. Complexity of learning How to measure the efficiency of the classical or quantum learner? 8/ 18

  2. Complexity of learning How to measure the efficiency of the classical or quantum learner? Sample complexity: number of labeled examples used by learner 8/ 18

  3. Complexity of learning How to measure the efficiency of the classical or quantum learner? Sample complexity: number of labeled examples used by learner Time complexity: number of time-steps used by learner 8/ 18

  4. Complexity of learning How to measure the efficiency of the classical or quantum learner? Sample complexity: number of labeled examples used by learner Time complexity: number of time-steps used by learner In this talk Strengths of quantum examples 8/ 18

  5. Complexity of learning How to measure the efficiency of the classical or quantum learner? Sample complexity: number of labeled examples used by learner Time complexity: number of time-steps used by learner In this talk Strengths of quantum examples A CLW’18: Sample complexity of learning Fourier-sparse Boolean functions under uniform D 8/ 18

  6. Complexity of learning How to measure the efficiency of the classical or quantum learner? Sample complexity: number of labeled examples used by learner Time complexity: number of time-steps used by learner In this talk Strengths of quantum examples A CLW’18: Sample complexity of learning Fourier-sparse Boolean functions under uniform D Bshouty-Jackson’95: Quantum polynomial time learnability of DNFs under uniform D 8/ 18

  7. Complexity of learning How to measure the efficiency of the classical or quantum learner? Sample complexity: number of labeled examples used by learner Time complexity: number of time-steps used by learner In this talk Strengths of quantum examples A CLW’18: Sample complexity of learning Fourier-sparse Boolean functions under uniform D Bshouty-Jackson’95: Quantum polynomial time learnability of DNFs under uniform D A CKW’18: Quantum examples can help the coupon collector 8/ 18

  8. Complexity of learning How to measure the efficiency of the classical or quantum learner? Sample complexity: number of labeled examples used by learner Time complexity: number of time-steps used by learner In this talk Strengths of quantum examples A CLW’18: Sample complexity of learning Fourier-sparse Boolean functions under uniform D Bshouty-Jackson’95: Quantum polynomial time learnability of DNFs under uniform D A CKW’18: Quantum examples can help the coupon collector Weaknesses of quantum examples A W’17: Quantum examples are not more powerful than classical examples for PAC learning 8/ 18

  9. Fourier sampling: a useful trick under uniform D 9/ 18

  10. Fourier sampling: a useful trick under uniform D Let c : { 0 , 1 } n → {− 1 , 1 } . 9/ 18

  11. Fourier sampling: a useful trick under uniform D Let c : { 0 , 1 } n → {− 1 , 1 } . Then the Fourier coefficients are � c ( S ) = 1 c ( x )( − 1) S · x for all S ∈ { 0 , 1 } n � 2 n x ∈{ 0 , 1 } n 9/ 18

  12. Fourier sampling: a useful trick under uniform D Let c : { 0 , 1 } n → {− 1 , 1 } . Then the Fourier coefficients are � c ( S ) = 1 c ( x )( − 1) S · x for all S ∈ { 0 , 1 } n � 2 n x ∈{ 0 , 1 } n Parseval’s identity: � c ( S ) 2 = S � 9/ 18

  13. Fourier sampling: a useful trick under uniform D Let c : { 0 , 1 } n → {− 1 , 1 } . Then the Fourier coefficients are � c ( S ) = 1 c ( x )( − 1) S · x for all S ∈ { 0 , 1 } n � 2 n x ∈{ 0 , 1 } n Parseval’s identity: � c ( S ) 2 = E x [ c ( x ) 2 ] S � 9/ 18

  14. Fourier sampling: a useful trick under uniform D Let c : { 0 , 1 } n → {− 1 , 1 } . Then the Fourier coefficients are � c ( S ) = 1 c ( x )( − 1) S · x for all S ∈ { 0 , 1 } n � 2 n x ∈{ 0 , 1 } n Parseval’s identity: � c ( S ) 2 = E x [ c ( x ) 2 ] = 1 S � 9/ 18

  15. Fourier sampling: a useful trick under uniform D Let c : { 0 , 1 } n → {− 1 , 1 } . Then the Fourier coefficients are � c ( S ) = 1 c ( x )( − 1) S · x for all S ∈ { 0 , 1 } n � 2 n x ∈{ 0 , 1 } n Parseval’s identity: � c ( S ) 2 = E x [ c ( x ) 2 ] = 1 S � c ( S ) 2 } S forms a probability distribution So { � 9/ 18

  16. Fourier sampling: a useful trick under uniform D Let c : { 0 , 1 } n → {− 1 , 1 } . Then the Fourier coefficients are � c ( S ) = 1 c ( x )( − 1) S · x for all S ∈ { 0 , 1 } n � 2 n x ∈{ 0 , 1 } n Parseval’s identity: � c ( S ) 2 = E x [ c ( x ) 2 ] = 1 S � c ( S ) 2 } S forms a probability distribution So { � Given quantum example under uniform D : � 1 √ | x , c ( x ) � 2 n x 9/ 18

  17. Fourier sampling: a useful trick under uniform D Let c : { 0 , 1 } n → {− 1 , 1 } . Then the Fourier coefficients are � c ( S ) = 1 c ( x )( − 1) S · x for all S ∈ { 0 , 1 } n � 2 n x ∈{ 0 , 1 } n Parseval’s identity: � c ( S ) 2 = E x [ c ( x ) 2 ] = 1 S � c ( S ) 2 } S forms a probability distribution So { � Given quantum example under uniform D : � � 1 Hadamard √ � | x , c ( x ) � − → c ( S ) | S � 2 n x S 9/ 18

  18. Fourier sampling: a useful trick under uniform D Let c : { 0 , 1 } n → {− 1 , 1 } . Then the Fourier coefficients are � c ( S ) = 1 c ( x )( − 1) S · x for all S ∈ { 0 , 1 } n � 2 n x ∈{ 0 , 1 } n Parseval’s identity: � c ( S ) 2 = E x [ c ( x ) 2 ] = 1 S � c ( S ) 2 } S forms a probability distribution So { � Given quantum example under uniform D : � � 1 Hadamard √ � | x , c ( x ) � − → c ( S ) | S � 2 n x S c ( S ) 2 } S Measuring allows to sample from the Fourier distribution { � 9/ 18

  19. Applications of Fourier sampling 10/ 18

  20. Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n 10/ 18

  21. Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n Classical: Ω( n ) classical examples needed 10/ 18

  22. Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n Classical: Ω( n ) classical examples needed Quantum: 1 quantum example suffices to learn C 1 (Bernstein-Vazirani’93) 10/ 18

  23. Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n Classical: Ω( n ) classical examples needed Quantum: 1 quantum example suffices to learn C 1 (Bernstein-Vazirani’93) Consider C 2 = { c is a ℓ -junta } , i.e., c ( x ) depends only on ℓ bits of x 10/ 18

  24. Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n Classical: Ω( n ) classical examples needed Quantum: 1 quantum example suffices to learn C 1 (Bernstein-Vazirani’93) Consider C 2 = { c is a ℓ -junta } , i.e., c ( x ) depends only on ℓ bits of x Classical: Efficient learning is notoriously hard for ℓ = O (log n ) and uniform D 10/ 18

  25. Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n Classical: Ω( n ) classical examples needed Quantum: 1 quantum example suffices to learn C 1 (Bernstein-Vazirani’93) Consider C 2 = { c is a ℓ -junta } , i.e., c ( x ) depends only on ℓ bits of x Classical: Efficient learning is notoriously hard for ℓ = O (log n ) and uniform D Quantum: C 2 can be exactly learnt using � O (2 ℓ ) quantum examples and in time O ( n 2 ℓ + 2 2 ℓ ) (Atıcı-Servedio’09) � 10/ 18

  26. Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n Classical: Ω( n ) classical examples needed Quantum: 1 quantum example suffices to learn C 1 (Bernstein-Vazirani’93) Consider C 2 = { c is a ℓ -junta } , i.e., c ( x ) depends only on ℓ bits of x Classical: Efficient learning is notoriously hard for ℓ = O (log n ) and uniform D Quantum: C 2 can be exactly learnt using � O (2 ℓ ) quantum examples and in time O ( n 2 ℓ + 2 2 ℓ ) (Atıcı-Servedio’09) � Generalizing both these concept classes? Definition: We say c is k -Fourier sparse if |{ S : � c ( S ) � = 0 }| ≤ k . 10/ 18

  27. Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n Classical: Ω( n ) classical examples needed Quantum: 1 quantum example suffices to learn C 1 (Bernstein-Vazirani’93) Consider C 2 = { c is a ℓ -junta } , i.e., c ( x ) depends only on ℓ bits of x Classical: Efficient learning is notoriously hard for ℓ = O (log n ) and uniform D Quantum: C 2 can be exactly learnt using � O (2 ℓ ) quantum examples and in time O ( n 2 ℓ + 2 2 ℓ ) (Atıcı-Servedio’09) � Generalizing both these concept classes? Definition: We say c is k -Fourier sparse if |{ S : � c ( S ) � = 0 }| ≤ k . Note that C 1 is 1-Fourier sparse and C 2 is 2 ℓ -Fourier sparse 10/ 18

  28. Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n Classical: Ω( n ) classical examples needed Quantum: 1 quantum example suffices to learn C 1 (Bernstein-Vazirani’93) Consider C 2 = { c is a ℓ -junta } , i.e., c ( x ) depends only on ℓ bits of x Classical: Efficient learning is notoriously hard for ℓ = O (log n ) and uniform D Quantum: C 2 can be exactly learnt using � O (2 ℓ ) quantum examples and in time O ( n 2 ℓ + 2 2 ℓ ) (Atıcı-Servedio’09) � Generalizing both these concept classes? Definition: We say c is k -Fourier sparse if |{ S : � c ( S ) � = 0 }| ≤ k . Note that C 1 is 1-Fourier sparse and C 2 is 2 ℓ -Fourier sparse Consider the concept class C = { c : { 0 , 1 } n → {− 1 , 1 } : c is k -Fourier sparse } 10/ 18

  29. Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n Classical: Ω( n ) classical examples needed Quantum: 1 quantum example suffices to learn C 1 (Bernstein-Vazirani’93) Consider C 2 = { c is a ℓ -junta } , i.e., c ( x ) depends only on ℓ bits of x Classical: Efficient learning is notoriously hard for ℓ = O (log n ) and uniform D Quantum: C 2 can be exactly learnt using � O (2 ℓ ) quantum examples and in time O ( n 2 ℓ + 2 2 ℓ ) (Atıcı-Servedio’09) � Generalizing both these concept classes? Definition: We say c is k -Fourier sparse if |{ S : � c ( S ) � = 0 }| ≤ k . Note that C 1 is 1-Fourier sparse and C 2 is 2 ℓ -Fourier sparse Consider the concept class C = { c : { 0 , 1 } n → {− 1 , 1 } : c is k -Fourier sparse } Observe that C 1 ⊆ C . C contains linear functions 10/ 18

  30. Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n Classical: Ω( n ) classical examples needed Quantum: 1 quantum example suffices to learn C 1 (Bernstein-Vazirani’93) Consider C 2 = { c is a ℓ -junta } , i.e., c ( x ) depends only on ℓ bits of x Classical: Efficient learning is notoriously hard for ℓ = O (log n ) and uniform D Quantum: C 2 can be exactly learnt using � O (2 ℓ ) quantum examples and in time O ( n 2 ℓ + 2 2 ℓ ) (Atıcı-Servedio’09) � Generalizing both these concept classes? Definition: We say c is k -Fourier sparse if |{ S : � c ( S ) � = 0 }| ≤ k . Note that C 1 is 1-Fourier sparse and C 2 is 2 ℓ -Fourier sparse Consider the concept class C = { c : { 0 , 1 } n → {− 1 , 1 } : c is k -Fourier sparse } Observe that C 1 ⊆ C . C contains linear functions Observe that C 2 ⊆ C . C contains (log k )-juntas 10/ 18

  31. Learning C = { c is k -Fourier sparse } 11/ 18

  32. Learning C = { c is k -Fourier sparse } Exact learning C under the uniform distribution D 11/ 18

  33. Learning C = { c is k -Fourier sparse } Exact learning C under the uniform distribution D Classically (Haviv-Regev’15): � Θ( nk ) classical examples ( x , c ( x )) are necessary and sufficient to learn the concept class C 11/ 18

  34. Learning C = { c is k -Fourier sparse } Exact learning C under the uniform distribution D Classically (Haviv-Regev’15): � Θ( nk ) classical examples ( x , c ( x )) are necessary and sufficient to learn the concept class C � Quantumly ( A CLW’18): � O ( k 1 . 5 ) quantum examples 1 x | x , c ( x ) � are √ 2 n sufficient to learn C (independent of the universe size n ) 11/ 18

  35. Learning C = { c is k -Fourier sparse } Exact learning C under the uniform distribution D Classically (Haviv-Regev’15): � Θ( nk ) classical examples ( x , c ( x )) are necessary and sufficient to learn the concept class C � Quantumly ( A CLW’18): � O ( k 1 . 5 ) quantum examples 1 x | x , c ( x ) � are √ 2 n sufficient to learn C (independent of the universe size n ) � Ω( k ) examples are necessary to learn C 11/ 18

  36. Learning C = { c is k -Fourier sparse } Exact learning C under the uniform distribution D Classically (Haviv-Regev’15): � Θ( nk ) classical examples ( x , c ( x )) are necessary and sufficient to learn the concept class C � Quantumly ( A CLW’18): � O ( k 1 . 5 ) quantum examples 1 x | x , c ( x ) � are √ 2 n sufficient to learn C (independent of the universe size n ) � Ω( k ) examples are necessary to learn C Sketch of upper bound 11/ 18

  37. Learning C = { c is k -Fourier sparse } Exact learning C under the uniform distribution D Classically (Haviv-Regev’15): � Θ( nk ) classical examples ( x , c ( x )) are necessary and sufficient to learn the concept class C � Quantumly ( A CLW’18): � O ( k 1 . 5 ) quantum examples 1 x | x , c ( x ) � are √ 2 n sufficient to learn C (independent of the universe size n ) � Ω( k ) examples are necessary to learn C Sketch of upper bound c ( S ) 2 } S Use Fourier sampling to sample S ∼ { � 11/ 18

  38. Learning C = { c is k -Fourier sparse } Exact learning C under the uniform distribution D Classically (Haviv-Regev’15): � Θ( nk ) classical examples ( x , c ( x )) are necessary and sufficient to learn the concept class C � Quantumly ( A CLW’18): � O ( k 1 . 5 ) quantum examples 1 x | x , c ( x ) � are √ 2 n sufficient to learn C (independent of the universe size n ) � Ω( k ) examples are necessary to learn C Sketch of upper bound c ( S ) 2 } S Use Fourier sampling to sample S ∼ { � Collect S s until the learner learns the Fourier span of c , V = span { S : � c ( S ) � = 0 } 11/ 18

  39. Learning C = { c is k -Fourier sparse } Exact learning C under the uniform distribution D Classically (Haviv-Regev’15): � Θ( nk ) classical examples ( x , c ( x )) are necessary and sufficient to learn the concept class C � Quantumly ( A CLW’18): � O ( k 1 . 5 ) quantum examples 1 x | x , c ( x ) � are √ 2 n sufficient to learn C (independent of the universe size n ) � Ω( k ) examples are necessary to learn C Sketch of upper bound c ( S ) 2 } S Use Fourier sampling to sample S ∼ { � Collect S s until the learner learns the Fourier span of c , V = span { S : � c ( S ) � = 0 } Suppose dim( V ) = r , then � O ( rk ) quantum examples suffice to find V 11/ 18

  40. Learning C = { c is k -Fourier sparse } Exact learning C under the uniform distribution D Classically (Haviv-Regev’15): � Θ( nk ) classical examples ( x , c ( x )) are necessary and sufficient to learn the concept class C � Quantumly ( A CLW’18): � O ( k 1 . 5 ) quantum examples 1 x | x , c ( x ) � are √ 2 n sufficient to learn C (independent of the universe size n ) � Ω( k ) examples are necessary to learn C Sketch of upper bound c ( S ) 2 } S Use Fourier sampling to sample S ∼ { � Collect S s until the learner learns the Fourier span of c , V = span { S : � c ( S ) � = 0 } Suppose dim( V ) = r , then � O ( rk ) quantum examples suffice to find V Use the result of [HR’15] to learn c ′ completely using � O ( rk ) classical examples 11/ 18

  41. Learning C = { c is k -Fourier sparse } Exact learning C under the uniform distribution D Classically (Haviv-Regev’15): � Θ( nk ) classical examples ( x , c ( x )) are necessary and sufficient to learn the concept class C � Quantumly ( A CLW’18): � O ( k 1 . 5 ) quantum examples 1 x | x , c ( x ) � are √ 2 n sufficient to learn C (independent of the universe size n ) � Ω( k ) examples are necessary to learn C Sketch of upper bound c ( S ) 2 } S Use Fourier sampling to sample S ∼ { � Collect S s until the learner learns the Fourier span of c , V = span { S : � c ( S ) � = 0 } Suppose dim( V ) = r , then � O ( rk ) quantum examples suffice to find V Use the result of [HR’15] to learn c ′ completely using � O ( rk ) classical examples √ Since r ≤ � k ) for every c ∈ C [Sanyal’15], we get � O ( k 1 . 5 ) upper bound O ( 11/ 18

  42. Learning Disjunctive normal Forms (DNF) 12/ 18

  43. Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. 12/ 18

  44. Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) 12/ 18

  45. Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) We say a DNF on n variables is an s -term DNF if number of clauses is ≤ s 12/ 18

  46. Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) We say a DNF on n variables is an s -term DNF if number of clauses is ≤ s Learning C = { c is an s -term DNF in n variables } under uniform D 12/ 18

  47. Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) We say a DNF on n variables is an s -term DNF if number of clauses is ≤ s Learning C = { c is an s -term DNF in n variables } under uniform D Classically: Efficient learning using examples is a longstanding open question. Best known upper bound is n O (log n ) [Verbeurgt’90] 12/ 18

  48. Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) We say a DNF on n variables is an s -term DNF if number of clauses is ≤ s Learning C = { c is an s -term DNF in n variables } under uniform D Classically: Efficient learning using examples is a longstanding open question. Best known upper bound is n O (log n ) [Verbeurgt’90] Quantumly: Bshouty-Jackson’95 gave a polynomial-time quantum algorithm! 12/ 18

  49. Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) We say a DNF on n variables is an s -term DNF if number of clauses is ≤ s Learning C = { c is an s -term DNF in n variables } under uniform D Classically: Efficient learning using examples is a longstanding open question. Best known upper bound is n O (log n ) [Verbeurgt’90] Quantumly: Bshouty-Jackson’95 gave a polynomial-time quantum algorithm! Proof sketch of quantum upper bound 12/ 18

  50. Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) We say a DNF on n variables is an s -term DNF if number of clauses is ≤ s Learning C = { c is an s -term DNF in n variables } under uniform D Classically: Efficient learning using examples is a longstanding open question. Best known upper bound is n O (log n ) [Verbeurgt’90] Quantumly: Bshouty-Jackson’95 gave a polynomial-time quantum algorithm! Proof sketch of quantum upper bound c ( U ) | ≥ 1 Structural property: if c is an s -term DNF, then there exists U s.t. | � s 12/ 18

  51. Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) We say a DNF on n variables is an s -term DNF if number of clauses is ≤ s Learning C = { c is an s -term DNF in n variables } under uniform D Classically: Efficient learning using examples is a longstanding open question. Best known upper bound is n O (log n ) [Verbeurgt’90] Quantumly: Bshouty-Jackson’95 gave a polynomial-time quantum algorithm! Proof sketch of quantum upper bound c ( U ) | ≥ 1 Structural property: if c is an s -term DNF, then there exists U s.t. | � s c ( T ) 2 } T , poly( s ) many times to see such a U Fourier sampling! Sample T ∼ { � 12/ 18

  52. Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) We say a DNF on n variables is an s -term DNF if number of clauses is ≤ s Learning C = { c is an s -term DNF in n variables } under uniform D Classically: Efficient learning using examples is a longstanding open question. Best known upper bound is n O (log n ) [Verbeurgt’90] Quantumly: Bshouty-Jackson’95 gave a polynomial-time quantum algorithm! Proof sketch of quantum upper bound c ( U ) | ≥ 1 Structural property: if c is an s -term DNF, then there exists U s.t. | � s c ( T ) 2 } T , poly( s ) many times to see such a U Fourier sampling! Sample T ∼ { � Construct a “weak learner” who outputs χ U s.t. Pr[ χ U ( x ) = c ( x )] = 1 2 + 1 s 12/ 18

  53. Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) We say a DNF on n variables is an s -term DNF if number of clauses is ≤ s Learning C = { c is an s -term DNF in n variables } under uniform D Classically: Efficient learning using examples is a longstanding open question. Best known upper bound is n O (log n ) [Verbeurgt’90] Quantumly: Bshouty-Jackson’95 gave a polynomial-time quantum algorithm! Proof sketch of quantum upper bound c ( U ) | ≥ 1 Structural property: if c is an s -term DNF, then there exists U s.t. | � s c ( T ) 2 } T , poly( s ) many times to see such a U Fourier sampling! Sample T ∼ { � Construct a “weak learner” who outputs χ U s.t. Pr[ χ U ( x ) = c ( x )] = 1 2 + 1 s Not good enough! Want an hypothesis that agrees with c on most inputs x ’s 12/ 18

  54. Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) We say a DNF on n variables is an s -term DNF if number of clauses is ≤ s Learning C = { c is an s -term DNF in n variables } under uniform D Classically: Efficient learning using examples is a longstanding open question. Best known upper bound is n O (log n ) [Verbeurgt’90] Quantumly: Bshouty-Jackson’95 gave a polynomial-time quantum algorithm! Proof sketch of quantum upper bound c ( U ) | ≥ 1 Structural property: if c is an s -term DNF, then there exists U s.t. | � s c ( T ) 2 } T , poly( s ) many times to see such a U Fourier sampling! Sample T ∼ { � Construct a “weak learner” who outputs χ U s.t. Pr[ χ U ( x ) = c ( x )] = 1 2 + 1 s Not good enough! Want an hypothesis that agrees with c on most inputs x ’s Boosting: Run weak learner many times in some manner to obtain a strong learner who outputs h satisfying Pr[ h ( x ) = c ( x )] ≥ 2 / 3 12/ 18

  55. Pretty good measurement for state identification 13/ 18

  56. Pretty good measurement for state identification Consider a concept class C consisting of n -bit Boolean functions. Let D : { 0 , 1 } n → [0 , 1] be a distribution 13/ 18

  57. Pretty good measurement for state identification Consider a concept class C consisting of n -bit Boolean functions. Let D : { 0 , 1 } n → [0 , 1] be a distribution � For c ∈ C , a quantum example is | ψ c � = � D ( x ) | x , c ( x ) � x ∈{ 0 , 1 } n 13/ 18

  58. Pretty good measurement for state identification Consider a concept class C consisting of n -bit Boolean functions. Let D : { 0 , 1 } n → [0 , 1] be a distribution � For c ∈ C , a quantum example is | ψ c � = � D ( x ) | x , c ( x ) � x ∈{ 0 , 1 } n State identification: For uniform c ∈ C (unknown), given | ψ c � ⊗ T , identify c 13/ 18

  59. Pretty good measurement for state identification Consider a concept class C consisting of n -bit Boolean functions. Let D : { 0 , 1 } n → [0 , 1] be a distribution � For c ∈ C , a quantum example is | ψ c � = � D ( x ) | x , c ( x ) � x ∈{ 0 , 1 } n State identification: For uniform c ∈ C (unknown), given | ψ c � ⊗ T , identify c Optimal measurement could be quite complicated, 13/ 18

  60. Pretty good measurement for state identification Consider a concept class C consisting of n -bit Boolean functions. Let D : { 0 , 1 } n → [0 , 1] be a distribution � For c ∈ C , a quantum example is | ψ c � = � D ( x ) | x , c ( x ) � x ∈{ 0 , 1 } n State identification: For uniform c ∈ C (unknown), given | ψ c � ⊗ T , identify c Optimal measurement could be quite complicated, but we can always use the Pretty Good Measurement (PGM) 13/ 18

  61. Pretty good measurement for state identification Consider a concept class C consisting of n -bit Boolean functions. Let D : { 0 , 1 } n → [0 , 1] be a distribution � For c ∈ C , a quantum example is | ψ c � = � D ( x ) | x , c ( x ) � x ∈{ 0 , 1 } n State identification: For uniform c ∈ C (unknown), given | ψ c � ⊗ T , identify c Optimal measurement could be quite complicated, but we can always use the Pretty Good Measurement (PGM) If P opt is the success probability of the optimal measurement, 13/ 18

  62. Pretty good measurement for state identification Consider a concept class C consisting of n -bit Boolean functions. Let D : { 0 , 1 } n → [0 , 1] be a distribution � For c ∈ C , a quantum example is | ψ c � = � D ( x ) | x , c ( x ) � x ∈{ 0 , 1 } n State identification: For uniform c ∈ C (unknown), given | ψ c � ⊗ T , identify c Optimal measurement could be quite complicated, but we can always use the Pretty Good Measurement (PGM) If P opt is the success probability of the optimal measurement, P pgm is the success probability of the PGM, 13/ 18

  63. Pretty good measurement for state identification Consider a concept class C consisting of n -bit Boolean functions. Let D : { 0 , 1 } n → [0 , 1] be a distribution � For c ∈ C , a quantum example is | ψ c � = � D ( x ) | x , c ( x ) � x ∈{ 0 , 1 } n State identification: For uniform c ∈ C (unknown), given | ψ c � ⊗ T , identify c Optimal measurement could be quite complicated, but we can always use the Pretty Good Measurement (PGM) If P opt is the success probability of the optimal measurement, P pgm is the success probability of the PGM, then P opt ≥ P pgm 13/ 18

  64. Pretty good measurement for state identification Consider a concept class C consisting of n -bit Boolean functions. Let D : { 0 , 1 } n → [0 , 1] be a distribution � For c ∈ C , a quantum example is | ψ c � = � D ( x ) | x , c ( x ) � x ∈{ 0 , 1 } n State identification: For uniform c ∈ C (unknown), given | ψ c � ⊗ T , identify c Optimal measurement could be quite complicated, but we can always use the Pretty Good Measurement (PGM) If P opt is the success probability of the optimal measurement, P pgm is the success probability of the PGM, then P opt ≥ P pgm ≥ P 2 opt (Barnum-Knill’02) 13/ 18

  65. Quantum examples help the coupon collector 14/ 18

  66. Quantum examples help the coupon collector Standard coupon collector Problem: Suppose there are N coupons. 14/ 18

  67. Quantum examples help the coupon collector Standard coupon collector Problem: Suppose there are N coupons. How many coupons to draw (with replacement) before having seen each coupon at least once? 14/ 18

Recommend


More recommend