Complexity of learning How to measure the efficiency of the classical or quantum learner? 8/ 18
Complexity of learning How to measure the efficiency of the classical or quantum learner? Sample complexity: number of labeled examples used by learner 8/ 18
Complexity of learning How to measure the efficiency of the classical or quantum learner? Sample complexity: number of labeled examples used by learner Time complexity: number of time-steps used by learner 8/ 18
Complexity of learning How to measure the efficiency of the classical or quantum learner? Sample complexity: number of labeled examples used by learner Time complexity: number of time-steps used by learner In this talk Strengths of quantum examples 8/ 18
Complexity of learning How to measure the efficiency of the classical or quantum learner? Sample complexity: number of labeled examples used by learner Time complexity: number of time-steps used by learner In this talk Strengths of quantum examples A CLW’18: Sample complexity of learning Fourier-sparse Boolean functions under uniform D 8/ 18
Complexity of learning How to measure the efficiency of the classical or quantum learner? Sample complexity: number of labeled examples used by learner Time complexity: number of time-steps used by learner In this talk Strengths of quantum examples A CLW’18: Sample complexity of learning Fourier-sparse Boolean functions under uniform D Bshouty-Jackson’95: Quantum polynomial time learnability of DNFs under uniform D 8/ 18
Complexity of learning How to measure the efficiency of the classical or quantum learner? Sample complexity: number of labeled examples used by learner Time complexity: number of time-steps used by learner In this talk Strengths of quantum examples A CLW’18: Sample complexity of learning Fourier-sparse Boolean functions under uniform D Bshouty-Jackson’95: Quantum polynomial time learnability of DNFs under uniform D A CKW’18: Quantum examples can help the coupon collector 8/ 18
Complexity of learning How to measure the efficiency of the classical or quantum learner? Sample complexity: number of labeled examples used by learner Time complexity: number of time-steps used by learner In this talk Strengths of quantum examples A CLW’18: Sample complexity of learning Fourier-sparse Boolean functions under uniform D Bshouty-Jackson’95: Quantum polynomial time learnability of DNFs under uniform D A CKW’18: Quantum examples can help the coupon collector Weaknesses of quantum examples A W’17: Quantum examples are not more powerful than classical examples for PAC learning 8/ 18
Fourier sampling: a useful trick under uniform D 9/ 18
Fourier sampling: a useful trick under uniform D Let c : { 0 , 1 } n → {− 1 , 1 } . 9/ 18
Fourier sampling: a useful trick under uniform D Let c : { 0 , 1 } n → {− 1 , 1 } . Then the Fourier coefficients are � c ( S ) = 1 c ( x )( − 1) S · x for all S ∈ { 0 , 1 } n � 2 n x ∈{ 0 , 1 } n 9/ 18
Fourier sampling: a useful trick under uniform D Let c : { 0 , 1 } n → {− 1 , 1 } . Then the Fourier coefficients are � c ( S ) = 1 c ( x )( − 1) S · x for all S ∈ { 0 , 1 } n � 2 n x ∈{ 0 , 1 } n Parseval’s identity: � c ( S ) 2 = S � 9/ 18
Fourier sampling: a useful trick under uniform D Let c : { 0 , 1 } n → {− 1 , 1 } . Then the Fourier coefficients are � c ( S ) = 1 c ( x )( − 1) S · x for all S ∈ { 0 , 1 } n � 2 n x ∈{ 0 , 1 } n Parseval’s identity: � c ( S ) 2 = E x [ c ( x ) 2 ] S � 9/ 18
Fourier sampling: a useful trick under uniform D Let c : { 0 , 1 } n → {− 1 , 1 } . Then the Fourier coefficients are � c ( S ) = 1 c ( x )( − 1) S · x for all S ∈ { 0 , 1 } n � 2 n x ∈{ 0 , 1 } n Parseval’s identity: � c ( S ) 2 = E x [ c ( x ) 2 ] = 1 S � 9/ 18
Fourier sampling: a useful trick under uniform D Let c : { 0 , 1 } n → {− 1 , 1 } . Then the Fourier coefficients are � c ( S ) = 1 c ( x )( − 1) S · x for all S ∈ { 0 , 1 } n � 2 n x ∈{ 0 , 1 } n Parseval’s identity: � c ( S ) 2 = E x [ c ( x ) 2 ] = 1 S � c ( S ) 2 } S forms a probability distribution So { � 9/ 18
Fourier sampling: a useful trick under uniform D Let c : { 0 , 1 } n → {− 1 , 1 } . Then the Fourier coefficients are � c ( S ) = 1 c ( x )( − 1) S · x for all S ∈ { 0 , 1 } n � 2 n x ∈{ 0 , 1 } n Parseval’s identity: � c ( S ) 2 = E x [ c ( x ) 2 ] = 1 S � c ( S ) 2 } S forms a probability distribution So { � Given quantum example under uniform D : � 1 √ | x , c ( x ) � 2 n x 9/ 18
Fourier sampling: a useful trick under uniform D Let c : { 0 , 1 } n → {− 1 , 1 } . Then the Fourier coefficients are � c ( S ) = 1 c ( x )( − 1) S · x for all S ∈ { 0 , 1 } n � 2 n x ∈{ 0 , 1 } n Parseval’s identity: � c ( S ) 2 = E x [ c ( x ) 2 ] = 1 S � c ( S ) 2 } S forms a probability distribution So { � Given quantum example under uniform D : � � 1 Hadamard √ � | x , c ( x ) � − → c ( S ) | S � 2 n x S 9/ 18
Fourier sampling: a useful trick under uniform D Let c : { 0 , 1 } n → {− 1 , 1 } . Then the Fourier coefficients are � c ( S ) = 1 c ( x )( − 1) S · x for all S ∈ { 0 , 1 } n � 2 n x ∈{ 0 , 1 } n Parseval’s identity: � c ( S ) 2 = E x [ c ( x ) 2 ] = 1 S � c ( S ) 2 } S forms a probability distribution So { � Given quantum example under uniform D : � � 1 Hadamard √ � | x , c ( x ) � − → c ( S ) | S � 2 n x S c ( S ) 2 } S Measuring allows to sample from the Fourier distribution { � 9/ 18
Applications of Fourier sampling 10/ 18
Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n 10/ 18
Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n Classical: Ω( n ) classical examples needed 10/ 18
Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n Classical: Ω( n ) classical examples needed Quantum: 1 quantum example suffices to learn C 1 (Bernstein-Vazirani’93) 10/ 18
Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n Classical: Ω( n ) classical examples needed Quantum: 1 quantum example suffices to learn C 1 (Bernstein-Vazirani’93) Consider C 2 = { c is a ℓ -junta } , i.e., c ( x ) depends only on ℓ bits of x 10/ 18
Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n Classical: Ω( n ) classical examples needed Quantum: 1 quantum example suffices to learn C 1 (Bernstein-Vazirani’93) Consider C 2 = { c is a ℓ -junta } , i.e., c ( x ) depends only on ℓ bits of x Classical: Efficient learning is notoriously hard for ℓ = O (log n ) and uniform D 10/ 18
Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n Classical: Ω( n ) classical examples needed Quantum: 1 quantum example suffices to learn C 1 (Bernstein-Vazirani’93) Consider C 2 = { c is a ℓ -junta } , i.e., c ( x ) depends only on ℓ bits of x Classical: Efficient learning is notoriously hard for ℓ = O (log n ) and uniform D Quantum: C 2 can be exactly learnt using � O (2 ℓ ) quantum examples and in time O ( n 2 ℓ + 2 2 ℓ ) (Atıcı-Servedio’09) � 10/ 18
Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n Classical: Ω( n ) classical examples needed Quantum: 1 quantum example suffices to learn C 1 (Bernstein-Vazirani’93) Consider C 2 = { c is a ℓ -junta } , i.e., c ( x ) depends only on ℓ bits of x Classical: Efficient learning is notoriously hard for ℓ = O (log n ) and uniform D Quantum: C 2 can be exactly learnt using � O (2 ℓ ) quantum examples and in time O ( n 2 ℓ + 2 2 ℓ ) (Atıcı-Servedio’09) � Generalizing both these concept classes? Definition: We say c is k -Fourier sparse if |{ S : � c ( S ) � = 0 }| ≤ k . 10/ 18
Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n Classical: Ω( n ) classical examples needed Quantum: 1 quantum example suffices to learn C 1 (Bernstein-Vazirani’93) Consider C 2 = { c is a ℓ -junta } , i.e., c ( x ) depends only on ℓ bits of x Classical: Efficient learning is notoriously hard for ℓ = O (log n ) and uniform D Quantum: C 2 can be exactly learnt using � O (2 ℓ ) quantum examples and in time O ( n 2 ℓ + 2 2 ℓ ) (Atıcı-Servedio’09) � Generalizing both these concept classes? Definition: We say c is k -Fourier sparse if |{ S : � c ( S ) � = 0 }| ≤ k . Note that C 1 is 1-Fourier sparse and C 2 is 2 ℓ -Fourier sparse 10/ 18
Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n Classical: Ω( n ) classical examples needed Quantum: 1 quantum example suffices to learn C 1 (Bernstein-Vazirani’93) Consider C 2 = { c is a ℓ -junta } , i.e., c ( x ) depends only on ℓ bits of x Classical: Efficient learning is notoriously hard for ℓ = O (log n ) and uniform D Quantum: C 2 can be exactly learnt using � O (2 ℓ ) quantum examples and in time O ( n 2 ℓ + 2 2 ℓ ) (Atıcı-Servedio’09) � Generalizing both these concept classes? Definition: We say c is k -Fourier sparse if |{ S : � c ( S ) � = 0 }| ≤ k . Note that C 1 is 1-Fourier sparse and C 2 is 2 ℓ -Fourier sparse Consider the concept class C = { c : { 0 , 1 } n → {− 1 , 1 } : c is k -Fourier sparse } 10/ 18
Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n Classical: Ω( n ) classical examples needed Quantum: 1 quantum example suffices to learn C 1 (Bernstein-Vazirani’93) Consider C 2 = { c is a ℓ -junta } , i.e., c ( x ) depends only on ℓ bits of x Classical: Efficient learning is notoriously hard for ℓ = O (log n ) and uniform D Quantum: C 2 can be exactly learnt using � O (2 ℓ ) quantum examples and in time O ( n 2 ℓ + 2 2 ℓ ) (Atıcı-Servedio’09) � Generalizing both these concept classes? Definition: We say c is k -Fourier sparse if |{ S : � c ( S ) � = 0 }| ≤ k . Note that C 1 is 1-Fourier sparse and C 2 is 2 ℓ -Fourier sparse Consider the concept class C = { c : { 0 , 1 } n → {− 1 , 1 } : c is k -Fourier sparse } Observe that C 1 ⊆ C . C contains linear functions 10/ 18
Applications of Fourier sampling Consider the concept class of linear functions C 1 = { c S ( x ) = S · x } S ∈{ 0 , 1 } n Classical: Ω( n ) classical examples needed Quantum: 1 quantum example suffices to learn C 1 (Bernstein-Vazirani’93) Consider C 2 = { c is a ℓ -junta } , i.e., c ( x ) depends only on ℓ bits of x Classical: Efficient learning is notoriously hard for ℓ = O (log n ) and uniform D Quantum: C 2 can be exactly learnt using � O (2 ℓ ) quantum examples and in time O ( n 2 ℓ + 2 2 ℓ ) (Atıcı-Servedio’09) � Generalizing both these concept classes? Definition: We say c is k -Fourier sparse if |{ S : � c ( S ) � = 0 }| ≤ k . Note that C 1 is 1-Fourier sparse and C 2 is 2 ℓ -Fourier sparse Consider the concept class C = { c : { 0 , 1 } n → {− 1 , 1 } : c is k -Fourier sparse } Observe that C 1 ⊆ C . C contains linear functions Observe that C 2 ⊆ C . C contains (log k )-juntas 10/ 18
Learning C = { c is k -Fourier sparse } 11/ 18
Learning C = { c is k -Fourier sparse } Exact learning C under the uniform distribution D 11/ 18
Learning C = { c is k -Fourier sparse } Exact learning C under the uniform distribution D Classically (Haviv-Regev’15): � Θ( nk ) classical examples ( x , c ( x )) are necessary and sufficient to learn the concept class C 11/ 18
Learning C = { c is k -Fourier sparse } Exact learning C under the uniform distribution D Classically (Haviv-Regev’15): � Θ( nk ) classical examples ( x , c ( x )) are necessary and sufficient to learn the concept class C � Quantumly ( A CLW’18): � O ( k 1 . 5 ) quantum examples 1 x | x , c ( x ) � are √ 2 n sufficient to learn C (independent of the universe size n ) 11/ 18
Learning C = { c is k -Fourier sparse } Exact learning C under the uniform distribution D Classically (Haviv-Regev’15): � Θ( nk ) classical examples ( x , c ( x )) are necessary and sufficient to learn the concept class C � Quantumly ( A CLW’18): � O ( k 1 . 5 ) quantum examples 1 x | x , c ( x ) � are √ 2 n sufficient to learn C (independent of the universe size n ) � Ω( k ) examples are necessary to learn C 11/ 18
Learning C = { c is k -Fourier sparse } Exact learning C under the uniform distribution D Classically (Haviv-Regev’15): � Θ( nk ) classical examples ( x , c ( x )) are necessary and sufficient to learn the concept class C � Quantumly ( A CLW’18): � O ( k 1 . 5 ) quantum examples 1 x | x , c ( x ) � are √ 2 n sufficient to learn C (independent of the universe size n ) � Ω( k ) examples are necessary to learn C Sketch of upper bound 11/ 18
Learning C = { c is k -Fourier sparse } Exact learning C under the uniform distribution D Classically (Haviv-Regev’15): � Θ( nk ) classical examples ( x , c ( x )) are necessary and sufficient to learn the concept class C � Quantumly ( A CLW’18): � O ( k 1 . 5 ) quantum examples 1 x | x , c ( x ) � are √ 2 n sufficient to learn C (independent of the universe size n ) � Ω( k ) examples are necessary to learn C Sketch of upper bound c ( S ) 2 } S Use Fourier sampling to sample S ∼ { � 11/ 18
Learning C = { c is k -Fourier sparse } Exact learning C under the uniform distribution D Classically (Haviv-Regev’15): � Θ( nk ) classical examples ( x , c ( x )) are necessary and sufficient to learn the concept class C � Quantumly ( A CLW’18): � O ( k 1 . 5 ) quantum examples 1 x | x , c ( x ) � are √ 2 n sufficient to learn C (independent of the universe size n ) � Ω( k ) examples are necessary to learn C Sketch of upper bound c ( S ) 2 } S Use Fourier sampling to sample S ∼ { � Collect S s until the learner learns the Fourier span of c , V = span { S : � c ( S ) � = 0 } 11/ 18
Learning C = { c is k -Fourier sparse } Exact learning C under the uniform distribution D Classically (Haviv-Regev’15): � Θ( nk ) classical examples ( x , c ( x )) are necessary and sufficient to learn the concept class C � Quantumly ( A CLW’18): � O ( k 1 . 5 ) quantum examples 1 x | x , c ( x ) � are √ 2 n sufficient to learn C (independent of the universe size n ) � Ω( k ) examples are necessary to learn C Sketch of upper bound c ( S ) 2 } S Use Fourier sampling to sample S ∼ { � Collect S s until the learner learns the Fourier span of c , V = span { S : � c ( S ) � = 0 } Suppose dim( V ) = r , then � O ( rk ) quantum examples suffice to find V 11/ 18
Learning C = { c is k -Fourier sparse } Exact learning C under the uniform distribution D Classically (Haviv-Regev’15): � Θ( nk ) classical examples ( x , c ( x )) are necessary and sufficient to learn the concept class C � Quantumly ( A CLW’18): � O ( k 1 . 5 ) quantum examples 1 x | x , c ( x ) � are √ 2 n sufficient to learn C (independent of the universe size n ) � Ω( k ) examples are necessary to learn C Sketch of upper bound c ( S ) 2 } S Use Fourier sampling to sample S ∼ { � Collect S s until the learner learns the Fourier span of c , V = span { S : � c ( S ) � = 0 } Suppose dim( V ) = r , then � O ( rk ) quantum examples suffice to find V Use the result of [HR’15] to learn c ′ completely using � O ( rk ) classical examples 11/ 18
Learning C = { c is k -Fourier sparse } Exact learning C under the uniform distribution D Classically (Haviv-Regev’15): � Θ( nk ) classical examples ( x , c ( x )) are necessary and sufficient to learn the concept class C � Quantumly ( A CLW’18): � O ( k 1 . 5 ) quantum examples 1 x | x , c ( x ) � are √ 2 n sufficient to learn C (independent of the universe size n ) � Ω( k ) examples are necessary to learn C Sketch of upper bound c ( S ) 2 } S Use Fourier sampling to sample S ∼ { � Collect S s until the learner learns the Fourier span of c , V = span { S : � c ( S ) � = 0 } Suppose dim( V ) = r , then � O ( rk ) quantum examples suffice to find V Use the result of [HR’15] to learn c ′ completely using � O ( rk ) classical examples √ Since r ≤ � k ) for every c ∈ C [Sanyal’15], we get � O ( k 1 . 5 ) upper bound O ( 11/ 18
Learning Disjunctive normal Forms (DNF) 12/ 18
Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. 12/ 18
Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) 12/ 18
Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) We say a DNF on n variables is an s -term DNF if number of clauses is ≤ s 12/ 18
Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) We say a DNF on n variables is an s -term DNF if number of clauses is ≤ s Learning C = { c is an s -term DNF in n variables } under uniform D 12/ 18
Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) We say a DNF on n variables is an s -term DNF if number of clauses is ≤ s Learning C = { c is an s -term DNF in n variables } under uniform D Classically: Efficient learning using examples is a longstanding open question. Best known upper bound is n O (log n ) [Verbeurgt’90] 12/ 18
Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) We say a DNF on n variables is an s -term DNF if number of clauses is ≤ s Learning C = { c is an s -term DNF in n variables } under uniform D Classically: Efficient learning using examples is a longstanding open question. Best known upper bound is n O (log n ) [Verbeurgt’90] Quantumly: Bshouty-Jackson’95 gave a polynomial-time quantum algorithm! 12/ 18
Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) We say a DNF on n variables is an s -term DNF if number of clauses is ≤ s Learning C = { c is an s -term DNF in n variables } under uniform D Classically: Efficient learning using examples is a longstanding open question. Best known upper bound is n O (log n ) [Verbeurgt’90] Quantumly: Bshouty-Jackson’95 gave a polynomial-time quantum algorithm! Proof sketch of quantum upper bound 12/ 18
Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) We say a DNF on n variables is an s -term DNF if number of clauses is ≤ s Learning C = { c is an s -term DNF in n variables } under uniform D Classically: Efficient learning using examples is a longstanding open question. Best known upper bound is n O (log n ) [Verbeurgt’90] Quantumly: Bshouty-Jackson’95 gave a polynomial-time quantum algorithm! Proof sketch of quantum upper bound c ( U ) | ≥ 1 Structural property: if c is an s -term DNF, then there exists U s.t. | � s 12/ 18
Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) We say a DNF on n variables is an s -term DNF if number of clauses is ≤ s Learning C = { c is an s -term DNF in n variables } under uniform D Classically: Efficient learning using examples is a longstanding open question. Best known upper bound is n O (log n ) [Verbeurgt’90] Quantumly: Bshouty-Jackson’95 gave a polynomial-time quantum algorithm! Proof sketch of quantum upper bound c ( U ) | ≥ 1 Structural property: if c is an s -term DNF, then there exists U s.t. | � s c ( T ) 2 } T , poly( s ) many times to see such a U Fourier sampling! Sample T ∼ { � 12/ 18
Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) We say a DNF on n variables is an s -term DNF if number of clauses is ≤ s Learning C = { c is an s -term DNF in n variables } under uniform D Classically: Efficient learning using examples is a longstanding open question. Best known upper bound is n O (log n ) [Verbeurgt’90] Quantumly: Bshouty-Jackson’95 gave a polynomial-time quantum algorithm! Proof sketch of quantum upper bound c ( U ) | ≥ 1 Structural property: if c is an s -term DNF, then there exists U s.t. | � s c ( T ) 2 } T , poly( s ) many times to see such a U Fourier sampling! Sample T ∼ { � Construct a “weak learner” who outputs χ U s.t. Pr[ χ U ( x ) = c ( x )] = 1 2 + 1 s 12/ 18
Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) We say a DNF on n variables is an s -term DNF if number of clauses is ≤ s Learning C = { c is an s -term DNF in n variables } under uniform D Classically: Efficient learning using examples is a longstanding open question. Best known upper bound is n O (log n ) [Verbeurgt’90] Quantumly: Bshouty-Jackson’95 gave a polynomial-time quantum algorithm! Proof sketch of quantum upper bound c ( U ) | ≥ 1 Structural property: if c is an s -term DNF, then there exists U s.t. | � s c ( T ) 2 } T , poly( s ) many times to see such a U Fourier sampling! Sample T ∼ { � Construct a “weak learner” who outputs χ U s.t. Pr[ χ U ( x ) = c ( x )] = 1 2 + 1 s Not good enough! Want an hypothesis that agrees with c on most inputs x ’s 12/ 18
Learning Disjunctive normal Forms (DNF) DNFs Simply an OR of AND of variables. For example, ( x 1 ∧ x 4 ∧ x 3 ) ∨ ( x 4 ∧ x 6 ∧ x 7 ∧ x 8 ) We say a DNF on n variables is an s -term DNF if number of clauses is ≤ s Learning C = { c is an s -term DNF in n variables } under uniform D Classically: Efficient learning using examples is a longstanding open question. Best known upper bound is n O (log n ) [Verbeurgt’90] Quantumly: Bshouty-Jackson’95 gave a polynomial-time quantum algorithm! Proof sketch of quantum upper bound c ( U ) | ≥ 1 Structural property: if c is an s -term DNF, then there exists U s.t. | � s c ( T ) 2 } T , poly( s ) many times to see such a U Fourier sampling! Sample T ∼ { � Construct a “weak learner” who outputs χ U s.t. Pr[ χ U ( x ) = c ( x )] = 1 2 + 1 s Not good enough! Want an hypothesis that agrees with c on most inputs x ’s Boosting: Run weak learner many times in some manner to obtain a strong learner who outputs h satisfying Pr[ h ( x ) = c ( x )] ≥ 2 / 3 12/ 18
Pretty good measurement for state identification 13/ 18
Pretty good measurement for state identification Consider a concept class C consisting of n -bit Boolean functions. Let D : { 0 , 1 } n → [0 , 1] be a distribution 13/ 18
Pretty good measurement for state identification Consider a concept class C consisting of n -bit Boolean functions. Let D : { 0 , 1 } n → [0 , 1] be a distribution � For c ∈ C , a quantum example is | ψ c � = � D ( x ) | x , c ( x ) � x ∈{ 0 , 1 } n 13/ 18
Pretty good measurement for state identification Consider a concept class C consisting of n -bit Boolean functions. Let D : { 0 , 1 } n → [0 , 1] be a distribution � For c ∈ C , a quantum example is | ψ c � = � D ( x ) | x , c ( x ) � x ∈{ 0 , 1 } n State identification: For uniform c ∈ C (unknown), given | ψ c � ⊗ T , identify c 13/ 18
Pretty good measurement for state identification Consider a concept class C consisting of n -bit Boolean functions. Let D : { 0 , 1 } n → [0 , 1] be a distribution � For c ∈ C , a quantum example is | ψ c � = � D ( x ) | x , c ( x ) � x ∈{ 0 , 1 } n State identification: For uniform c ∈ C (unknown), given | ψ c � ⊗ T , identify c Optimal measurement could be quite complicated, 13/ 18
Pretty good measurement for state identification Consider a concept class C consisting of n -bit Boolean functions. Let D : { 0 , 1 } n → [0 , 1] be a distribution � For c ∈ C , a quantum example is | ψ c � = � D ( x ) | x , c ( x ) � x ∈{ 0 , 1 } n State identification: For uniform c ∈ C (unknown), given | ψ c � ⊗ T , identify c Optimal measurement could be quite complicated, but we can always use the Pretty Good Measurement (PGM) 13/ 18
Pretty good measurement for state identification Consider a concept class C consisting of n -bit Boolean functions. Let D : { 0 , 1 } n → [0 , 1] be a distribution � For c ∈ C , a quantum example is | ψ c � = � D ( x ) | x , c ( x ) � x ∈{ 0 , 1 } n State identification: For uniform c ∈ C (unknown), given | ψ c � ⊗ T , identify c Optimal measurement could be quite complicated, but we can always use the Pretty Good Measurement (PGM) If P opt is the success probability of the optimal measurement, 13/ 18
Pretty good measurement for state identification Consider a concept class C consisting of n -bit Boolean functions. Let D : { 0 , 1 } n → [0 , 1] be a distribution � For c ∈ C , a quantum example is | ψ c � = � D ( x ) | x , c ( x ) � x ∈{ 0 , 1 } n State identification: For uniform c ∈ C (unknown), given | ψ c � ⊗ T , identify c Optimal measurement could be quite complicated, but we can always use the Pretty Good Measurement (PGM) If P opt is the success probability of the optimal measurement, P pgm is the success probability of the PGM, 13/ 18
Pretty good measurement for state identification Consider a concept class C consisting of n -bit Boolean functions. Let D : { 0 , 1 } n → [0 , 1] be a distribution � For c ∈ C , a quantum example is | ψ c � = � D ( x ) | x , c ( x ) � x ∈{ 0 , 1 } n State identification: For uniform c ∈ C (unknown), given | ψ c � ⊗ T , identify c Optimal measurement could be quite complicated, but we can always use the Pretty Good Measurement (PGM) If P opt is the success probability of the optimal measurement, P pgm is the success probability of the PGM, then P opt ≥ P pgm 13/ 18
Pretty good measurement for state identification Consider a concept class C consisting of n -bit Boolean functions. Let D : { 0 , 1 } n → [0 , 1] be a distribution � For c ∈ C , a quantum example is | ψ c � = � D ( x ) | x , c ( x ) � x ∈{ 0 , 1 } n State identification: For uniform c ∈ C (unknown), given | ψ c � ⊗ T , identify c Optimal measurement could be quite complicated, but we can always use the Pretty Good Measurement (PGM) If P opt is the success probability of the optimal measurement, P pgm is the success probability of the PGM, then P opt ≥ P pgm ≥ P 2 opt (Barnum-Knill’02) 13/ 18
Quantum examples help the coupon collector 14/ 18
Quantum examples help the coupon collector Standard coupon collector Problem: Suppose there are N coupons. 14/ 18
Quantum examples help the coupon collector Standard coupon collector Problem: Suppose there are N coupons. How many coupons to draw (with replacement) before having seen each coupon at least once? 14/ 18
Recommend
More recommend