Stochastic Subgradient Descent � 12 Convex function f : C → ℝ on a convex set C . (not necessarily di ff erentiable) Subgradient at x : slope g(x) of any line that is below the graph of f and intersects it at x . Stochastic Subgradient at x : random variable satisfying g ( x ) ˜ E [˜ g ( x )] = g ( x ) (projected) Stochastic Subgradient Descent
Stochastic Subgradient Descent � 12 Convex function f : C → ℝ on a convex set C . (not necessarily di ff erentiable) Subgradient at x : slope g(x) of any line that is below the graph of f and intersects it at x . Stochastic Subgradient at x : random variable satisfying g ( x ) ˜ E [˜ g ( x )] = g ( x ) (projected) Stochastic Subgradient Descent x t C
Stochastic Subgradient Descent � 12 Convex function f : C → ℝ on a convex set C . (not necessarily di ff erentiable) Subgradient at x : slope g(x) of any line that is below the graph of f and intersects it at x . Stochastic Subgradient at x : random variable satisfying g ( x ) ˜ E [˜ g ( x )] = g ( x ) (projected) Stochastic Subgradient Descent x t − η ˜ g ( x t ) C
Stochastic Subgradient Descent � 12 Convex function f : C → ℝ on a convex set C . (not necessarily di ff erentiable) Subgradient at x : slope g(x) of any line that is below the graph of f and intersects it at x . Stochastic Subgradient at x : random variable satisfying g ( x ) ˜ E [˜ g ( x )] = g ( x ) (projected) Stochastic Subgradient Descent x t − η ˜ g ( x t ) projection x t +1 C
Stochastic Subgradient Descent 12 � Convex function f : C → ℝ on a convex set C . (not necessarily di ff erentiable) Subgradient at x : slope g(x) of any line that is below the graph of f and intersects it at x . Stochastic Subgradient at x : random variable satisfying g ( x ) ˜ E [˜ g ( x )] = g ( x ) (projected) Stochastic Subgradient Descent x t − η ˜ g ( x t ) projection x t +1 C If has low variance then the number of steps is the same as if we were using g(x) . g ( x ) ˜
Stochastic Subgradient for the Lovász extension 13 � For the Lovász extension f , there exists a subgradient g(x) such that:
Stochastic Subgradient for the Lovász extension 13 � For the Lovász extension f , there exists a subgradient g(x) such that: • each coordinate g(x) i can be computed with two queries to F
Stochastic Subgradient for the Lovász extension � 13 For the Lovász extension f , there exists a subgradient g(x) such that: • each coordinate g(x) i can be computed with two queries to F • subgradient descent requires steps to get an ε -minimizer of f O ( n / ϵ 2 ) (Jegelka, Bilmes 2011) and (Hazan, Kale 2012)
Stochastic Subgradient for the Lovász extension � 13 For the Lovász extension f , there exists a subgradient g(x) such that: • each coordinate g(x) i can be computed with two queries to F • subgradient descent requires steps to get an ε -minimizer of f O ( n / ϵ 2 ) (Jegelka, Bilmes 2011) and (Hazan, Kale 2012) A stochastic subgradient for g(x) can be computed in time: g ( x ) ˜
Stochastic Subgradient for the Lovász extension � 13 For the Lovász extension f , there exists a subgradient g(x) such that: • each coordinate g(x) i can be computed with two queries to F • subgradient descent requires steps to get an ε -minimizer of f O ( n / ϵ 2 ) (Jegelka, Bilmes 2011) and (Hazan, Kale 2012) A stochastic subgradient for g(x) can be computed in time: g ( x ) ˜ ˜ • Chakrabarty, Lee, Sidford, Wong 2017: O ( n 2/3 )
Stochastic Subgradient for the Lovász extension � 13 For the Lovász extension f , there exists a subgradient g(x) such that: • each coordinate g(x) i can be computed with two queries to F • subgradient descent requires steps to get an ε -minimizer of f O ( n / ϵ 2 ) (Jegelka, Bilmes 2011) and (Hazan, Kale 2012) A stochastic subgradient for g(x) can be computed in time: g ( x ) ˜ ˜ • Chakrabarty, Lee, Sidford, Wong 2017: O ( n 2/3 ) • Our result: ˜ ˜ O ( n 1/2 ) O ( n 1/4 / ϵ 1/2 ) or (classical) (quantum)
Stochastic Subgradient for the Lovász extension � 13 For the Lovász extension f , there exists a subgradient g(x) such that: • each coordinate g(x) i can be computed with two queries to F • subgradient descent requires steps to get an ε -minimizer of f O ( n / ϵ 2 ) (Jegelka, Bilmes 2011) and (Hazan, Kale 2012) A stochastic subgradient for g(x) can be computed in time: g ( x ) ˜ ˜ • Chakrabarty, Lee, Sidford, Wong 2017: O ( n 2/3 ) • Our result: ˜ ˜ O ( n 1/2 ) O ( n 1/4 / ϵ 1/2 ) or (classical) (quantum) • ˜ Axelrod, Liu, Sidford 2019: O (1)
Stochastic Subgradient for the Lovász extension � 14 First attempt to construct : g ( x ) ˜
Stochastic Subgradient for the Lovász extension � 14 First attempt to construct : g ( x ) ˜ For any non-zero vector u ∈ R n , define the random variable
̂ Stochastic Subgradient for the Lovász extension � 14 First attempt to construct : g ( x ) ˜ For any non-zero vector u ∈ R n , define the random variable i -th coordinate u = (0, … , 0 , sgn( u i ) ∥ u ∥ 1 , 0 , … , 0) p i = | u i | where i is sampled with probability ∥ u ∥ 1
̂ ⃗ ⃗ Stochastic Subgradient for the Lovász extension � 14 First attempt to construct : g ( x ) ˜ For any non-zero vector u ∈ R n , define the random variable i -th coordinate u = (0, … , 0 , sgn( u i ) ∥ u ∥ 1 , 0 , … , 0) p i = | u i | where i is sampled with probability ∥ u ∥ 1 u ] = ∑ e i = ∑ | u i | E [ ̂ sgn( u i ) ∥ u ∥ 1 ⋅ e i = u u i ⋅ Unbiased: ∥ u ∥ 1 i i
̂ ⃗ ⃗ Stochastic Subgradient for the Lovász extension � 14 First attempt to construct : g ( x ) ˜ For any non-zero vector u ∈ R n , define the random variable i -th coordinate u = (0, … , 0 , sgn( u i ) ∥ u ∥ 1 , 0 , … , 0) p i = | u i | where i is sampled with probability ∥ u ∥ 1 u ] = ∑ e i = ∑ | u i | E [ ̂ sgn( u i ) ∥ u ∥ 1 ⋅ e i = u u i ⋅ Unbiased: ∥ u ∥ 1 i i 2 ] = ∑ | u i | 2 nd moment: u ∥ 2 sgn( u i ) 2 ∥ u ∥ 2 1 = ∥ u ∥ 2 E [ ∥ ̂ 1 ∥ u ∥ 1 i
⃗ ̂ ⃗ Stochastic Subgradient for the Lovász extension 14 � First attempt to construct : g ( x ) ˜ For any non-zero vector u ∈ R n , define the random variable i -th coordinate u = (0, … , 0 , sgn( u i ) ∥ u ∥ 1 , 0 , … , 0) p i = | u i | where i is sampled with probability ∥ u ∥ 1 u ] = ∑ e i = ∑ | u i | E [ ̂ sgn( u i ) ∥ u ∥ 1 ⋅ e i = u u i ⋅ Unbiased: ∥ u ∥ 1 i i 2 ] = ∑ | u i | 2 nd moment: u ∥ 2 sgn( u i ) 2 ∥ u ∥ 2 1 = ∥ u ∥ 2 E [ ∥ ̂ 1 ∥ u ∥ 1 i For the Lovász extension: u = g(x) and || g(x) || 1 = O (1) (low variance) (Jegelka, Bilmes 2011)
̂ ⃗ ⃗ Stochastic Subgradient for the Lovász extension � 14 First attempt to construct : g ( x ) ˜ For any non-zero vector u ∈ R n , define the random variable i -th coordinate u = (0, … , 0 , sgn( u i ) ∥ u ∥ 1 , 0 , … , 0) ✗ p i = | u i | Hard to sample where i is sampled with probability ∥ u ∥ 1 (Importance sampling) u ] = ∑ e i = ∑ | u i | E [ ̂ sgn( u i ) ∥ u ∥ 1 ⋅ e i = u u i ⋅ Unbiased: ∥ u ∥ 1 i i 2 ] = ∑ | u i | 2 nd moment: u ∥ 2 sgn( u i ) 2 ∥ u ∥ 2 1 = ∥ u ∥ 2 E [ ∥ ̂ 1 ∥ u ∥ 1 i For the Lovász extension: u = g(x) and || g(x) || 1 = O (1) (low variance) (Jegelka, Bilmes 2011)
Stochastic Subgradient for the Lovász extension 15 � Second attempt:
Stochastic Subgradient for the Lovász extension � 15 Second attempt: ˜ Tool: there is an unbiased estimate of that can be d ( x , y ) d ( x , y ) = g ( y ) − g ( x ) computed e ffi ciently when is sparse. x − y (Chakrabarty, Lee, Sidford, Wong 2017)
Stochastic Subgradient for the Lovász extension � 15 Second attempt: ˜ Tool: there is an unbiased estimate of that can be d ( x , y ) d ( x , y ) = g ( y ) − g ( x ) computed e ffi ciently when is sparse. x − y (Chakrabarty, Lee, Sidford, Wong 2017) Our construction:
̂ Stochastic Subgradient for the Lovász extension � 15 Second attempt: ˜ Tool: there is an unbiased estimate of that can be d ( x , y ) d ( x , y ) = g ( y ) − g ( x ) computed e ffi ciently when is sparse. x − y (Chakrabarty, Lee, Sidford, Wong 2017) Our construction: x 0 ⟶ g ( x 0 )
̂ ̂ Stochastic Subgradient for the Lovász extension � 15 Second attempt: ˜ Tool: there is an unbiased estimate of that can be d ( x , y ) d ( x , y ) = g ( y ) − g ( x ) computed e ffi ciently when is sparse. x − y (Chakrabarty, Lee, Sidford, Wong 2017) Our construction: x 0 ⟶ g ( x 0 ) x 1 ⟶ g ( x 0 ) + ˜ d ( x 0 , x 1 )
̂ ̂ ̂ Stochastic Subgradient for the Lovász extension 15 � Second attempt: ˜ Tool: there is an unbiased estimate of that can be d ( x , y ) d ( x , y ) = g ( y ) − g ( x ) computed e ffi ciently when is sparse. x − y (Chakrabarty, Lee, Sidford, Wong 2017) Our construction: x 0 ⟶ g ( x 0 ) x 1 ⟶ g ( x 0 ) + ˜ d ( x 0 , x 1 ) x 2 ⟶ g ( x 0 ) + ˜ d ( x 0 , x 2 )
̂ ̂ ̂ ̂ Stochastic Subgradient for the Lovász extension 15 � Second attempt: ˜ Tool: there is an unbiased estimate of that can be d ( x , y ) d ( x , y ) = g ( y ) − g ( x ) computed e ffi ciently when is sparse. x − y (Chakrabarty, Lee, Sidford, Wong 2017) Our construction: ( T = parameter to be optimized) x 0 ⟶ g ( x 0 ) x 1 ⟶ g ( x 0 ) + ˜ d ( x 0 , x 1 ) x 2 ⟶ g ( x 0 ) + ˜ d ( x 0 , x 2 ) ⋮ x T − 1 ⟶ g ( x 0 ) + ˜ d ( x 0 , x T − 1 )
̂ ̂ ̂ ̂ ̂ Stochastic Subgradient for the Lovász extension 15 � Second attempt: ˜ Tool: there is an unbiased estimate of that can be d ( x , y ) d ( x , y ) = g ( y ) − g ( x ) computed e ffi ciently when is sparse. x − y (Chakrabarty, Lee, Sidford, Wong 2017) Our construction: ( T = parameter to be optimized) x 0 ⟶ x T ⟶ g ( x 0 ) g ( x T ) x 1 ⟶ g ( x 0 ) + ˜ d ( x 0 , x 1 ) x 2 ⟶ g ( x 0 ) + ˜ d ( x 0 , x 2 ) ⋮ x T − 1 ⟶ g ( x 0 ) + ˜ d ( x 0 , x T − 1 )
̂ ̂ ̂ ̂ ̂ ̂ Stochastic Subgradient for the Lovász extension � 15 Second attempt: ˜ Tool: there is an unbiased estimate of that can be d ( x , y ) d ( x , y ) = g ( y ) − g ( x ) computed e ffi ciently when is sparse. x − y (Chakrabarty, Lee, Sidford, Wong 2017) Our construction: ( T = parameter to be optimized) x 0 ⟶ x T ⟶ g ( x 0 ) g ( x T ) x 1 ⟶ x T +1 ⟶ g ( x 0 ) + ˜ g ( x T ) + ˜ d ( x 0 , x 1 ) d ( x T , x T +1 ) x 2 ⟶ g ( x 0 ) + ˜ d ( x 0 , x 2 ) ⋮ x T − 1 ⟶ g ( x 0 ) + ˜ d ( x 0 , x T − 1 )
̂ ̂ ̂ ̂ ̂ ̂ ̂ ̂ Stochastic Subgradient for the Lovász extension � 15 Second attempt: ˜ Tool: there is an unbiased estimate of that can be d ( x , y ) d ( x , y ) = g ( y ) − g ( x ) computed e ffi ciently when is sparse. x − y (Chakrabarty, Lee, Sidford, Wong 2017) Our construction: ( T = parameter to be optimized) x 0 ⟶ x T ⟶ g ( x 0 ) g ( x T ) x 1 ⟶ x T +1 ⟶ g ( x 0 ) + ˜ g ( x T ) + ˜ d ( x 0 , x 1 ) d ( x T , x T +1 ) x 2 ⟶ x T +2 ⟶ g ( x 0 ) + ˜ d ( x 0 , x 2 ) g ( x T ) + ˜ d ( x T , x T +2 ) ⋮ ⋮ x 2 T − 1 ⟶ x T − 1 ⟶ g ( x T ) + ˜ d ( x T , x 2 T − 1 ) g ( x 0 ) + ˜ d ( x 0 , x T − 1 )
̂ ̂ ̂ ̂ ̂ ̂ ̂ ̂ ̂ Stochastic Subgradient for the Lovász extension 15 � Second attempt: ˜ Tool: there is an unbiased estimate of that can be d ( x , y ) d ( x , y ) = g ( y ) − g ( x ) computed e ffi ciently when is sparse. x − y (Chakrabarty, Lee, Sidford, Wong 2017) Our construction: ( T = parameter to be optimized) x 0 ⟶ x 2 T ⟶ x T ⟶ g ( x 0 ) g ( x 2 T ) g ( x T ) x 1 ⟶ x T +1 ⟶ g ( x 0 ) + ˜ g ( x T ) + ˜ d ( x 0 , x 1 ) d ( x T , x T +1 ) ⋯ x 2 ⟶ x T +2 ⟶ g ( x 0 ) + ˜ d ( x 0 , x 2 ) g ( x T ) + ˜ d ( x T , x T +2 ) ⋮ ⋮ x 2 T − 1 ⟶ x T − 1 ⟶ g ( x T ) + ˜ d ( x T , x 2 T − 1 ) g ( x 0 ) + ˜ d ( x 0 , x T − 1 )
̂ ̂ ̂ ̂ ̂ ̂ ̂ ̂ ̂ Stochastic Subgradient for the Lovász extension � 15 Second attempt: ˜ Tool: there is an unbiased estimate of that can be d ( x , y ) d ( x , y ) = g ( y ) − g ( x ) computed e ffi ciently when is sparse. x − y (Chakrabarty, Lee, Sidford, Wong 2017) Our construction: ( T = parameter to be optimized) x 0 ⟶ x 2 T ⟶ x T ⟶ g ( x 0 ) g ( x 2 T ) g ( x T ) x 1 ⟶ x T +1 ⟶ g ( x 0 ) + ˜ g ( x T ) + ˜ d ( x 0 , x 1 ) d ( x T , x T +1 ) ⋯ x 2 ⟶ x T +2 ⟶ g ( x 0 ) + ˜ d ( x 0 , x 2 ) g ( x T ) + ˜ d ( x T , x T +2 ) ⋮ ⋮ x 2 T − 1 ⟶ x T − 1 ⟶ g ( x T ) + ˜ d ( x T , x 2 T − 1 ) g ( x 0 ) + ˜ d ( x 0 , x T − 1 ) T independent T independent ⋯ samples samples
2 Quantum speed-up for Importance Sampling
Problem � 17 Input: discrete probability distribution D = (p 1 ,…,p n ) on [n]. Output: T independent samples i 1 ,…,i T ~ D. Evaluation oracle access Classical Quantum i ↦ p i U ( | i ⟩ | 0 ⟩ ) = | i ⟩ | p i ⟩ Cost = # queries to the evaluation oracle
Importance Sampling with a Classical Oracle � 18 Binary Tree
Importance Sampling with a Classical Oracle � 18 Binary Tree p 1 + p 2 + p 3 p 4 + p 5 p 4 p 5 p 3 p 1 + p 2 p 2 p 1
Importance Sampling with a Classical Oracle � 18 Binary Tree p 1 + p 2 + p 3 p 4 + p 5 p 4 p 5 p 3 p 1 + p 2 p 2 p 1 Preprocessing time: O ( n ) Cost per sample: O (log n )
Importance Sampling with a Classical Oracle � 18 Binary Tree p 1 + p 2 + p 3 p 4 + p 5 p 4 p 5 p 3 p 1 + p 2 p 2 p 1 Preprocessing time: O ( n ) Cost per sample: O (log n ) Cost for T samples: O ( n + T log n )
Importance Sampling with a Classical Oracle 18 � Binary Tree Alias Method (Walker 1974, Vose 1991) p 1 + p 2 + p 3 p 4 + p 5 p 4 p 5 p 3 p 1 + p 2 p 2 p 1 Preprocessing time: O ( n ) Preprocessing time: O ( n ) O (1) Cost per sample: O (log n ) Cost per sample: Cost for T samples: O ( n + T log n ) Cost for T samples: O ( n + T )
Importance Sampling with Quantum State preparation � 19 (Grover 2000) Preprocessing: Sampling (repeat T times) :
Importance Sampling with Quantum State preparation � 19 (Grover 2000) Preprocessing: 1. Compute with quantum Maximum Finding p max = max { p 1 , …, p n } Sampling (repeat T times) :
Importance Sampling with Quantum State preparation � 19 (Grover 2000) Preprocessing: 1. Compute with quantum Maximum Finding p max = max { p 1 , …, p n } 1 2. Construct the unitary n ∑ V ( | 0 ⟩ | 0 ⟩ ) ⟼ | i ⟩ | 0 ⟩ i ∈ [ n ] Sampling (repeat T times) :
Importance Sampling with Quantum State preparation � 19 (Grover 2000) Preprocessing: 1. Compute with quantum Maximum Finding p max = max { p 1 , …, p n } 1 2. Construct the unitary n ∑ V ( | 0 ⟩ | 0 ⟩ ) ⟼ | i ⟩ | 0 ⟩ i ∈ [ n ] | i ⟩ ( | 1 ⟩ ) 1 p i p i n ∑ ⟼ | 0 ⟩ + 1 − p max p max i ∈ [ n ] Sampling (repeat T times) :
Importance Sampling with Quantum State preparation � 19 (Grover 2000) Preprocessing: 1. Compute with quantum Maximum Finding p max = max { p 1 , …, p n } 1 2. Construct the unitary n ∑ V ( | 0 ⟩ | 0 ⟩ ) ⟼ | i ⟩ | 0 ⟩ i ∈ [ n ] | i ⟩ ( | 1 ⟩ ) 1 p i p i n ∑ ⟼ | 0 ⟩ + 1 − p max p max i ∈ [ n ] np max ( ∑ p i | i ⟩ ) | 0 ⟩ + … | 1 ⟩ 1 = i Sampling (repeat T times) :
Importance Sampling with Quantum State preparation � 19 (Grover 2000) Preprocessing: 1. Compute with quantum Maximum Finding p max = max { p 1 , …, p n } 1 2. Construct the unitary n ∑ V ( | 0 ⟩ | 0 ⟩ ) ⟼ | i ⟩ | 0 ⟩ i ∈ [ n ] | i ⟩ ( | 1 ⟩ ) 1 p i p i n ∑ ⟼ | 0 ⟩ + 1 − p max p max i ∈ [ n ] np max ( ∑ p i | i ⟩ ) | 0 ⟩ + … | 1 ⟩ 1 = i Sampling (repeat T times) : 1. Prepare with Amplitude Amplification on V, and measure it. ∑ p i | i ⟩ i
Importance Sampling with Quantum State preparation � 19 (Grover 2000) Preprocessing: 1. Compute with quantum Maximum Finding p max = max { p 1 , …, p n } 1 2. Construct the unitary n ∑ V ( | 0 ⟩ | 0 ⟩ ) ⟼ | i ⟩ | 0 ⟩ i ∈ [ n ] | i ⟩ ( | 1 ⟩ ) 1 p i p i n ∑ ⟼ | 0 ⟩ + 1 − p max p max i ∈ [ n ] np max ( ∑ p i | i ⟩ ) | 0 ⟩ + … | 1 ⟩ 1 = i Sampling (repeat T times) : 1. Prepare with Amplitude Amplification on V, and measure it. ∑ p i | i ⟩ i Preprocessing time: O ( n ) O ( np max ) Cost per sample:
Importance Sampling with Quantum State preparation � 19 (Grover 2000) Preprocessing: 1. Compute with quantum Maximum Finding p max = max { p 1 , …, p n } 1 2. Construct the unitary n ∑ V ( | 0 ⟩ | 0 ⟩ ) ⟼ | i ⟩ | 0 ⟩ i ∈ [ n ] | i ⟩ ( | 1 ⟩ ) 1 p i p i n ∑ ⟼ | 0 ⟩ + 1 − p max p max i ∈ [ n ] np max ( ∑ p i | i ⟩ ) | 0 ⟩ + … | 1 ⟩ 1 = i Sampling (repeat T times) : 1. Prepare with Amplitude Amplification on V, and measure it. ∑ p i | i ⟩ i Preprocessing time: O ( n ) O ( np max ) Cost per sample: Cost for T samples: O ( np max ) n + T
Importance Sampling with Quantum State preparation � 19 (Grover 2000) Preprocessing: 1. Compute with quantum Maximum Finding p max = max { p 1 , …, p n } 1 2. Construct the unitary n ∑ V ( | 0 ⟩ | 0 ⟩ ) ⟼ | i ⟩ | 0 ⟩ i ∈ [ n ] | i ⟩ ( | 1 ⟩ ) 1 p i p i n ∑ ⟼ | 0 ⟩ + 1 − p max p max i ∈ [ n ] np max ( ∑ p i | i ⟩ ) | 0 ⟩ + … | 1 ⟩ 1 = i Sampling (repeat T times) : 1. Prepare with Amplitude Amplification on V, and measure it. ∑ p i | i ⟩ i Preprocessing time: O ( n ) O ( np max ) Cost per sample: Cost for T samples: O ( np max ) = O ( T n ) n + T
Importance Sampling with Quantum State preparation 19 � (Grover 2000) Preprocessing: 1. Compute with quantum Maximum Finding p max = max { p 1 , …, p n } 1 2. Construct the unitary n ∑ V ( | 0 ⟩ | 0 ⟩ ) ⟼ | i ⟩ | 0 ⟩ i ∈ [ n ] | i ⟩ ( | 1 ⟩ ) 1 p i p i n ∑ ⟼ | 0 ⟩ + 1 − p max p max i ∈ [ n ] np max ( ∑ p i | i ⟩ ) | 0 ⟩ + … | 1 ⟩ 1 = i Sampling (repeat T times) : 1. Prepare with Amplitude Amplification on V, and measure it. ∑ p i | i ⟩ i Preprocessing time: O ( n ) O ( np max ) Cost per sample: Our result: O ( Tn ) Cost for T samples: O ( np max ) = O ( T n ) n + T
Importance Sampling with a Quantum Oracle � 20 Our result: O ( Tn ) for obtaining T independent samples from D = (p 1 ,…,p n ).
Importance Sampling with a Quantum Oracle � 20 Our result: O ( Tn ) for obtaining T independent samples from D = (p 1 ,…,p n ). Element 1 2 3 4 5 6 7 Probability p 1 p 2 p 3 p 4 p 5 p 6 p 7 Distribution D
Importance Sampling with a Quantum Oracle � 20 Our result: O ( Tn ) for obtaining T independent samples from D = (p 1 ,…,p n ). Element 1 2 3 4 5 6 7 Probability p 1 p 2 p 3 p 4 p 5 p 6 p 7 Distribution D T / 1 ≥ p ∑ p i i P Heavy = i ∈ Heavy Element 1 3 4 p 1 p 3 p 4 Probability P Heavy P Heavy P Heavy Distribution D Heavy
Importance Sampling with a Quantum Oracle � 20 Our result: O ( Tn ) for obtaining T independent samples from D = (p 1 ,…,p n ). Element 1 2 3 4 5 6 7 Probability p 1 p 2 p 3 p 4 p 5 p 6 p 7 Distribution D p i < 1/ T T / 1 ≥ p P Light = ∑ ∑ p i i P Heavy = p i i ∈ Heavy i ∈ Light Element 2 5 6 7 Element 1 3 4 p 6 p 7 p 2 p 5 p 1 p 3 p 4 Probability Probability P Light P Light P Light P Light P Heavy P Heavy P Heavy Distribution D Heavy Distribution D Light
Importance Sampling with a Quantum Oracle � 20 Our result: O ( Tn ) for obtaining T independent samples from D = (p 1 ,…,p n ). Element 1 2 3 4 5 6 7 Probability p 1 p 2 p 3 p 4 p 5 p 6 p 7 Distribution D p i < 1/ T T / 1 ≥ p P Light = ∑ ∑ p i i P Heavy = p i i ∈ Heavy i ∈ Light Element 2 5 6 7 Element 1 3 4 p 6 p 7 p 2 p 5 p 1 p 3 p 4 Probability Probability P Light P Light P Light P Light P Heavy P Heavy P Heavy Distribution D Heavy Distribution D Light Use the Alias method
Importance Sampling with a Quantum Oracle � 20 Our result: O ( Tn ) for obtaining T independent samples from D = (p 1 ,…,p n ). Element 1 2 3 4 5 6 7 Probability p 1 p 2 p 3 p 4 p 5 p 6 p 7 Distribution D p i < 1/ T T / 1 ≥ p P Light = ∑ ∑ p i i P Heavy = p i i ∈ Heavy i ∈ Light Element 2 5 6 7 Element 1 3 4 p 6 p 7 p 2 p 5 p 1 p 3 p 4 Probability Probability P Light P Light P Light P Light P Heavy P Heavy P Heavy Distribution D Heavy Distribution D Light Use the Alias method Use Quantum State Preparation
Importance Sampling with a Quantum Oracle � 21 Preprocessing:
Importance Sampling with a Quantum Oracle � 21 Preprocessing: 1. Compute the set Heavy ⊂ [n] of indices i such that p i ≥ 1/T , using Grover Search .
Importance Sampling with a Quantum Oracle � 21 Preprocessing: 1. Compute the set Heavy ⊂ [n] of indices i such that p i ≥ 1/T , using Grover Search . ∑ 2. Compute P Heavy = p i i ∈ Heavy
Importance Sampling with a Quantum Oracle � 21 Preprocessing: 1. Compute the set Heavy ⊂ [n] of indices i such that p i ≥ 1/T , using Grover Search . ∑ 2. Compute P Heavy = p i i ∈ Heavy 3. Apply the preprocessing step of the Alias Method on D Heavy .
Importance Sampling with a Quantum Oracle � 21 Preprocessing: 1. Compute the set Heavy ⊂ [n] of indices i such that p i ≥ 1/T , using Grover Search . ∑ 2. Compute P Heavy = p i i ∈ Heavy 3. Apply the preprocessing step of the Alias Method on D Heavy . 4. Apply the preprocessing step of the Quant. State Preparation method on D Light .
Importance Sampling with a Quantum Oracle � 22 Sampling (repeat T times) :
Importance Sampling with a Quantum Oracle � 22 Sampling (repeat T times) : Flip a coin that is head with probability P Heavy :
Importance Sampling with a Quantum Oracle � 22 Sampling (repeat T times) : Flip a coin that is head with probability P Heavy : • Head : sample i ~ D Heavy with the Alias Method.
Importance Sampling with a Quantum Oracle � 22 Sampling (repeat T times) : Flip a coin that is head with probability P Heavy : • Head : sample i ~ D Heavy with the Alias Method. • Tail : sample i ~ D Light with Quantum State Preparation.
Importance Sampling with a Quantum Oracle � 23 Preprocessing: 1. Compute the set Heavy ⊂ [n] of indices i such that p i ≥ 1/T , using Grover Search . Cost: ∑ 2. Compute P Heavy = p i i ∈ Heavy Cost: 3. Apply the preprocessing step of the Alias Method on D Heavy . Cost: 4. Apply the preprocessing step of the Quant. State Preparation method on D Light . Cost:
Importance Sampling with a Quantum Oracle � 23 Preprocessing: 1. Compute the set Heavy ⊂ [n] of indices i such that p i ≥ 1/T , using Grover Search . since |Heavy| ≤ T Cost: O ( nT ) ∑ 2. Compute P Heavy = p i i ∈ Heavy Cost: 3. Apply the preprocessing step of the Alias Method on D Heavy . Cost: 4. Apply the preprocessing step of the Quant. State Preparation method on D Light . Cost:
Importance Sampling with a Quantum Oracle � 23 Preprocessing: 1. Compute the set Heavy ⊂ [n] of indices i such that p i ≥ 1/T , using Grover Search . since |Heavy| ≤ T Cost: O ( nT ) ∑ 2. Compute P Heavy = p i i ∈ Heavy Cost: O ( T ) 3. Apply the preprocessing step of the Alias Method on D Heavy . Cost: 4. Apply the preprocessing step of the Quant. State Preparation method on D Light . Cost:
Importance Sampling with a Quantum Oracle � 23 Preprocessing: 1. Compute the set Heavy ⊂ [n] of indices i such that p i ≥ 1/T , using Grover Search . since |Heavy| ≤ T Cost: O ( nT ) ∑ 2. Compute P Heavy = p i i ∈ Heavy Cost: O ( T ) 3. Apply the preprocessing step of the Alias Method on D Heavy . Cost: O ( T ) 4. Apply the preprocessing step of the Quant. State Preparation method on D Light . Cost:
Importance Sampling with a Quantum Oracle � 23 Preprocessing: 1. Compute the set Heavy ⊂ [n] of indices i such that p i ≥ 1/T , using Grover Search . since |Heavy| ≤ T Cost: O ( nT ) ∑ 2. Compute P Heavy = p i i ∈ Heavy Cost: O ( T ) 3. Apply the preprocessing step of the Alias Method on D Heavy . Cost: O ( T ) 4. Apply the preprocessing step of the Quant. State Preparation method on D Light . Cost: O ( n )
Recommend
More recommend