reinforcement learning with neural networks for quantum
play

Reinforcement Learning with Neural Networks for Quantum Multiple - PowerPoint PPT Presentation

Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing Sarah Brandsen 1 , Kevin D. Stubbs 2 , Henry D. Pfister 2 , 3 1 Department of Physics, Duke University 2 Department of Mathematics, Duke University 3 Department of


  1. Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing Sarah Brandsen 1 , Kevin D. Stubbs 2 , Henry D. Pfister 2 , 3 1 Department of Physics, Duke University 2 Department of Mathematics, Duke University 3 Department of Electrical Engineering, Duke University IEEE International Symposium on Information Theory June 21-26, 2020 (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 1 / 27

  2. Outline 1 Overview of Multiple State Discrimination 2 Reinforcement Learning with Neural Networks (RLNN) 3 Comparing RLNN performance to known results Binary pure state discrimination RLNN performance as function of subsystem number Comparison to “Pretty Good Measurement” 4 Performance of RLNN in more general cases Trine ensemble Comparison to Semidefinite Programming Upper Bounds 5 Open questions (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 2 / 27

  3. Quantum State Discrimination Given: ρ ∈ { ρ j }| m j =1 with priors � q = ( q 1 , ..., q m ) Objective: find quantum measurement ˆ Π = { Π j }| m j =1 that maximizes m � P success = Tr[ q j ρ j Π j ] j =1 ρ 1 q j = Pr( ρ = ρ j ) ρ 4 ρ 3 ρ 2 (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 3 / 27

  4. Locally Adaptive Strategies Locally adaptive protocols consist of measuring one subsystem at a time, then choosing the next subsystem and measurement based on previous results ρ (3) 4 ρ (3) 1 ρ (2) ρ (1) 3 ρ (2) 1 ρ (1) 1 2 ρ (3) ρ (1) ρ (1) ρ (2) 3 4 3 2 ρ (2) ρ (3) 4 2 (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 4 / 27

  5. Motivation for Locally Adaptive Strategies Analytic solution for optimal collective measurement generally not known when m ≥ 3 Approximately optimal solutions found via semidefinite programming [EMV03] may be experimentally impractical for large systems ρ (3) 4 ρ (3) 1 ρ (2) ρ (1) 3 ρ (2) 1 ρ (1) 1 2 ρ (3) ρ (1) ρ (1) ρ (2) 3 4 3 2 ρ (2) ρ (3) 4 2 k =1 ρ ( k ) ρ j = � n j (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 5 / 27

  6. Reinforcement Learning Main idea- agent learns to maximize the expected future reward through repeated interactions with the environment. a t ∈ A s t ∈ S r t (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 6 / 27

  7. Advantage function Agent’s policy - draw random action a given state s according to π θ ( a | s ) = Pr[A = a | S = s ] Advantage function - compares the expected reward of choosing action a given state s to the average expected reward for being in state s given policy π N γ ℓ − t � � � � � A π ( s t , a t ) = E π θ [ r ( s ℓ , a ℓ ) � s t , a t ] − E π θ [ r ( s ℓ , a ℓ ) � s t ] ℓ = t (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 7 / 27

  8. Neural Networks for Function Approximation Setup- we use a fully connected neural network where the input layer feeds into two parallel sets of sub-networks. π ∗ ( a 1 | s ) π ∗ ( a 2 | s ) π ∗ ( a | A | | s ) V ( s ) Input Layer Two Hidden Layers (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 8 / 27

  9. Set of allowed quantum measurements Binary Projective Measurement Set – taken to be { ˆ Π( ℓ ) } Q ℓ =1 where Π( ℓ ) � ℓ � ℓ �  �  � 2 � 2 Π( ℓ − 1) Π( ℓ + 1) ℓ 1 − ˆ Q Q Q  , Π( ℓ ) � � ℓ � ℓ  � � 2 � 2 ℓ 1 − 1 − Q Q Q � ℓ � ℓ  �  � 2 � 2 � − ℓ 1 − 1 − Q Q Q � ℓ � ℓ  �  � 2 � 2 − ℓ 1 − Q Q Q and ℓ ∈ { 0 , 1 , ..., Q − 1 } . Q = 20 in our experiments. (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 9 / 27

  10. Applying RLNN to Multiple State Discrimination Initialize- Randomly generate ρ ∈ { ρ j } m j =1 according to � q = ( q 1 , ..., q m ) Initialize � s = ( s 1 , ..., s n ) to all-zeros vector � s = [0 , 0 , 0] (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 10 / 27

  11. Applying RLNN to Multiple State Discrimination (cont) Step- Agent chooses an action of the form ( j , ˆ Π) Implement action and sample outcome according to Tr[Π out ρ ] Update prior via Bayes’ Theorem Set s j → 1 j = 2 s → [0 , 1 , 0] � (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 11 / 27

  12. Reward scheme If subsystem j has already been measured in a previous round, return penalty of -0.5 r = − 0 . 5 When s j = 1 for all j return reward of 1 if ρ guess = ρ and 0 else. 1 1 Results are generated using the default PPO algorithm from Ray version 0.7.6 (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 12 / 27

  13. Binary Pure State Discrimination Setup- in the special case where m = 2, the state set is { ρ + , ρ − } with prior q = Pr( ρ = ρ + ). Optimal solution- the Helstrom measurement is optimal, where Π h = { Π + , Π − } and Π ± are projectors onto the positive/negative eigenspace of M � q ρ + − (1 − q ) ρ − In the special case where ρ ± are both tensor products of pure subsystems, an adaptive greedy protocol is fully optimal. (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 13 / 27

  14. RLNN Performance in the Binary Case Setup- for each trial, we randomly select pure tensor product quantum states with m = 2, n = 3. Results for the optimal RLNN policy are plotted after 1000 training iterations. 1 0 . 9 0 . 8 P succ 0 . 7 0 . 6 Helstrom RLNN 0 . 5 0 2 4 6 8 trial (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 14 / 27

  15. RLNN Performance as Function of Training Iterations 0 . 1 P succ, Helstrom − P succ, RLNN 8 · 10 − 2 6 · 10 − 2 4 · 10 − 2 2 · 10 − 2 0 0 200 400 600 800 1 , 000 training iteration (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 15 / 27

  16. Special known case Given a base set { ρ 0 , ρ 1 } , consider: S (1) � { ρ 0 , ρ 1 } S (2) � { ρ 0 ⊗ ρ 0 , ρ 0 ⊗ ρ 1 , ρ 1 ⊗ ρ 0 , ρ 1 ⊗ ρ 1 } S (3) � { ρ 0 ⊗ ρ 0 ⊗ ρ 0 , ρ 0 ⊗ ρ 0 ⊗ ρ 1 , ρ 0 ⊗ ρ 1 ⊗ ρ 0 , ρ 0 ⊗ ρ 1 ⊗ ρ 1 , ρ 1 ⊗ ρ 0 ⊗ ρ 0 , ρ 1 ⊗ ρ 0 ⊗ ρ 1 , ρ 1 ⊗ ρ 1 ⊗ ρ 0 , ρ 1 ⊗ ρ 1 ⊗ ρ 1 } .... and for each state set, assume each candidate state is equally probable. (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 16 / 27

  17. Special known case Results- the RLNN performance starts to show a signficant gap from the optimal success probability when n ≥ 5 0 . 8 P succ 0 . 6 0 . 4 Optimal NN 1 2 3 4 5 6 n (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 17 / 27

  18. The “Pretty Good Measurement” (PGM) The “Pretty Good Measurement” defines the POVM q j ρ j ) − 1 q j ρ j ) − 1 � � Π PGM , k � ( 2 q k ρ k ( ∀ k ∈ { 1 , ..., m } 2 j j Motivation- PGM is known to be optimal for several cases: Symmetric pure states with uniform prior where ρ j = | ψ j � � ψ j | and | ψ j � = U j − 1 | ψ 1 � with U m = I Linearly-independent pure states where the diagonal elements of the square-root of the Gram matrix are all equal (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 18 / 27

  19. RLNN vs Pretty Good Measurement Setup- we generate 10 trials of candidate states with n = 3, m = 5 and plot the difference in RLNN and PGM success probability. 4 3 2 1 0 − 5 · 10 − 2 0 5 · 10 − 2 0 . 1 P succ , NN − P succ , PGM (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 19 / 27

  20. Trine Ensemble Candidate States The trine ensemble consists of three equally spaced real qubit states, 3 ) † � ⊗ j � 2 � 3 ) ⊗ j | 0 � � 0 | � R ( 4 π R ( 4 π namely j =0 ρ ( 0 ) ρ ( 1 ) 0 0 ρ ( 0 ) ρ ( 0 ) ρ ( 1 ) ρ ( 1 ) 2 1 2 1 (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 20 / 27

  21. Conjectured optimal local method Step 1: “Anti-trine” measurement implemented on subsystem 1 ρ ( 0 ) 0 Π 1 Π 2 ρ ( 0 ) ρ ( 0 ) 2 1 Π 0 (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 21 / 27

  22. Conjectured optimal local method Step 2: Helstrom measurement for the remaining two candidate states is implemented on subsystem 2 ρ ( 1 ) 0 Π 0 ρ ( 1 ) 1 Π 1 out = Π 2 Success probability of this method is P succ ≈ 0 . 933, whereas success probability of a locally greedy method is P succ, lg = 0 . 8 (Duke University) Reinforcement Learning with Neural Networks for Quantum Multiple Hypothesis Testing 22 / 27

Recommend


More recommend