Combining Perfect Shuffle and Bitonic Networks for Efficient Quantum Sorting Naveed Mahmud, Bailey K. Srimoungchanh, Bennett Haase-Divine, Nolan Blankenau, Annika Kuhnke, and Esam El-Araby University of Kansas (KU) Fifth International Workshop on Heterogeneous High-performance Reconfigurable Computing (H 2 RC’19) November 17-22, 2019 Denver, Colorado
Outline ◆ Introduction and Motivation ◆ Background and Related Work ◆ Proposed Work ◆ Experimental Results ◆ Conclusions and Future Work H 2 RC 2019 – Nov. 17 th , 2019 2
Introduction and Motivation ◆ Why Quantum? ▪ Efficient quantum algorithms ▪ Solving NP-hard problems ▪ source: Speedup over classical https://learning.acm.org/ ▪ techtalks/qiskit Quantum supremacy ▪ Quantum Ready NISQ devices ◆ Need for Quantum Emulation ▪ Difficult to control QC experiments ▪ Verification and benchmarking ▪ High-cost of accessing QCs E.g., academic hourly rate of $1,250 up to 499 ◆ annual hours ◆ Emulation using FPGAs ▪ Greater speedup vs. SW ▪ Dynamic (reconfigurable) vs. fixed architectures ▪ Exploiting parallelism ▪ Limitation → Scalability H 2 RC 2019 – Nov. 17 th , 2019 3
Introduction and Motivation ◆ Why Quantum? ▪ Efficient quantum algorithms ▪ Solving NP-hard problems ▪ source: Speedup over classical https://learning.acm.org/ ▪ techtalks/qiskit Quantum supremacy ▪ Quantum Ready NISQ devices ◆ Need for Quantum Emulation ▪ Difficult to control QC experiments ▪ Verification and benchmarking ▪ High-cost of accessing QCs E.g., academic hourly rate of $1,250 up to 499 ◆ annual hours ◆ Emulation using FPGAs Google’s 72 - qubit “Bristlecone” Intel’s 49 - qubit “Tangle Lake” IBM-Q 53-qubit computer ▪ Greater speedup vs. SW ▪ Dynamic (reconfigurable) vs. fixed architectures ▪ Exploiting parallelism ▪ Limitation → Scalability Rigetti’s 16-qubit ASPEN-4 IonQ’s 79-qubit computer D-Wave 2000Q H 2 RC 2019 – Nov. 17 th , 2019 4
Outline ◆ Introduction and Motivation ◆ Background and Related Work ◆ Proposed Work ◆ Experimental Results ◆ Conclusions and Future Work H 2 RC 2019 – Nov. 17 th , 2019 5
Background (Quantum Computing) ◆ Qubits = + Single- Qubit Superpo i s tion: 0 1 ▪ 1 Physical implementations ( ) ( ) ◆ Electron (spin) → = 2 → = 2 Born Rule p : 0 , p 1 1 1 NMR ≡ N uclear M agnetic R esonance ◆ Nucleus (spin through NMR) ◆ Photon (polarization encoding) Multi-Qubit Sup erpo sit o i n : ◆ Josephson junction (superconducting qubits) = = = 2 1 0 ◆ Trapped ions q q q q q q 3 2 1 0 2 1 0 ◆ Anions 2 1 0 ▪ = + + + Theoretical representation 000 001 ... 11 1 3 2 1 0 2 1 0 2 1 0 ◆ Bloch sphere − n 2 1 = + + + = Basis states → ȁ ۧ » 0 , ȁ ۧ 1 c 0 c 1 .. . c 7 c q 3 0 1 7 n q Pure states → ȁ ۧ » 𝜔 = q 0 ◆ Vector of complex coefficients n − ( ) 2 1 2 2 2 → = = = Born R ul e : p q c c 1 ◆ Superposition n q n q = q 0 ▪ Linear sum of distinct basis states Multi-Qubit Entangl em nt e : ▪ Converts to classical logic when measured ( ) ( ) ▪ = = Applies to state with n -qubits q ... q q q ... q q − − n n 1 1 0 n n 1 1 0 entangled entangled un-entangled ◆ Entanglement ( ) = = 1 0 For Example : q q q q ▪ Strong correlation between qubits 2 1 0 1 0 entangled entangled 1 0 ▪ Measuring a qubit gives information about other qubits = + + + + c 00 c 11 00 01 1 0 0 11 ▪ Entangled state cannot be factored into a tensor product 2 0 3 1 0 1 0 1 0 1 entangled H 2 RC 2019 – Nov. 17 th , 2019 6
Background (Quantum Gates) ◆ X Gate (NOT) gate 𝑌 = 0 1 ▪ 1-qubit gate 1 0 ▪ Inverts the magnitude of the qubit 1 0 0 0 0 1 0 0 ◆ cX (Controlled NOT) Gate 𝑑𝑌 = 0 0 0 1 ▪ 2-qubit gate 0 0 1 0 ▪ Control qubit and a target qubit ▪ Inverts target qubit based on value of control 1 0 0 0 0 0 1 0 SWAP = 0 1 0 0 ◆ SWAP Gate 0 0 0 1 ▪ 2-qubit gate ▪ Exchanges positions of the two qubits 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 ◆ cSWAP (Controlled SWAP) Gate 0 0 0 1 0 0 0 0 𝑑SWAP = ▪ 0 0 0 0 1 0 0 0 3-qubit gate 0 0 0 0 0 0 1 0 ▪ Exchanges positions of the two qubits based on 0 0 0 0 0 1 0 0 the control qubit 0 0 0 0 0 0 0 1 H 2 RC 2019 – Nov. 17 th , 2019 7
Background (Sorting) Bitonic sort Insertion ◆ Classical Sorting Complexity Quicksort Merge sort with perfect sort ▪ shuffle Quicksort log 2 N ▪ N log N N log N N 2 Merge sort Time ▪ Insertion sort log N N 1 N Space ▪ Bitonic sort with perfect shuffle source: https://www.bigocheatsheet.com/ H 2 RC 2019 – Nov. 17 th , 2019 8
Background (Sorting) Bitonic sort Insertion ◆ Classical Sorting Complexity Quicksort Merge sort with perfect sort ▪ shuffle Quicksort log 2 N ▪ N log N N log N N 2 Merge sort Time ▪ Insertion sort log N N 1 N Space ▪ Bitonic sort with perfect shuffle source: https://www.bigocheatsheet.com/ ◆ Quantum Sorting ▪ Relatively new realm of research ▪ Based on encoding of data as coefficients of a superimposed quantum state ( N=2 n ) ▪ Parallel architecture ▪ Speedup compared to classical sorters N ≡ number of states n ≡ number of qubits H 2 RC 2019 – Nov. 17 th , 2019 9
Background (Sorting) Bitonic sort Insertion ◆ Classical Sorting Complexity Quicksort Merge sort with perfect sort ▪ shuffle Quicksort log 2 N ▪ N log N N log N N 2 Merge sort Time ▪ Insertion sort log N N 1 N Space ▪ Bitonic sort with perfect shuffle source: https://www.bigocheatsheet.com/ ◆ Quantum Sorting ▪ Relatively new realm of research ▪ Based on encoding of data as coefficients of a superimposed quantum state ( N=2 n ) ▪ Parallel architecture ▪ Speedup compared to classical sorters N ≡ number of states n ≡ number of qubits Quantum bitonic Quantum merge Complexity sort with perfect sorting [Chen, et al] shuffle log 2 n log 2 n Time n n Space H 2 RC 2019 – Nov. 17 th , 2019 10
Related Work (Quantum Sorting) ◆ Chen, et al., “Quantum switching and quantum merge sorting,” February 2006 ▪ Bitonic merge sorting with a divide-and-conquer approach ▪ 𝑷(𝒎𝒑𝒉 𝟑 𝒐) time complexity to sort n qubits ▪ Not enough details about ‘quantum comparator’ ▪ No experimental evaluation ◆ Hoyer, et al., “Quantum complexities of ordered searching, sorting, and element distinctness,” November 2002 ▪ Proof showing lower bound of general quantum sorting is 𝛁(𝑶 𝒎𝒑𝒉 𝑶) ▪ Based on comparison matrix given as input oracle ▪ No circuit realizations or implementations H 2 RC 2019 – Nov. 17 th , 2019 11
Related Work (Parallel SW Simulators) Villalonga , et al., “Establishing the Quantum Supremacy Frontier with a 281 Pflop /s Simulation,” May 2019 ◆ ▪ Simulation of 7x7 and 11x11 random quantum circuits (RQCs) of depth 42 and 26 respectively. ▪ Summit supercomputer (ORNL, USA) with 4550 nodes List of quantum SW simulators ▪ 1.6 TB of non-volatile memory per node https://quantiki.org/wiki/list-qc-simulators ▪ Power consumption of 7.3 MW Li et al., “Quantum Supremacy Circuit Simulation on Sunway TaihuLight ,” August 2018 ◆ ▪ Simulation of 49-qubit random quantum circuits of depth of 55 ▪ Sunway supercomputer (NSC, China) with 131,072 nodes (32,768 CPUs) ▪ 1 PB total main memory J. Chen, et al., “Classical Simulation of Intermediate - Size Quantum Circuits,” May 2018 ◆ ▪ Simulation of up to 144-qubit random quantum circuits of depth 27 ▪ Supercomputing cluster (Alibaba Group, China) with 131,072 nodes ▪ 8 GB memory per node De Raedt et al., “Massively parallel quantum computer simulator eleven years later,” May 2018 ◆ ▪ Simulation of Shor’s algorithm using 48-qubits ▪ Various supercomputing platforms: IBM Blue Gene/Q (decommissioned), JURECA (Germany), K computer (Japan), Sunway TaihuLight (China) ▪ Up to 16-128 GB memory/node utilized T. Jones, et al., “ QuEST and High Performance Simulation of Quantum Computers,” May 2018 ◆ ▪ Simulation of random quantum circuits up to 38 qubits ▪ ARCUS supercomputer (ARCHER, UK) with 2048 nodes ▪ Up to 256 GB memory per node H 2 RC 2019 – Nov. 17 th , 2019 12
Recommend
More recommend