2019 IEEE High Performance Extreme Computing Conference (HPEC ‘19) Twenty-third Annual HPEC Conference 24 - 26 September 2019 Westin Hotel, Waltham, MA USA
Thursday, September 26, 2019 Quantum 3:00-4:40 in Eden Vale C3 Chair: Patrick Dreher / NCSU [Best Paper Finalist] C to D-Wave: A High-level C Compilation Framework for Quantum Annealers Mohamed W. Hassan (Virginia Tech), Scott Pakin (LANL), and Wu-chun Feng (Virginia Tech) A quantum annealer solves optimization problems by exploiting quantum effects. Problems are represented as Hamiltonian functions that define an energy landscape. The quantum-annealing hardware relaxes to a solution corresponding to the ground state of the energy landscape.  Expressing arbitrary programming problems in terms of real-valued Hamiltonian-function coefficients is unintuitive and challenging.  This paper addresses the difficulty of programming quantum annealers by presenting a compilation framework that compiles a subset of C code to a quantum machine instruction (QMI) to be executed on a quantum annealer. Our work is based on a modular software stack that facilitates programming D-Wave quantum annealers by successively lowering code from C to Verilog to a symbolic "quantum macro assembly language'' and finally to a device- specific Hamiltonian function.  We demonstrate the capabilities of our software stack on a set of problems written in C and executed on a D-Wave 2000Q quantum annealer. QxSQA: GPGPU-Accelerated Simulated Quantum Annealer within a Non-Linear Optimization and Boltzmann Sampling Framework Dan Padilha, Serge Weinstock, and Mark Hodson (QxBranch) We introduce QxSQA, a GPGPU-Accelerated Simulated Quantum Annealer based on Path-Integral Monte Carlo (PIMC). QxSQA is tuned for finding low-energy solutions to integer, non-linear optimization problems of up to 2^14 (16384) binary variables with quadratic interactions on a single GPU instance. Experimental results demonstrate QxSQA can solve Maximum Clique test problems of 8100 binary variables with planted solutions in under one minute, with linear scaling against key optimization parameters on other large-scale problems. Through the PIMC formulation, QxSQA also functions as an accurate sampler of Boltzmann distributions for machine learning applications. Experimental characterization of Boltzmann sampling results for a reinforcement learning problem showed good convergence performance at useful scales. Our implementation integrates as a solver within our QxBranch developer platform, positioning developers to efficiently develop applications using QxSQA, and then test the same application code on a quantum annealer or universal quantum computer hardware platform such as those from D- Wave Systems, IBM, or Rigetti Computing. [Best Student Paper Finalist] Multistart Methods for Quantum Approximate Optimization Ruslan Shaydulin, Ilya Safro (Clemson Univ.), Jeffrey Larson (Argonne) Hybrid quantum-classical algorithms such as the quantum approximate optimization algorithm (QAOA) are considered one of the most promising approaches for leveraging near-term quantum computers for practical applications. Such algorithms are often implemented in a variational form, combining classical optimization methods with a quantum machine to find parameters that maximize performance. The quality of the QAOA solution depends heavily on quality of the parameters produced by the classical optimizer. Moreover, the presence of multiple local optima makes it difficult for the classical optimizer to identify high-quality parameters. In this paper we study the use of a multistart optimization approach within QAOA to improve the performance of quantum machines on important graph clustering problems. We also demonstrate that reusing the optimal parameters from similar problems can improve the performance of classical optimization methods, expanding on similar results for MAXCUT. Message Scheduling for Performant, Many-Core Belief Propagation Mark Van der Merwe (University of Utah)*; Vinu Joseph (UNIVERSITY OF UTAH); Ganesh Gopalakrishnan (University of Utah) Belief Propagation (BP) is a message-passing algorithm for approximate inference over Probabilistic Graphical Models (PGMs), finding many applications such as computer vision, error-correcting codes, and protein-folding. While general, the convergence and speed of the algorithm has limited its practical use on difficult inference problems. As an algorithm that is highly amenable to parallelization, many-core Graphical Processing Units (GPUs) could significantly improve BP performance. Improving BP through many-core systems is non-trivial: the scheduling of messages in the algorithm strongly affects performance. We present a study of message scheduling for BP on GPUs. We demonstrate that BP exhibits a tradeoff between speed and convergence based on parallelism and show that existing message schedulings are not able to utilize this tradeoff. To this end, we present a novel randomized message scheduling approach, Randomized BP (RnBP), which outperforms existing methods on the GPU. Prototype Container-Based Platform for Extreme Quantum Computing Algorithm Development Patrick Dreher, Madhuvanti Ramasami (North Carolina State Univ.) Recent advances in the development of the first generation of quantum computing devices have provided researchers with computational platforms to explore new ideas and reformulate conventional computational codes suitable for a quantum computer. Developers can now implement these reformulations on both quantum simulators and hardware platforms through a cloud computing software environment. For example, the IBM Q Experience provides the direct access to their quantum simulators and quantum computing hardware platforms. However these current access options may not be an optimal environment for developers needing to download and modify the source codes and libraries. This paper focuses on the construction of a Docker container environment with Qiskit source codes and libraries running on a local cloud computing system that can directly access the IBM Q Experience. This prototype container based system allows single user and small project groups to do rapid prototype development, testing and implementation of extreme capability algorithms with more agility and flexibility than can be provided through the IBM Q Experience website.  This prototype environment also provides an excellent teaching environment for labs and project assignments within graduate courses in cloud computing and quantum computing. The paper also discusses computer security challenges for expanding this prototype container system to larger groups of quantum computing researchers.
Thursday, September 26, 2019 Quantum 3:00-4:40 in Eden Vale C3 Chair: Patrick Dreher / NCSU [Best Paper Finalist] C to D-Wave: A High-level C Compilation Framework for Quantum Annealers Mohamed W. Hassan (Virginia Tech), Scott Pakin (LANL), and Wu- chun Feng (Virginia Tech) A quantum annealer solves optimization problems by exploiting quantum effects. Problems are represented as Hamiltonian functions that define an energy landscape. The quantum-annealing hardware relaxes to a solution corresponding to the ground state of the energy landscape.  Expressing arbitrary programming problems in terms of real-valued Hamiltonian-function coefficients is unintuitive and challenging.  This paper addresses the difficulty of programming quantum annealers by presenting a compilation framework that compiles a subset of C code to a quantum machine instruction (QMI) to be executed on a quantum annealer. Our work is based on a modular software stack that facilitates programming D-Wave quantum annealers by successively lowering code from C to Verilog to a symbolic "quantum macro assembly language'' and finally to a device-specific Hamiltonian function.  We demonstrate the capabilities of our software stack on a set of problems written in C and executed on a D-Wave 2000Q quantum annealer. QxSQA: GPGPU-Accelerated Simulated Quantum Annealer within a Non-Linear Optimization and Boltzmann Sampling Framework Dan Padilha, Serge Weinstock, and Mark Hodson (QxBranch) We introduce QxSQA, a GPGPU-Accelerated Simulated Quantum Annealer based on Path-Integral Monte Carlo (PIMC). QxSQA is tuned for finding low-energy solutions to integer, non-linear optimization problems of up to 2^14 (16384) binary variables with quadratic interactions on a single GPU instance. Experimental results demonstrate QxSQA can solve Maximum Clique test problems of 8100 binary variables with planted solutions in under one minute, with linear scaling against key optimization parameters on other large-scale problems. Through the PIMC formulation, QxSQA also functions as an accurate sampler of Boltzmann distributions for machine learning applications. Experimental characterization of Boltzmann sampling results for a reinforcement learning problem showed good convergence performance at useful scales. Our implementation integrates as a solver within our QxBranch developer platform, positioning developers to efficiently develop applications using QxSQA, and then test the same application code on a quantum annealer or universal quantum computer hardware platform such as those from D-Wave Systems, IBM, or Rigetti Computing. [Best Student Paper Finalist] Multistart Methods for Quantum Approximate Optimization Ruslan Shaydulin, Ilya Safro (Clemson Univ.), Jeffrey Larson (Argonne) Hybrid quantum-classical algorithms such as the quantum approximate optimization algorithm (QAOA) are considered one of the most promising approaches for leveraging near-term quantum computers for practical applications. Such algorithms are often implemented in a variational form, combining classical optimization methods with a quantum machine to find parameters that maximize performance. The quality of the QAOA solution depends heavily on quality of the parameters produced by the classical optimizer. Moreover, the presence of multiple local optima makes it difficult for the classical optimizer to identify high-quality parameters. In this paper we study the use of a multistart optimization approach within QAOA to improve the performance of quantum machines on important graph clustering problems. We also demonstrate that reusing the optimal parameters from similar problems can improve the performance of classical optimization methods, expanding on similar results for MAXCUT. Message Scheduling for Performant, Many-Core Belief Propagation Mark Van der Merwe (University of Utah)*; Vinu Joseph (UNIVERSITY OF UTAH); Ganesh Gopalakrishnan (University of Utah) Belief Propagation (BP) is a message-passing algorithm for approximate inference over Probabilistic Graphical Models (PGMs), finding many applications such as computer vision, error-correcting codes, and protein-folding. While general, the convergence and speed of the algorithm has limited its practical use on difficult inference problems. As an algorithm that is highly amenable to parallelization, many-core Graphical Processing Units (GPUs) could significantly improve BP performance. Improving BP through many- core systems is non-trivial: the scheduling of messages in the algorithm strongly affects performance. We present a study of message scheduling for BP on GPUs. We demonstrate that BP exhibits a tradeoff between speed and convergence based on parallelism and show that existing message schedulings are not able to utilize this tradeoff. To this end, we present a novel randomized message scheduling approach, Randomized BP (RnBP), which outperforms existing methods on the GPU. Prototype Container-Based Platform for Extreme Quantum Computing Algorithm Development Patrick Dreher, Madhuvanti Ramasami (North Carolina State Univ.) Recent advances in the development of the first generation of quantum computing devices have provided researchers with computational platforms to explore new ideas and reformulate conventional computational codes suitable for a quantum computer. Developers can now implement these reformulations on both quantum simulators and hardware platforms through a cloud computing software environment. For example, the IBM Q Experience provides the direct access to their quantum simulators and quantum computing hardware platforms. However these current access options may not be an optimal environment for developers needing to download and modify the source codes and libraries. This paper focuses on the construction of a Docker container environment with Qiskit source codes and libraries running on a local cloud computing system that can directly access the IBM Q Experience. This prototype container based system allows single user and small project groups to do rapid prototype development, testing and implementation of extreme capability algorithms with more agility and flexibility than can be provided through the IBM Q Experience website.  This prototype environment also provides an excellent teaching environment for labs and project assignments within graduate courses in cloud computing and quantum computing. The paper also discusses computer security challenges for expanding this prototype container system to larger groups of quantum computing researchers.