Results 1  10
of
28
A Quantum Logic Array Microarchitecture: Scalable Quantum Data Movement and Computation
 Proceedings of the 38th International Symposium on Microarchitecture MICRO38
, 2005
"... Recent experimental advances have demonstrated technologies capable of supporting scalable quantum computation. A critical next step is how to put those technologies together into a scalable, faulttolerant system that is also feasible. We propose a Quantum Logic Array (QLA) microarchitecture that f ..."
Abstract

Cited by 27 (3 self)
 Add to MetaCart
(Show Context)
Recent experimental advances have demonstrated technologies capable of supporting scalable quantum computation. A critical next step is how to put those technologies together into a scalable, faulttolerant system that is also feasible. We propose a Quantum Logic Array (QLA) microarchitecture that forms the foundation of such a system. The QLA focuses on the communication resources necessary to efficiently support faulttolerant computations. We leverage the extensive groundwork in quantum error correction theory and provide analysis that shows that our system is both asymptotically and empirically fault tolerant. Specifically, we use the QLA to implement a hierarchical, arraybased design and a logarithmic expense quantumteleportation communication protocol. Our goal is to overcome the primary scalability challenges of reliability, communication, and quantum resource distribution that plague current proposals for largescale quantum computing. Our work complements recent work by Balenseifer et al [1], which studies the software tool chain necessary to simplify development of quantum applications; here we focus on modeling a fullscale optimized microarchitecture for scalable computing. 1.
A Fault Tolerant, Area Efficient Architecture for Shor’s Factoring Algorithm
"... We optimize the area and latency of Shor’s factoring while simultaneously improving fault tolerance through: (1) balancing the use of ancilla generators, (2) aggressive optimization of error correction, and (3) tuning the core adder circuits. Our custom CAD flow produces detailed layouts of the phys ..."
Abstract

Cited by 14 (3 self)
 Add to MetaCart
(Show Context)
We optimize the area and latency of Shor’s factoring while simultaneously improving fault tolerance through: (1) balancing the use of ancilla generators, (2) aggressive optimization of error correction, and (3) tuning the core adder circuits. Our custom CAD flow produces detailed layouts of the physical components and utilizes simulation to analyze circuits in terms of area, latency, and success probability. We introduce a metric, called ADCR, which is the probabilistic equivalent of the classic AreaDelay product. Our error correction optimization can reduce ADCR by an order of magnitude or more. Contrary to conventional wisdom, we show that the area of an optimized quantum circuit is not dominated exclusively by error correction. Further, our adder evaluation shows that quantum carrylookahead adders (QCLA) beat ripplecarry adders in ADCR, despite being larger and more complex. We conclude with what we believe is one of most accurate estimates of the area and latency required for 1024bit Shor’s factorization: 7659 mm 2 for the smallest circuit and 6 × 10 8 seconds for the fastest circuit.
Quantum Memory Hierarchies: Efficient Designs to Match Available Parallelism in Quantum Computing
"... The assumption of maximum parallelism support for the successful realization of scalable quantum computers has led to homogeneous, “seaofqubits ” architectures. The resulting architectures overcome the primary challenges of reliability and scalability at the cost of physically unacceptable system ..."
Abstract

Cited by 11 (3 self)
 Add to MetaCart
The assumption of maximum parallelism support for the successful realization of scalable quantum computers has led to homogeneous, “seaofqubits ” architectures. The resulting architectures overcome the primary challenges of reliability and scalability at the cost of physically unacceptable system area. We find that by exploiting the natural serialization at both the application and the physical microarchitecture level of a quantum computer, we can reduce the area requirement while improving performance. In particular we present a scalable quantum architecture design that employs specialization of the system into memory and computational regions, each individually optimized to match hardware support to the available parallelism. Through careful application and system analysis, we find that our new architecture can yield up to a factor of thirteen savings in area due to specialization. In addition, by providing a memory hierarchy design for quantum computers, we can increase time performance by a factor of eight. This result brings us closer to the realization of a quantum processor that can solve meaningful problems.
Distributed arithmetic on a quantum multicomputer
 See ACM
, 2006
"... We evaluate the performance of quantum arithmetic algorithms run on a distributed quantum computer (a quantum multicomputer). We vary the node capacity and I/O capabilities, and the network topology. The tradeoff of choosing between gates executed remotely, through “teleported gates ” on entangled p ..."
Abstract

Cited by 11 (3 self)
 Add to MetaCart
We evaluate the performance of quantum arithmetic algorithms run on a distributed quantum computer (a quantum multicomputer). We vary the node capacity and I/O capabilities, and the network topology. The tradeoff of choosing between gates executed remotely, through “teleported gates ” on entangled pairs of qubits (telegate), versus exchanging the relevant qubits via quantum teleportation, then executing the algorithm using local gates (teledata), is examined. We show that the teledata approach performs better, and that carryripple adders perform well when the teleportation block is decomposed so that the key quantum operations can be parallelized. A node size of only a few logical qubits performs adequately provided that the nodes have two transceiver qubits. A linear network topology performs acceptably for a broad range of system sizes and performance parameters. We therefore recommend pursuing small, highI/O bandwidth nodes and a simple network. Such a machine will run Shor’s algorithm for factoring large numbers efficiently. 1
Automated Generation of Layout and Control for Quantum Circuits
 In Proc. of ACM Intl. Conf. on Computing Frontiers
, 2007
"... We present a computeraided design flow for quantum circuits, complete with automatic layout and control logic extraction. To motivate automated layout for quantum circuits, we investigate gridbased layouts and show a performance variance of four times as we vary grid structure and initial qubit pl ..."
Abstract

Cited by 9 (3 self)
 Add to MetaCart
(Show Context)
We present a computeraided design flow for quantum circuits, complete with automatic layout and control logic extraction. To motivate automated layout for quantum circuits, we investigate gridbased layouts and show a performance variance of four times as we vary grid structure and initial qubit placement. We then propose two polynomialtime design heuristics: a greedy algorithm suitable for small, congestionfree quantum circuits and a dataflowbased analysis approach to placement and routing with implicit initial placement of qubits. Finally, we show that our dataflowbased heuristic generates better layouts than the stateoftheart automated gridbased layout and scheduling mechanism in terms of latency and potential pipelinability, but at the cost of some area. 1
Scheduling physical operations in a quantum information processor
 Proceedings of SPIE, 6244:62440T
, 2006
"... Please verify that (1) all pages are present, (2) all figures are acceptable, (3) all fonts and special characters are correct, and (4) all text and figures fit within the ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
Please verify that (1) all pages are present, (2) all figures are acceptable, (3) all fonts and special characters are correct, and (4) all text and figures fit within the
Tailoring quantum architectures to implementation style: A quantum computer for mobile and persistent qubits
 In ISCA ’07
, 2007
"... In recent years, quantum computing (QC) research has moved from the realm of theoretical physics and mathematics into real implementations [9]. With many different potential hardware implementations, quantum computer architecture is a rich field with an opportunity to solve interesting new problems ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
(Show Context)
In recent years, quantum computing (QC) research has moved from the realm of theoretical physics and mathematics into real implementations [9]. With many different potential hardware implementations, quantum computer architecture is a rich field with an opportunity to solve interesting new problems and to revisit old ones. This paper presents a QC architecture tailored to physical implementations with highly mobile and persistent quantum bits (qubits). Implementations with qubit coherency times that are much longer than operation times and qubit transportation times that are orders of magnitude faster than operation times lend greater flexibility to the architecture. This is particularly true in the placement and locality of individual qubits. For concreteness, we assume a physical device model based on electronspin qubits on liquid helium (eSHe) [15]. Like many conventional computer architectures, QCs focus on the efficient exposure of parallelism. We present here a QC microarchitecture that enjoys increasing computational parallelism with size and latency scaling only linearly with the number of operations. Although an efficient and high level of parallelism is admirable, quantum hardware is still expensive and difficult to build, so we demonstrate how the software may be optimized to reduce an application’s hardware requirements by 25 % with no performance loss. Because the majority of a QC’s time and resources are devoted to quantum error correction, we also present noise modeling results that evaluate error correction procedures. These results demonstrate that idle qubits in memory need only be refreshed approximately once every one hundred operation cycles.
Minimizing the Latency of Quantum Circuits during Mapping to the IonTrap Circuit Fabric
"... Abstract — Quantum computers are exponentially faster than their classical counterparts in terms of solving some specific, but important problems. The biggest challenge in realizing a quantum computing system is the environmental noise. One way to decrease the effect of noise (and hence, reduce the ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
(Show Context)
Abstract — Quantum computers are exponentially faster than their classical counterparts in terms of solving some specific, but important problems. The biggest challenge in realizing a quantum computing system is the environmental noise. One way to decrease the effect of noise (and hence, reduce the overhead of building fault tolerant quantum circuits) is to reduce the latency of the quantum circuit that runs on a quantum circuit. In this paper, a novel algorithm is presented for scheduling, placement, and routing of a quantum algorithm, which is to be realized on a target quantum circuit fabric technology. This algorithm, and the accompanying software tool, advances stateoftheart in quantum CAD methodologies and methods while considering key characteristics and constraints of the iontrap quantum circuit fabric. Experimental results show that the presented tool improves results of the previous tool by about 41%. Keywords quantum computing; scheduling; routing; placement; iontrap technology; CAD tool I.
A librarybased synthesis methodology for reversible logic
 Microelectron. J
, 2010
"... Synthesis of reversible logic has received significant attention in the recent years and many synthesis approaches for reversible circuits have been proposed so far. In this paper, a librarybased synthesis methodology for reversible circuits is proposed where a reversible specification is consider ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
(Show Context)
Synthesis of reversible logic has received significant attention in the recent years and many synthesis approaches for reversible circuits have been proposed so far. In this paper, a librarybased synthesis methodology for reversible circuits is proposed where a reversible specification is considered as a permutation comprising a set of cycles. To this end, a presynthesis optimization step is introduced to construct a reversible specification from an irreversible function. In addition, a cyclebased representation model is presented to be used as an intermediate format in the proposed synthesis methodology. The selected intermediate format serves as a focal point for all potential representation models. In order to synthesize a given function, a library containing seven building blocks is used where each building block is a cycle of length less than 6. To synthesize large cycles, we also propose a decomposition algorithm which produces all possible minimal and inequivalent factorizations for a given cycle of length greater than 5. All decompositions contain the maximum number of disjoint cycles. The generated decompositions are used in conjunction with a novel cycle assignment algorithm which is proposed based on the graph matching problem to select the best possible cycle pairs. Then, each pair is synthesized by using the available components of the library. The decomposition algorithm together with the cycle assignment method are considered as a binding method which selects a building block from the library for each cycle. Finally, a postsynthesis optimization step is introduced to optimize the synthesis results in terms of different costs. To analyze the proposed methodology, various experiments are performed. Our analyses on the available reversible benchmark functions reveal that the proposed librarybased synthesis methodology can produce lowcost circuits in some cases compared with the current approaches. The proposed methodology always converges and it typically synthesizes a give function fast. No garbage line is used for even permutations. 1 ar