Results 1  10
of
86
Reconfigurable Computing: A Survey of Systems and Software
, 2000
"... Due to its potential to greatly accelerate a wide variety of applications, reconfigurable computing has become a subject of a great deal of research. Its key feature is the ability to perform computations in hardware to increase performance, while retaining much of the flexibility of a software solu ..."
Abstract

Cited by 196 (5 self)
 Add to MetaCart
Due to its potential to greatly accelerate a wide variety of applications, reconfigurable computing has become a subject of a great deal of research. Its key feature is the ability to perform computations in hardware to increase performance, while retaining much of the flexibility of a software solution. In this survey we explore the hardware aspects of reconfigurable computing machines, from single chip architectures to multichip systems, including internal structures and external coupling. We also focus on the software that targets these machines, such as compilation tools that map highlevel algorithms directly to the reconfigurable substrate. Finally, we consider the issues involved in runtime reconfigurable systems, which reuse the configurable hardware during program execution.
Hardwareassisted simulated annealing with application for fast FPGA placement
 In ACM FPGA ’03
, 2003
"... To truly exploit FPGAs for rapid turnaround development and prototyping, placement times must be reduced to seconds; latebound, reconfigurable computing applications may demand placement times as short as microseconds. In this paper, we show how a systolic structure can accelerate placement by ass ..."
Abstract

Cited by 36 (3 self)
 Add to MetaCart
(Show Context)
To truly exploit FPGAs for rapid turnaround development and prototyping, placement times must be reduced to seconds; latebound, reconfigurable computing applications may demand placement times as short as microseconds. In this paper, we show how a systolic structure can accelerate placement by assigning one processing element to each possible location for an FPGA LUT from a design netlist. We demonstrate that our technique approaches the same quality point as traditional simulated annealing as measured by a simple linear wirelength metric. Experimental results look ahead to compare quality against VPR’s fast placer when considering the minimum channel width required to route as the primary optimization criteria. Preliminary results from an FPGA implementation show the feasibility of accelerating simulated annealing by three orders of magnitude using this approach. This means we can place the largest design in the University of Toronto’s “FPGA
Partitioning of Unstructured Meshes for Load Balancing
, 1995
"... Many largescale engineering and scientific calculations involve repeated updating of variables on an unstructured mesh. To do these types of computations on distributed memory parallel computers, it is necessary to partition the mesh among the processors so that the load balance is maximized and in ..."
Abstract

Cited by 23 (5 self)
 Add to MetaCart
Many largescale engineering and scientific calculations involve repeated updating of variables on an unstructured mesh. To do these types of computations on distributed memory parallel computers, it is necessary to partition the mesh among the processors so that the load balance is maximized and interprocessor communication time is minimized. This can be approximated by the problem of partitioning a graph so as to obtain a minimum cut, a wellstudied combinatorial optimization problem. Graph partitioning is NP complete, so for real world applications, one resorts to heuristics, i.e., algorithms that give good but not necessarily optimum solutions. These algorithms include recursive spectral bisection, local search methods such as KernighanLin, and more general purpose methods such as simulated annealing. We show that a general procedure enables us to combine simulating annealing with KernighanLin. The resulting algorithm is both very fast and extremely effective. 1 Introduction Co...
New Faster KernighanLinType GraphPartitioning Algorithms
 In Proc. IEEE Intl. Conf. ComputerAided Design
, 1993
"... : In this paper we present a very efficient graph partitioning scheme Quick Cut that uses the basic strategy of the KernighanLin (KL) algorithm to swap pairs of nodes to improve an existing partition of a graph G. The main feature of Quick Cut is a "neighborhood search" strategy that is ..."
Abstract

Cited by 22 (3 self)
 Add to MetaCart
(Show Context)
: In this paper we present a very efficient graph partitioning scheme Quick Cut that uses the basic strategy of the KernighanLin (KL) algorithm to swap pairs of nodes to improve an existing partition of a graph G. The main feature of Quick Cut is a "neighborhood search" strategy that is based on the result (obtained in this paper) that it is not necessary to search more than a certain subset of d 2 node pairs to find the node pair with the maximum swap gain. Here d is the maximum node degree of G. We also use an improved data structure, viz., balanced trees, to store the nodes in the two partitions. Due to the new search strategy and data structure, Quick Cut has a worstcase time complexity of \Theta(max(ed; e log n)), and an averagecase complexity of \Theta(e log n), where e is the number of edges of G. The KL algorithm, on the other hand, has a time complexity of \Theta(n 2 log n), where n is the number of nodes of G. Another contribution of this paper is the presentation o...
PartitionBased Clustering in Object Bases: From Theory to Practice
, 1993
"... We classify clustering algorithms into sequencebased techniqueswhich transform the object net into a linear sequenceand partitionbased clustering algorithms. Tsangaris and Naughton [TN91, TN92] have shown that the partitionbased techniques are superior. However, their work is based on a sin ..."
Abstract

Cited by 22 (7 self)
 Add to MetaCart
We classify clustering algorithms into sequencebased techniqueswhich transform the object net into a linear sequenceand partitionbased clustering algorithms. Tsangaris and Naughton [TN91, TN92] have shown that the partitionbased techniques are superior. However, their work is based on a single partitioning algorithm, the Kernighan and Lin heuristics, which is not applicable to realistically large object bases because of its high runningtime complexity. The contribution of this paper is twofold: (1) we devise a new class of greedy object graph partitioning algorithms (GGP) whose runningtime complexity is moderate while still yielding good quality results. For large object graphs GGP is the best known heuristics with an acceptable runningtime. (2) We carry out an extensive quantitative analysis of all wellknown partitioning algorithms for clustering object graphs. Our analysis yields that no one algorithm performs superior for all object net characteristics. Therefore, we d...
Configurable Computing: A Survey of Systems and Software
, 1999
"... Due to its potential to greatly accelerate a wide variety of applications, reconfigurable computing has become a subject of a great deal of research. Its key feature is the ability to perform computations in hardware to increase performance, while retaining much of the flexibility of a software solu ..."
Abstract

Cited by 19 (3 self)
 Add to MetaCart
Due to its potential to greatly accelerate a wide variety of applications, reconfigurable computing has become a subject of a great deal of research. Its key feature is the ability to perform computations in hardware to increase performance, while retaining much of the flexibility of a software solution. In this survey we explore the hardware aspects of reconfigurable computing machines, from single chip architectures to multichip systems, including internal structures and external coupling. We also focus on the software that targets these machines, such as compilation tools that map highlevel algorithms directly to the reconfigurable substrate. Finally, we consider the issues involved in runtime reconfigurable systems, which reuse the configurable hardware during program execution. Introduction There are two primary methods in traditional computing for the execution of algorithms. The first is to use an Application Specific Integrated Circuit, or ASIC, to perform the ope...
Software Technologies for Reconfigurable Systems
 IEEE Computer
, 1996
"... FPGAbased systems are a significant area of computing, providing a highperformance implementation substrate for many different applications. However, the key to harnessing their power for most domains is developing mapping tools for automatically transforming a circuit or algorithm into a config ..."
Abstract

Cited by 18 (6 self)
 Add to MetaCart
(Show Context)
FPGAbased systems are a significant area of computing, providing a highperformance implementation substrate for many different applications. However, the key to harnessing their power for most domains is developing mapping tools for automatically transforming a circuit or algorithm into a configuration for the system. In this paper we review the current stateoftheart in mapping tools for FPGAbased systems, including singlechip and multichip mapping algorithms for FPGAs, software support for reconfigurable computing, and tools for runtime reconfigurability. We also discuss the challenges for the future, pointing out where development is still needed to let reconfigurable systems achieve all of their promise. 1.0 Introduction Reconfigurable computing is becoming a powerful methodology for achieving highperformance implementations of many applications. By mapping applications into FPGA hardware resources, extremely efficient computations can be performed. In [Hauck98] w...
Pin Assignment for MultiFPGA Systems
, 1997
"... MultiFPGA systems have tremendous potential, providing a highperformance computing substrate for many different applications. One of the keys to achieving this potential is a complete, automatic mapping solution that creates highquality mappings in the shortest possible time. In this paper we ..."
Abstract

Cited by 14 (5 self)
 Add to MetaCart
MultiFPGA systems have tremendous potential, providing a highperformance computing substrate for many different applications. One of the keys to achieving this potential is a complete, automatic mapping solution that creates highquality mappings in the shortest possible time. In this paper we consider one step in this process, the assignment of interFPGA signals to specific I/O pins on the FPGAs in a multiFPGA system. We show that this problem can neither be handled by pin assignment methods developed for other applications nor standard routing algorithms. Although current mapping systems ignore this issue, we show that an intelligent pin assignment method can achieve both quality and mapping speed improvements over random approaches. Intelligent pin assignment methods already exist for multiFPGA systems, but are restricted to topologies where logicbearing FPGAs cannot be directly connected. In this paper we provide three new algorithms for the pin assignment of multi...
Fast Placement Approaches for FPGAs
 ACM TRANS. ON DESIGN AUTOMATION OF ELECTRONICS SYSTEMS
, 2002
"... ..."