Results 1 -
3 of
3
Karhunen-Loève Transform: An Exercise in Simple Image-Processing Parallel Pipelines
- In Euro-Par'97
, 1997
"... . Practical parallelizations of multi-phased low-level imageprocessing algorithms may require working in batch mode. The features of a common processing model, employing a pipeline of processor farms, are described. A simple exemplar, the Karhunen-Lo`eve transform, is prototyped on a network of ..."
Abstract
-
Cited by 5 (4 self)
- Add to MetaCart
. Practical parallelizations of multi-phased low-level imageprocessing algorithms may require working in batch mode. The features of a common processing model, employing a pipeline of processor farms, are described. A simple exemplar, the Karhunen-Lo`eve transform, is prototyped on a network of processors running a real-time operating system. The design trade-offs for this and similar algorithms are indicated, when a general solution is sought. Eventual implementation on large- and fine- grained hardware is considered. The chosen exemplar is shown to have some features, such as strict sequencing and unbalanced processing phases, which militate against a comfortable parallelization. 1 Introduction Many low-level image-processing (IP) algorithms, such as spatial filters, are completely localized in their data references. If adjacent image data are overlapped at boundaries then at a small additional cost a data-farming programming paradigm can be employed, in which the only com...
Scheduling Schemes for Data Farming
- IEE Proceedings Part E (Computers and Digital Techniques
, 1999
"... The use of order statistics to arrive at a scheduling regime is shown to be applicable to data farms running on second-generation parallel processors. Uniform and decreasing task-size scheduling regimes are examined. Experimental timings and a further simulation for large-scale effects were used ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
The use of order statistics to arrive at a scheduling regime is shown to be applicable to data farms running on second-generation parallel processors. Uniform and decreasing task-size scheduling regimes are examined. Experimental timings and a further simulation for large-scale effects were used to exercise the scheduling regimes. The paper also considers a number of other scheduling schemes for data farms. It is shown that a method previously used for loop-scheduling is preferable, particularly as a form of automatic and generalised scheduling for data farming where there is a data-dependent workload. 1 Introduction A processor or data farm [ 1 ] is a programming paradigm involving message-passing in which a single task is repeatedly executed in parallel on a collection of initial data. Data-farming is a commonly-used paradigm in parallel processing [ 2 ] and appears in numerous guises: some (network-of-workstations) NOW-based [ 3 ] ; some based on dedicated multicomputers [ 4...
EFFICIENT IMAGE RECONSTRUCTION USING PARTIAL 2D FOURIER TRANSFORM †
"... In this paper we present an efficient way of doing image reconstruction using the 2D Discrete Fourier transform (DFT). We exploit the fact that in the frequency domain, information is concentrated in certain regions. Consequently, it is sufficient to compute partial 2D Fourier transform where only m ..."
Abstract
- Add to MetaCart
In this paper we present an efficient way of doing image reconstruction using the 2D Discrete Fourier transform (DFT). We exploit the fact that in the frequency domain, information is concentrated in certain regions. Consequently, it is sufficient to compute partial 2D Fourier transform where only m ×m elements of an N ×N image are nonzero. Compared with the traditional row-column (RC) decomposition algorithm, the proposed algorithm enables us to reconstruct images with significantly smaller computation complexity at the expense of mild degradation in quality. We also describe the implementation of the new reconstruction algorithm on a Xilinx Virtex-II Pro-100 FPGA. For 512 × 512 natural and aerial images, this implementation results in 68 % reduction in the number of memory accesses and 76 % reduction in the total computation time compared to the RC method.

