## Improved derivation of process networks (2006)

Venue: | in Proceedings of the 4th International Workshop on Optimization for DSP and Embedded Systems (ODES ’06 |

Citations: | 5 - 1 self |

### BibTeX

@INPROCEEDINGS{Verdoolaege06improvedderivation,

author = {Sven Verdoolaege and Hristo Nikolov and Nikolov Todor and Plamenov Stefanov},

title = {Improved derivation of process networks},

booktitle = {in Proceedings of the 4th International Workshop on Optimization for DSP and Embedded Systems (ODES ’06},

year = {2006}

}

### OpenURL

### Abstract

Abstract — Current emerging embedded System-on-Chip platforms are increasingly becoming multiprocessor architectures. System designers experience significant difficulties in programming these platforms. The applications are typically specified as sequential programs that do not reveal the available parallelism in an application, thereby hindering the efficient mapping of an application onto a parallel multiprocessor platform. In this paper we present our compiler techniques that facilitate the migration from a sequential application specification to a parallel application specification using the Process Network model of computation. Our work is inspired by a previous research project called Compaan. With our techniques we address optimization issues such as the generation of Process Networks with simplified topology and communication without sacrificing the Process Networks performance. Moreover, we describe a technique for compile-time memory requirement estimation which we consider as an important contribution of this paper. We demonstrate the usefulness of our techniques on several examples. I.

### Citations

832 |
The Semantics of a Simple Language for Parallel Programming
- Kahn
- 1974
(Show Context)
Citation Context ...lel model of computation used for parallel application specification. Although many parallel models of computation exist [14], [15], in this paper we consider the Process Network model of computation =-=[12]-=- because its operational semantics are simple, yet general enough, to conveniently specify stream-oriented data processing that fits nicely with the application domain we are interested in—multimedia ... |

261 | A Framework for Comparing Models of Computation
- Lee, Sangiovanni-Vincentelli
- 1998
(Show Context)
Citation Context ...rallel application specification. These compiler techniques depend on the parallel model of computation used for parallel application specification. Although many parallel models of computation exist =-=[14]-=-, [15], in this paper we consider the Process Network model of computation [12] because its operational semantics are simple, yet general enough, to conveniently specify stream-oriented data processin... |

172 | Parametric integer programming
- Feautrier
- 1988
(Show Context)
Citation Context ...ified variables α ∈ Zn′ as well as some parameters p ∈ Zn′′ , i.e., S = {i ∈ Zn | ∃α ∈ Zn′ : Ai + Bα + Cp + c ≥ 0 }, with A ∈ Zm×n , B ∈ Zm×n′ , C ∈ Zm×n′′ and c ∈ Zm . Parametric integer programming =-=[9]-=- is a technique for computing the lexicographically smallest element of a parametric integer set. The result is a subdivision of the parameter space with for each cell of this subdivision a descriptio... |

118 | Code generation in the polyhedral model is easier than you think
- Bastoul
- 2004
(Show Context)
Citation Context ...s and identifying some special cases. We will not discuss these issues any further here. For non-parametric problems, it is usually easier to simulate the communication channel. That is, we use CLooG =-=[1]-=- to generate code that increments a counter for each iteration writing to the channel and decrements the counter for each read iteration. The maximum value attained by this counter is recorded and ref... |

96 | YAPI: Application modeling for signal processing systems
- Kock, Essink, et al.
(Show Context)
Citation Context ...tream-oriented data processing that fits nicely with the application domain we are interested in—multimedia and signal processing applications. Moreover, for this application domain, many researchers =-=[6]-=-–[8], [11], [16], [18], [19], [21], [24] have already indicated that Process Networks are very suitable for systematic and efficient mapping onto multiprocessor platforms. In this paper we present our... |

70 | A Systematic Approach to Exploring Embedded System Architectures at Multiple Abstraction Levels
- Pimentel, Erbas, et al.
(Show Context)
Citation Context ...ing that fits nicely with the application domain we are interested in—multimedia and signal processing applications. Moreover, for this application domain, many researchers [6]–[8], [11], [16], [18], =-=[19]-=-, [21], [24] have already indicated that Process Networks are very suitable for systematic and efficient mapping onto multiprocessor platforms. In this paper we present our compiler techniques for der... |

54 | Deprettere, “Compaan: Deriving Process Networks from Matlab for Embedded Signal Processing Architectures
- Kienhuis, Rijpkema, et al.
(Show Context)
Citation Context ...oned above in a particular way. SANLPs are important in Scientific, Matrix Computation and Multimedia and Adaptive Signal Processing applications. Our work is inspired by previous research on Compaan =-=[13]-=-, [20], [23]. The techniques presented in this paper can be seen as a significant improvement of the techniques developed in the Compaan project in the following sense. The Compaan project has identif... |

53 | System Design using Kahn Process Networks: The Compaan/Laura Approach
- Stefanov, Zissulescu, et al.
(Show Context)
Citation Context ...at fits nicely with the application domain we are interested in—multimedia and signal processing applications. Moreover, for this application domain, many researchers [6]–[8], [11], [16], [18], [19], =-=[21]-=-, [24] have already indicated that Process Networks are very suitable for systematic and efficient mapping onto multiprocessor platforms. In this paper we present our compiler techniques for deriving ... |

48 | Lattice-based memory allocation
- Darte, Schreiber, et al.
- 2005
(Show Context)
Citation Context ...lready been highlighted throughout the text. As to memory size requirements, much research has been devoted to optimal reuse of memory for arrays. For an overview and a general technique, we refer to =-=[4]-=-. These techniques are complementary to our research on FIFO sizes and can be used on the reordering channels and optionally on the data communication inside a node. Also related is the concept of reu... |

34 | Multiprocessor Mapping of Process Networks: A JPEG Decoding Case Study
- Kock
(Show Context)
Citation Context ...ntee a correct-by-construction generation of Process Networks. Process Networks have been used to model applications and to explore the mapping of these applications onto multiprocessor architectures =-=[7]-=-, [16], [19], [24]. The application modeling is performed manually starting from sequential C code and a significant amount of time (a few weeks) is spent by the designers on correctly transforming th... |

32 | System Level Design with SPADE: an M-JPEG Case Study
- Lieverse, Stefanov, et al.
(Show Context)
Citation Context ...data processing that fits nicely with the application domain we are interested in—multimedia and signal processing applications. Moreover, for this application domain, many researchers [6]–[8], [11], =-=[16]-=-, [18], [19], [21], [24] have already indicated that Process Networks are very suitable for systematic and efficient mapping onto multiprocessor platforms. In this paper we present our compiler techni... |

29 | Generating cache hints for improved program efficiency
- Beyls, D’Hollander
- 2005
(Show Context)
Citation Context ...iques are complementary to our research on FIFO sizes and can be used on the reordering channels and optionally on the data communication inside a node. Also related is the concept of reuse distances =-=[2]-=-. In particular, our FIFO sizes are a special case of the “reuse distance per statement” of [26]. For more advanced forms of copy propagation, we refer to [25].sVIII. CONCLUSIONS AND DISCUSSION In thi... |

28 | Analytical computation of Ehrhart polynomials: Enabling more compiler analyses and optimizations
- Verdoolaege, Seghir, et al.
- 2004
(Show Context)
Citation Context ...nt the number of write iterations that are lexicographically smaller than this read iteration. Although counting the number of elements in the resulting sets is very easy, we use the barvinok library =-=[29]-=- to perform this counting. This library is designed for more complicated cases but also detects and appropriately handles these easy cases. In the example, the first read operation occurs at iteration... |

19 |
Factoring wavelet transforms into lifting schemes,” Fourier Analysis
- Daubechies, Sweldens
- 1998
(Show Context)
Citation Context ...refore filter-length dependent. The C program realizing one level of a 2D forward DWT is presented in Figure 10. In this example we use a lifting scheme of a reversible transformation with 5/3 filter =-=[5]-=-. In this case the image has to be extended with one pixel at the boundaries. All the boundary conditions are described by the conditions in code lines 8, 11, 17, 20, 26 and 29. 1 copy high_flt_hor 1 ... |

19 | Multi-dimensional incremental loop fusion for data locality
- Verdoolaege, Bruynooghe, et al.
- 2003
(Show Context)
Citation Context ... distance vectors are (lexicographically) positive. The end result is a schedule that ensures that every data element is written before it is read. For more information on this algorithm, we refer to =-=[27]-=-, where it is applied to perform loop fusion on SANLPs. Note that unlike the case of loop fusion, we can ignore antidependences here, unless we want to use the declared size of an array as an estimate... |

17 | C-HEAP: A Heterogeneous Multi-processor Architecture Template and Scalable and Flexible Protocol for the Design of Embedded Signal Processing Systems
- Nieuwland, Kang, et al.
- 2002
(Show Context)
Citation Context ...rocessing that fits nicely with the application domain we are interested in—multimedia and signal processing applications. Moreover, for this application domain, many researchers [6]–[8], [11], [16], =-=[18]-=-, [19], [21], [24] have already indicated that Process Networks are very suitable for systematic and efficient mapping onto multiprocessor platforms. In this paper we present our compiler techniques f... |

13 | Advanced copy propagation for arrays
- Vanbroekhoven, Janssens, et al.
- 2003
(Show Context)
Citation Context ...related is the concept of reuse distances [2]. In particular, our FIFO sizes are a special case of the “reuse distance per statement” of [26]. For more advanced forms of copy propagation, we refer to =-=[25]-=-.sVIII. CONCLUSIONS AND DISCUSSION In this paper we have improved upon the state-of-the-art conversion of sequential programs to Process Networks in several ways. We have shown that we can reduce the ... |

12 | Automatic Synthesis of System on Chip Multiprocessor Architectures for Process networks
- Dwivedi, Kumar, et al.
(Show Context)
Citation Context ...m-oriented data processing that fits nicely with the application domain we are interested in—multimedia and signal processing applications. Moreover, for this application domain, many researchers [6]–=-=[8]-=-, [11], [16], [18], [19], [21], [24] have already indicated that Process Networks are very suitable for systematic and efficient mapping onto multiprocessor platforms. In this paper we present our com... |

11 | Mapping Concurrent Applications onto Architectural Platforms
- Mihal, Keutzer
- 2003
(Show Context)
Citation Context ...m very difficult. By contrast, if an application is specified using a parallel model of computation (MoC) then the mapping can be done in a systematic and transparent way using a disciplined approach =-=[17]-=-, but specifying an application using a parallel MoC is difficult, not well understood by application developers, and a time consuming and error prone process. That is why application developers still... |

4 |
Deriving Process Networks from Nested Loop Algorithms
- Rijpkema, Deprettere, et al.
- 2000
(Show Context)
Citation Context ...bove in a particular way. SANLPs are important in Scientific, Matrix Computation and Multimedia and Adaptive Signal Processing applications. Our work is inspired by previous research on Compaan [13], =-=[20]-=-, [23]. The techniques presented in this paper can be seen as a significant improvement of the techniques developed in the Compaan project in the following sense. The Compaan project has identified th... |

3 |
et al., “PtolemyII: Heterogeneous Concurrent Modeling and Design in Java
- Lee
- 1999
(Show Context)
Citation Context ... application specification. These compiler techniques depend on the parallel model of computation used for parallel application specification. Although many parallel models of computation exist [14], =-=[15]-=-, in this paper we consider the Process Network model of computation [12] because its operational semantics are simple, yet general enough, to conveniently specify stream-oriented data processing that... |

2 |
A Hierarchical Classification Scheme to Derive Interprocess Communication
- Turjan, Kienhuis, et al.
(Show Context)
Citation Context ...tion of the reordering test of [23], where it is formulated as n1 × n2 PIP problems for a channel described by a single integer relation. The simplified computation for specific types of relations of =-=[22]-=- apply to pairs of the same relation and, with some modifications, also to pairs of different relations. D. Detecting Self Reuse The reordering check from the previous section is not sufficient to det... |

2 | A high-level memory energy estimator based on reuse distance
- Aa, Jayapal, et al.
- 2005
(Show Context)
Citation Context ...s and optionally on the data communication inside a node. Also related is the concept of reuse distances [2]. In particular, our FIFO sizes are a special case of the “reuse distance per statement” of =-=[26]-=-. For more advanced forms of copy propagation, we refer to [25].sVIII. CONCLUSIONS AND DISCUSSION In this paper we have improved upon the state-of-the-art conversion of sequential programs to Process ... |

2 | An access regularity criterion and regularity improvement heuristics for data transfer optimization by global loop transformations
- Verdoolaege, Danckaert, et al.
- 2003
(Show Context)
Citation Context ...o align the iteration domains. Although our heuristics seem to perform relatively well on our examples, it is clear that we need a more general approach such as the linear transformation algorithm of =-=[28]-=-. V. WORKED-OUT EXAMPLES In this section, we show the results of applying our optimization techniques to two image processing algorithms. The generated Process Networks (PN) enjoy a reduction in the a... |

1 |
A symbolic approach to Bernstein expansion (in russian
- Clauss, Tchoupaeva
(Show Context)
Citation Context ...ired buffer size is the maximum of this expression over all read iterations. Computing this parametric maximum remains an obstacle, however. A possible approach is to use symbolic Bernstein expansion =-=[3]-=-, but to the best of our knowledge this technique has not been implemented yet. For sufficiently regular problems, we can still compute the above maximum symbolically by performing some simplification... |

1 |
An MPEG-2 Decoder Case Study as a Driver for a
- Wolf, Lieverse, et al.
(Show Context)
Citation Context ...s nicely with the application domain we are interested in—multimedia and signal processing applications. Moreover, for this application domain, many researchers [6]–[8], [11], [16], [18], [19], [21], =-=[24]-=- have already indicated that Process Networks are very suitable for systematic and efficient mapping onto multiprocessor platforms. In this paper we present our compiler techniques for deriving Proces... |