## SVP - a Model Capturing Sets, Streams, and Parallelism (1992)

Venue: | In Proceedings of the 18th VLDB Conference |

Citations: | 22 - 0 self |

### BibTeX

@INPROCEEDINGS{Parker92svp-,

author = {D. Stott Parker and Eric Simon and Patrick Valduriez and Projet Rodin},

title = {SVP - a Model Capturing Sets, Streams, and Parallelism},

booktitle = {In Proceedings of the 18th VLDB Conference},

year = {1992},

pages = {115--126},

publisher = {Morgan-Kaufmann}

}

### OpenURL

### Abstract

We describe the SVP data model. The goal of SVP is to model both set and stream data, and to model parallelism in bulk data processing. SVP also shows promise for other parallel processing applications. SVP models collections, which include sets and streams as special cases. Collections are represented as ordered tree structures, and divide-and-conquer mappings are easily defined on these structures. We show that many useful database mappings (queries) have a divide-and-conquer format when specified using collections, and that this specification exposes parallelism. We formalize a class of divide-and-conquer mappings on collections called SVP-transducers. SVP-transducers generalize aggregates, set mappings, stream transductions, and scan computations. At the same time, they have a rigorous semantics based on continuity with respect to collection orderings, and permit implicit specification of both independent and pipeline parallelism. 1 Introduction Achieving parallelism in bulk data...

### Citations

723 |
in Context
- Carriero, Gelernter
- 1989
(Show Context)
Citation Context ...o low-level and difficult for the programmer. Furthermore, the large variety of parallel architectures result in distinct, architecturespecific extensions to the original language. 1 In order 1 Linda =-=[4]-=- is a notable exception of `coordination language' with simple, language-independent parallel constructs, which can mate easily with many non-parallel languages. to achieve efficient program execution... |

489 | The chemical abstract machine
- Berry, Boudol
- 1990
(Show Context)
Citation Context ...anguages aimed at supporting both pipeline and independent parallelism. Recently a variety of parallel programming models have taken a larger view of how parallelism should be expressed. For example, =-=[6]-=- combines both a set orientation with parallel lambda calculus (rewriting) execution. It is interesting to consider a programming language based on the SVP transducer, or equivalents like divide-and-c... |

467 | Comprehending monads
- Wadler
- 1990
(Show Context)
Citation Context ...e other specification techniques, including restricted higher-order mappings like the reduction operator in APL [10] and the pump operator in FAD [6], list comprehensions and elegant variants thereof =-=[20]-=-, and series-parallel computation graphs [15]. ffl Parallelism in the dividing and conquering is specified using both the structure of the data, and the structure of the divide-and-conquer mapping: di... |

242 |
Data Parallel Algorithms
- Hillis, Guy
- 1986
(Show Context)
Citation Context ...(with the MIMD computation model). The regularity of the data structures available in the language permits exploitation of different forms of parallelism, such as independent and pipeline parallelism =-=[9]-=-. In this paper, we follow the third approach, and propose a model for parallel database programming where the primary sources for parallelism are parallel set and stream expressions. Parallel program... |

232 |
A Programming Language
- Iverson
- 1962
(Show Context)
Citation Context ...s paper these mappings are specified with recursive functional equations. They generalize other specification techniques, including restricted higher-order mappings like the reduction operator in APL =-=[10]-=- and the pump operator in FAD [6], list comprehensions and elegant variants thereof [20], and series-parallel computation graphs [15]. ffl Parallelism in the dividing and conquering is specified using... |

221 | An introduction to the theory of lists
- Bird
- 1987
(Show Context)
Citation Context ...lues in the set. Streams analogously use the following notation: 1. [ ] is a stream (the empty stream); 2. [x] is a stream, for any value x; 3. Finite streams are written with square braces, as with: =-=[1; 2; 3]-=-. 4. The concatenation S 1 ffl S 2 is a stream, if S 1 and S 2 are streams. (We use the symbol ` ffl ' for stream concatenation (`append') in this paper.) 5. The length j S j of any stream S is the nu... |

109 |
Recursive Programming Techniques
- Burge
- 1975
(Show Context)
Citation Context ...s definable as an SVP transducer: pump(h; `; id ` ; hi) = id ` pump(h; `; id ` ; hxi) = h(x) pump(h; `; id ` ; S 1 \Pi S 2 ) = pump(h; `; id ` ; S 1 ) ` pump(h; `; id ` ; S 2 ): The list1 operator in =-=[3]-=- is similar. The APL reduction operator [10] allows non-associative, non-commutative operators. In particular, if ` is a binary operator and S = (x 1 ; x 2 ; : : : ; xn ) is a vector, the APL reductio... |

108 | Structural recursion as a query language
- BREAZU-TANNEN, BUNEMAN, et al.
- 1991
(Show Context)
Citation Context .... All programs in this paper (and more) are presented there. Earlier this month we were informed by Shamim Naqvi that very similar work was independently developed by Breazu-Tanen, Buneman, and Naqvi =-=[22]-=- during the early part of 1991. This work takes a somewhat different tack, emphasizing not parallelism or `divide-and-conquer', but the elegant properties of structural recursion for database programm... |

60 |
Tree automata: an informal survey
- Thatcher
- 1973
(Show Context)
Citation Context ...ve functional equations. They generalize other specification techniques, including restricted higher-order mappings like the reduction operator in APL [16] and the pump operator in FAD [10], automata =-=[29, 30]-=-, and series-parallel computation graphs [24]. ffl Parallelism in the dividing and conquering is specified using both the structure of the data, and the structure of the divide-and-conquer mapping: di... |

58 |
Algebraic identities for program calculation
- Bird
- 1989
(Show Context)
Citation Context ...lues in the set. Streams analogously use the following notation: 1. [ ] is a stream (the empty stream); 2. [x] is a stream, for any value x; 3. Finite streams are written with square braces, as with: =-=[1; 2; 3]-=-. 4. The concatenation S 1 ffl S 2 is a stream, if S 1 and S 2 are streams. (We use the symbol ` ffl ' for stream concatenation (`append') in this paper.) 5. The length j S j of any stream S is the nu... |

47 |
Parallel algorithms for the execution of relational database operations
- Bitton, Boral, et al.
- 1983
(Show Context)
Citation Context ...a low-level form of relational algebra with explicit (low-level) parallel constructs [2]. Data partitioning is used to spread the computation of relational algebra operators among parallel processors =-=[1]-=-. This partitioning is typically defined during the physical database design and then exploited by a compiler. Most of the time, a partitioned computation requires that processors exchange intermediat... |

44 |
Kruskal (Eds.), Time Warps, String Edits and Macromolecules: the Theory and Practice of Sequence Comparison, 2nd Edition
- Sankoff, B
- 1999
(Show Context)
Citation Context ...In array processing, it seems that the restrictions on SVP recursions over collections are not flexible enough for expressing some computations. A good example comes from sequence comparison problems =-=[26]-=-. These problems usually have a natural divide-and-conquer structure. For example, if A and B are sequences, and hi is the empty sequence, their minimal string-edit distance d(A; B) is defined by d(hi... |

34 |
Denotational Semantics
- Schmidt
- 1986
(Show Context)
Citation Context ... \Pi be the set of all finite or countably infinite collections over the finite or countably infinite set D. If we define hDi = f hxi j x 2 D g then D \Pi satisfies the recursive domain specification =-=[27]-=- D \Pi = hDi + D \Pi \Theta D \Pi : Also, we need to introduce ?, the undetermined collection. Extending the definition above for the lifting D? = D [ f?g, let hDi ? = hDi [ f?g, and D \Pi ? be the se... |

18 | S.: Parallel Evaluation of the Transitive Closure of a Database Relation - Valduriez, Khoshafian - 1988 |

18 |
A New Paradigm for Parallel and Distributed RuleProcessing
- WOLFSON, OZERI
- 1990
(Show Context)
Citation Context ...ata partitioning (fan-out parallelism) will be done and how distributed results will be collected (fan-in parallelism). This view is supported by recent results on data reduction for Datalog programs =-=[21]-=-, in which rules are replaced by their per-processor specializations. These specialized rules include appropriate hash functions that capture partitioning information. This approach is very interestin... |

17 |
The Semantics of a Simple Language for Parallel
- Kahn
- 1974
(Show Context)
Citation Context ...nds of structure important in data processing, including sort ordering and physical data partitioning. It is also possible to generalize the classic work of Kahn for continuous functions on sequences =-=[11]-=- to work for continuous functions on collections. The basic idea is that prefix-continuous functions on sequences 3 are exactly those functions that yield pipeline parallelism. Stream-continuity gives... |

16 | The Semantics of a Simple Language for - KAHN - 1974 |

13 |
A FAD for data intensive applications
- Danforth, Valduriez
- 1992
(Show Context)
Citation Context ...en developed before that permit expression of both ordering among tuples and data partitioning. For example, the FAD language has operators that express various forms of fan-out and fanin parallelism =-=[6]-=-. FAD is a strongly-typed set-oriented database language based on functional programming and relational algebra. It provides a fixed set of higherorder functions to aggregate functions, like the pump ... |

13 |
The Nested Rectangular Array as a Model of Data
- More
- 1979
(Show Context)
Citation Context ...ctional operators over lists appears in [28]. Third, we can modify SVP to include some features for arrays, and some kinds of nested arrays. More has developed an extensive theory of nested arrays in =-=[20, 21]-=-, and has shown how it generalizes and improves the (non-nested) model of arrays used in APL. Others have gotten very interesting results with recursive arrays. The classic paper [36] explores the res... |

13 | Matrix algebra and applicative programming
- Wise
- 1987
(Show Context)
Citation Context ...ed arrays in [20, 21], and has shown how it generalizes and improves the (non-nested) model of arrays used in APL. Others have gotten very interesting results with recursive arrays. The classic paper =-=[36]-=- explores the results of storing square arrays as recursive quadtrees (which Wise calls 2 d -ary trees, or quaternary trees), and developing matrix operators as divide-and-conquer operations over thes... |

12 |
Axioms and theorems for a theory of arrays
- More
- 1973
(Show Context)
Citation Context ...ctional operators over lists appears in [28]. Third, we can modify SVP to include some features for arrays, and some kinds of nested arrays. More has developed an extensive theory of nested arrays in =-=[20, 21]-=-, and has shown how it generalizes and improves the (non-nested) model of arrays used in APL. Others have gotten very interesting results with recursive arrays. The classic paper [36] explores the res... |

10 |
Stream data analysis in Prolog
- Parker
- 1990
(Show Context)
Citation Context ...among data (e.g., sorted relations, or ordered tuples). Relational languages are therefore inadequate for specifying `stream processing', in which ordered sequences of data are processed sequentially =-=[13]-=-. Pipeline parallelism is generally used, transparently to the user, in lower-level languages implementing relational algebra (e.g., PLERA [2], or PFAD [8]). However, higher-level relational interface... |

10 |
Automata Networks in Computer Science: Theory and Applications
- Souli'e, Robert, et al.
- 1987
(Show Context)
Citation Context ...ve functional equations. They generalize other specification techniques, including restricted higher-order mappings like the reduction operator in APL [16] and the pump operator in FAD [10], automata =-=[29, 30]-=-, and series-parallel computation graphs [24]. ffl Parallelism in the dividing and conquering is specified using both the structure of the data, and the structure of the divide-and-conquer mapping: di... |

9 |
Compiling Control into Database Queries for Parallel Execution Management
- Borla-Salamet, Chachaty, et al.
- 1991
(Show Context)
Citation Context ...ich ordered sequences of data are processed sequentially [13]. Pipeline parallelism is generally used, transparently to the user, in lower-level languages implementing relational algebra (e.g., PLERA =-=[2]-=-, or PFAD [8]). However, higher-level relational interfaces do not permit streams to be exploited, preventing specification of stream computations and also pipeline parallelism. A second problem is th... |

6 |
Parallelizing FAD: A Database Programming Language
- HART, DANFORTH, et al.
- 1988
(Show Context)
Citation Context ...equences of data are processed sequentially [13]. Pipeline parallelism is generally used, transparently to the user, in lower-level languages implementing relational algebra (e.g., PLERA [2], or PFAD =-=[8]-=-). However, higher-level relational interfaces do not permit streams to be exploited, preventing specification of stream computations and also pipeline parallelism. A second problem is that parallel d... |

5 |
High Speed Implementations of Rule-Based Systems
- Gupta, Forgy, et al.
- 1989
(Show Context)
Citation Context ...sulting speed-up is quite limited. For instance, experiments conducted with the OPS5 rule-based language revealed that in practice, the true speed-up achievable from parallelism was less than tenfold =-=[7]-=-. A related serious problem with this approach is that, in the final analysis, the serial programming paradigm does not encourage the use of parallel algorithms. The second approach enables the progra... |

3 |
From modal logic to deductive databases
- Thayse, ed
- 1989
(Show Context)
Citation Context ...to an accepting state, but also accept an infinite word if it drives them into an accepting state an infinite number of times. An excellent review of the work in this area is provided in Chapter 4 of =-=[31]-=-. It would be interesting to relate these results for infinite streams to what we have developed with SVP. In particular, we can perhaps derive a Temporal Relational Calculus, a query logic for stream... |

2 |
et al., Paragon: a Parallel Programming Environment for Scientific Applications Using Communications Structures
- Chase
- 1991
(Show Context)
Citation Context ...ing where the primary sources for parallelism are parallel set and stream expressions. Parallel programming environments that follow this approach have recently been proposed. For example, in Paragon =-=[5]-=-, the primary source for parallelism is parallel array expressions. Paragon is targeted to scientific programming applications and offers the essential features of parallel Fortran languages. Our mode... |

2 |
Graphs and Order: the role of graphs in the theory of ordered sets and its applications
- Rival, ed
- 1985
(Show Context)
Citation Context ...r format for specifying mappings which is implicitly also a format for specifying parallelism. ffl Divide-and-conquer computations can be represented as series-parallel graphs. Series-parallel graphs =-=[15]-=- are defined recursively as graphs having one input and one output that can be constructed using two combination rules: series or parallel composition of the inputs and outputs. A typical series-paral... |

2 |
Architecture-independent parallel computation
- Skillikorn
- 1990
(Show Context)
Citation Context ...et machine. The third approach can combine the advantages of the other two. It can ease the task of programming while allowing the programmer to express non-sequential computation in a high-level way =-=[16]-=-. Once the programmer has specified the algorithmic aspects of his program using high-level programming constructs, automatic or semi-automatic methods can be used to derive a mapping from the computa... |

1 |
A New Paradigm for Parallel and
- Wolfson, Ozeri
- 1990
(Show Context)
Citation Context ...ata partitioning (fan-out parallelism) will be done and how distributed results will be collected (fan-in parallelism). This view is supported by recent results on data reduction for Datalog programs =-=[21]-=-, in which rules are replaced by their per-processor specializations. These specialized rules include appropriate hash functions that capture partitioning information. This approach is very interestin... |

1 |
eds.), The Handbook of Fixed Income Securities (3rd
- Fabozzi, Pollack
- 1991
(Show Context)
Citation Context ...ios. Investment strategies for bonds have become very complex, and decisions often require a great deal of data analysis. A very readable and comprehensive introduction to the subject can be found in =-=[11]-=-. 5.1 Bond Attributes As an example of what one finds when considering various investments in the United States, the information for a bond abstracted from Moody's Public Utility Manual could look as ... |

1 |
Using a Relational System on Wall Street
- Rozen, Shasha
- 1989
(Show Context)
Citation Context ...n state(y; Q 1 ) \Pi union state(y; Q 2 ) in proper SVP form. 5 Case Study: Bond Investment Analysis In this section, we describe a realistic application that shows potential for a model like SVP. In =-=[25]-=-, Rozen and Shasha describe BondDB, a decision support system developed to support investment banks in the buying and selling of bonds. The system was built using a relational DBMS (Oracle), storing b... |

1 | 31 [41 151 C. Chase et al., Paragon: a Parallel Programming Environment for Scientific Applications Using Communications Structures - PI - 1983 |