Results 1  10
of
15
SEQ: A Model for Sequence Databases
 University of WisconsinMadison
, 1995
"... This paper presents the model which is the basis for a system to manage various kinds of sequence data. The model separates the data from the ordering information, and includes operators based on two distinct abstractions of a sequence. The main contributions of the model are: (a) it can deal with d ..."
Abstract

Cited by 62 (5 self)
 Add to MetaCart
This paper presents the model which is the basis for a system to manage various kinds of sequence data. The model separates the data from the ordering information, and includes operators based on two distinct abstractions of a sequence. The main contributions of the model are: (a) it can deal with different types of sequence data, (b) it supports an expressive range of sequence queries, (c) it draws from many of the diverse existing approaches to modeling sequence data. 1
Sequences, Datalog and Transducers
, 1996
"... This paper develops a query language for sequence databases, such as genome databases and text databases. The language, called SequenceDatalog, extends classical Datalog with interpreted function symbols for manipulating sequences. It has both a clear operational and declarative semantics, based on ..."
Abstract

Cited by 24 (5 self)
 Add to MetaCart
This paper develops a query language for sequence databases, such as genome databases and text databases. The language, called SequenceDatalog, extends classical Datalog with interpreted function symbols for manipulating sequences. It has both a clear operational and declarative semantics, based on a new notion called the extended active domain of a database. The extended domain contains all the sequences in the database and all their subsequences. This idea leads to a clear distinction between safe and unsafe recursion over sequences: safe recursion stays inside the extended active domain, while unsafe recursion does not. By carefully limiting the amountof unsafe recursion, the paper develops a safe and expressive subset of Sequence Datalog. As part of the development, a new type of transducer is introduced, called a generalized sequence transducer. Unsafe recursion is allowed only within these generalized transducers. Generalized transducers extend ordinary transducers by allowing them to invoke other transducers as "subroutines." Generalized transducers can be implemented in Sequence Datalog in a straightforward way. Moreover, their introduction into the language leads to simple conditions that guarantee safety and finiteness. This paper develops two such conditions. The first condition expresses exactly the class of ptime sequence functions; and the second expresses exactly the class of elementary sequence functions.
A Query Language for NC
 In Proceedings of 13th ACM Symposium on Principles of Database Systems
, 1994
"... We show that a form of divide and conquer recursion on sets together with the relational algebra expresses exactly the queries over ordered relational databases which are NC computable. At a finer level, we relate k nested uses of recursion exactly to AC k , k 1. We also give corresponding resul ..."
Abstract

Cited by 16 (9 self)
 Add to MetaCart
We show that a form of divide and conquer recursion on sets together with the relational algebra expresses exactly the queries over ordered relational databases which are NC computable. At a finer level, we relate k nested uses of recursion exactly to AC k , k 1. We also give corresponding results for complex objects. 1 Introduction NC is the complexity class of functions that are computable in polylogarithmic time with polynomially many processors on a parallel random access machine (PRAM). The query language for NC discussed here is centered around a form of divide and conquer recursion (dcr ) on finite sets which has obvious potential for parallel evaluation and can easily express, for example, transitive closure and parity. Divide and conquer with parameters e; f; u defines the unique function ', notation dcr (e; f; u), taking finite sets as arguments, such that: '(;) def = e '(fyg) def = f(y) '(s 1 [ s 2 ) def = u('(s 1 ); '(s 2 )) when s 1 " s 2 = ; For parity, we t...
A Query Language for ListBased Complex Objects
 In Thirteenth ACM SIGMOD Intern. Symposium on Principles of Database Systems (PODS'94
, 1994
"... We present a language for querying listbased complex objects. The language is shown to express precisely the polynomialtime generic listobject functions. The iteration mechanism of the language is based on a new approach wherein, in addition to the list over which the iteration is performed, a se ..."
Abstract

Cited by 15 (5 self)
 Add to MetaCart
We present a language for querying listbased complex objects. The language is shown to express precisely the polynomialtime generic listobject functions. The iteration mechanism of the language is based on a new approach wherein, in addition to the list over which the iteration is performed, a second list is used to control the number of iteration steps. During the iteration, the intermediate results can be moved to the output list as well as reinserted into the list being iterated over. A simple syntactic constraint allows the growth rate of the intermediate results to be tightly controlled which, in turn, restricts the expressiveness of the language to PTIME. Data Parallel Systems Inc., 4617 Morningside Dr., Bloomington, IN, 47408; email: colby@dpsi.com y University of Regina, Dept. of Comp. Science, Regina, Saskatchewan S4S 0A2, Canada, email: saxton@cs.uregina.ca z Indiana University, Comp. Science Dept., Bloomington, IN 474054101, email: vgucht@cs.indiana.edu. 1 Intro...
A Uniform Calculus for Collection Types
 OREGON GRADUATE INSTITUTE OF SCIENE & TECHNOLOGY
, 1994
"... We present a new algebra for collection types based on monoids and monoid homomorphisms. The types supported in this algebra can be any nested composition of collection types, including lists, sets, multisets (bags), vectors, and matrices. We also define a new calculus for this algebra, called mo ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
We present a new algebra for collection types based on monoids and monoid homomorphisms. The types supported in this algebra can be any nested composition of collection types, including lists, sets, multisets (bags), vectors, and matrices. We also define a new calculus for this algebra, called monoid comprehensions, that captures operations involving multiple collection types in declarative form. This algebra can easily capture the semantics of many objectoriented database query languages that support mixed collection types, such as the OQL language of the ODMG93 standard. In addition, it is ideal for expressing data parallelism and nested parallelism and can be effectively translated onto many parallel architectures. We present a normalization algorithm that reduces any expression in our algebra to a canonical form which, when evaluated, generates very few intermediate data structures. These canonical forms are amenable to a higher degree of parallelism than the original...
CoPa: a Parallel Programming Language for Collections
 University of Pennsylvania, Institute for
, 1998
"... In this paper we propose a new framework for parallel processing of collections. We define a highlevel language called CoPa for processing nested sets, bags, and sequences (a generalization of arrays and lists). CoPa includes most features found in query languages for objectoriented or objectrela ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
In this paper we propose a new framework for parallel processing of collections. We define a highlevel language called CoPa for processing nested sets, bags, and sequences (a generalization of arrays and lists). CoPa includes most features found in query languages for objectoriented or objectrelational databases, and has, in addition, a powerful form of recursion not found in query languages. CoPa has a formal declarative definition of parallel complexity, as part of its specification. We prove the existence of a complexitypreserving compilation for CoPa, i.e. one which offers upperbound guarantees for the parallel complexity of the compiled code. The majority of the compilation process is architectureindependent, using a parallel vector machine model (BVRAM). The BVRAM instructions form a sequencealgebra which is of independent interest, and have been carefully chosen to reconcile two conflicting demands: supporting the complexitypreserving compilation of CoPa's highlevel con...
Finite Query Languages for Sequence Databases
, 1995
"... This paper develops a query language for sequence databases, such as genome databases and text databases. Unlike relational data, queries over sequential data can easily produce infinite answer sets, since the universe of sequences is infinite, even for a finite alphabet. The challenge is to develop ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
This paper develops a query language for sequence databases, such as genome databases and text databases. Unlike relational data, queries over sequential data can easily produce infinite answer sets, since the universe of sequences is infinite, even for a finite alphabet. The challenge is to develop query languages that are both highly expressive and finite. This paper develops such a language. It is a subset of a recently developed logic called Sequence Datalog [22]. Sequence Datalog distinguishes syntactically between subsequence extraction and sequence construction. Extraction creates sequences of bounded length, and leads to safe recursion; while construction can create sequences of arbitrary length, and leads to unsafe recursion. In this paper, we develop syntactic restrictions for Sequence Datalog that allow sequence construction but preserve finiteness. The main idea is to use safe recursion to control and limit unsafe recursion. The main results are a finite language, called We...
Management Of Sequence Data
, 1996
"... One of the challenges facing today's database systems is the need to support complex data types, which are of growing importance in new application areas. The thesis addresses this problem, with a specific focus on supporting sequence data. A large part of the thesis deals with the details of seque ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
One of the challenges facing today's database systems is the need to support complex data types, which are of growing importance in new application areas. The thesis addresses this problem, with a specific focus on supporting sequence data. A large part of the thesis deals with the details of sequences. Issues covered include the model for sequence data, an algebra of operators to query the data, a query language to express the queries, optimization techniques and query processing algorithms. Performance results are presented from an implementation of these ideas, demonstrating the effects of the various optimizations. This detailed exploration of sequence data is one contribution of the thesis. The second contribution is a solution to the problem of integrating different data types, including sequences and relations, in a generalpurpose database system. The thesis discusses the drawbacks of existing solutions, and then proposes a solution based on a novel EADT paradigm. This parad...
An Object Based Algebra for Parallel Query Processing and Optimization
, 1992
"... The Tarski algebra provides an algebraic foundation for objectbased query languages. This is demonstrated by showing how queries expressed in a graphoriented query language (based on the functional data model) can be translated into the Tarski algebra. The graphical representation of queries in co ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
The Tarski algebra provides an algebraic foundation for objectbased query languages. This is demonstrated by showing how queries expressed in a graphoriented query language (based on the functional data model) can be translated into the Tarski algebra. The graphical representation of queries in combination with the Tarski algebra is a convenient mechanism to study optimization in the context of object based query languages. We then propose extensions to the Tarski algebra that facilitate parallel query processing and address the issue of parallel query optimization in this algebraic framework. We also show how our framework helps in the study of nonmonotonic query optimization. 1 Introduction Over the last decade, a variety of new database models [10] have been introduced to deal with data applications involving objects with a complex external and/or internal structure. These database models can be classified into three main categories: the complex object models, the functionbas...
Query Processing for Streaming Sensor Data
"... This document summarizes my progress to date in building TeleTiny, with a particular focus on these two components and the interfaces between them, and discusses how my Ph.D. will yield a complete architecture for energyefficient query processing over streaming sensor data. The remainder of this se ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
This document summarizes my progress to date in building TeleTiny, with a particular focus on these two components and the interfaces between them, and discusses how my Ph.D. will yield a complete architecture for energyefficient query processing over streaming sensor data. The remainder of this section summarizes the challenges of sensorquery processing, sketches the TeleTiny architecture, and shows how it meets those challenges