Results 1 - 10
of
231
The design and implementation of hierarchical software systems with reusable components
- ACM Transactions on Software Engineering and Methodology
, 1992
"... We present a domain-independent model of hierarchical software system design and construction that is based on interchangeable software components and largescale reuse. The model unifies the conceptualizations of two independent projects, Genesis and Avoca, that are successful examples of software c ..."
Abstract
-
Cited by 412 (72 self)
- Add to MetaCart
(Show Context)
We present a domain-independent model of hierarchical software system design and construction that is based on interchangeable software components and largescale reuse. The model unifies the conceptualizations of two independent projects, Genesis and Avoca, that are successful examples of software component/building-block technologies and domain modeling. Building-block technologies exploit large-scale reuse, rely on open architecture software, and elevate the granularity of programming to the subsystem level. Domain modeling formalizes the similarities and differences among systems of a domain. We believe our model is a blue-print for achieving software component technologies in many domains.
Retrieving And Integrating Datafrom Multiple Information Sources
, 1993
"... With the current explosion of data, retrieving and integrating information from various sources is a critical problem. Work in multidatabase systems has begun to address this problem, but it has primarily focused on methods for communicating between databases and requires significant effort for e ..."
Abstract
-
Cited by 332 (31 self)
- Add to MetaCart
(Show Context)
With the current explosion of data, retrieving and integrating information from various sources is a critical problem. Work in multidatabase systems has begun to address this problem, but it has primarily focused on methods for communicating between databases and requires significant effort for each new database added to the system. This paper describes a more general approach that exploits a semantic model of a problem domain to integrate the information from various information sources. The information sources handled include both databases and knowledge bases, and other information sources (e.g., programs) could potentially be incorporated into the system. This paper describes how both the domain and the information sources are modeled, shows how a query at the domain level is mapped into a set of queries to individual information sources, and presents algorithms for automatically improving the efficiency of queries using knowledge about both the domain and the informat...
The Gamma database machine project
- IEEE Transactions on Knowledge and Data Engineering
, 1990
"... This paper describes the design of the Gamma database machine and the techniques employed in its implementation. Gamma is a relational database machine currently operating on an Intel iPSC/2 hypercube with 32 processors and 32 disk drives. Gamma employs three key technical ideas which enable the arc ..."
Abstract
-
Cited by 272 (29 self)
- Add to MetaCart
This paper describes the design of the Gamma database machine and the techniques employed in its implementation. Gamma is a relational database machine currently operating on an Intel iPSC/2 hypercube with 32 processors and 32 disk drives. Gamma employs three key technical ideas which enable the architecture to be scaled to 100s of processors. First, all relations are horizontally partitioned across multiple disk drives enabling relations to be scanned in parallel. Second, novel parallel algorithms based on hashing are used to implement the complex relational operators such as join and aggregate functions. Third, dataflow scheduling techniques are used to coordinate multioperator queries. By using these techniques it is possible to control the execution of very complex queries with minimal coordination- a necessity for configurations involving a very large number of processors. In addition to describing the design of the Gamma software, a thorough performance evaluation of the iPSC/2 hypercube version of Gamma is also presented. In addition to measuring the effect of relation size and indices on the response time for selection, join, aggregation, and update queries, we also analyze the performance of Gamma relative to the number of processors employed when the sizes of the input relations are kept constant (speedup) and when the sizes of the input relations are increased proportionally to the number of processors (scaleup). The speedup results obtained for both selection and join queries are linear; thus, doubling the number of processors
A Probabilistic Relational Algebra for the Integration of Information Retrieval and Database Systems
- ACM Transactions on Information Systems
, 1994
"... We present a probabilistic relational algebra (PRA) which is a generalization of standard relational algebra. Here tuples are assigned probabilistic weights giving the probability that a tuple belongs to a relation. Based on intensional semantics, the tuple weights of the result of a PRA expression ..."
Abstract
-
Cited by 212 (34 self)
- Add to MetaCart
We present a probabilistic relational algebra (PRA) which is a generalization of standard relational algebra. Here tuples are assigned probabilistic weights giving the probability that a tuple belongs to a relation. Based on intensional semantics, the tuple weights of the result of a PRA expression always confirm to the underlying probabilistic model. We also show for which expressions extensional semantics yields the same results. Furthermore, we discuss complexity issues and indicate possibilities for optimization. With regard to databases, the approach allows for representing imprecise attribute values, whereas for information retrieval, probabilistic document indexing and probabilistic search term weighting can be modelled. As an important extension, we introduce the concept of vague predicates which yields a probabilistic weight instead of a Boolean value, thus allowing for queries with vague selection conditions. So PRA implements uncertainty and vagueness in combination with the...
Dataflow query execution in a parallel main-memory environment
- In Proc. of the International Conference on Parallel and Distributed Information Systems (PDIS
, 1991
"... Abstract. In this paper, the performance and characteristics of the execution of various join-trees on a parallel DBMS are studied. The results of this study are a step into the direction of the design of a query optimization strategy that is fit for parallel execution of complex queries. Among othe ..."
Abstract
-
Cited by 210 (5 self)
- Add to MetaCart
(Show Context)
Abstract. In this paper, the performance and characteristics of the execution of various join-trees on a parallel DBMS are studied. The results of this study are a step into the direction of the design of a query optimization strategy that is fit for parallel execution of complex queries. Among others, synchronization issues are identified to limit the performance gain from parallelism. A new hash-join algorithm is introduced that has fewer synchronization constraints han the known hash-join algorithms. Also, the behavior of individual join operations in a join-tree is studied in a simulation experiment. The results how that the introduced Pipelining hash-join algorithm yields a better performance for multi-join queries. The format of the optimal join-tree appears to depend on the size of the operands of the join: A multi-join between small operands performs best with a bushy schedule; larger operands are better off with a linear schedule. The results from the simulation study are confirmed with an analytic model for dataflow query execution. Ke~,ords: parallel query processing, multi-join queries, simulation, analytical modeling 1.
XIRQL: A Query Language for Information Retrieval in XML Documents
, 2001
"... Based on the document-centric view of XML, we present the query language XIRQL. Current proposals for XML query languages lack most IR-related features, which are weighting and ranking, relevance-oriented search, datatypes with vague predicates, and semantic relativism. XIRQL integrates these featur ..."
Abstract
-
Cited by 190 (7 self)
- Add to MetaCart
Based on the document-centric view of XML, we present the query language XIRQL. Current proposals for XML query languages lack most IR-related features, which are weighting and ranking, relevance-oriented search, datatypes with vague predicates, and semantic relativism. XIRQL integrates these features by using ideas from logic-based probabilistic IR models, in combination with concepts from the database area. For processing XIRQL queries, a path algebra is presented, that also serves as a starting point for query optimization.
A performance evaluation of four parallel join algorithms in a shared-nothing multiprocessor environment
- Proceedings of the SIGMOD Conference
, 1989
"... ABSTRACT -In this paper we analyze and compare four parallel join algorithms. Grace and Hybrid hash represent the class of hash-based join methods, Simple hash represents a loop ing algorithm with hashing, and our last algorithm is the more traditional sort-merge. The performance of each of the alg ..."
Abstract
-
Cited by 183 (15 self)
- Add to MetaCart
ABSTRACT -In this paper we analyze and compare four parallel join algorithms. Grace and Hybrid hash represent the class of hash-based join methods, Simple hash represents a loop ing algorithm with hashing, and our last algorithm is the more traditional sort-merge. The performance of each of the algorithms with different tuple distribution policies, the addition of bit vector filters, varying amounts of main-memory for joining, and non-uniformly distributed join attribute values is studied. The Hybrid hash-join algorithm is found to be superior except when the join attribute values of the inner relation are non-uniformly distributed and memory is limited. In this case, a more conservative algorithm such as the sort-merge algorithm should be used. The Gamma database machine serves as the host for the performance comparison.
The exodus optimizer generator
- In Proceedings of the 1987 ACM International Conference on Management ofData (S1GMOD'87
, 1987
"... This paper presents the design and an mmal perfor-mance evaluation of the query ophrmzer generator designed for the EXODUS extensible database system. Algetic transformaaon rules are translated mto an executable query optmuzer. which transforms query trees and selects methods for executmg operattons ..."
Abstract
-
Cited by 177 (7 self)
- Add to MetaCart
(Show Context)
This paper presents the design and an mmal perfor-mance evaluation of the query ophrmzer generator designed for the EXODUS extensible database system. Algetic transformaaon rules are translated mto an executable query optmuzer. which transforms query trees and selects methods for executmg operattons accordmg to cost funcaons associ-ated with the methods The search strategy avoids exhaus-ave search and it mties Itself to take advantage of past expenence Computattonal results show that an opmzer generated for a relational system produces access plans almost as good as those produced by exhaushve search, ~rlth the search tune cut to a small fraction 1
Query Optimization
, 1996
"... Imagine yourself standing in front of an exquisite buffet filled with numerous delicacies. Your goal is to try them all out, but you need to decide in what order. What exchange of tastes will maximize the overall pleasure of your palate? Although much less pleasurable and subjective, that is the typ ..."
Abstract
-
Cited by 147 (5 self)
- Add to MetaCart
Imagine yourself standing in front of an exquisite buffet filled with numerous delicacies. Your goal is to try them all out, but you need to decide in what order. What exchange of tastes will maximize the overall pleasure of your palate? Although much less pleasurable and subjective, that is the type of problem that query optimizers are called to solve. Given a query, there are many plans that a database management system (DBMS) can follow to process it and produce its answer. All plans are equivalent in terms of their final output but vary in their cost, i.e., the amount of time that they need to run. What is the plan that needs the least amount of time? Such query optimization is absolutely necessary in a DBMS. The cost difference between two alternatives can be enormous. For example, consider the following database schema, which will be...
Starburst Mid-Flight: As the Dust Clears
- IEEE Transactions on Knowledge and Data Engineering
, 1990
"... ter, is improving the design of relational database management sys-tems and enhancing their performance, while building an extensible system to better support nontraditional applications (such as engineer-ing, geographic, office, etc.), and to serve as a testbed for future im-provements in database ..."
Abstract
-
Cited by 120 (4 self)
- Add to MetaCart
ter, is improving the design of relational database management sys-tems and enhancing their performance, while building an extensible system to better support nontraditional applications (such as engineer-ing, geographic, office, etc.), and to serve as a testbed for future im-provements in database technology. As of November 1989, we have an initial prototype of our system up and running. In this paper, we reflect on the design and implementation of the Starburst system to date. We examine some key design decisions, and how they affect the goal of improved structure and performance. We also examine how well we have met our goal of extensibility: what aspects of the system are ex-tensible, how extensions can be done, and how easy it is to add exten-sions. We discuss some actual extensions to the system, including the experiences of our first real customizers. Index Terms-Access methods, data structures, extensibility, plan optimization, query processing, relational database system, rule sys-tems.