Results 1 - 10
of
141
The design and implementation of hierarchical software systems with reusable components
- ACM Transactions on Software Engineering and Methodology
, 1992
"... We present a domain-independent model of hierarchical software system design and construction that is based on interchangeable software components and largescale reuse. The model unifies the conceptualizations of two independent projects, Genesis and Avoca, that are successful examples of software c ..."
Abstract
-
Cited by 347 (71 self)
- Add to MetaCart
We present a domain-independent model of hierarchical software system design and construction that is based on interchangeable software components and largescale reuse. The model unifies the conceptualizations of two independent projects, Genesis and Avoca, that are successful examples of software component/building-block technologies and domain modeling. Building-block technologies exploit large-scale reuse, rely on open architecture software, and elevate the granularity of programming to the subsystem level. Domain modeling formalizes the similarities and differences among systems of a domain. We believe our model is a blue-print for achieving software component technologies in many domains.
Retrieving And Integrating Datafrom Multiple Information Sources
, 1993
"... With the current explosion of data, retrieving and integrating information from various sources is a critical problem. Work in multidatabase systems has begun to address this problem, but it has primarily focused on methods for communicating between databases and requires significant effort for e ..."
Abstract
-
Cited by 284 (24 self)
- Add to MetaCart
With the current explosion of data, retrieving and integrating information from various sources is a critical problem. Work in multidatabase systems has begun to address this problem, but it has primarily focused on methods for communicating between databases and requires significant effort for each new database added to the system. This paper describes a more general approach that exploits a semantic model of a problem domain to integrate the information from various information sources. The information sources handled include both databases and knowledge bases, and other information sources (e.g., programs) could potentially be incorporated into the system. This paper describes how both the domain and the information sources are modeled, shows how a query at the domain level is mapped into a set of queries to individual information sources, and presents algorithms for automatically improving the efficiency of queries using knowledge about both the domain and the informat...
The Gamma database machine project
- IEEE Transactions on Knowledge and Data Engineering
, 1990
"... This paper describes the design of the Gamma database machine and the techniques employed in its implementation. Gamma is a relational database machine currently operating on an Intel iPSC/2 hypercube with 32 processors and 32 disk drives. Gamma employs three key technical ideas which enable the arc ..."
Abstract
-
Cited by 203 (27 self)
- Add to MetaCart
This paper describes the design of the Gamma database machine and the techniques employed in its implementation. Gamma is a relational database machine currently operating on an Intel iPSC/2 hypercube with 32 processors and 32 disk drives. Gamma employs three key technical ideas which enable the architecture to be scaled to 100s of processors. First, all relations are horizontally partitioned across multiple disk drives enabling relations to be scanned in parallel. Second, novel parallel algorithms based on hashing are used to implement the complex relational operators such as join and aggregate functions. Third, dataflow scheduling techniques are used to coordinate multioperator queries. By using these techniques it is possible to control the execution of very complex queries with minimal coordination- a necessity for configurations involving a very large number of processors. In addition to describing the design of the Gamma software, a thorough performance evaluation of the iPSC/2 hypercube version of Gamma is also presented. In addition to measuring the effect of relation size and indices on the response time for selection, join, aggregation, and update queries, we also analyze the performance of Gamma relative to the number of processors employed when the sizes of the input relations are kept constant (speedup) and when the sizes of the input relations are increased proportionally to the number of processors (scaleup). The speedup results obtained for both selection and join queries are linear; thus, doubling the number of processors
Dataflow Query Execution in a Parallel Main-Memory Environment
- Distributed and Parallel Databases
, 1991
"... In this paper, the performance and characteristics of the execution of various join-trees on a parallel DBMS are studied. The results of this study, are a step into the direction of the design of a query optimization strategy that is fit for parallel execution of complex queries. Among others, synch ..."
Abstract
-
Cited by 159 (4 self)
- Add to MetaCart
In this paper, the performance and characteristics of the execution of various join-trees on a parallel DBMS are studied. The results of this study, are a step into the direction of the design of a query optimization strategy that is fit for parallel execution of complex queries. Among others, synchronization issues are identified to limit the performance gain from parallelism. A new hashjoin algorithm, called Pipelining hash-join is introduced that has fewer synchronization constraints than the known hash-join algorithms. Also, the behavior of individual join operations in a join-tree is studied in a simulation experiment. The results show that the Pipelining hash-join algorithm yields a better performance for multi-join queries. Also, the format of the optimal join-tree appears to depend on the size of the operands of the join: A multi-join between small operands performs best with a bushy schedule; larger operands are better off with a linear schedule. The results from the simulatio...
The EXODUS Optimizer Generator
, 1987
"... This paper presents the design and an initial performance evaluation of the query optimizer generator designed for the EXODUS extensible database system. Algebraic transformation rules are translated into an executable query optimizer, which transforms query trees and selects methods for executing o ..."
Abstract
-
Cited by 153 (7 self)
- Add to MetaCart
This paper presents the design and an initial performance evaluation of the query optimizer generator designed for the EXODUS extensible database system. Algebraic transformation rules are translated into an executable query optimizer, which transforms query trees and selects methods for executing operations according to cost functions associated with the methods. The search strategy avoids exhaustive search and it modifies itself to take advantage of past experience. Computational results show that an optimizer generated for a relational system produces access plans almost as good as those produced by exhaustive search, with the search time cut to a small fraction.
A Probabilistic Relational Algebra for the Integration of Information Retrieval and Database Systems
- ACM Transactions on Information Systems
, 1994
"... We present a probabilistic relational algebra (PRA) which is a generalization of standard relational algebra. Here tuples are assigned probabilistic weights giving the probability that a tuple belongs to a relation. Based on intensional semantics, the tuple weights of the result of a PRA expression ..."
Abstract
-
Cited by 149 (28 self)
- Add to MetaCart
We present a probabilistic relational algebra (PRA) which is a generalization of standard relational algebra. Here tuples are assigned probabilistic weights giving the probability that a tuple belongs to a relation. Based on intensional semantics, the tuple weights of the result of a PRA expression always confirm to the underlying probabilistic model. We also show for which expressions extensional semantics yields the same results. Furthermore, we discuss complexity issues and indicate possibilities for optimization. With regard to databases, the approach allows for representing imprecise attribute values, whereas for information retrieval, probabilistic document indexing and probabilistic search term weighting can be modelled. As an important extension, we introduce the concept of vague predicates which yields a probabilistic weight instead of a Boolean value, thus allowing for queries with vague selection conditions. So PRA implements uncertainty and vagueness in combination with the...
A Performance Evaluation of Four Parallel Join Algorithms in a Shared-Nothing Multiprocessor Environment
, 1989
"... The join operator has been a cornerstone of relational database systems since their inception. As such, much time and effort has gone into making joins efficient. With the obvious trend towards multiprocessors, attention has focused on efficiently parallelizing the join operation. In this paper we a ..."
Abstract
-
Cited by 147 (14 self)
- Add to MetaCart
The join operator has been a cornerstone of relational database systems since their inception. As such, much time and effort has gone into making joins efficient. With the obvious trend towards multiprocessors, attention has focused on efficiently parallelizing the join operation. In this paper we analyze and compare four parallel join algorithms. Grace and Hybrid hash represent the class of hash-based join methods, Simple hash represents a looping algorithm with hashing, and our last algorithm is the more traditional sort-merge. The Gamma database machine serves as the host for the performance comparison. Gamma’s shared-nothing architecture with commercially available components is becoming increasingly common, both in research and in industry. 1.
XIRQL: A Query Language for Information Retrieval in XML Documents
, 2001
"... Based on the document-centric view of XML, we present the query language XIRQL. Current proposals for XML query languages lack most IR-related features, which are weighting and ranking, relevance-oriented search, datatypes with vague predicates, and semantic relativism. XIRQL integrates these featur ..."
Abstract
-
Cited by 140 (6 self)
- Add to MetaCart
Based on the document-centric view of XML, we present the query language XIRQL. Current proposals for XML query languages lack most IR-related features, which are weighting and ranking, relevance-oriented search, datatypes with vague predicates, and semantic relativism. XIRQL integrates these features by using ideas from logic-based probabilistic IR models, in combination with concepts from the database area. For processing XIRQL queries, a path algebra is presented, that also serves as a starting point for query optimization.
Query Optimization
, 1996
"... Imagine yourself standing in front of an exquisite buffet filled with numerous delicacies. Your goal is to try them all out, but you need to decide in what order. What exchange of tastes will maximize the overall pleasure of your palate? Although much less pleasurable and subjective, that is the typ ..."
Abstract
-
Cited by 102 (2 self)
- Add to MetaCart
Imagine yourself standing in front of an exquisite buffet filled with numerous delicacies. Your goal is to try them all out, but you need to decide in what order. What exchange of tastes will maximize the overall pleasure of your palate? Although much less pleasurable and subjective, that is the type of problem that query optimizers are called to solve. Given a query, there are many plans that a database management system (DBMS) can follow to process it and produce its answer. All plans are equivalent in terms of their final output but vary in their cost, i.e., the amount of time that they need to run. What is the plan that needs the least amount of time? Such query optimization is absolutely necessary in a DBMS. The cost difference between two alternatives can be enormous. For example, consider the following database schema, which will be...
Planning and Reformulating Queries for Semantically-Modeled Multidatabase Systems
- In Proceedings of the First International Conference on Information and Knowledge Management
, 1992
"... With vast amounts of information available from various sources, integrating data from multiple databases is an important problem. The SIMS project attacks this problem using a variety of Artificial Intelligence techniques, including planning, knowledge representation, problem reformulation, and lea ..."
Abstract
-
Cited by 46 (3 self)
- Add to MetaCart
With vast amounts of information available from various sources, integrating data from multiple databases is an important problem. The SIMS project attacks this problem using a variety of Artificial Intelligence techniques, including planning, knowledge representation, problem reformulation, and learning. To integrate multiple databases, the user provides a semantic model of the application domain and then uses this model to describe the contents of the available databases. Given a query, the system uses a planner to decide which databases must be queried and in what order the queries should be executed. This paper focuses on the query planning problem --- the selection of appropriate data sources and ordering the accesses to them, and on the reformulation of queries --- the use of knowledge both about the domain and the databases to modify queries to make the retrieval plans for them more efficient. 1 Introduction Most tasks performed by users of complex information systems involve...

