Results 1  10
of
10
The Design And Implementation Of Massively Parallel Knowledge Representation And Reasoning Systems: A Connectionist Approach
, 1996
"... Efficient knowledge representation and reasoning is an important component of intelligent activity, and is a crucial aspect in the design of largescale intelligent systems. This dissertation explores the design, analysis, and implementation of massively parallel knowledge representation and reasoni ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
Efficient knowledge representation and reasoning is an important component of intelligent activity, and is a crucial aspect in the design of largescale intelligent systems. This dissertation explores the design, analysis, and implementation of massively parallel knowledge representation and reasoning systems which can encode very large knowledge bases and respond to a class of queries in realtime, with reasoning episodes expected to span a fraction of a second. The dissertation attempts to design efficient, largescale knowledge base systems by: (i) exploiting massive parallelism; and (ii) constraining representational and inferential capabilities to achieve tractability, while still retaining sufficient expressive power to capture a broad class of reasoning in intelligent systems. To this end, shruti, a connectionist reasoning system which models reflexive i.e., effortless and spontaneousreasoning serves as the knowledge representation and reasoning framework. Shrutibased mas...
Scalable, parallel, scientific databases
 In 10th International Conf. on Scientific and Statistical Database Management
, 1998
"... Abstract: Large scientific applications which rely on highly parallel computational analysis require highly parallel data access. We describe an objectoriented, scientific database system that achieves nearly linear scaleup over large, million object data sets. Of primary importance are those feat ..."
Abstract

Cited by 8 (2 self)
 Add to MetaCart
Abstract: Large scientific applications which rely on highly parallel computational analysis require highly parallel data access. We describe an objectoriented, scientific database system that achieves nearly linear scaleup over large, million object data sets. Of primary importance are those features which seem central to the development of this, or any other parallel database system. These include techniques of object distribution, of multioperator parallelism, and of indirect object referencing. It also appears to require a query server architecture instead of the more common page server configurations. 1.
Multiway Equijoin Query Acceleration Using HitLists
 Ph.D. Dissertation, NDSU
, 1992
"... This paper presents a new data structure for multiway and general join query acceleration, the hitlist, and an algorithm for its use. The hitlist is a surrogate index providing the mapping between the values of two attributes in a relation participating in an equijoin or a selection. The results o ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
This paper presents a new data structure for multiway and general join query acceleration, the hitlist, and an algorithm for its use. The hitlist is a surrogate index providing the mapping between the values of two attributes in a relation participating in an equijoin or a selection. The results of an analytical model, simulation study, and an implementation are presented. The performance advantages of this approach are made clear, as well as the basis for these results in the attainment of full selectivity. Extensions of hitlists are also examined. 1. Introduction This paper presents a new data structure, the hitlist, which can be used to accelerate complex queries. The hitlist acceleration method is developed and applied to multiway joins in which not all of the joining attributes are the same. An analytical model was developed which compared the response time of a multiway join algorithm using hitlists with the performance of an algorithm not using hitlists. The comparison ...
An Efficient Processing of a Chain Join with the Minimum Communication Cost
 in Distributed Database Systems, Journal of Distributed and Parallel Databases
, 1995
"... Abstract. This paper investigates the optimization problem when executing a join in a distributed database environment. The minimization of the communication cost for sending data through links has been adopted as an optimization criterion. We explore in this paper the approach of judiciously using ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
Abstract. This paper investigates the optimization problem when executing a join in a distributed database environment. The minimization of the communication cost for sending data through links has been adopted as an optimization criterion. We explore in this paper the approach of judiciously using join operations as reducers in distributed query processing. In general, this problem is computationally intractable. A restriction of the execution of a join in a predefined combinatorial order leads to a possible solution in polynomial time. An algorithm for a chain query computation has been proposed in [21]. The time complexity of the algorithm is O(m2n 2 + m3n), where n is the number of sites in the network, and m is the number of relations (fragments) involved in the join. In this paper, we firstly present a proof of the intuitively well understood factthat the "eigenorder " of a "chain " join will be the best predefined combinatorial order to implement the algorithm in [21]. Secondly, we show a sufficient and necessary condition for a chain query with the eigenordering to be a "simple" query. For the process of the class of simple queries, we show a significant reduction of the time complexity from O(m2n 2 + ra3n) to O(rnn + m2). It is encouraging that, in practice, the most frequent queries belong to the
PMJoin: Optimizing distributed multiway stream joins by stream partitioning
 In Proc. Int. Conf. on Database Syst. for Advanced App. (DASFAA
, 2006
"... Abstract. In emerging data stream applications, data sources are typically distributed. Evaluating multijoin queries over streams from different sources may incur large communication cost. As queries run continuously, the precious bandwidths would be aggressively consumed without careful optimizati ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Abstract. In emerging data stream applications, data sources are typically distributed. Evaluating multijoin queries over streams from different sources may incur large communication cost. As queries run continuously, the precious bandwidths would be aggressively consumed without careful optimization of operator ordering and placement. In this paper, we focus on the optimization of continuous multijoin queries over distributed streams. We observe that by partitioning streams into substreams we can significantly reduce the communication cost and hence propose a novel partitionbased join scheme PMJoin. A few partitioning techniques are studied. To generate the query plan for each substream, a heuristic algorithm is proposed based on a ratebased model. Results from an extensive experimental study show that our techniques can sufficiently reduce the communication cost. 1
On the Number of Expressions Modulo Commutativity over a Finite Semigroup
"... A problem, ffl n , that is closely related to Catalan numbers, b n , is formulated, analysed, and solved. The result shows that ffl n is exponential, which poses a challenge for finding a polynomial time algorithm if a search space is ffl n . Conversely, it inspires attempts to prove that such searc ..."
Abstract
 Add to MetaCart
A problem, ffl n , that is closely related to Catalan numbers, b n , is formulated, analysed, and solved. The result shows that ffl n is exponential, which poses a challenge for finding a polynomial time algorithm if a search space is ffl n . Conversely, it inspires attempts to prove that such search problems are NP complete. This work has immediate applications to the join optimization problem in database systems. Catalan numbers [7] b n = 1 n+1 i 2n n j can be interpreted as, given a finite semigroup (G; fi) with jGj = n, the number of expressions over (G; fi) which are equivalent under associativity to a specific expression e of length n. Catalan numbers have applications in such diverse areas as determining the number of ways that parentheses may be configured in matrixchain multiplication [1], determining the number of ways that a npolygon can be triangularized [4], and determining the number of full binary trees [2]. Given the Catalan number b n , it is easy to see t...
A Distributed Query Processing Strategy Using Placement Dependency
 Proceedings of the 12th International Conference on Data Engineering
, 1996
"... We present an algorithm to make use of placement dependency information to process distributed queries. Our algorithm first partitions the referenced relations of a given query into a number of nonexclusive subsets such that the fragmented relations within a subset have placement dependency and the ..."
Abstract
 Add to MetaCart
We present an algorithm to make use of placement dependency information to process distributed queries. Our algorithm first partitions the referenced relations of a given query into a number of nonexclusive subsets such that the fragmented relations within a subset have placement dependency and the join operation(s) associated with the relations in the subset can be locally processed without data transfer. Each subset is associated with a set of sites and can be used to generate an execution plan for the given query by keeping the fragmented relations in the subset fragmented at the sites where they are situated while replicating the other referenced relations at each of the processing sites. Among the alternatives, our algorithm picks the plan that gives the minimum response time. Our experimental results show that our algorithm improves response time significantly. 1 Introduction Query processing in distributed database systems has been an active research area for many years. Many...
An Effective Parallelization of Execution of Multijoins in Multiprocessor Systems
"... In this paper, we study a synchronous execution strategy for parallel join computation in multiprocessor systems. Through a further comprehensive investigation of the processor allocation problem and interoperator parallelization problem, we present a new algorithm for producing an effective parall ..."
Abstract
 Add to MetaCart
In this paper, we study a synchronous execution strategy for parallel join computation in multiprocessor systems. Through a further comprehensive investigation of the processor allocation problem and interoperator parallelization problem, we present a new algorithm for producing an effective parallelization plan for processing multijoins. Besides theoretical analysis, the eficiency and effectiveness of our new algorithm are supported by our experiments.
A Survey on Parallel and Distributed Data Warehouses
"... Data Warehouses are a crucial technology for current competitive organizations in the globalized world. Size, speed and distributed operation are major challenges concerning those systems. Many data warehouses have huge sizes and the requirement that queries be processed quickly and efficiently, so ..."
Abstract
 Add to MetaCart
Data Warehouses are a crucial technology for current competitive organizations in the globalized world. Size, speed and distributed operation are major challenges concerning those systems. Many data warehouses have huge sizes and the requirement that queries be processed quickly and efficiently, so parallel solutions are deployed to render the necessary efficiency. Distributed operation, on the other hand, concerns global commercial and scientific organizations that need to share their data in a coherent distributed data warehouse. In this paper we review the major concepts, systems and research results behind parallel and distributed data warehouses.
Warehouses
"... Some businesses generate giga or even terabytes of historical data that can be organized and analyzed for better decision making. This poses issues concerning systems and software for efficient processing over such data. While the traditional solution to this problem involves costly hardware and sof ..."
Abstract
 Add to MetaCart
Some businesses generate giga or even terabytes of historical data that can be organized and analyzed for better decision making. This poses issues concerning systems and software for efficient processing over such data. While the traditional solution to this problem involves costly hardware and software, we focus on strategies for running large data warehouses over lowcost, nondedicated nodes in a localarea network (LAN) and nonproprietary software. Once such a technology is in place, every data warehouse will be able to run in a small cost environment, but the system must be able to choose its placement and processing for maximum efficiency. We discuss the basic system architecture and the design of the data placement and processing strategy. We compare the shortcomings of a basic horizontal partitioning for the environment, with a simple design that produces efficient placements. Our discussion and results provide important insight into how lowcost efficient data warehouse systems can be obtained. Copyright © 2007, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited. 134 Furtado