Results 1  10
of
36
Maximal Vector Computation in Large Data Sets
 IN VLDB
, 2005
"... Finding the maximals in a collection of vectors is relevant to many applications. The maximal set is related to the convex hull  and hence, linear optimization  and nearest neighbors. The maximal vector problem has resurfaced with the advent of skyline queries for relational databases and skyl ..."
Abstract

Cited by 60 (1 self)
 Add to MetaCart
Finding the maximals in a collection of vectors is relevant to many applications. The maximal set is related to the convex hull  and hence, linear optimization  and nearest neighbors. The maximal vector problem has resurfaced with the advent of skyline queries for relational databases and skyline algorithms that are external and relationally well behaved. The initial
Efficient Computation of the Skyline Cube
 IN VLDB
, 2005
"... Skyline has been proposed as an important operator for multicriteria decision making, data mining and visualization, and userpreference queries. In this paper, we consider the problem of efficiently computing a Skycube, which consists of skylines of all possible nonempty subsets of a given ..."
Abstract

Cited by 49 (4 self)
 Add to MetaCart
Skyline has been proposed as an important operator for multicriteria decision making, data mining and visualization, and userpreference queries. In this paper, we consider the problem of efficiently computing a Skycube, which consists of skylines of all possible nonempty subsets of a given set of dimensions. While existing skyline computation algorithms can be immediately extended to computing each skyline query independently, such "sharednothing" algorithms are inefficient. We develop several computation sharing strategies based on e#ectively identifying the computation dependencies among multiple related skyline queries. Based on these sharing strategies, two novel algorithms, BottomUp and TopDown algorithms, are proposed to compute Skycube efficiently. Finally, our extensive performance evaluations confirm the effectiveness of the sharing strategies. It is
Refreshing the sky: the compressed skycube with efficient support for frequent updates
 In SIGMOD
, 2006
"... The skyline query is important in many applications such as multicriteria decision making, data mining, and userpreference queries. Given a set of ddimensional objects, the skyline query finds the objects that are not dominated by others. In practice, different users may be interested in different ..."
Abstract

Cited by 33 (0 self)
 Add to MetaCart
The skyline query is important in many applications such as multicriteria decision making, data mining, and userpreference queries. Given a set of ddimensional objects, the skyline query finds the objects that are not dominated by others. In practice, different users may be interested in different dimensions of the data, and issue queries on any subset of d dimensions. This paper focuses on supporting concurrent and unpredictable subspace skyline queries in frequent updated databases. Simply to compute and store the skyline objects of every subspace in a skycube will incur expensive update cost. In this paper, we investigate the important issue of updating the skycube in a dynamic environment. To balance the query cost and update cost, we propose a new structure, the compressed skycube, which concisely represents the complete skycube. We thoroughly explore the properties of the compressed skycube and provide an efficient objectaware update scheme. Experimental results show that the compressed skycube is both query and update efficient. 1.
Algorithms and Analyses for Maximal Vector Computation
"... The maximal vector problem is to identify the maximals over a collection of vectors. This arises in many contexts and, as such, has been well studied. The problem recently gained renewed attention with skyline queries for relational databases and with work to develop skyline algorithms that are exte ..."
Abstract

Cited by 24 (0 self)
 Add to MetaCart
The maximal vector problem is to identify the maximals over a collection of vectors. This arises in many contexts and, as such, has been well studied. The problem recently gained renewed attention with skyline queries for relational databases and with work to develop skyline algorithms that are external and relationally well behaved. While many algorithms have been proposed, how they perform has been unclear. We study the performance of, and design choices behind, these algorithms. We prove runtime bounds based on the number of vectors n and the dimensionality k. Early algorithms based on divideandconquer established seemingly good average and worstcase asymptotic runtimes. In fact, the problem can be solved in O(n) averagecase (holding k as fixed). We prove, however, that the performance is quite bad with respect to k. We demonstrate that the more recent skyline algorithms are better behaved, and can also achieve O(kn) averagecase. While k matters for these, in practice, its effect vanishes in the asymptotic. We introduce a new external algorithm, LESS, that is more efficient and better behaved. We evaluate LESS’s effectiveness and improvement over the field, and prove that its averagecase running time is O(kn). 1
Towards Multidimensional Subspace Skyline Analysis
"... The skyline operator is important for multicriteria decisionmaking applications. Although many recent studies developed efficient methods to compute skyline objects in a given space, none of them considers skylines in multiple subspaces simultaneously. More importantly, the fundamental ..."
Abstract

Cited by 17 (3 self)
 Add to MetaCart
The skyline operator is important for multicriteria decisionmaking applications. Although many recent studies developed efficient methods to compute skyline objects in a given space, none of them considers skylines in multiple subspaces simultaneously. More importantly, the fundamental
Efficient Topk Aggregation of Ranked Inputs
"... A topk query combines different rankings of the same set of objects and returns the k objects with the highest combined score according to an aggregate function. We bring to light some key observations, which impose two phases that any topk algorithm, based on sorted accesses, should go through. B ..."
Abstract

Cited by 16 (1 self)
 Add to MetaCart
A topk query combines different rankings of the same set of objects and returns the k objects with the highest combined score according to an aggregate function. We bring to light some key observations, which impose two phases that any topk algorithm, based on sorted accesses, should go through. Based on them, we propose a new algorithm, which is designed to minimize the number of object accesses, the computational cost, and the memory requirements of topk search with monotone aggregate functions. We provide an analysis for its cost and show that it is always no worse than the baseline “no random accesses ” algorithm in terms of computations, accesses, and memory required. As a side contribution, we perform a space analysis, which indicates the memory requirements of topk algorithms that only perform sorted accesses. For the case, where the required space exceeds the available memory, we propose diskbased variants of our algorithm. We propose and optimize a multiway topk join operator, with certain advantages over evaluation trees of binary topk join operators. Finally, we define and study the computation of topk cubes and the implementation of rollup and drilldown operations in such cubes. Extensive experiments with synthetic and real data show that, compared to previous techniques, our method accesses fewer objects, while being orders of magnitude faster.
Approaching the Efficient Frontier: Cooperative Database Retrieval Using HighDimensional Skylines
 IN PROC. OF THE INT. CONF. ON DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA
, 2005
"... Cooperative database retrieval is a challenging problem: top k retrieval delivers manageable results only when a suitable compensation function (e.g. a weighted mean) is explicitly given. On the other hand skyline queries offer intuitive querying to users, but result set sizes grow exponentially and ..."
Abstract

Cited by 14 (13 self)
 Add to MetaCart
Cooperative database retrieval is a challenging problem: top k retrieval delivers manageable results only when a suitable compensation function (e.g. a weighted mean) is explicitly given. On the other hand skyline queries offer intuitive querying to users, but result set sizes grow exponentially and hence can easily exceed manageable levels. We show how to combine the advantages of skyline queries and top k retrieval in an interactive query processing scheme using user feedback on a manageable, representative sample of the skyline set to derive most adequate weightings for subsequent focused top k retrieval. Hence, each user’s information needs are conveniently and intuitively obtained, and only a limited set of best matching objects is returned. We will demonstrate our scheme’s efficient performance, manageable result sizes, and representativeness of the skyline. We will also show how to effectively estimate users’ compensation functions using their feedback. Our approach thus paves the way to intuitive and efficient cooperative retrieval with vague query predicates.
Semantic Optimization Techniques for Preference Queries
, 2006
"... Preference queries are relational algebra or SQL queries that contain occurrences of the winnow operator (find the most preferred tuples in a given relation). Such queries are parameterized by specific preference relations. Semantic optimization techniques make use of integrity constraints holding i ..."
Abstract

Cited by 14 (3 self)
 Add to MetaCart
Preference queries are relational algebra or SQL queries that contain occurrences of the winnow operator (find the most preferred tuples in a given relation). Such queries are parameterized by specific preference relations. Semantic optimization techniques make use of integrity constraints holding in the database. In the context of semantic optimization of preference queries, we identify two fundamental properties: containment of preference relations relative to integrity constraints and satisfaction of order axioms relative to integrity constraints. We show numerous applications of those notions to preference query evaluation and optimization. As integrity constraints, we consider constraintgenerating dependencies, a class generalizing functional dependencies. We demonstrate that the problems of containment and satisfaction of order axioms can be captured as specific instances of constraintgenerating dependency entailment. This makes it possible to formulate necessary and sufficient conditions for the applicability of our techniques as constraint validity problems. We characterize the computational complexity of such problems.
Database Querying under Changing Preferences
, 2006
"... We present here a formal foundation for an iterative and incremental approach to constructing and evaluating preference queries. Our main focus is on query modification: a query transformation approach which works by revising the preference relation in the query. We provide a detailed analysis of th ..."
Abstract

Cited by 10 (2 self)
 Add to MetaCart
We present here a formal foundation for an iterative and incremental approach to constructing and evaluating preference queries. Our main focus is on query modification: a query transformation approach which works by revising the preference relation in the query. We provide a detailed analysis of the cases where the ordertheoretic properties of the preference relation are preserved by the revision. We consider a number of different revision operators: union, prioritized and Pareto composition. We also formulate algebraic laws that enable incremental evaluation of preference queries. Finally, we consider two variations of the basic framework: finite restrictions of preference relations and weakorder extensions of strict partial order preference relations.
Exploiting Indifference for Customization of Partial Order Skylines
 INT. DATABASE ENGINEERING AND APPLICATIONS SYMP. (IDEAS
, 2006
"... Unlike numerical preferences, preferences on attribute values do not show an inherent total order, but skyline computation has to rely on partial orderings explicitly stated by the user. In such orders many object values are incomparable, hence skylines sizes become unpractical. However, the Pareto ..."
Abstract

Cited by 10 (4 self)
 Add to MetaCart
Unlike numerical preferences, preferences on attribute values do not show an inherent total order, but skyline computation has to rely on partial orderings explicitly stated by the user. In such orders many object values are incomparable, hence skylines sizes become unpractical. However, the Pareto semantics can be modified to benefit from indifferences: skyline result sizes can be essentially reduced by allowing the user to declare some incomparable values as equally desirable. A major problem of adding such equivalences is that they may result in intransitivity of the aggregated Pareto order and thus efficient query processing is hampered. In this paper we analyze how far the strict Pareto semantics can be relaxed while always retaining transitivity of the induced Pareto aggregation. Extensive practical tests show that skyline sizes can indeed be reduced about two orders of magnitude when using the maximum possible relaxation still guaranteeing the consistency with all user preferences.