Results 1 - 10
of
14
Multi-Dimensional Top-k Dominating Queries
"... The top-k dominating query returns k data objects which dominate the highest number of objects in a dataset. This query is an important tool for decision support since it provides data analysts an intuitive way for finding significant objects. In addition, it combines the advantages of top-k and sky ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
The top-k dominating query returns k data objects which dominate the highest number of objects in a dataset. This query is an important tool for decision support since it provides data analysts an intuitive way for finding significant objects. In addition, it combines the advantages of top-k and skyline queries without sharing their disadvantages: (i) the output size can be controlled, (ii) no ranking functions need to be specified by users, and (iii) the result is independent of the scales at different dimensions. Despite their importance, top-k dominating queries have not received adequate attention from the research community. This paper is an extensive study on the evaluation of topk dominating queries. First, we propose a set of algorithms that apply on indexed multi-dimensional data. Second, we investigate query evaluation on data that are not indexed. Finally, we study a relaxed variant of the query which considers dominance in dimensional subspaces. Experiments using synthetic and real datasets demonstrate that our algorithms significantly outperform a previous skyline-based approach. We also illustrate the applicability of this multi-dimensional analysis query by studying the meaningfulness of its results on real data.
Discovering Relative Importance of Skyline Attributes
"... Querying databases with preferences is an important research problem. Among various approaches to querying with preferences, the skyline framework is one of the most popular. A well known deficiency of that framework is that all attributes are of the same importance in skyline preference relations. ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
Querying databases with preferences is an important research problem. Among various approaches to querying with preferences, the skyline framework is one of the most popular. A well known deficiency of that framework is that all attributes are of the same importance in skyline preference relations. Consequently, the size of the results of skyline queries may grow exponentially with the number of skyline attributes. Here we propose the framework called p-skylines which enriches skylines with the notion of attribute importance. It turns out that incorporating relative attribute importance in skylines allows for reduction in the corresponding query result sizes. We propose an approach to discovering importance relationships of attributes, based on user-selected sets of superior and inferior examples. We show that the problem of checking the existence of and the problem of computing an optimal p-skyline preference relation covering a given set of examples are NP-complete and FNP-complete, respectively. However, we also show that a restricted version of the discovery problem – using only superior examples to discover attribute importance – can be solved efficiently in polynomial time. Our experiments show that the proposed importance discovery algorithm has high accuracy and good scalability. 1.
Online interval skyline queries on time series
- In Proceedings of the 25th international conference on data engineering (ICDE’09
, 2009
"... Abstract — In many applications, we need to analyze a large number of time series. Segments of time series demonstrating dominating advantages over others are often of particular interest. In this paper, we advocate interval skyline queries, a novel type of time series analysis queries. For a set of ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
Abstract — In many applications, we need to analyze a large number of time series. Segments of time series demonstrating dominating advantages over others are often of particular interest. In this paper, we advocate interval skyline queries, a novel type of time series analysis queries. For a set of time series and a given time interval [i: j], an interval skyline query returns the time series which are not dominated by any other time series in the interval. We illustrate the usefulness of interval skyline queries in applications. Moreover, we develop an on-the-fly method and a view-materialization method to online answer interval skyline queries on time series. The on-the-fly method keeps the minimum and the maximum values of the time series using radix priority search trees and sketches, and computes the skyline at the query time. The view-materialization method maintains the skylines over all intervals in a compact data structure. Through theoretical analysis and extensive experiments, we show that both methods only require linear space and are efficient in query answering as well as incremental maintenance. I.
Efficiently Performing Consistency Checks for Multi-Dimensional Preference Trade-Offs
"... Abstract — Skyline Queries have recently received a lot of attention due to their intuitive query capabilities. Following the concept of Pareto optimality all ‘best ’ database objects are returned to the user. However, this often results in unmanageable large result set sizes hampering the success o ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Abstract — Skyline Queries have recently received a lot of attention due to their intuitive query capabilities. Following the concept of Pareto optimality all ‘best ’ database objects are returned to the user. However, this often results in unmanageable large result set sizes hampering the success of this innovative paradigm. As an effective remedy for this problem, trade-offs provide a natural concept for dealing with incomparable choices. Such trade-offs, however, are not reflected by the Pareto paradigm. Thus, incorporating them into the users ’ preference orders and adjusting skyline results accordingly needs special algorithms beyond traditional skylining. For the actual integration of trade-offs into skylines, the problem of ensuring the consistency of arbitrary trade-off sets poses a demanding challenge. Consistency is a crucial aspect when dealing with multi-dimensional trade-offs spanning over several attributes. If the consistency should be violated, cyclic preferences may occur in the result set. But such cyclic preferences cannot be resolved by information systems in a sensible way. Often, this problem is circumvented by restricting the trade-offs’ expressiveness, e.g. by altogether ignoring some classes of possibly inconsistent trade-offs. In this paper, we will present a new algorithm capable of efficiently verifying the consistency of any arbitrary set of trade-offs. After motivating its basic concepts and introducing the algorithm itself, we will also show that it exhibits superior average-case performance. The benefits of our approach promise to pave the way towards personalized and cooperative information systems.
Top-k Dominant Web Services Under Multi-Criteria Matching
"... As we move from a Web of data to a Web of services, enhancing the capabilities of the current Web search engines with effective and efficient techniques for Web services retrieval and selection becomes an important issue. Traditionally, the relevance of a Web service advertisement to a service reque ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
As we move from a Web of data to a Web of services, enhancing the capabilities of the current Web search engines with effective and efficient techniques for Web services retrieval and selection becomes an important issue. Traditionally, the relevance of a Web service advertisement to a service request is determined by computing an overall score that aggregates individual matching scores among the various parameters in their descriptions. Two drawbacks characterize such approaches. First, there is no single matching criterion that is optimal for determining the similarity between parameters. Instead, there are numerous approaches ranging from using Information Retrieval similarity metrics up to semantic logicbased inference rules. Second, the reduction of individual scores to an overall similarity leads to significant information loss. Since there is no consensus on how to weight these scores, existing methods
Online skyline analysis with dynamic preferences on nominal attributes
- In Technical Report, http://www.cse.cuhk.edu.hk/∼kdd
, 2007
"... Abstract—The importance of skyline analysis has been well recognized in multicriteria decision-making applications. All of the previous studies assume a fixed order on the attributes in question. However, in some applications, users may be interested in skylines with respect to various total or part ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Abstract—The importance of skyline analysis has been well recognized in multicriteria decision-making applications. All of the previous studies assume a fixed order on the attributes in question. However, in some applications, users may be interested in skylines with respect to various total or partial orders on nominal attributes. In this paper, we identify and tackle the problem of online skyline analysis with dynamic preferences on nominal attributes. We investigate how changes of orders in attributes lead to changes of skylines. We address two novel types of interesting queries: a viewpoint query returns with respect to which orders a point is (or is not) in the skylines, and a refined skyline query retrieves the skyline with respect to a specific order. We develop two methods systematically and report an extensive performance study using both synthetic and real data sets to verify the effectiveness and the efficiency of our methods. Index Terms—Skyline, materialization, data warehouses, preferences. Ç
Efficient Skyline Refinement using Trade-Offs
"... Skyline Queries have received a lot of attention due to their intuitive query formulation. Following the concept of Pareto optimality all ‘best’ database items satisfying different aspects of the query are returned to the user. However, this often results in huge result set sizes. In everyday’s life ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Skyline Queries have received a lot of attention due to their intuitive query formulation. Following the concept of Pareto optimality all ‘best’ database items satisfying different aspects of the query are returned to the user. However, this often results in huge result set sizes. In everyday’s life users face the same problem. But here, when confronted with a too large variety of choices users tend to focus only on some aspects of the attribute space at a time and try to figure out acceptable compromises between these attributes. Such trade-offs are not reflected by the Pareto paradigm. Incorporating them into user preferences and adjusting skyline results accordingly thus needs special algorithms beyond traditional skylining. In this paper we propose a novel algorithm for efficiently incorporating such typical trade-off information into preference orders. Our experiments on both real world and synthetic data sets show the impact of our techniques: impractical skyline sizes efficiently become manageable with a minimum amount of user interaction. Additionally, we also design a method to elicit especially interesting trade-offs promising a high reduction of skyline sizes. At any point, the user can choose whether to provide individual trade-offs, or accept those suggested by the system. The benefit of incorporating trade-offs into the strict Pareto semantics is clear: result sets become manageable, while additionally getting more focused on the users’ information needs.
CONSISTENCY CHECK ALGORITHMS FOR MULTI-DIMENSIONAL PREFERENCE TRADE-OFFS
- INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND APPLICATIONS
"... Skyline Queries have recently received a lot of attention due to their intuitive query capabilities. Following the concept of Pareto optimality all ‘best ’ database objects are returned to the user. However, this often results in unmanageable large result set sizes hampering the success of this inno ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Skyline Queries have recently received a lot of attention due to their intuitive query capabilities. Following the concept of Pareto optimality all ‘best ’ database objects are returned to the user. However, this often results in unmanageable large result set sizes hampering the success of this innovative paradigm. As an effective remedy for this problem, trade-offs provide a natural concept for dealing with incomparable choices. But such trade-offs are not reflected by the Pareto paradigm. Thus, incorporating them into the users’ preference orders and adjusting skyline results accordingly needs special algorithms beyond traditional skylining. For the actual integration of trade-offs into skylines, the problem of ensuring the consistency of arbitrary trade-off sets poses a demanding challenge. Consistency is a crucial aspect when dealing with multi-dimensional trade-offs spanning over several attributes. If the consistency should be violated, cyclic preferences may occur in the result set. But such cyclic preferences cannot be handled by information systems in a sensible way. Often, this problem is circumvented by restricting the tradeoffs’ expressiveness, e.g. by altogether ignoring some classes of possibly inconsistent trade-offs. In this paper, we will present a new algorithm capable of efficiently verifying the consistency of any
Computing Closed Skycubes ∗
"... In this paper, we tackle the problem of efficient skycube computation. We introduce a novel approach significantly reducing domination tests for a given subspace and the number of subspaces searched. Technically, we identify two types of skyline points that can be directly derived without using any ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
In this paper, we tackle the problem of efficient skycube computation. We introduce a novel approach significantly reducing domination tests for a given subspace and the number of subspaces searched. Technically, we identify two types of skyline points that can be directly derived without using any domination tests. Moreover, based on formal concept analysis, we introduce two closure operators that enable a concise representation of skyline cubes. We show that this concise representation is easy to compute and develop an efficient algorithm, which only needs to search a small portion of the huge search space. We show with empirical results the merits of our approach. 1.
Stochastic Skyline Operator
"... Abstract — In many applications involving the multiple criteria optimal decision making, users may often want to make a personal trade-off among all optimal solutions. As a key feature, the skyline in a multi-dimensional space provides the minimum set of candidates for such purposes by removing all ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract — In many applications involving the multiple criteria optimal decision making, users may often want to make a personal trade-off among all optimal solutions. As a key feature, the skyline in a multi-dimensional space provides the minimum set of candidates for such purposes by removing all points not preferred by any (monotonic) utility/scoring functions; that is, the skyline removes all objects not preferred by any user no mater how their preferences vary. Driven by many applications with uncertain data, the probabilistic skyline model is proposed to retrieve uncertain objects based on skyline probabilities. Nevertheless, skyline probabilities cannot capture the preferences of monotonic utility functions. Motivated by this, in this paper we propose a novel skyline operator, namely stochastic skyline. In the light of the expected utility principle, stochastic skyline guarantees to provide the minimum set of candidates for the optimal solutions over all possible monotonic multiplicative utility functions. In contrast to the conventional skyline or the probabilistic skyline computation, we show that the problem of stochastic skyline is NP-complete with respect to the dimensionality. Novel and efficient algorithms are developed to efficiently compute stochastic skyline over multidimensional uncertain data, which run in polynomial time if the dimensionality is fixed. We also show, by theoretical analysis and experiments, that the size of stochastic skyline is quite similar to that of conventional skyline over certain data. Comprehensive experiments demonstrate that our techniques are efficient and scalable regarding both CPU and IO costs. I.

