• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Refreshing the sky: The compressing skycube with efficient support for frequent updates (2006)

by T Xia, D Zhang
Venue:In SIGMOD
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 26
Next 10 →

Probabilistic skylines on uncertain data

by Jian Pei, Bin Jiang, Xuemin Lin, Yidong Yuan - In Proceedings of the 33rd International Conference on Very Large Data Bases (VLDB’07), Viena , 2007
"... Uncertain data are inherent in some important applications. Although a considerable amount of research has been dedicated to modeling uncertain data and answering some types of queries on uncertain data, how to conduct advanced analysis on uncertain data remains an open problem at large. In this pap ..."
Abstract - Cited by 39 (10 self) - Add to MetaCart
Uncertain data are inherent in some important applications. Although a considerable amount of research has been dedicated to modeling uncertain data and answering some types of queries on uncertain data, how to conduct advanced analysis on uncertain data remains an open problem at large. In this paper, we tackle the problem of skyline analysis on uncertain data. We propose a novel probabilistic skyline model where an uncertain object may take a probability to be in the skyline, and a p-skyline contains all the objects whose skyline probabilities are at least p. Computing probabilistic skylines on large uncertain data sets is challenging. We develop two efficient algorithms. The bottom-up algorithm computes the skyline probabilities of some selected instances of uncertain objects, and uses those instances to prune other instances and uncertain objects effectively. The top-down algorithm recursively partitions the instances of uncertain objects into subsets, and prunes subsets and objects aggressively. Our experimental results on both the real NBA player data set and the benchmark synthetic data sets show that probabilistic skylines are interesting and useful, and our two algorithms are efficient on large data sets, and complementary to each other in performance. 1.

Selecting Stars: The k Most Representative Skyline Operator

by Xuemin Lin, Yidong Yuan, Qing Zhang, Ying Zhang - In Proc. of the Int. IEEE Conf. on Data Engineering (ICDE , 2007
"... Skyline computation has many applications including multi-criteria decision making. In this paper, we study the problem of selecting k skyline points so that the number of points, which are dominated by at least one of these k skyline points, is maximized. We first present an efficient dynamic progr ..."
Abstract - Cited by 39 (1 self) - Add to MetaCart
Skyline computation has many applications including multi-criteria decision making. In this paper, we study the problem of selecting k skyline points so that the number of points, which are dominated by at least one of these k skyline points, is maximized. We first present an efficient dynamic programming based exact algorithm in a 2d-space. Then, we show that the problem is NP-hard when the dimensionality is 3 or more and it can be approximately solved by a polynomial time algorithm with the guaranteed approximation ratio 1 − 1 e. To speed-up the computation, an efficient, scalable, index-based randomized algorithm is developed by applying the FM probabilistic counting technique. A comprehensive performance evaluation demonstrates that our randomized technique is very efficient, highly accurate, and scalable. 1.

Deltasky: Optimal maintenance of skyline deletions without exclusive dominance region generation

by Ping Wu, Divyakant Agrawal, Amr El Abbadi - In UCSB Tech Report, 2006. http://www.cs.ucsb.edu/ ∼ pingwu/ deltasky.pdf , 2007
"... This paper addresses the problem of efficient maintenance of a materialized skyline view in response to skyline removals. While there has been significant progress on skyline query computation, an equally important but largely unanswered issue is on the incremental maintenance for skyline deletions. ..."
Abstract - Cited by 15 (1 self) - Add to MetaCart
This paper addresses the problem of efficient maintenance of a materialized skyline view in response to skyline removals. While there has been significant progress on skyline query computation, an equally important but largely unanswered issue is on the incremental maintenance for skyline deletions. Previous work suggested the use of the so called exclusive dominance region (EDR) to achieve optimal I/O performance for deletion maintenance. However, the shape of an EDR becomes extremely complex in higher dimensions, and algorithms for its computation have not been developed. We derive a systematic way to decompose a d-dimensional EDR into a collection of hyper-rectangles. We show that the number of such hyper-rectangles is O(m d), where m is the current skyline result size. We then propose a novel algorithm DeltaSky which determines whether an intermediate R-tree MBR intersects with the EDR without explicitly calculating the EDR itself. This reduces the worse case complexity of the EDR intersection check from O(m d) to O(md). Thus DeltaSky helps the branch and bound skyline algorithm achieve I/O optimality for deletion maintenance by finding only the newly appeared skyline points after the deletion. We discuss implementation issues and show that DeltaSky can be efficiently implemented using one extra B-Tree. Moreover, we propose two optimization techniques which further reduce the average cost in practice. Extensive experiments demonstrate that DeltaSky achieves orders of magnitude performance gain over alternative solutions. 1

Towards Multidimensional Subspace Skyline Analysis

by Jian Pei, Yidong Yuan, Xuemin Lin, Wen Jin, Martin Ester, Qing Liu, Wei Wang, Yufei Tao, Jeffrey Xu Yu, Qing Zhang
"... The skyline operator is important for multicriteria decision-making applications. Although many recent studies developed efficient methods to compute skyline objects in a given space, none of them considers skylines in multiple subspaces simultaneously. More importantly, the fundamental ..."
Abstract - Cited by 14 (3 self) - Add to MetaCart
The skyline operator is important for multicriteria decision-making applications. Although many recent studies developed efficient methods to compute skyline objects in a given space, none of them considers skylines in multiple subspaces simultaneously. More importantly, the fundamental

Probabilistic Skyline Operators over Sliding Windows Windows

by Wenjie Zhang, Ying Zhang, Jeffrey Xu Yu - ICDE 2009 2009
"... Abstract — Skyline computation has many applications including multi-criteria decision making. In this paper, we study the problem of efficient processing of continuous skyline queries over sliding windows on uncertain data elements regarding given probability thresholds. We first characterize what ..."
Abstract - Cited by 12 (4 self) - Add to MetaCart
Abstract — Skyline computation has many applications including multi-criteria decision making. In this paper, we study the problem of efficient processing of continuous skyline queries over sliding windows on uncertain data elements regarding given probability thresholds. We first characterize what kind of elements we need to keep in our query computation. Then we show the size of dynamically maintained candidate set and the size of skyline. We develop novel, efficient techniques to process a continuous, probabilistic skyline query. Finally, we extend our techniques to the applications where multiple probability thresholds are given or we want to retrieve “top-k ” skyline data objects. Our extensive experiments demonstrate that the proposed techniques are very efficient and handle a high-speed data stream in real time. I.

Efficient Skyline and Top-k Retrieval in Subspaces

by Yufei Tao, Xiaokui Xiao, Jian Pei
"... Skyline and top-k queries are two popular operations for preference retrieval. In practice, applications that require these operations usually provide numerous candidate attributes, whereas, depending on their interests, users may issue queries regarding different subsets of the dimensions. The exis ..."
Abstract - Cited by 12 (0 self) - Add to MetaCart
Skyline and top-k queries are two popular operations for preference retrieval. In practice, applications that require these operations usually provide numerous candidate attributes, whereas, depending on their interests, users may issue queries regarding different subsets of the dimensions. The existing algorithms are inadequate for subspace skyline/top-k search because they have at least one of the following defects: they (i) require scanning the entire database at least once; (ii) are optimized for one subspace but incur significant overhead for other subspaces; (iii) demand expensive maintenance cost or space consumption. In this paper, we propose a technique, SUBSKY, which settles both types of queries using purely relational technologies. The core of SUBSKY is a transformation that converts multidimensional data to 1D values. These values are indexed by a simple B-tree, which allows us to answer subspace queries by accessing a fraction of the database. SUBSKY entails low maintenance overhead, which equals the cost of updating a traditional B-tree. Extensive experiments with real data confirm that our technique outperforms alternative solutions significantly in both efficiency and scalability.

Eliciting Matters -- Controlling Skyline Sizes by Incremental Integration of User Preferences

by Wolf-Tilo Balke, Ulrich Güntzer, Christoph Lofi - INT. CONF. ON DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA , 2007
"... Today, result sets of skyline queries are unmanageable due to their exponential growth with the number of query predicates. In this paper we discuss the incremental re-computation of skylines based on additional information elicited from the user. Extending the traditional case of totally ordered do ..."
Abstract - Cited by 9 (4 self) - Add to MetaCart
Today, result sets of skyline queries are unmanageable due to their exponential growth with the number of query predicates. In this paper we discuss the incremental re-computation of skylines based on additional information elicited from the user. Extending the traditional case of totally ordered domains, we consider preferences in their most general form as strict partial orders of attribute values. After getting an initial skyline set our basic approach aims at interactively increasing the system’s information about the user’s wishes explicitly including indifferences. The additional knowledge then is incorporated into the preference information and constantly reduces skyline sizes. In fact, our approach even allows users to specify trade-offs between different query predicates, thus effectively decreasing the query dimensionality. We give theoretical proof for the soundness and consistence of the extended preference information and an extensive experimental evaluation of the efficiency of our approach. On average, skyline sizes can be considerably decreased in each elicitation step.

Incremental Trade-Off Management for Preference Based Queries

by Wolf-Tilo Balke, Christoph Lofi, Ulrich Güntzer
"... Preference-based queries often referred to as skyline queries play an important role in cooperative query processing. However, their prohibitive result sizes pose a severe challenge to the paradigm‟s practical applicability. In this paper we discuss the incremental re-computation of skylines based o ..."
Abstract - Cited by 7 (7 self) - Add to MetaCart
Preference-based queries often referred to as skyline queries play an important role in cooperative query processing. However, their prohibitive result sizes pose a severe challenge to the paradigm‟s practical applicability. In this paper we discuss the incremental re-computation of skylines based on additional information elicited from the user. Extending the traditional case of totally ordered domains, we consider preferences in their most general form as strict partial orders of attribute values. After getting an initial skyline set our approach aims at incrementally increasing the system‟s information about the user‟s wishes. This additional knowledge then is incorporated into the preference information and constantly reduces skyline sizes. In particular, our approach also allows users to specify trade-offs between different query attributes, thus effectively decreasing the query dimensionality. We provide the required theoretical foundations for modeling preferences and equivalences, show how to compute incremented skylines, and proof the correctness of the algorithm. Moreover, we show that incremented skyline computation can take advantage of locality and database indices and thus the performance of the algorithm can be additionally increased.

Computing compressed multidimensional skyline cubes efficiently

by Jian Pei, Ada Wai-chee, Fu Xuemin, Lin Haixun Wang - In ICDE‘07 , 2007
"... Recently, the skyline computation and analysis have been extended from one single full space to multidimensional subspaces, which can lead to valuable insights in some applications. Particularly, compressed skyline cubes in the form of skyline groups and their decisive subspaces provide a succinct s ..."
Abstract - Cited by 7 (1 self) - Add to MetaCart
Recently, the skyline computation and analysis have been extended from one single full space to multidimensional subspaces, which can lead to valuable insights in some applications. Particularly, compressed skyline cubes in the form of skyline groups and their decisive subspaces provide a succinct summarization and compression of multidimensional subspace skylines. However, computing skyline cubes remains a challenging task since the existing methods have to search an exponential number of nonempty subspaces for subspace skylines. In this paper, we propose a novel and efficient method, Stellar, which exploits an interesting skyline group lattice on a small subset of objects which are in the skyline of the full space. We show that this skyline group lattice is easy to compute and can be extended to the skyline group lattice on all objects. After computing the skyline in the full space, Stellar only needs to enumerate skyline groups and their decisive subspaces using the full space skyline objects. Avoiding searching for skylines in an exponential number of subspaces improves the efficiency and the scalability of subspace skyline computation substantially in practice. An extensive performance study verifies the merits of our new method. 1

Online interval skyline queries on time series

by Bin Jiang, Jian Pei - In Proceedings of the 25th international conference on data engineering (ICDE’09 , 2009
"... Abstract — In many applications, we need to analyze a large number of time series. Segments of time series demonstrating dominating advantages over others are often of particular interest. In this paper, we advocate interval skyline queries, a novel type of time series analysis queries. For a set of ..."
Abstract - Cited by 7 (1 self) - Add to MetaCart
Abstract — In many applications, we need to analyze a large number of time series. Segments of time series demonstrating dominating advantages over others are often of particular interest. In this paper, we advocate interval skyline queries, a novel type of time series analysis queries. For a set of time series and a given time interval [i: j], an interval skyline query returns the time series which are not dominated by any other time series in the interval. We illustrate the usefulness of interval skyline queries in applications. Moreover, we develop an on-the-fly method and a view-materialization method to online answer interval skyline queries on time series. The on-the-fly method keeps the minimum and the maximum values of the time series using radix priority search trees and sketches, and computes the skyline at the query time. The view-materialization method maintains the skylines over all intervals in a compact data structure. Through theoretical analysis and extensive experiments, we show that both methods only require linear space and are efficient in query answering as well as incremental maintenance. I.
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University