• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 11 - 20 of 1,797
Next 10 →

Abstract Bifocal Sampling for Skew-Resistant Join Size Estimation

by Sumit Ganguly, Phillip B. Gibbons
"... This paper introduces bifocal sampling, a new technique for estimating the size of an equi-join of two relations. Bifocal sampling classi es tuples in each relation into two groups, sparse and dense, based on the number of tuples with the same join value. Distinct estimation procedures are employed ..."
Abstract - Add to MetaCart
This paper introduces bifocal sampling, a new technique for estimating the size of an equi-join of two relations. Bifocal sampling classi es tuples in each relation into two groups, sparse and dense, based on the number of tuples with the same join value. Distinct estimation procedures are employed

Self-Join Size Estimation in Large-scale Distributed Data Systems

by Theoni Pitoura, Peter Triantafillou
"... Abstract — In this work we tackle the open problem of self-join size (SJS) estimation in a large-scale Distributed Data System, where tuples of a relation are distributed over data nodes which comprise an overlay network. Our contributions include adaptations of five well-known SJS estimation centra ..."
Abstract - Cited by 5 (0 self) - Add to MetaCart
Abstract — In this work we tackle the open problem of self-join size (SJS) estimation in a large-scale Distributed Data System, where tuples of a relation are distributed over data nodes which comprise an overlay network. Our contributions include adaptations of five well-known SJS estimation

Join Size Estimation Over Data Streams Using Cosine Series

by Zhewei Jiang, Cheng Luo, Wen-chi Hou, Feng Yan, Qiang Zhu, Chih-fang Wang
"... In many applications, data takes the form of a continuous stream rather than a persistent data set. Data stream processing is generally an on-line, one-pass process and is required to be time and space efficient too. In this paper, we develop a framework for estimating join size over the data stream ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
In many applications, data takes the form of a continuous stream rather than a persistent data set. Data stream processing is generally an on-line, one-pass process and is required to be time and space efficient too. In this paper, we develop a framework for estimating join size over the data

Holistic twig joins: optimal XML pattern matching

by Nicolas Bruno, Nick Koudas, Divesh Srivastava - in: Proceedings of ACM SIGMOD International Conference on Management of Data, 2002
"... XML employs a tree-structured data model, and, naturally, XML queries specify patterns of selection predicates on multiple elements related by a tree structure. Finding all occurrences of such a twig pattern in an XML database is a core operation for XML query processing. Prior work has typically de ..."
Abstract - Cited by 344 (8 self) - Add to MetaCart
of this approach for matching twig patterns is that intermediate result sizes can get large, even when the input and output sizes are more manageable. In this paper, we propose a novel holistic twig join algorithm, TwigStack, for matching an XML query twig pattern. Our technique uses a chain of linked stacks

Power-Law Based Estimation of Set Similarity Join Size ∗

by Hongrae Lee, Raymond T. Ng, Kyuseok Shim
"... We propose a novel technique for estimating the size of set similarity join. The proposed technique relies on a succinct representation of sets using Min-Hash signatures. We exploit frequent patterns in the signatures for the Set Similarity Join (SSJoin) size estimation by counting their support. Ho ..."
Abstract - Cited by 12 (1 self) - Add to MetaCart
We propose a novel technique for estimating the size of set similarity join. The proposed technique relies on a succinct representation of sets using Min-Hash signatures. We exploit frequent patterns in the signatures for the Set Similarity Join (SSJoin) size estimation by counting their support

Competition in two-sided markets

by Mark Armstrong - The MITRE Corporation , 1989
"... Many markets involve two groups of agents who interact via “platforms, ” where one group’s benefit from joining a platform depends on the size of the other group that joins the platform. I present three models of such markets: a monopoly platform; a model of competing platforms where agents join a s ..."
Abstract - Cited by 374 (2 self) - Add to MetaCart
Many markets involve two groups of agents who interact via “platforms, ” where one group’s benefit from joining a platform depends on the size of the other group that joins the platform. I present three models of such markets: a monopoly platform; a model of competing platforms where agents join a

Adelson,"A Multiresolution Spline with Application to Image Mosaics",

by Peter J Burt , Edward H Adelson - ACM Transactions on Graphics, , 1983
"... We define a multiresolution spline technique for combining two or more images into a larger image mosaic. In this procedure, the images to be splined are first decomposed into a set of band-pass filtered component images. Next, the component images in each spatial frequency hand are assembled into ..."
Abstract - Cited by 362 (4 self) - Add to MetaCart
into a corresponding bandpass mosaic. In this step, component images are joined using a weighted average within a transition zone which is proportional in size to the wave lengths represented in the band. Finally, these band-pass mosaic images are summed to obtain the desired image mosaic. In this way

Structural Joins: A Primitive for Efficient XML Query Pattern Matching

by Shurug Al-Khalifa, H. V. Jagadish, Nick Koudas, Jignesh M. Patel, Divesh Srivastava, Yuqing Wu - In ICDE , 2002
"... XML queries typically specify patterns of selection predicates on multiple elements that have some specified tree structured relationships. The primitive tree structured relationships are parent-child and ancestor-descendant, and finding all occurrences of these structural relationships in an XML da ..."
Abstract - Cited by 279 (27 self) - Add to MetaCart
database is a core operation for XML query processing. In this paper, we develop two families of structural join algorithms for this task: tree-merge and stack-tree. The tree-merge algorithms are a natural extension of traditional merge joins and the recently proposed multi-predicate merge joins, while

On the propagation of errors in the size of join results

by Yannis Ioannidis, Stavros Christodoulakis - Proc. of ACM SIGMOD , 1991
"... yannisQcs.wise.edu Query optimizers of current relational database systems use several statistics maintained by the system on the contents of the database to decide on the most efficient access plan for a given query. These statistics contain errors that transitively affect many estimates derived by ..."
Abstract - Cited by 142 (5 self) - Add to MetaCart
yannisQcs.wise.edu Query optimizers of current relational database systems use several statistics maintained by the system on the contents of the database to decide on the most efficient access plan for a given query. These statistics contain errors that transitively affect many estimates derived by the optimizer. We present a formal framework based on which the principles of this error propagation can be studied. Within this framework, we obtain several ana-lytic results on how the error propagates in general, as well as in the extreme and average cases. We also pro-vide results on guarantees that the database system can make based on the statistics that it maintains. Finally, we discuss some promising approaches to controlling the error propagation and derive several interesting proper-ties of them. 1

Efficient Algorithms for Mining Outliers from Large Data Sets

by Sridhar Ramaswamy, Rajeev Rastogi, Kyuseok Shim , 2000
"... In this paper, we propose a novel formulation for distance-based outliers that is based on the distance of a point from its k th nearest neighbor. We rank each point on the basis of its distance to its k th nearest neighbor and declare the top n points in this ranking to be outliers. In addition ..."
Abstract - Cited by 322 (0 self) - Add to MetaCart
. In addition to developing relatively straightforward solutions to finding such outliers based on the classical nestedloop join and index join algorithms, we develop a highly efficient partition-based algorithm for mining outliers. This algorithm first partitions the input data set into disjoint subsets
Next 10 →
Results 11 - 20 of 1,797
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University