• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 110,079
Next 10 →

Efficient Algorithms for Mining Outliers from Large Data Sets

by Sridhar Ramaswamy, Rajeev Rastogi, Kyuseok Shim , 2000
"... In this paper, we propose a novel formulation for distance-based outliers that is based on the distance of a point from its k th nearest neighbor. We rank each point on the basis of its distance to its k th nearest neighbor and declare the top n points in this ranking to be outliers. In addition ..."
Abstract - Cited by 322 (0 self) - Add to MetaCart
. In addition to developing relatively straightforward solutions to finding such outliers based on the classical nestedloop join and index join algorithms, we develop a highly efficient partition-based algorithm for mining outliers. This algorithm first partitions the input data set into disjoint subsets

Efficient Variants of the ICP Algorithm

by Szymon Rusinkiewicz, Marc Levoy - INTERNATIONAL CONFERENCE ON 3-D DIGITAL IMAGING AND MODELING , 2001
"... The ICP (Iterative Closest Point) algorithm is widely used for geometric alignment of three-dimensional models when an initial estimate of the relative pose is known. Many variants of ICP have been proposed, affecting all phases of the algorithm from the selection and matching of points to the minim ..."
Abstract - Cited by 718 (5 self) - Add to MetaCart
The ICP (Iterative Closest Point) algorithm is widely used for geometric alignment of three-dimensional models when an initial estimate of the relative pose is known. Many variants of ICP have been proposed, affecting all phases of the algorithm from the selection and matching of points

An Efficient Context-Free Parsing Algorithm

by Jay Earley , 1970
"... A parsing algorithm which seems to be the most efficient general context-free algorithm known is described. It is similar to both Knuth's LR(k) algorithm and the familiar top-down algorithm. It has a time bound proportional to n 3 (where n is the length of the string being parsed) in general; i ..."
Abstract - Cited by 798 (0 self) - Add to MetaCart
A parsing algorithm which seems to be the most efficient general context-free algorithm known is described. It is similar to both Knuth's LR(k) algorithm and the familiar top-down algorithm. It has a time bound proportional to n 3 (where n is the length of the string being parsed) in general

An Efficient Boosting Algorithm for Combining Preferences

by Raj Dharmarajan Iyer , Jr. , 1999
"... The problem of combining preferences arises in several applications, such as combining the results of different search engines. This work describes an efficient algorithm for combining multiple preferences. We first give a formal framework for the problem. We then describe and analyze a new boosting ..."
Abstract - Cited by 727 (18 self) - Add to MetaCart
The problem of combining preferences arises in several applications, such as combining the results of different search engines. This work describes an efficient algorithm for combining multiple preferences. We first give a formal framework for the problem. We then describe and analyze a new

CURE: An Efficient Clustering Algorithm for Large Data sets

by Sudipto Guha, Rajeev Rastogi, Kyuseok Shim - Published in the Proceedings of the ACM SIGMOD Conference , 1998
"... Clustering, in data mining, is useful for discovering groups and identifying interesting distributions in the underlying data. Traditional clustering algorithms either favor clusters with spherical shapes and similar sizes, or are very fragile in the presence of outliers. We propose a new clustering ..."
Abstract - Cited by 722 (5 self) - Add to MetaCart
is much better than those found by existing algorithms. Furthermore, they demonstrate that random sampling and partitioning enable CURE to not only outperform existing algorithms but also to scale well for large databases without sacrificing clustering quality. 1

Theoretical improvements in algorithmic efficiency for network flow problems

by Jack Edmonds, Richard M. Karp - , 1972
"... This paper presents new algorithms for the maximum flow problem, the Hitchcock transportation problem, and the general minimum-cost flow problem. Upper bounds on ... the numbers of steps in these algorithms are derived, and are shown to compale favorably with upper bounds on the numbers of steps req ..."
Abstract - Cited by 560 (0 self) - Add to MetaCart
This paper presents new algorithms for the maximum flow problem, the Hitchcock transportation problem, and the general minimum-cost flow problem. Upper bounds on ... the numbers of steps in these algorithms are derived, and are shown to compale favorably with upper bounds on the numbers of steps

Implementing data cubes efficiently

by Venky Harinarayan, Anand Rajaraman, Jeffrey D. Ulman - In SIGMOD , 1996
"... Decision support applications involve complex queries on very large databases. Since response times should be small, query optimization is critical. Users typically view the data as multidimensional data cubes. Each cell of the data cube is a view consisting of an aggregation of interest, like total ..."
Abstract - Cited by 548 (1 self) - Add to MetaCart
to materializing the data cube. In this paper, we investigate the issue of which cells (views) to materialize when it is too expensive to materialize all views. A lattice framework is used to express dependencies among views. We present greedy algorithms that work off this lattice and determine a good set of views

Efficient semantic matching

by Fausto Giunchiglia, Mikalai Yatskevich, Enrico Giunchiglia , 2004
"... We think of Match as an operator which takes two graph-like structures and produces a mapping between semantically related nodes. We concentrate on classifications with tree structures. In semantic matching, correspondences are discovered by translating the natural language labels of nodes into prop ..."
Abstract - Cited by 855 (68 self) - Add to MetaCart
into propositional formulas, and by codifying matching into a propositional unsatisfiability problem. We distinguish between problems with conjunctive formulas and problems with disjunctive formulas, and present various optimizations. For instance, we propose a linear time algorithm which solves the first class

The CN2 Induction Algorithm

by Peter Clark , Tim Niblett - MACHINE LEARNING , 1989
"... Systems for inducing concept descriptions from examples are valuable tools for assisting in the task of knowledge acquisition for expert systems. This paper presents a description and empirical evaluation of a new induction system, cn2, designed for the efficient induction of simple, comprehensib ..."
Abstract - Cited by 890 (6 self) - Add to MetaCart
Systems for inducing concept descriptions from examples are valuable tools for assisting in the task of knowledge acquisition for expert systems. This paper presents a description and empirical evaluation of a new induction system, cn2, designed for the efficient induction of simple

Efficient graph-based image segmentation.

by Pedro F Felzenszwalb , Daniel P Huttenlocher - International Journal of Computer Vision, , 2004
"... Abstract. This paper addresses the problem of segmenting an image into regions. We define a predicate for measuring the evidence for a boundary between two regions using a graph-based representation of the image. We then develop an efficient segmentation algorithm based on this predicate, and show ..."
Abstract - Cited by 940 (1 self) - Add to MetaCart
Abstract. This paper addresses the problem of segmenting an image into regions. We define a predicate for measuring the evidence for a boundary between two regions using a graph-based representation of the image. We then develop an efficient segmentation algorithm based on this predicate, and show
Next 10 →
Results 1 - 10 of 110,079
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University