• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 75,080
Next 10 →

Analysis of Distance Based Indexing Methods for Similarity Search

by Mahdi Mirzazadeh University, Mahdi Mirzazadeh, Bashir S. Sadjad
"... In this paper we investigate data structures for performing similarity search queries in metric spaces. We have selected six methods proposed in three dierent communities: two theoretically analyzed structures of Clarkson, GNAT and SA-tree from databases, and AESA and LAESA which are used for patter ..."
Abstract - Add to MetaCart
In this paper we investigate data structures for performing similarity search queries in metric spaces. We have selected six methods proposed in three dierent communities: two theoretically analyzed structures of Clarkson, GNAT and SA-tree from databases, and AESA and LAESA which are used

DSIM: A Distance-based Indexing Method for Genomic Sequences

by Xia Cao, Beng Chin, Ooi Hwee, Hwa Pang, Kian-lee Tan, Anthony K. H. Tung - In Proceedings of the IEEE International Conference on Bioinformatics and Bioengineering (BIBE , 2005
"... In this paper, we propose a Distance-based Sequence In-dexing Method (DSIM) for indexing and searching genome databases. Borrowing the idea of video compression, we compress the genomic sequence database around a set of automatically selected reference words, formed from high-frequency data substrin ..."
Abstract - Cited by 3 (0 self) - Add to MetaCart
In this paper, we propose a Distance-based Sequence In-dexing Method (DSIM) for indexing and searching genome databases. Borrowing the idea of video compression, we compress the genomic sequence database around a set of automatically selected reference words, formed from high-frequency data

Algorithms for Mining Distance-Based Outliers in Large Datasets

by Edwin M. Knorr, Raymond T. Ng , 1998
"... This paper deals with finding outliers (exceptions) in large, multidimensional datasets. The identification of outliers can lead to the discovery of truly unexpected knowledge in areas such as electronic commerce, credit card fraud, and even the analysis of performance statistics of professional ath ..."
Abstract - Cited by 360 (5 self) - Add to MetaCart
athletes. Existing methods that we have seen for finding outliers in large datasets can only deal efficiently with two dimensions/attributes of a dataset. Here, we study the notion of DB- (Distance- Based) outliers. While we provide formal and empirical evidence showing the usefulness of DB-outliers, we

Indexing by latent semantic analysis

by Scott Deerwester, Susan T. Dumais, George W. Furnas, Thomas K. Landauer, Richard Harshman - JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE , 1990
"... A new method for automatic indexing and retrieval is described. The approach is to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries. The p ..."
Abstract - Cited by 3779 (35 self) - Add to MetaCart
A new method for automatic indexing and retrieval is described. The approach is to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries

The X-tree: An index structure for high-dimensional data

by Stefan Berchtold, Daniel A. Keim, Hans-peter Kriegel - In Proceedings of the Int’l Conference on Very Large Data Bases , 1996
"... In this paper, we propose a new method for index-ing large amounts of point and spatial data in high-dimensional space. An analysis shows that index structures such as the R*-tree are not adequate for indexing high-dimensional data sets. The major problem of R-tree-based index structures is the over ..."
Abstract - Cited by 592 (17 self) - Add to MetaCart
In this paper, we propose a new method for index-ing large amounts of point and spatial data in high-dimensional space. An analysis shows that index structures such as the R*-tree are not adequate for indexing high-dimensional data sets. The major problem of R-tree-based index structures

A Simple, Fast, and Accurate Algorithm to Estimate Large Phylogenies by Maximum Likelihood

by Stéphane Guindon, Olivier Gascuel , 2003
"... The increase in the number of large data sets and the complexity of current probabilistic sequence evolution models necessitates fast and reliable phylogeny reconstruction methods. We describe a new approach, based on the maximumlikelihood principle, which clearly satisfies these requirements. The ..."
Abstract - Cited by 2182 (27 self) - Add to MetaCart
. The core of this method is a simple hill-climbing algorithm that adjusts tree topology and branch lengths simultaneously. This algorithm starts from an initial tree built by a fast distance-based method and modifies this tree to improve its likelihood at each iteration. Due to this simultaneous adjustment

Managing Gigabytes: Compressing and Indexing Documents and Images - Errata

by I. H. Witten, A. Moffat, T. C. Bell , 1996
"... > ! "GZip" page 64, Table 2.5, line "progp": "43,379" ! "49,379" page 68, Table 2.6: "Mbyte/sec" ! "Mbyte/min" twice in the body of the table, and in the caption "Mbyte/second" ! "Mbyte/minute" page 70, para 4, line ..."
Abstract - Cited by 978 (48 self) - Add to MetaCart
5: "Santos" ! "Santis" page 71, line 11: "Fiala and Greene (1989)" ! "Fiala and Green (1989)" Chapter Three page 89, para starting "Using this method", line 2: "hapax legomena " ! "hapax legomenon " page 96, line 5: "

Quantization Index Modulation: A Class of Provably Good Methods for Digital Watermarking and Information Embedding

by Brian Chen, Gregory W. Wornell - IEEE TRANS. ON INFORMATION THEORY , 1999
"... We consider the problem of embedding one signal (e.g., a digital watermark), within another "host" signal to form a third, "composite" signal. The embedding is designed to achieve efficient tradeoffs among the three conflicting goals of maximizing information-embedding rate, mini ..."
Abstract - Cited by 496 (14 self) - Add to MetaCart
, minimizing distortion between the host signal and composite signal, and maximizing the robustness of the embedding. We introduce new classes of embedding methods, termed quantization index modulation (QIM) and distortion-compensated QIM (DC-QIM), and develop convenient realizations in the form of what we

R-trees: A Dynamic Index Structure for Spatial Searching

by Antonin Guttman - INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA , 1984
"... In order to handle spatial data efficiently, as required in computer aided design and geo-data applications, a database system needs an index mechanism that will help it retrieve data items quickly according to their spatial locations However, traditional indexing methods are not well suited to data ..."
Abstract - Cited by 2750 (0 self) - Add to MetaCart
In order to handle spatial data efficiently, as required in computer aided design and geo-data applications, a database system needs an index mechanism that will help it retrieve data items quickly according to their spatial locations However, traditional indexing methods are not well suited

Index-driven similarity search in metric spaces

by Gisli R. Hjaltason, Hanan Samet - ACM Transactions on Database Systems , 2003
"... Similarity search is a very important operation in multimedia databases and other database applications involving complex objects, and involves finding objects in a data set S similar to a query object q, based on some similarity measure. In this article, we focus on methods for similarity search th ..."
Abstract - Cited by 192 (8 self) - Add to MetaCart
that make the general assumption that similarity is represented with a distance metric d. Existing methods for handling similarity search in this setting typically fall into one of two classes. The first directly indexes the objects based on distances (distance-based indexing), while the second is based
Next 10 →
Results 1 - 10 of 75,080
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University