Results 1  10
of
3,924
A comparison of string distance metrics for namematching tasks
, 2003
"... Using an opensource, Java toolkit of namematching methods, we experimentally compare string distance metrics on the task of matching entity names. We investigate a number of different metrics proposed by different communities, including editdistance metrics, fast heuristic string comparators, tok ..."
Abstract

Cited by 446 (11 self)
 Add to MetaCart
, tokenbased distance metrics, and hybrid methods. Overall, the bestperforming method is a hybrid scheme combining a TFIDF weighting scheme, which is widely used in information retrieval, with the JaroWinkler stringdistance scheme, which was developed in the probabilistic record linkage community.
Geodesic Active Contours
, 1997
"... A novel scheme for the detection of object boundaries is presented. The technique is based on active contours evolving in time according to intrinsic geometric measures of the image. The evolving contours naturally split and merge, allowing the simultaneous detection of several objects and both in ..."
Abstract

Cited by 1425 (47 self)
 Add to MetaCart
A novel scheme for the detection of object boundaries is presented. The technique is based on active contours evolving in time according to intrinsic geometric measures of the image. The evolving contours naturally split and merge, allowing the simultaneous detection of several objects and both
A comparison of mechanisms for improving TCP performance over wireless links
 IEEE/ACM TRANSACTIONS ON NETWORKING
, 1997
"... Reliable transport protocols such as TCP are tuned to perform well in traditional networks where packet losses occur mostly because of congestion. However, networks with wireless and other lossy links also suffer from significant losses due to bit errors and handoffs. TCP responds to all losses by i ..."
Abstract

Cited by 927 (11 self)
 Add to MetaCart
by invoking congestion control and avoidance algorithms, resulting in degraded endtoend performance in wireless and lossy systems. In this paper, we compare several schemes designed to improve the performance of TCP in such networks. We classify these schemes into three broad categories: end
Proof verification and hardness of approximation problems
 IN PROC. 33RD ANN. IEEE SYMP. ON FOUND. OF COMP. SCI
, 1992
"... We show that every language in NP has a probablistic verifier that checks membership proofs for it using logarithmic number of random bits and by examining a constant number of bits in the proof. If a string is in the language, then there exists a proof such that the verifier accepts with probabilit ..."
Abstract

Cited by 797 (39 self)
 Add to MetaCart
in the proof (though this number is a very slowly growing function of the input length). As a consequence we prove that no MAX SNPhard problem has a polynomial time approximation scheme, unless NP=P. The class MAX SNP was defined by Papadimitriou and Yannakakis [82] and hard problems for this class include
Similarity search in high dimensions via hashing
, 1999
"... The nearest or nearneighbor query problems arise in a large variety of database applications, usually in the context of similarity searching. Of late, there has been increasing interest in building search/index structures for performing similarity search over highdimensional data, e.g., image dat ..."
Abstract

Cited by 641 (10 self)
 Add to MetaCart
to 20, searching in kd trees and related structures involves the inspection of a large fraction of the database, thereby doing no better than bruteforce linear search. It has been suggested that since the selection of features and the choice of a distance metric in typical applications is rather
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization
, 1997
"... The Rocchio relevance feedback algorithm is one of the most popular and widely applied learning methods from information retrieval. Here, a probabilistic analysis of this algorithm is presented in a text categorization framework. The analysis gives theoretical insight into the heuristics used in the ..."
Abstract

Cited by 456 (1 self)
 Add to MetaCart
in the Rocchio algorithm, particularly the word weighting scheme and the similarity metric. It also suggests improvements which lead to a probabilistic variant of the Rocchio classifier. The Rocchio classifier, its probabilistic variant, and a naive Bayes classifier are compared on six text categorization tasks
Similarity estimation techniques from rounding algorithms
 In Proc. of 34th STOC
, 2002
"... A locality sensitive hashing scheme is a distribution on a family F of hash functions operating on a collection of objects, such that for two objects x, y, Prh∈F[h(x) = h(y)] = sim(x,y), where sim(x,y) ∈ [0, 1] is some similarity function defined on the collection of objects. Such a scheme leads ..."
Abstract

Cited by 449 (6 self)
 Add to MetaCart
A locality sensitive hashing scheme is a distribution on a family F of hash functions operating on a collection of objects, such that for two objects x, y, Prh∈F[h(x) = h(y)] = sim(x,y), where sim(x,y) ∈ [0, 1] is some similarity function defined on the collection of objects. Such a scheme leads
A Fuzzy Commitment Scheme
 ACM CCS'99
, 1999
"... We combine wellknown techniques from the areas of errorcorrecting codes and cryptography to achieve a new type of cryptographic primitive that we refer to as a fuzzy commitment scheme. Like a conventional cryptographic commitment scheme, our fuzzy commitment scheme is both concealing and binding: i ..."
Abstract

Cited by 344 (1 self)
 Add to MetaCart
that it accepts a witness that is close to the original encrypting witness in a suitable metric, but not necessarily identical. This characteristic of our fuzzy commitment scheme makes it useful for applications such as biometric authentication systems, in which data is subject to random noise. Because the scheme
Efficient Crawling Through URL Ordering
 COMPUTER NETWORKS AND ISDN SYSTEMS
, 1998
"... In this paper we study in what order a crawler should visit the URLs it has seen, in order to obtain more “important” pages first. Obtaining important pages rapidly can be very useful when a crawler cannot visit the entire Web in a reasonable amount of time. We define several importance metrics, ord ..."
Abstract

Cited by 384 (9 self)
 Add to MetaCart
In this paper we study in what order a crawler should visit the URLs it has seen, in order to obtain more “important” pages first. Obtaining important pages rapidly can be very useful when a crawler cannot visit the entire Web in a reasonable amount of time. We define several importance metrics
Fuzzy identitybased encryption
 In EUROCRYPT
, 2005
"... We introduce a new type of IdentityBased Encryption (IBE) scheme that we call Fuzzy IdentityBased Encryption. In Fuzzy IBE we view an identity as set of descriptive attributes. A Fuzzy IBE scheme allows for a private key for an identity, ω, to decrypt a ciphertext encrypted with an identity, ω ′ , ..."
Abstract

Cited by 377 (20 self)
 Add to MetaCart
′ , if and only if the identities ω and ω ′ are close to each other as measured by the “set overlap ” distance metric. A Fuzzy IBE scheme can be applied to enable encryption using biometric inputs as identities; the errortolerance property of a Fuzzy IBE scheme is precisely what allows for the use of biometric
Results 1  10
of
3,924