Results 1  10
of
11
Bayesian graph edit distance
 IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2000
"... AbstractÐThis paper describes a novel framework for comparing and matching corrupted relational graphs. The paper develops the idea of editdistance originally introduced for graphmatching by Sanfeliu and Fu [1]. We show how the Levenshtein distance can be used to model the probability distribution ..."
Abstract

Cited by 53 (5 self)
 Add to MetaCart
AbstractÐThis paper describes a novel framework for comparing and matching corrupted relational graphs. The paper develops the idea of editdistance originally introduced for graphmatching by Sanfeliu and Fu [1]. We show how the Levenshtein distance can be used to model the probability distribution for structural errors in the graphmatching problem. This probability distribution is used to locate matches using MAP label updates. We compare the resulting graphmatching algorithm with that recently reported by Wilson and Hancock. The use of editdistance offers an elegant alternative to the exhaustive compilation of label dictionaries. Moreover, the method is polynomial rather than exponential in its worstcase complexity. We support our approach with an experimental study on synthetic data and illustrate its effectiveness on an uncalibrated stereo correspondence problem. This demonstrates experimentally that the gain in efficiency is not at the expense of quality of match.
Edit distance from graph spectra
 In Proc. 9th IEEE Int. Conf. Comp. Vis
, 2003
"... This paper is concerned with computing graph edit distance. One of the criticisms that can be leveled at existing methods for computing graph edit distance is that it lacks the formality and rigour of the computation of string edit distance. Hence, our aim is to convert graphs to string sequences so ..."
Abstract

Cited by 10 (2 self)
 Add to MetaCart
This paper is concerned with computing graph edit distance. One of the criticisms that can be leveled at existing methods for computing graph edit distance is that it lacks the formality and rigour of the computation of string edit distance. Hence, our aim is to convert graphs to string sequences so that standard string edit distance techniques can be used. To do this we use graph spectral seriation method to convert the adjacency matrix into a string or sequence order. We pose the problem of graphmatching as maximum a posteriori probability alignment of the seriation sequences for pairs of graphs. This treatment leads to an expression for the edit costs. We compute the edit distance by finding the sequence of string edit operations which minimise the cost of the path traversing the edit lattice. The edit costs are defined in terms of the a posteriori probability of visiting a site on the lattice. We demonstrate the method with results on a dataset of Delaunay graphs. 1.
Efficient algorithms for normalized edit distance
 Journal of Discrete Algorithms
, 2000
"... ABSTRACT: A common model for computing the similarity of two stringsXandYof lengthsm andnrespectively, withmn, is to transformXintoYthrough a sequence of edit operations, called an edit sequence. The edit operations are of three types: insertion, deletion, and substitution. A given cost function ass ..."
Abstract

Cited by 8 (0 self)
 Add to MetaCart
ABSTRACT: A common model for computing the similarity of two stringsXandYof lengthsm andnrespectively, withmn, is to transformXintoYthrough a sequence of edit operations, called an edit sequence. The edit operations are of three types: insertion, deletion, and substitution. A given cost function assigns a weight to each edit operation. The amortized weight for a given edit sequence is the ratio of its weight to its length, and the minimum of this ratio over all edit sequences is the normalized edit distance. Existing algorithms for normalized edit distance computation with proven complexity bounds requireO(mn2)time in the worstcase. We give provably better algorithms: anO(mnlogn)time algorithm when the cost function is uniform, i.e, the weights of edit operations depend only on the type but not on the individual symbols involved, and anO(mnlogm)time algorithm when the weights are rational.
A Formal Theory for Optimal and Information Theoretic Syntactic Pattern Recognition
"... In this paper we present a foundational basis for optimal and information theoretic syntactic pattern recognition. We do this by developing a rigorous model, M * , for channels which permit arbitrarily distributed substitution, deletion and insertion syntactic errors. More explicitly, if A is any ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
In this paper we present a foundational basis for optimal and information theoretic syntactic pattern recognition. We do this by developing a rigorous model, M * , for channels which permit arbitrarily distributed substitution, deletion and insertion syntactic errors. More explicitly, if A is any finite alphabet and A * the set of words over A, we specify a stochastically consistent scheme by which a string U A * can be transformed into any Y A * by means of arbitrarily distributed substitution, deletion and insertion operations. The scheme is shown to be Functionally Complete and stochastically consistent. Apart from the synthesis aspects, we also deal with the analysis of such a model and derive a technique by which Pr[YU], the probability of receiving Y given that U was transmitted, can be computed in cubic time using dynamic programming. One of the salient features of this scheme is that it demonstrates how dynamic programming can be applied to evaluate quantities involv...
An Efficient UniformCost Normalized Edit Distance Algorithm
 6th Symp. on String Processing and Info. Retrieval
, 1999
"... A common model for computing the similarity of two strings X and Y of lengths m, and n respectively with m n, is to transform X into Y through a sequence of three types of edit operations: insertion, deletion, and substitution. The model assumes a given cost function which assigns a nonnegative re ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
A common model for computing the similarity of two strings X and Y of lengths m, and n respectively with m n, is to transform X into Y through a sequence of three types of edit operations: insertion, deletion, and substitution. The model assumes a given cost function which assigns a nonnegative real weight to each edit operation. The amortized weight for a given edit sequence is the ratio of its weight to its length, and the minimum of this ratio over all edit sequences is the normalized edit distance. Existing algorithms for normalized edit distance computation with proven complexity bounds require O(mn
Learning Significant Alignments: An Alternative to Normalized Local Alignment
"... We describe a supervised learning approach to resolve difficulties in nding biologically significant local alignments. It was noticed that the O(n²) algorithm by SmithWaterman, the prevalent tool for computing local sequence alignment, often outputs long, meaningless alignments while ignor ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
We describe a supervised learning approach to resolve difficulties in nding biologically significant local alignments. It was noticed that the O(n&sup2;) algorithm by SmithWaterman, the prevalent tool for computing local sequence alignment, often outputs long, meaningless alignments while ignoring shorter, biologically significant ones. Arslan et. al. proposed an O(n&sup2;log n) algorithm which outputs a normalized local alignment that maximizes the degree of similarity rather than the total similarity score. Given a properly selected normalization parameter, the algorithm can discover significant alignments that would be missed by the SmithWaterman algorithm. Unfortunately, determining a proper normalization parameter requires repeated executions with different parameter values and expert feedback to determine the usefulness of the alignments. We propose a learning approach that uses existing biologically significant alignments to learn parameters for intelligently processing suboptimal SmithWaterman alignments. Our algorithm runs in O(n&sup2;) time and can discover biologically significant alignments without requiring expert feedback to produce meaningful results.
2010 International Conference on Pattern Recognition Normalized SumoverPaths Edit Distances
"... Abstract—In this paper, normalized SoP stringedit distances, taking into account all possible alignments between two sequences, are investigated. These normalized distances are variants of the SumoverPaths (SoP) distances which compute the expected cost on all sequence alignments by favoring lowc ..."
Abstract
 Add to MetaCart
Abstract—In this paper, normalized SoP stringedit distances, taking into account all possible alignments between two sequences, are investigated. These normalized distances are variants of the SumoverPaths (SoP) distances which compute the expected cost on all sequence alignments by favoring lowcost ones – therefore favoring good alignment. Such distances consider two sequences tied by many optimal or nearlyoptimal alignments as more similar than two sequences sharing only one, optimal, alignment. They depend on a parameter, θ, and reduce to the standard distances – the editdistance or the longest common subsequence – when θ → 0, while having the same time complexity. This paper puts the emphasis on applying some type of normalization of the expectation of the cost. Experimental results for clustering and classification tasks performed on four OCR data sets show that (i) the applied normalization generally improves the existing results, and (ii) as for the SoP editdistances, the normalized SoP editdistances clearly outperform the nonrandomized measures, i.e. the standard edit distance and longest common subsequence. Keywordsedit distance; longest common subsequence; randomizedshortest paths; normalization. I.
An Efficient UniformCost Normalized Edit Distance Algorithm
"... A common model for computing the similarity of two stringsXandYof lengthsm, andnrespectively with n, is to transformXintoYthrough a sequence of three types of edit operations: insertion, deletion, and substitution. The model assumes a given cost function which assigns a nonnegative real weight to e ..."
Abstract
 Add to MetaCart
A common model for computing the similarity of two stringsXandYof lengthsm, andnrespectively with n, is to transformXintoYthrough a sequence of three types of edit operations: insertion, deletion, and substitution. The model assumes a given cost function which assigns a nonnegative real weight to each edit operation. The amortized weight for a given edit sequence is the ratio of its weight to its length, and the minimum of this ratio over all edit sequences is the normalized edit distance. Existing algorithms for normalized edit distance computation with proven complexity bounds requireO(mn2)time in the worstcase. We give anO(mnlogn)time algorithm for the problem when the cost function is uniform, i.e, the weight of each edit operation is constant within the same type, except substitutions can have different weights depending on whether they are matching or nonmatching.
unknown title
"... Since its inception by Agrawal and Srikant in 1995 the field of Sequence Mining has grown both in algorithmic maturity and in the breadth of application areas under consideration. With the amount of available data increasing at an exponential rate this trend, especially algorithmic development, must ..."
Abstract
 Add to MetaCart
Since its inception by Agrawal and Srikant in 1995 the field of Sequence Mining has grown both in algorithmic maturity and in the breadth of application areas under consideration. With the amount of available data increasing at an exponential rate this trend, especially algorithmic development, must continue and indeed be enhanced with
International Journal of Pattern Recognition and Artificial Intelligence c ○ World Scientific Publishing Company String Edit Distance, Random Walks and Graph Matching
"... This paper shows how the eigenstructure of the adjacency matrix can be used for the purposes of robust graphmatching. We commence from the observation that the leading eigenvector of a transition probability matrix is the steady state of the associated Markov chain. When the transition matrix is th ..."
Abstract
 Add to MetaCart
This paper shows how the eigenstructure of the adjacency matrix can be used for the purposes of robust graphmatching. We commence from the observation that the leading eigenvector of a transition probability matrix is the steady state of the associated Markov chain. When the transition matrix is the normalised adjacency matrix of a graph, then the leading eigenvector gives the sequence of nodes of the steady state random walk on the graph. We use this property to convert the nodes in a graph into a string where the nodeorder is given by the sequence of nodes visited in the random walk. We match graphs represented in this way, by finding the sequence of string edit operations which minimise edit distance. 1.