## Distance Measures for Point Sets and Their Computation (1997)

Venue: | Acta Informatica |

Citations: | 50 - 2 self |

### BibTeX

@ARTICLE{Eiter97distancemeasures,

author = {Thomas Eiter and Heikki Mannila},

title = {Distance Measures for Point Sets and Their Computation},

journal = {Acta Informatica},

year = {1997},

volume = {34},

pages = {103--133}

}

### Years of Citing Articles

### OpenURL

### Abstract

We consider the problem of measuring the similarity or distance between two finite sets of points in a metric space, and computing the measure. This problem has applications in, e.g., computational geometry, philosophy of science, updating or changing theories, and machine learning. We review some of the distance functions proposed in the literature, among them the minimum distance link measure, the surjection measure, and the fair surjection measure, and supply polynomial time algorithms for the computation of these measures. Furthermore, we introduce the minimum link measure, a new distance function which is more appealing than the other distance functions mentioned. We also present a polynomial time algorithm for computing this new measure. We further address the issue of defining a metric on point sets. We present the metric infimum method that constructs a metric from any distance functions on point sets. In particular, the metric infimum of the minimum link measure is a quite int...

### Citations

10887 |
Computers and Intractability: A Guide to the Theory of NP-Completeness
- Garey, Johnson
- 1979
(Show Context)
Citation Context ...ng d ! l is NP-hard. On the other hand, our results imply that for a broad class of instances, computing d ! l is NP-easy, i.e., possible in polynomial time with an oracle for some problem in NP (cf. =-=[8]-=-), and hence not much harder than the NP-complete problems. The rest of the paper is organized as follows. After introducing the some notation at the end of the introduction, we review in Section 2 me... |

707 |
Cluster analysis for applications
- Anderberg
- 1973
(Show Context)
Citation Context ... = \Delta(x; y) for all x and y from B. (2) Algorithms for computing such extensions, which run in polynomial time if possible. Questions of this type arise in several areas, such as cluster analysis =-=[2, 20]-=-, computational geometry [1], philosophy of science [17, 18, 13, 15], updating and revising theories [6, 25], arbitration between theories [19], and machine learning [12, 11, 26]. The best-known metri... |

496 |
Graphs and Hypergraphs
- Berge
- 1973
(Show Context)
Citation Context ... all x and nonempty S yields a closest point of S for x, i.e. (x; S) = y; such that y 2 S and \Delta(x; y) = \Delta m (x; S): We assume that the reader knows about basic concepts of graph theory (cf. =-=[3, 24]) and-=- NP-completeness (cf. [8]). Let the union of a family of graphs fG i = (V i ; E i ) : 1sisng be the graph ( S i V i ; S i E i ). The union is disjoint if V i " V j = ;, 1si ! jsn. In case that th... |

304 |
Knowledge in Flux
- Gardenfors
- 1988
(Show Context)
Citation Context ...l time if possible. Questions of this type arise in several areas, such as cluster analysis [2, 20], computational geometry [1], philosophy of science [17, 18, 13, 15], updating and revising theories =-=[6, 25]-=-, arbitration between theories [19], and machine learning [12, 11, 26]. The best-known metric between subsets of a metric space is the Hausdorff metric, defined as d h (S 1 ; S 2 ) = maxfmax e2S 1 min... |

120 |
Machine Learning: A Theoretical Approach
- Natarajan
- 1992
(Show Context)
Citation Context ..., such as cluster analysis [2, 20], computational geometry [1], philosophy of science [17, 18, 13, 15], updating and revising theories [6, 25], arbitration between theories [19], and machine learning =-=[12, 11, 26]-=-. The best-known metric between subsets of a metric space is the Hausdorff metric, defined as d h (S 1 ; S 2 ) = maxfmax e2S 1 min f2S 2 \Delta(e; f); max e2S 2 min f2S 1 \Delta(e; f)g: This metric is... |

115 |
Updating Logical Databases
- Winslett
- 1990
(Show Context)
Citation Context ...l time if possible. Questions of this type arise in several areas, such as cluster analysis [2, 20], computational geometry [1], philosophy of science [17, 18, 13, 15], updating and revising theories =-=[6, 25]-=-, arbitration between theories [19], and machine learning [12, 11, 26]. The best-known metric between subsets of a metric space is the Hausdorff metric, defined as d h (S 1 ; S 2 ) = maxfmax e2S 1 min... |

93 |
Congruence, similarity and symmetries of geometric objects. Discrete Comput. Geom
- Alt, Mehlhorn, et al.
- 1988
(Show Context)
Citation Context ...from B. (2) Algorithms for computing such extensions, which run in polynomial time if possible. Questions of this type arise in several areas, such as cluster analysis [2, 20], computational geometry =-=[1]-=-, philosophy of science [17, 18, 13, 15], updating and revising theories [6, 25], arbitration between theories [19], and machine learning [12, 11, 26]. The best-known metric between subsets of a metri... |

83 | Some NP-complete geometric problems
- Garey, Graham, et al.
- 1976
(Show Context)
Citation Context ...st flows states the following. 5 This assumption is stricter than necessary, but avoids problems arising if the values of \Delta are possibly difficult to compare to a number or among each other (cf. =-=[7]-=-). 11 # # # # # c c c c c # # # # # c c c c c u u u u u u ! ! ! ! ! ! ! ! ! ! # # # # # a a a a a a a a a a c c c c c 1 1 6 2 3 2 3 5 7 x 3 x 1 x 2 u 2 u 1 v 1 \Delta(s; s 0 ) 3 7 5 6 2 1 s 0 2 s 0 1 ... |

83 | Mannila – “Pruning and grouping discovered association rules
- Toivonen, Klemettinen, et al.
(Show Context)
Citation Context ...rs from different rules discovered from the data. The rules can be identified with the data points to which they apply, and then rule distance can be computed by using a distance function for subsets =-=[21]-=-. Particularly appealing among the distance functions between point sets that have been proposed in philosophy of science are the minimum distance measure, dmd , the surjection measure, d s , and the ... |

54 | On the semantics of theory change: arbitration between old and new information
- Revesz
- 1993
(Show Context)
Citation Context ...type arise in several areas, such as cluster analysis [2, 20], computational geometry [1], philosophy of science [17, 18, 13, 15], updating and revising theories [6, 25], arbitration between theories =-=[19]-=-, and machine learning [12, 11, 26]. The best-known metric between subsets of a metric space is the Hausdorff metric, defined as d h (S 1 ; S 2 ) = maxfmax e2S 1 min f2S 2 \Delta(e; f); max e2S 2 min ... |

50 | Graph algorithms
- Leeuwen
- 1990
(Show Context)
Citation Context ... all x and nonempty S yields a closest point of S for x, i.e. (x; S) = y; such that y 2 S and \Delta(x; y) = \Delta m (x; S): We assume that the reader knows about basic concepts of graph theory (cf. =-=[3, 24]) and-=- NP-completeness (cf. [8]). Let the union of a family of graphs fG i = (V i ; E i ) : 1sisng be the graph ( S i V i ; S i E i ). The union is disjoint if V i " V j = ;, 1si ! jsn. In case that th... |

35 |
Inductive Acquisition of Expert Knowledge
- Muggleton
- 1990
(Show Context)
Citation Context ..., such as cluster analysis [2, 20], computational geometry [1], philosophy of science [17, 18, 13, 15], updating and revising theories [6, 25], arbitration between theories [19], and machine learning =-=[12, 11, 26]-=-. The best-known metric between subsets of a metric space is the Hausdorff metric, defined as d h (S 1 ; S 2 ) = maxfmax e2S 1 min f2S 2 \Delta(e; f); max e2S 2 min f2S 1 \Delta(e; f)g: This metric is... |

27 |
editors. Knowledge Discovery in Databases
- Piatetsky-Shapiro, Frawley
- 1991
(Show Context)
Citation Context ...e third testimony will naturally decrease with growing distance to the other testimonies. Another use of such distance functions between subsets is in the new area of knowledge discovery in databases =-=[5, 16]-=-, where one often has to choose between or form clusters from different rules discovered from the data. The rules can be identified with the data points to which they apply, and then rule distance can... |

19 | On the proper definition of minimality in specialization and theory revision
- Wrobel
- 1993
(Show Context)
Citation Context ..., such as cluster analysis [2, 20], computational geometry [1], philosophy of science [17, 18, 13, 15], updating and revising theories [6, 25], arbitration between theories [19], and machine learning =-=[12, 11, 26]-=-. The best-known metric between subsets of a metric space is the Hausdorff metric, defined as d h (S 1 ; S 2 ) = maxfmax e2S 1 min f2S 2 \Delta(e; f); max e2S 2 min f2S 1 \Delta(e; f)g: This metric is... |

15 |
Efficiently computing the Hausdorff distance for point sets under translation
- Huttenlocher, Kedem
- 1990
(Show Context)
Citation Context ... ; (B), which is a metric if \Delta is a metric ([4]). The problem of computing the Hausdorff distance between geometric entities has been considered in the area of computational geometry (see, e.g., =-=[9]-=-). The problem with d h is that it is very sensitive to extreme points in the sets S 1 and S 2 . (See Figure 1.) 4 u u u u u u a b c d e f Figure 1: Sets fa; b; c; d; eg and fag are equally distant fr... |

15 |
Graph algorithms and NP-completeness, volume 2 of Data structures and algorithms
- Mehlhorn
- 1984
(Show Context)
Citation Context ...f S M = V . The weight w(M) of M is w(M) = P e2M w(e). It is wellknown that a perfect matching of minimum weight (hence, by our assumptions, also its weight) in G can computed in polynomial time (cf. =-=[10]-=-). Specialized algorithms have been developed for bipartite graphs and for other graph classes. Proposition 4.1 (cf. [10, p.93,Theorem 14]) Let G = (V 1 [V 2 ; E; w) be a bipartite weighted graph with... |

14 |
Likeness to Truth; D
- Oddie
- 1986
(Show Context)
Citation Context ...computing such extensions, which run in polynomial time if possible. Questions of this type arise in several areas, such as cluster analysis [2, 20], computational geometry [1], philosophy of science =-=[17, 18, 13, 15]-=-, updating and revising theories [6, 25], arbitration between theories [19], and machine learning [12, 11, 26]. The best-known metric between subsets of a metric space is the Hausdorff metric, defined... |

5 |
A note on verisimilitude
- Popper
- 1976
(Show Context)
Citation Context ...computing such extensions, which run in polynomial time if possible. Questions of this type arise in several areas, such as cluster analysis [2, 20], computational geometry [1], philosophy of science =-=[17, 18, 13, 15]-=-, updating and revising theories [6, 25], arbitration between theories [19], and machine learning [12, 11, 26]. The best-known metric between subsets of a metric space is the Hausdorff metric, defined... |

2 |
Verisimilitude and distance in logical space
- Oddie
- 1978
(Show Context)
Citation Context ... (a; c); (b; d); (e; c); (c; d) g is optimal and c(j 0 ) = 2 + 2 + 1 + 1 = 6. d l (S 1 ; S 2 ) = 5; the linking L = f(a; c); (b; c); (e; c); (c; d)g is optimal and c(L) = 1 + 1 + 1 + 2 = 5. 2 2 Oddie =-=[14]-=- actually considered only the version normalized by the size of the larger set. 6 t t t t t a b c d e Figure 2: Examples of the distance measures; the distance from a to b is the unit distance. The fo... |

2 |
Some Comments on Truth and the Growth of Knowledge
- Popper
- 1962
(Show Context)
Citation Context ...computing such extensions, which run in polynomial time if possible. Questions of this type arise in several areas, such as cluster analysis [2, 20], computational geometry [1], philosophy of science =-=[17, 18, 13, 15]-=-, updating and revising theories [6, 25], arbitration between theories [19], and machine learning [12, 11, 26]. The best-known metric between subsets of a metric space is the Hausdorff metric, defined... |

1 |
Cluster Analysis for Researchers
- Romerburg
- 1984
(Show Context)
Citation Context ... = \Delta(x; y) for all x and y from B. (2) Algorithms for computing such extensions, which run in polynomial time if possible. Questions of this type arise in several areas, such as cluster analysis =-=[2, 20]-=-, computational geometry [1], philosophy of science [17, 18, 13, 15], updating and revising theories [6, 25], arbitration between theories [19], and machine learning [12, 11, 26]. The best-known metri... |

1 |
Verisimilitude and Theory-Distance
- Tuomela
- 1978
(Show Context)
Citation Context ... organized as follows. After introducing the some notation at the end of the introduction, we review in Section 2 measures for the distance of point sets derived from measures of theory distance from =-=[13, 22, 14]-=-. Furthermore, in this section we define the new minimum link measure d l and compare it to the distance measures dmd ; d s , and d fs . In Section 3 we analyze the structural properties of d s , d fs... |

1 |
Geometry Helps in Matching
- Vaiyda
- 1988
(Show Context)
Citation Context ... jEj = 2nm + n +m). Let us remark at the end of this section that in case \Delta is a metric on B, the efficiency of the proposed algorithms might be improved by using metric matching techniques (cf. =-=[23]-=-). The network flow and matching problems constructed in this section involve additional vertices to which the underlying metric space B does not extend; thus, the constructions would have to be suita... |