## On Computing Geometric Estimators of Location (2001)

Citations: | 5 - 0 self |

### BibTeX

@MISC{Aloupis01oncomputing,

author = {Greg Aloupis},

title = {On Computing Geometric Estimators of Location},

year = {2001}

}

### Years of Citing Articles

### OpenURL

### Abstract

Let S be a data set of n points in R d , and be a point in R d which "best" describes S. Since the term "best" is subjective, there exist several definitions for finding . However, it is generally agreed that such a definition, or estimator of location, should have certain statistical properties which make it robust. Most estimators of location assign a depth value to any point in R d and define to be a point with maximum depth. Here, new results are presented concerning the computational complexity of estimators of location. We prove that in R 2 the computation of simplicial and halfspace depth of a point requires\Omega\Gamma n log n) time, which matches the upper bound complexities of algorithms by Rousseeuw and Ruts. Our lower bounds also apply to two sign tests, that of Hodges and that of Oja and Nyblom. In addition, we propose algorithms which reduce the time complexity of calculating the points with greatest Oja and simplicial depth. Our fastest algorithms use O(n 3 log n) and O(n 4 ) time respectively, compared to the algorithms of Rousseeuw and Ruts which use O(n 5 log n) time. One of our algorithms may also be used to find a point with minimum weighted sum of distances to a set of n lines in O(n 2 ) time. This point is called the FermatTorricelli point of n lines by Roy Barbara, whose algorithm uses O(n 3 ) time. Finally, we propose a new estimator which arises from the notion of hyperplane depth recently defined by Rousseeuw and Hubert.

### Citations

8530 |
Introduction to Algorithms
- Cormen, Leiserson, et al.
- 1990
(Show Context)
Citation Context ...lly involves determining the path length size that an algebraic decision tree must have for the tree to be able to produce the desired output for any possible input. For example, it can be shown (see =-=[CLR90]-=-) that any algebraic decision tree which is capable of sorting n real numbers based on comparisons alone must have a path withsChapter 1. Introduction 8 length Ω(n log n), and therefore any RAM algori... |

1762 |
Computational Geometry: An Introduction
- Preparata, Shamos
- 1985
(Show Context)
Citation Context ... f(n), Ω(f(n)) is at least a constant factor of f(n), and Θ(f(n)) is both O(f(n)) and Ω(f(n)). Upper bounds for the time and space used by algorithms will be in the real RAM model of computation (see =-=[PS85]-=-). According to this model, only arithmetic operations (+, −, ×, /) and comparisons are allowed, for real numbers of infinite precision (occasionally this is extended to include certain functions such... |

1130 |
Graph Theory
- Harary
- 1969
(Show Context)
Citation Context ...ortest path from the point to the convex hull. In general, the center of a graph has also been considered to be the set of points for which the maximum path length to reach another point is minimized =-=[Har69]-=-. Several other estimators which are based more on statistical methods exist. For example, M-Estimators, L-Estimators and R-Estimators were discussed by Huber [Hub72]. Robust estimators of location ha... |

688 |
Algorithms in Combinatorial Geometry
- Edelsbrunner
- 1987
(Show Context)
Citation Context ... a point with depth at least ⌈ n ⌉. Gill, Steiger and Wigderson [GSW92] stated that the d+1 centerpoint could be used as a multivariate median since it coincides with the median in R 1 . Edelsbrunner =-=[Ede87]-=- showed that a centerpoint can be found for any data set. The number ⌈ n ⌉ arises from Helly’s theorem (see [RH99a] for a more detailed d+1 account). Cole, Sharir and Yap [CSY87] were able to compute ... |

503 | Computational Geometry in C - O’ROURKE - 1998 |

369 | Time bounds for selection - Blum, Floyd, et al. - 1973 |

351 |
Stability in Competition
- Hotelling
- 1929
(Show Context)
Citation Context ... the literature under a variety of other names, such as the L1 median [Sma90], or the mediancentre [Gow74]. Some properties of this multivariate median are discussed in section 2.1. In 1929 Hotelling =-=[Hot29]-=- described the univariate median as the point which minimizes the maximum number of data points on one of its sides. This notion was generalized to higher dimensions many years later by Tukey [Tuk75].... |

251 | Geometric range searching and its relatives
- Agrawal, Erickson
- 1999
(Show Context)
Citation Context ... and then takes O(k+n 0.695 ) time for each query. However, in this algorithm if points are only counted the time complexity reduces to O(n 0.695 ). The total space used is O(n). Agarwal and Erickson =-=[AE98]-=- provide an extensive survey on this problem and similar topics, including a discussion of lower bounds and time-space tradeoffs. Since step 2 of algorithm Simp–Med performs O(n 2 ) queries, the algor... |

240 |
An efficient algorithm for determining the convex hull of a finite point set
- Graham
- 1972
(Show Context)
Citation Context ...ltivariate median (m) via convex hull peeling. Even a naive algorithm for convex hull peeling should not take longer than O(n 2 log n) time in R 2 . The convex hull may be computed in O(n log n) time =-=[Gra72]-=-. Matching lower bounds have been found in several models of computation (see [Avi82, Yao81]). In the worst case it is possible that only three data points will be removed in each iteration, which lea... |

216 |
bounds for algebraic computation trees
- Ben-Or, Lower
- 1983
(Show Context)
Citation Context ...fspace Depth Lower Bound We show that finding halfspace depth allows us to answer the question of Set Equality, which has an Ω(n log n) lower bound in the algebraic decision tree model of computation =-=[BO83]-=-: • Set Equality: Given two sets A = {a1, a2, . . . , an} and B = {b1, b2, . . . , bn}, is A = B? sisChapter 3. Algorithms and Lower Bounds for Depth in R 2 27 Lemma 3.1 Let S = {s1, s2, . . . , s2n} ... |

159 |
Leeuwen, "Maintenance of Configurations in the Plane
- Overmars, van
- 1981
(Show Context)
Citation Context ...hull is computed, the modified algorithm continues with the remaining points. This modification was first proposedsChapter 2. Multivariate Medians 15 by Shamos [Sha76]. Later Overmars and van Leeuwen =-=[OvL81]-=- designed a data structure which maintains the convex hull of a set of points after the insertion/deletion of arbitrary points, with a cost of O(log 2 n) time per insertion/deletion. This provides an ... |

116 |
Multivariate estimation with high breakdown point
- Rousseeuw
- 1985
(Show Context)
Citation Context ... Unfortunately, as Hayford was aware, the vector-of-medians depends on the choice of orthogonal directions. You may verify this easily using a set of three non-collinear points. In fact, as Rousseeuw =-=[Rou85]-=- pointed out, this method may yield a median which is outside the convex hull 1 of the data. Mood [Moo41] also proposed a joint distribution of unidimensional medians and used integration to find the ... |

104 |
Constructing arrangements of lines and hyperplanes with applications
- Edelsbrunner, O’Rourke, et al.
- 1986
(Show Context)
Citation Context ...secting the point. An arrangement of n lines may be constructed in Θ(n 2 ) time and space. This result was first obtained by Chazelle, Guibas and Lee [CGL85], and by Edelsbrunner, O’Rourke and Seidel =-=[EOS86]-=-. The proof of this result is described well in [O’R95]. The same algorithm may be used to construct an arrangement of line segments. A nice application of arrangements is for sorting all points about... |

102 |
Mathematics and the Picturing of data
- Tukey
(Show Context)
Citation Context ... [Hot29] described the univariate median as the point which minimizes the maximum number of data points on one of its sides. This notion was generalized to higher dimensions many years later by Tukey =-=[Tuk75]-=-. The Tukey median, or halfspace median, is perhaps the most widely studied and used multivariate median in recent years, and is discussed in section 2.2. A very intuitive definition for the univariat... |

93 |
The power of geometric duality
- Chazelle, Guibas, et al.
- 1985
(Show Context)
Citation Context ...f interest only when several queries are made. Typically some form of preprocessing takes place which allows each query to be answered in less than the brute-force O(n) time. Chazelle, Guibas and Lee =-=[CGL85]-=- compute the convex layers of the given data set as a preprocessing step (in O(n log n) time using Chazelle’s algorithm mentioned in chapter 2). Then they report every point on one side of a query lin... |

92 |
The Ordering of Multivariate Data
- Barnett
- 1974
(Show Context)
Citation Context ...in section 2.2. A very intuitive definition for the univariate median is to continuously remove pairs of extreme data points. A generalization of this notion appeared by Shamos [Sha76] and by Barnett =-=[Bar76]-=-, although Shamos stated that the idea originally belongs to Tukey. Convex hull peeling iteratively removes convex hull layers of points until a convex set remains. Convex hull peeling and the related... |

86 |
Multivariate analysis by data depth: descriptive statistics, graphics and inference (with discussion and a rejoinder by
- Liu, Parelius, et al.
- 1999
(Show Context)
Citation Context ...ical display [MRR + 01] and even voting theory [RR99]. Halfspace, hyperplane and simplicial depth are also closely related to regression [RR99]. For a recent account of statistical uses of depth, see =-=[LPS99]-=-. 3 A data set is in general position if no three points are collinear. Occasionally this is extended to denote sets where no four points are co-circular.sChapter 1. Introduction 7 For a more detailed... |

71 |
notion of breakdown point
- Donoho, Huber
- 1983
(Show Context)
Citation Context ...uhaa [RL91] discussed the breakdown point of various estimators. They gave credit to Hodges [Hod67] and Hampel [Ham68] for introducing the concept. Lopuhaa [Lop92] stated that it was Donoho and Huber =-=[DH83]-=- who suggested the definition given above, which is intended for finite data sets. Donoho and Gasko [DG92] also provided many results concerning the breakdown point, some of which are mentioned in cha... |

67 |
A generalization of Carath'eodory's theorem
- B'ar'any
- 1982
(Show Context)
Citation Context ...al position in R 2 there always exists a point contained in at least n3 27 + O(n2 ) open triangles formed by the points. This implies that the depth of the simplicial median in R 2 is Θ(n 3 ). Bárány =-=[Bár82]-=-sChapter 2. Multivariate Medians 19 showed that in R d there always exists a point contained in simplices. 1 (d + 1) d+1 ⎛ ⎜ ⎝ n d + 1 ⎞ ⎟ ⎠ + O(n d ) A straightforward method of finding the simplical... |

67 | Contributions to the theory of robust estimation - Hampel - 1968 |

66 | Big omicron and big omega and big theta - Knuth - 1976 |

65 |
An optimal algorithm for finding segments intersections
- Balaban
- 1995
(Show Context)
Citation Context ...tion point we can find the simplicial median. If a set of n line segments has k intersection points, they can be reported in O(n log n+k) time and O(n) space with a line sweeping technique of Balaban =-=[Bal95]-=-. In the case of simplicial depth, we have O(n 2 ) line segments formed between pairs of data points, and unfortunately k is Θ(n 4 ) [SW94]. Thus the algorithm of BalabansChapter 2. Multivariate Media... |

63 |
Breakdown properties of location estimates based on halfspace depth and projected outlyingness
- Donoho, Gasko
- 1992
(Show Context)
Citation Context ...mpel [Ham68] for introducing the concept. Lopuhaa [Lop92] stated that it was Donoho and Huber [DH83] who suggested the definition given above, which is intended for finite data sets. Donoho and Gasko =-=[DG92]-=- also provided many results concerning the breakdown point, some of which are mentioned in chapter 2. In R 1 (in a univariate data set) the median has a breakdown point of 1 2 and the mean has a break... |

62 |
On k-hulls and related problems
- Cole, Sharir, et al.
- 1987
(Show Context)
Citation Context ...in R 1 . Edelsbrunner [Ede87] showed that a centerpoint can be found for any data set. The number ⌈ n ⌉ arises from Helly’s theorem (see [RH99a] for a more detailed d+1 account). Cole, Sharir and Yap =-=[CSY87]-=- were able to compute the centerpoint in O(n log 5 n) time. Matouˇsek improved this result by setting k equal to ⌈ n ⌉ in his d+1 algorithm, thus obtaining the centerpoint in O(n log 4 n) time. Finall... |

61 |
On the identification of the convex hull of a finite set of points in the plane
- Jarvis
- 1973
(Show Context)
Citation Context ...nts will be removed in each iteration, which leads to O(n) convex hull calculations. The task can be done easily in O(n 2 ) time by slightly modifying the Jarvis “gift-wrapping” convex hull algorithm =-=[Jar73]-=-. This algorithm takes O(hn) time to compute the convex hull, where h is the number of vertices on the hull. Once the hull is computed, the modified algorithm continues with the remaining points. This... |

59 |
Breakdown Properties of Multivariate Location Estimators
- Donoho
- 1982
(Show Context)
Citation Context ...mations and has a 50% breakdown point for a data set in general position 3 . Rousseeuw also described another estimator with the same properties, introduced independently by Stahel [Sta81] and Donoho =-=[Don82]-=-. Their method involves finding a projection for each data point x in which x is most outlying. They then compute a weighted mean based on the results. Toussaint and Poulsen [TP79] proposed successive... |

56 |
A Survey of Multidimensional Medians
- Small
- 1990
(Show Context)
Citation Context ...inity 1sChapter 1. Introduction 2 m m Figure 1.1: The mean (m) is not a robust estimator. in order to force the median to do the same. This suggests a measure of robustness for estimators of location =-=[Sma90]-=-: • The breakdown point is the proportion of data which must be moved to infinity so that the estimator will do the same. Rousseeuw and Lopuhaa [RL91] discussed the breakdown point of various estimato... |

55 | On the convex layers of a planar set
- Chazelle
- 1985
(Show Context)
Citation Context ...of points after the insertion/deletion of arbitrary points, with a cost of O(log 2 n) time per insertion/deletion. This provides an O(n log 2 n) time method for convex hull peeling. Finally, Chazelle =-=[Cha85]-=- improved this result by ignoring insertions and taking advantage of the structure in the sequence of deletions in convex hull peeling. Chazelle’s algorithm uses O(n log n) time to compute all convex ... |

54 |
Descriptive statistics for multivariate distributions
- Oja
- 1983
(Show Context)
Citation Context ...ex hull peeling iteratively removes convex hull layers of points until a convex set remains. Convex hull peeling and the related method of ellipsoid peeling are discussed in section 2.3. In 1983, Oja =-=[Oja83]-=- introduced a definition for the multivariate median which generalizes the notion that the univariate median is the point with minimum sum of distances to all data points. However, Oja measured distan... |

43 | Regression depth
- Rousseeuw, Hubert
- 1999
(Show Context)
Citation Context ...iger do not count this plane as crossing a ray from the query point. They also provided a matching lower bound and mentioned previous results such as an O(n 3 ) time algorithm by Rousseeuw and Hubert =-=[RH99b]-=- and an O(n log 2 n) time algorithm by van Kreveld et al [vKMR + 99]. Amenta et al [ABET00] proposed an O(n d ) time algorithm which constructs the arrangement 1 of the n hyperplanes and 1 For details... |

35 |
Robust statistics: a review
- Huber
- 1972
(Show Context)
Citation Context ... reach another point is minimized [Har69]. Several other estimators which are based more on statistical methods exist. For example, M-Estimators, L-Estimators and R-Estimators were discussed by Huber =-=[Hub72]-=-. Robust estimators of location have been used for data description, multivariate confidence regions, p-values, quality indices, and control charts (see [RR96]). Applications of depth include hypothes... |

32 |
Estimation of correlation coefficients by ellipsoidal trimming
- Titterington
- 1978
(Show Context)
Citation Context ...ions in convex hull peeling. Chazelle’s algorithm uses O(n log n) time to compute all convex layers and the depth of any point. A technique similar to convex hull peeling was proposed by Titterington =-=[Tit78]-=-. He proposed iteratively peeling minimum volume ellipsoids containing the data set. Both methods of peeling data can have very low breakdown points. Donoho and Gasko [DG92] proved that the breakdown ... |

31 |
Decision trees and random access machines
- Paul, Simon
- 1980
(Show Context)
Citation Context ... RAM algorithm restricted to comparisons must take Ω(n log n) time in the worst case to sort n real numbers. For a discussion on the connection between the RAM and algebraic decision tree models, see =-=[PS80]-=-. The univariate median of n points is easily computed in O(n log n) time by sorting the points. An optimal O(n) time method was proposed by Blum et al [BFP + 73]. Many undergraduate introductory text... |

30 |
Computing a centerpoint of a finite planar set of points in linear time. Discrete Comput
- Jadhav, Mukhopadhyay
- 1994
(Show Context)
Citation Context ...enterpoint in O(n log 5 n) time. Matouˇsek improved this result by setting k equal to ⌈ n ⌉ in his d+1 algorithm, thus obtaining the centerpoint in O(n log 4 n) time. Finally, Jadhav and Mukhopadhyay =-=[JM94]-=- gave an O(n) algorithm to compute the centerpoint.sChapter 2. Multivariate Medians 14 2.3 Convex Hull Peeling and Related Methods Perhaps the most intuitive visual interpretation of the univariate me... |

30 | A lower bound to finding convex hulls - Yao - 1981 |

24 |
Bivariate Sign Test,” The
- Hodges
- 1955
(Show Context)
Citation Context ...son [GSW92]. We match the upper bounds of the simplicial and halfspace depth algorithms by proving that these problems require Ω(n log n) time. Our lower bounds also apply to the sign tests of Hodges =-=[Hod55]-=- and of Oja and Nyblom [ON89]. These tests are used to determine if there is a statistically significant difference between two distributions of n points. In chapter 4 we present our algorithms for co... |

22 |
Halfplanar range search in linear space and O(n ’695 query time
- EDELSBRUNNER, WELZL
- 1986
(Show Context)
Citation Context ...istribution of data points in a convex area is Ω(n 2 3 ) and O(n 2 3 log 1 3 n) [Dev01]. In the case that the lower bound is achieved, T (n) is O(n 2 3 log n). The algorithm of Edelsbrunner and Welzl =-=[EW86]-=- uses O(n log n) preprocessing time and then takes O(k+n 0.695 ) time for each query. However, in this algorithm if points are only counted the time complexity reduces to O(n 0.695 ). The total space ... |

21 | Constructing the bivariate tukey median - Rousseeuw, Ruts - 1998 |

21 | The rectilinear crossing number of a complete graph and Sylvester’s ”Four Point” problem of geometric probability
- Scheinerman, E, et al.
- 1994
(Show Context)
Citation Context ...me and O(n) space with a line sweeping technique of Balaban [Bal95]. In the case of simplicial depth, we have O(n 2 ) line segments formed between pairs of data points, and unfortunately k is Θ(n 4 ) =-=[SW94]-=-. Thus the algorithm of BalabansChapter 2. Multivariate Medians 20 takes O(n 4 ) time and O(n 2 ) space for our purposes, so it is better to use brute-force to compute each intersection point. The tot... |

18 |
The depth function of a population distribution
- ROUSSEEUW, RUTS
- 1999
(Show Context)
Citation Context ...ion, multivariate confidence regions, p-values, quality indices, and control charts (see [RR96]). Applications of depth include hypothesis testing, graphical display [MRR + 01] and even voting theory =-=[RR99]-=-. Halfspace, hyperplane and simplicial depth are also closely related to regression [RR99]. For a recent account of statistical uses of depth, see [LPS99]. 3 A data set is in general position if no th... |

17 |
Computing the center of planar point sets
- Matouˇsek
- 1991
(Show Context)
Citation Context ...O(n 5 log n) time to compute the deepest point. Later they also gave a more complicated O(n 2 log n) version and provided implementations [RR98]. They seemed to be unaware that before this, Matouˇsek =-=[Mat91]-=- had presented an O(n log 5 n) algorithm for computing the halfspace median. Matouˇsek showed how to compute any point with depth greater than some constant k in O(n log 4 n) time and then used a bina... |

16 | Fast implementation of depth contours using topological sweep - Miller, Ramaswami, et al. |

16 |
Bivariate location depth
- Rousseeuw, Ruts
- 1996
(Show Context)
Citation Context ...nd R-Estimators were discussed by Huber [Hub72]. Robust estimators of location have been used for data description, multivariate confidence regions, p-values, quality indices, and control charts (see =-=[RR96]-=-). Applications of depth include hypothesis testing, graphical display [MRR + 01] and even voting theory [RR99]. Halfspace, hyperplane and simplicial depth are also closely related to regression [RR99... |

15 |
Geometric medians
- Gil, Steiger, et al.
- 1992
(Show Context)
Citation Context ...contained in the most intervals between pairs of data points. In higher dimensions, intervals (segments) are replaced by simplices. This method is discussed in section 2.5. Gil, Steiger and Wigderson =-=[GSW92]-=- compared robustness and computational aspects of certain medians, although they imposed the restriction that the median must be one of the data points. They proposed a new definition for the multivar... |

15 | Depth in an arrangement of hyperplanes
- Rousseeuw, Hubert
- 1999
(Show Context)
Citation Context ...certain depth function, and any point in R d can be assigned a depth according to the particular function. Recently a new significant notion of multivariate depth was proposed by Rousseeuw and Hubert =-=[RH99a]-=-. The hyperplane depth of a point with respect to a set of hyperplanes is the minimum number of hyperplanes that a ray extending from that point must cross. In R 1 this defines the median of n points,... |

14 |
Control charts for multivariate processes
- Liu
- 1995
(Show Context)
Citation Context ... 2 with n = 4 is shown in figure 2.6. The simplicial depth of a point in R d is the number of simplices which contain the point. Liu’s original definition involves closed simplices, although later in =-=[Liu95]-=- she repeats the definition using open simplices. Unless mentioned otherwise, when we refer to simplicial depth or the simplicial median, we will use the original definition (a point on the boundary o... |

14 | Geometry and statistics: Problems at the interface
- Shamos
- 1976
(Show Context)
Citation Context ...ears, and is discussed in section 2.2. A very intuitive definition for the univariate median is to continuously remove pairs of extreme data points. A generalization of this notion appeared by Shamos =-=[Sha76]-=- and by Barnett [Bar76], although Shamos stated that the idea originally belongs to Tukey. Convex hull peeling iteratively removes convex hull layers of points until a convex set remains. Convex hull ... |

13 | Regression depth and center points
- Amenta, Bern, et al.
(Show Context)
Citation Context ...matching lower bound and mentioned previous results such as an O(n 3 ) time algorithm by Rousseeuw and Hubert [RH99b] and an O(n log 2 n) time algorithm by van Kreveld et al [vKMR + 99]. Amenta et al =-=[ABET00]-=- proposed an O(n d ) time algorithm which constructs the arrangement 1 of the n hyperplanes and 1 For details concerning the arrangement of n lines, see appendix A.sChapter 2. Multivariate Medians 22 ... |

13 |
A notion of data depth based upon random simplices
- Liu
- 1990
(Show Context)
Citation Context ...e Oja median is a point for which the total volume of simplices 2 formed by the point and appropriate subsets of the data set is minimum. Section 2.4 contains a more detailed discussion. In 1990, Liu =-=[Liu90]-=- proposed yet another multivariate definition, generalizing the 2 A simplex is a line segment in R 1 , a triangle in R 2 , a tetrahedron in R 3 , etc.sChapter 1. Introduction 5 fact that the univariat... |

12 | Lower bounds for computing statistical depth - Aloupis, Cortés, et al. |