## Intrinsic Dimension Estimation Using Packing Numbers (2003)

### Cached

### Download Links

- [www.iro.umontreal.ca]
- [www.iro.umontreal.ca]
- [www.iro.umontreal.ca]
- [books.nips.cc]
- DBLP

### Other Repositories/Bibliography

Citations: | 53 - 0 self |

### BibTeX

@MISC{Kégl03intrinsicdimension,

author = {Balázs Kégl},

title = {Intrinsic Dimension Estimation Using Packing Numbers},

year = {2003}

}

### Years of Citing Articles

### OpenURL

### Abstract

We propose a new algorithm to estimate the intrinsic dimension of data sets. The method is based on geometric properties of the data and requires neither parametric assumptions on the data generating model nor input parameters to set. The method is compared to a similar, widely-used algorithm from the same family of geometric techniques. Experiments show that our method is more robust in terms of the data generating distribution and more reliable in the presence of noise.

### Citations

3259 |
The self-organizing map
- Kohonen
- 1990
(Show Context)
Citation Context ...low intrinsic dimension. Several methods have been developed to find low-dimensional representations of high-dimensional data, including Principal Component Analysis (PCA), Self-Organizing Maps (SOM) =-=[1]-=-, Multidimensional Scaling (MDS) [2], and, more recently, Local Linear Embedding (LLE) [3] and the ISOMAP algorithm [4]. Although most of these algorithms require that the intrinsic dimension of the m... |

1696 |
A Global Geometric Framework for Nonlinear Dimensionality Reduction. Science 290 (5500) (2000), 2319–2323. dimensional statistical inference and random matrices 333
- Tenenbaum, Silva, et al.
(Show Context)
Citation Context ...l data, including Principal Component Analysis (PCA), Self-Organizing Maps (SOM) [1], Multidimensional Scaling (MDS) [2], and, more recently, Local Linear Embedding (LLE) [3] and the ISOMAP algorithm =-=[4]-=-. Although most of these algorithms require that the intrinsic dimension of the manifold be explicitly set, there has been little effort devoted to design and analyze techniques that estimate the intr... |

1624 | Nonlinear Dimensionality Reduction by Locally Linear Embedding
- Roweis, Saul
(Show Context)
Citation Context ...sentations of high-dimensional data, including Principal Component Analysis (PCA), Self-Organizing Maps (SOM) [1], Multidimensional Scaling (MDS) [2], and, more recently, Local Linear Embedding (LLE) =-=[3]-=- and the ISOMAP algorithm [4]. Although most of these algorithms require that the intrinsic dimension of the manifold be explicitly set, there has been little effort devoted to design and analyze tech... |

323 | Searching in metric spaces
- Navarro, Baeza-yates
- 2001
(Show Context)
Citation Context ...ctures (e.g., kd-trees and R-trees) increase exponentially with the dimension, and these methods become inefficient if the dimension is more than about 20. Nevertheless, it was shown by Chávez et al. =-=[5]-=- that the complexity increases with the intrinsic dimension of the data rather then with the dimension of the embedding space. Figure 1: Intrinsic dimension D at different resolutions. (a) At very sma... |

280 | GTM: The generative topographic mapping
- Bishop, Svensén, et al.
- 1998
(Show Context)
Citation Context ...hen computing �Dpca. Another general scheme in the family of projection techniques is to turn the dimensionality reduction algorithm from an embedding technique into a probabilistic, generative model =-=[8]-=-, and optimize the dimension as any other parameter by using cross-validation in a maximum likelihood setting. The main disadvantage of this approach is that the dimension estimate depends on the gene... |

256 |
Measuring the strangeness of strange attractors
- Grassberger, Procaccia
- 1983
(Show Context)
Citation Context ... a finite sample, the zero limit cannot be achieved so the estimation procedure usually of the linear part consists of plotting logC(r) versus logr and measuring the slope ∂logC(r) ∂logrsof the curve =-=[9, 10, 11]-=-. To formalize this intuitive procedure, we present the following definition. Definition 3 The scale-dependent correlation dimension of a finite setSn = {x1,...,xn} is �Dcorr(r1,r2) = logC(r2) − logC(... |

99 | algorithms for PCA and SPCA
- Roweis, “EM
- 1998
(Show Context)
Citation Context ...[3]. The algorithm runs in O(n 2 d) time (where n is the number of points and d is the embedding dimension) which is slightly worse than the O(nd �Dpca) complexity of the fast PCA algorithm of Roweis =-=[7]-=- when computing �Dpca. Another general scheme in the family of projection techniques is to turn the dimensionality reduction algorithm from an embedding technique into a probabilistic, generative mode... |

77 | Global coordination of local linear models - Roweis, Saul, et al. |

73 |
Clique is hard to approximate within n 1−ε
- H˚astad
- 1999
(Show Context)
Citation Context ...{(xi,xj)|d(xi,xj) < r}. This problem is known to be NP-hard. There are results that show that for a general graph, even the approximation of MI(G) within a factor of n 1−ε , for any ε > 0, is NP-hard =-=[12]-=-. On the positive side, it was shown that for such geometric graphs as Gr, MI(G) can be approximated arbitrarily well by polynomial time algorithms [13]. However, approximating algorithms of this kind... |

71 | Polynomial-time approximation schemes for geometric graphs
- Erlebach, Jansen, et al.
(Show Context)
Citation Context ...n a factor of n 1−ε , for any ε > 0, is NP-hard [12]. On the positive side, it was shown that for such geometric graphs as Gr, MI(G) can be approximated arbitrarily well by polynomial time algorithms =-=[13]-=-. However, approximating algorithms of this kind scale exponentially with the data dimension both in terms of the quality of the approximation and the running time 1 so they are of little practical us... |

41 | Estimating the intrinsic dimension of data with a fractalbased method
- Camastra, Vinciarelli
(Show Context)
Citation Context ... a finite sample, the zero limit cannot be achieved so the estimation procedure usually of the linear part consists of plotting logC(r) versus logr and measuring the slope ∂logC(r) ∂logrsof the curve =-=[9, 10, 11]-=-. To formalize this intuitive procedure, we present the following definition. Definition 3 The scale-dependent correlation dimension of a finite setSn = {x1,...,xn} is �Dcorr(r1,r2) = logC(r2) − logC(... |

38 | Intrinsic Dimensionality Estimation with Optimally Topology Preserving Maps
- Bruske, Sommer
- 1995
(Show Context)
Citation Context ... consider Euclidean data sets, there are certain applications where only a distance metric d :X ×X ↦→ R + ∪ {0} and the matrix of pairwise distances D = [di j] = d(xi,xj) are given. Bruske and Sommer =-=[6]-=- present an approach to circumvent the second problem. Instead of doing PCA on the original data, they first cluster the data, then construct an optimally topology preserving map (OPTM) on the cluster... |

12 |
Self-spacial join selectivity estimation using fractal concepts
- Belussi, Faloutsos
- 1998
(Show Context)
Citation Context ... a finite sample, the zero limit cannot be achieved so the estimation procedure usually of the linear part consists of plotting logC(r) versus logr and measuring the slope ∂logC(r) ∂logrsof the curve =-=[9, 10, 11]-=-. To formalize this intuitive procedure, we present the following definition. Definition 3 The scale-dependent correlation dimension of a finite setSn = {x1,...,xn} is �Dcorr(r1,r2) = logC(r2) − logC(... |

11 | Clique is hard to approximate within n1 - Hastad - 1996 |

4 | EM algorithms for PCA and SPCA - Rowels, EM - 1998 |

1 | Searching in mec spaces - Chfivez, Navarro, et al. - 2001 |

1 | Intrinsic dimensionalfly estimation with optimally topology preserving maps - Bruske, Sommer - 1998 |

1 | GTM: The generafive topographic mapping - Bishop, Svens6n, et al. - 1998 |