Results 1 - 10
of
35
Exploratory projection pursuit
- Journal of the American Statistical Association
, 1987
"... Exploratory projection pursuit is concerned with finding relatively highly revealing lower dimensional projections of high dimensional data. The intent is to discover views of the multivariate data set that exhibit nonlinear effects-clustering, concentrations near nonlinear manifolds- that are not c ..."
Abstract
-
Cited by 206 (0 self)
- Add to MetaCart
Exploratory projection pursuit is concerned with finding relatively highly revealing lower dimensional projections of high dimensional data. The intent is to discover views of the multivariate data set that exhibit nonlinear effects-clustering, concentrations near nonlinear manifolds- that are not captured by the linear correlation structure. This paper presents a new algorithm for this purpose that has both statistical and computational advantages over previous methods. A connection to density estimation is established. Examples are presented and issues related to practical application are discussed.
Automated image annotation using global features and robust nonparametric density estimation
- In International ACM Conference on Image and Video Retrieval (CIVR
, 2005
"... Abstract. This paper describes a simple framework for automatically annotating images using non-parametric models of distributions of image features. We show that under this framework quite simple image properties such as global colour and texture distributions provide a strong basis for reliably an ..."
Abstract
-
Cited by 48 (21 self)
- Add to MetaCart
Abstract. This paper describes a simple framework for automatically annotating images using non-parametric models of distributions of image features. We show that under this framework quite simple image properties such as global colour and texture distributions provide a strong basis for reliably annotating images. We report results on subsets of two photographic libraries, the Corel Photo Archive and the Getty Image Archive. We also show how the popular Earth Mover’s Distance measure can be effectively incorporated within this framework. 1
Nonparametric Multivariate Density Estimation: A Comparative Study
- IEEE Trans. Signal Processing
, 1994
"... This paper algorithmically and empirically studies two major types of nonparametric multivariate density estimation techniques, where no assumption is made about the data being drawn from any of known parametric families of distribution. The first type is the popular kernel method (and several of it ..."
Abstract
-
Cited by 34 (1 self)
- Add to MetaCart
This paper algorithmically and empirically studies two major types of nonparametric multivariate density estimation techniques, where no assumption is made about the data being drawn from any of known parametric families of distribution. The first type is the popular kernel method (and several of its variants) which uses locally tuned radial basis (e.g., Gaussian) functions to interpolate the multi-dimensional density; the second type is based on an exploratory projection pursuit technique which interprets the multi-dimensional density through the construction of several one-dimensional densities along highly "interesting" projections of multidimensional data. Performance evaluations using training data from mixture Gaussian and mixture Cauchy densities are presented. The results show that the curse of dimensionality and the sensitivity of control parameters have a much more adverse impact on the kernel density estimators than on the projection pursuit density estimators. 3 This rese...
A review of dimension reduction techniques
, 1997
"... The problem of dimension reduction is introduced as a way to overcome the curse of the dimensionality when dealing with vector data in high-dimensional spaces and as a modelling tool for such data. It is defined as the search for a low-dimensional manifold that embeds the high-dimensional data. A cl ..."
Abstract
-
Cited by 29 (4 self)
- Add to MetaCart
The problem of dimension reduction is introduced as a way to overcome the curse of the dimensionality when dealing with vector data in high-dimensional spaces and as a modelling tool for such data. It is defined as the search for a low-dimensional manifold that embeds the high-dimensional data. A classification of dimension reduction problems is proposed. A survey of several techniques for dimension reduction is given, including principal component analysis, projection pursuit and projection pursuit regression, principal curves and methods based on topologically continuous maps, such as Kohonen’s maps or the generalised topographic mapping. Neural network implementations for several of these techniques are also reviewed, such as the projection pursuit learning network and the BCM neuron with an objective function. Several appendices complement the mathematical treatment of the main text.
Geometric Methods for Feature Extraction and Dimensional Reduction
- In L. Rokach and O. Maimon (Eds.), Data
, 2005
"... Abstract We give a tutorial overview of several geometric methods for feature extraction and dimensional reduction. We divide the methods into projective methods and methods that model the manifold on which the data lies. For projective methods, we review projection pursuit, principal component anal ..."
Abstract
-
Cited by 24 (1 self)
- Add to MetaCart
Abstract We give a tutorial overview of several geometric methods for feature extraction and dimensional reduction. We divide the methods into projective methods and methods that model the manifold on which the data lies. For projective methods, we review projection pursuit, principal component analysis (PCA), kernel PCA, probabilistic PCA, and oriented PCA; and for the manifold methods, we review multidimensional scaling (MDS), landmark MDS, Isomap, locally linear embedding, Laplacian eigenmaps and spectral clustering. The Nyström method, which links several of the algorithms, is also reviewed. The goal is to provide a self-contained review of the concepts and mathematics underlying these algorithms.
Cross-Validation and the Estimation of Conditional Probability Densities
- Journal of the American Statistical Association
, 2004
"... ABSTRACT. Many practical problems, especially some connected with forecasting, require nonparametric estimation of conditional densities from mixed data. For example, given an explanatory data vector X for a prospective customer, with components that could include the customer’s salary, occupation, ..."
Abstract
-
Cited by 21 (2 self)
- Add to MetaCart
ABSTRACT. Many practical problems, especially some connected with forecasting, require nonparametric estimation of conditional densities from mixed data. For example, given an explanatory data vector X for a prospective customer, with components that could include the customer’s salary, occupation, age, sex, marital status and address, a company might wish to estimate the density of the expenditure, Y, that could be made by that person, basing the inference on observations of (X, Y) for previous clients. Choosing appropriate smoothing parameters for this problem can be tricky, not least because plug-in rules take a particularly complex form in the case of mixed data. An obvious difficulty is that there exists no general formula for the optimal smoothing parameters. More insidiously, and more seriously, it can be difficult to determine which components of X are relevant to the problem of conditional inference. For example, if the jth component of X is independent of Y then that component is irrelevant to estimating the density of Y given X, and ideally should be dropped before conducting inference. In this paper we show that cross-validation overcomes these difficulties. It automatically determines which components are relevant and which are not, through assigning large smoothing parameters to the latter and consequently shrinking them towards the uniform distribution on the respective marginals. This effectively removes irrelevant components from contention, by suppressing their contribution to estimator variance; they already have very small bias, a consequence of their independence of Y. Crossvalidation also gives us important information about which components are relevant: the relevant components are precisely those which cross-validation has chosen to smooth in a traditional way, by assigning them smoothing parameters of conventional size. Indeed, cross-validation produces asymptotically optimal smoothing for relevant components, while eliminating irrelevant components by oversmoothing. In the problem of nonparametric estimation of a conditional density, cross-validation comes into its own as a method with no obvious peers.
Data Filtering and Distribution Modeling Algorithms for Machine Learning
, 1993
"... vi Acknowledgments vii 1. Introduction 1 1.1 Boosting by majority : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 4 1.2 Query By Committee : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 7 1.3 Learning distributions of binary vectors : : ..."
Abstract
-
Cited by 18 (4 self)
- Add to MetaCart
vi Acknowledgments vii 1. Introduction 1 1.1 Boosting by majority : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 4 1.2 Query By Committee : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 7 1.3 Learning distributions of binary vectors : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 8 2. Boosting a weak learning algorithm by majority 10 2.1 Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 10 2.2 The majority-vote game : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 14 2.2.1 Optimality of the weighting scheme : : : : : : : : : : : : : : : : : : : : : : : : : : : 19 2.2.2 The representational power of majority gates : : : : : : : : : : : : : : : : : : : : : : 20 2.3 Boosting a weak learner using a majority vote : : : : : : : : : : : : : : : : : : : : : : : : : : 22 2.3.1 Preliminaries : : : : : : : : : : : : : : : : : : : : : : : : : :...
Conjoint Probabilistic Subband Modeling
- Massachusetts Institute of Technology
, 1997
"... A new approach to high-order-conditional probability density estimation is developed, based on a partitioning of conditioning space via decision trees. The technique is applied to image compression, image restoration, and texture synthesis, and the results compared with those obtained by standard mi ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
A new approach to high-order-conditional probability density estimation is developed, based on a partitioning of conditioning space via decision trees. The technique is applied to image compression, image restoration, and texture synthesis, and the results compared with those obtained by standard mixture density and linear regression models. By applying the technique to subband-domain processing, some evidence is provided to support the following statement: the appropriate tradeoff between spatial and spectral localization in linear preprocessing shifts towards greater spatial localization when subbands are processed in a way that exploits interdependence.
High Dimensional Feature Reduction Via Projection Pursuit
, 1995
"... - ii-Table of Contents ABSTRACT.................................................................................................................................... v 1. INTRODUCTION..................................................................................................................... 1 ..."
Abstract
-
Cited by 13 (3 self)
- Add to MetaCart
- ii-Table of Contents ABSTRACT.................................................................................................................................... v 1. INTRODUCTION..................................................................................................................... 1 1.1 Background.............................................................................................................. 1
Feature extraction for non-parametric discriminant analysis
- Journal of Computational and Graphical Statistics
, 2003
"... In high-dimensional classi � cation problems, one is often interested in � nding a few important discriminant directions in order to reduce the dimensionality.Fisher’s linear discriminant analysis(LDA) is a commonly used method. Although LDA is guaranteedto � nd the best directions when each class h ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
In high-dimensional classi � cation problems, one is often interested in � nding a few important discriminant directions in order to reduce the dimensionality.Fisher’s linear discriminant analysis(LDA) is a commonly used method. Although LDA is guaranteedto � nd the best directions when each class has a Gaussian density with a common covariance matrix, it can fail if the class densitiesare more general.Using a likelihood-basedinterpretation of Fisher’s LDA criterion, we develop a general method for � nding important discriminant directions without assuming the class densities belong to any particular parametric family. We also show that our method can be easily integrated with projection pursuit density estimation to produce a powerful procedure for (reduced-rank) nonparametric discriminant analysis.

