Results 1  10
of
70
CrossValidation and the Estimation of Conditional Probability Densities
 Journal of the American Statistical Association
, 2004
"... ABSTRACT. Many practical problems, especially some connected with forecasting, require nonparametric estimation of conditional densities from mixed data. For example, given an explanatory data vector X for a prospective customer, with components that could include the customer’s salary, occupation, ..."
Abstract

Cited by 73 (9 self)
 Add to MetaCart
ABSTRACT. Many practical problems, especially some connected with forecasting, require nonparametric estimation of conditional densities from mixed data. For example, given an explanatory data vector X for a prospective customer, with components that could include the customer’s salary, occupation, age, sex, marital status and address, a company might wish to estimate the density of the expenditure, Y, that could be made by that person, basing the inference on observations of (X, Y) for previous clients. Choosing appropriate smoothing parameters for this problem can be tricky, not least because plugin rules take a particularly complex form in the case of mixed data. An obvious difficulty is that there exists no general formula for the optimal smoothing parameters. More insidiously, and more seriously, it can be difficult to determine which components of X are relevant to the problem of conditional inference. For example, if the jth component of X is independent of Y then that component is irrelevant to estimating the density of Y given X, and ideally should be dropped before conducting inference. In this paper we show that crossvalidation overcomes these difficulties. It automatically determines which components are relevant and which are not, through assigning large smoothing parameters to the latter and consequently shrinking them towards the uniform distribution on the respective marginals. This effectively removes irrelevant components from contention, by suppressing their contribution to estimator variance; they already have very small bias, a consequence of their independence of Y. Crossvalidation also gives us important information about which components are relevant: the relevant components are precisely those which crossvalidation has chosen to smooth in a traditional way, by assigning them smoothing parameters of conventional size. Indeed, crossvalidation produces asymptotically optimal smoothing for relevant components, while eliminating irrelevant components by oversmoothing. In the problem of nonparametric estimation of a conditional density, crossvalidation comes into its own as a method with no obvious peers.
Automated image annotation using global features and robust nonparametric density estimation
 In International ACM Conference on Image and Video Retrieval (CIVR
, 2005
"... Abstract. This paper describes a simple framework for automatically annotating images using nonparametric models of distributions of image features. We show that under this framework quite simple image properties such as global colour and texture distributions provide a strong basis for reliably an ..."
Abstract

Cited by 64 (21 self)
 Add to MetaCart
(Show Context)
Abstract. This paper describes a simple framework for automatically annotating images using nonparametric models of distributions of image features. We show that under this framework quite simple image properties such as global colour and texture distributions provide a strong basis for reliably annotating images. We report results on subsets of two photographic libraries, the Corel Photo Archive and the Getty Image Archive. We also show how the popular Earth Mover’s Distance measure can be effectively incorporated within this framework. 1
Nonparametric Multivariate Density Estimation: A Comparative Study
 IEEE Trans. Signal Processing
, 1994
"... This paper algorithmically and empirically studies two major types of nonparametric multivariate density estimation techniques, where no assumption is made about the data being drawn from any of known parametric families of distribution. The first type is the popular kernel method (and several of it ..."
Abstract

Cited by 52 (1 self)
 Add to MetaCart
This paper algorithmically and empirically studies two major types of nonparametric multivariate density estimation techniques, where no assumption is made about the data being drawn from any of known parametric families of distribution. The first type is the popular kernel method (and several of its variants) which uses locally tuned radial basis (e.g., Gaussian) functions to interpolate the multidimensional density; the second type is based on an exploratory projection pursuit technique which interprets the multidimensional density through the construction of several onedimensional densities along highly "interesting" projections of multidimensional data. Performance evaluations using training data from mixture Gaussian and mixture Cauchy densities are presented. The results show that the curse of dimensionality and the sensitivity of control parameters have a much more adverse impact on the kernel density estimators than on the projection pursuit density estimators. 3 This rese...
Geometric Methods for Feature Extraction and Dimensional Reduction
 In L. Rokach and O. Maimon (Eds.), Data
, 2005
"... Abstract We give a tutorial overview of several geometric methods for feature extraction and dimensional reduction. We divide the methods into projective methods and methods that model the manifold on which the data lies. For projective methods, we review projection pursuit, principal component anal ..."
Abstract

Cited by 42 (1 self)
 Add to MetaCart
(Show Context)
Abstract We give a tutorial overview of several geometric methods for feature extraction and dimensional reduction. We divide the methods into projective methods and methods that model the manifold on which the data lies. For projective methods, we review projection pursuit, principal component analysis (PCA), kernel PCA, probabilistic PCA, and oriented PCA; and for the manifold methods, we review multidimensional scaling (MDS), landmark MDS, Isomap, locally linear embedding, Laplacian eigenmaps and spectral clustering. The Nyström method, which links several of the algorithms, is also reviewed. The goal is to provide a selfcontained review of the concepts and mathematics underlying these algorithms.
A review of dimension reduction techniques
, 1997
"... The problem of dimension reduction is introduced as a way to overcome the curse of the dimensionality when dealing with vector data in highdimensional spaces and as a modelling tool for such data. It is defined as the search for a lowdimensional manifold that embeds the highdimensional data. A cl ..."
Abstract

Cited by 39 (4 self)
 Add to MetaCart
(Show Context)
The problem of dimension reduction is introduced as a way to overcome the curse of the dimensionality when dealing with vector data in highdimensional spaces and as a modelling tool for such data. It is defined as the search for a lowdimensional manifold that embeds the highdimensional data. A classification of dimension reduction problems is proposed. A survey of several techniques for dimension reduction is given, including principal component analysis, projection pursuit and projection pursuit regression, principal curves and methods based on topologically continuous maps, such as Kohonen’s maps or the generalised topographic mapping. Neural network implementations for several of these techniques are also reviewed, such as the projection pursuit learning network and the BCM neuron with an objective function. Several appendices complement the mathematical treatment of the main text.
Ridge polynomial networks
 IEEE Transactions on Neural Networks
, 1995
"... AbstructThis paper presents P polynomial conndo&t network called ridge polynomial network (RE”) that can dormly approximate any imntinuous function on a cootpad set in multidimensional input space?TId, with arbitrary dqpe of pccmcy. Thii network provides a more e$cicnt and regular orchitccture ..."
Abstract

Cited by 27 (3 self)
 Add to MetaCart
AbstructThis paper presents P polynomial conndo&t network called ridge polynomial network (RE”) that can dormly approximate any imntinuous function on a cootpad set in multidimensional input space?TId, with arbitrary dqpe of pccmcy. Thii network provides a more e$cicnt and regular orchitccture compared to ordinary higherorder feedforward networks while maintaining their fast learning property. The ridge polynomial network is a generalization of the pisigma network and uses a special form of ridge polynomials. It function f: Bd + B is approximated as [17], [25] / d d d d d d provides a natural mechanism for irmmental ntbtwnk growth. Simulation results on a surface fitting problem, the dassiecPtion of highdimensional data and the realbtion of a mdtlvariate polynomial function are given to highlight the network. In particular, a canstructive 1 developed for the network is shown to yield smooth generalization and steady learning. I.
Efficient Belief Propagation for HigherOrder Cliques Using Linear Constraint Nodes
 COMPUTER VISION AND IMAGE UNDERSTANDING
, 2008
"... ..."
Feature extraction for nonparametric discriminant analysis
 Journal of Computational and Graphical Statistics
, 2003
"... In highdimensional classi � cation problems, one is often interested in � nding a few important discriminant directions in order to reduce the dimensionality.Fisher’s linear discriminant analysis(LDA) is a commonly used method. Although LDA is guaranteedto � nd the best directions when each class h ..."
Abstract

Cited by 19 (1 self)
 Add to MetaCart
(Show Context)
In highdimensional classi � cation problems, one is often interested in � nding a few important discriminant directions in order to reduce the dimensionality.Fisher’s linear discriminant analysis(LDA) is a commonly used method. Although LDA is guaranteedto � nd the best directions when each class has a Gaussian density with a common covariance matrix, it can fail if the class densitiesare more general.Using a likelihoodbasedinterpretation of Fisher’s LDA criterion, we develop a general method for � nding important discriminant directions without assuming the class densities belong to any particular parametric family. We also show that our method can be easily integrated with projection pursuit density estimation to produce a powerful procedure for (reducedrank) nonparametric discriminant analysis.
Data Filtering and Distribution Modeling Algorithms for Machine Learning
, 1993
"... vi Acknowledgments vii 1. Introduction 1 1.1 Boosting by majority : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 4 1.2 Query By Committee : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 7 1.3 Learning distributions of binary vectors : : ..."
Abstract

Cited by 18 (4 self)
 Add to MetaCart
vi Acknowledgments vii 1. Introduction 1 1.1 Boosting by majority : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 4 1.2 Query By Committee : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 7 1.3 Learning distributions of binary vectors : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 8 2. Boosting a weak learning algorithm by majority 10 2.1 Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 10 2.2 The majorityvote game : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 14 2.2.1 Optimality of the weighting scheme : : : : : : : : : : : : : : : : : : : : : : : : : : : 19 2.2.2 The representational power of majority gates : : : : : : : : : : : : : : : : : : : : : : 20 2.3 Boosting a weak learner using a majority vote : : : : : : : : : : : : : : : : : : : : : : : : : : 22 2.3.1 Preliminaries : : : : : : : : : : : : : : : : : : : : : : : : : :...