Results 1  10
of
47
Sparse Bayesian Learning and the Relevance Vector Machine
, 2001
"... This paper introduces a general Bayesian framework for obtaining sparse solutions to regression and classication tasks utilising models linear in the parameters. Although this framework is fully general, we illustrate our approach with a particular specialisation that we denote the `relevance vec ..."
Abstract

Cited by 623 (5 self)
 Add to MetaCart
This paper introduces a general Bayesian framework for obtaining sparse solutions to regression and classication tasks utilising models linear in the parameters. Although this framework is fully general, we illustrate our approach with a particular specialisation that we denote the `relevance vector machine' (RVM), a model of identical functional form to the popular and stateoftheart `support vector machine' (SVM). We demonstrate that by exploiting a probabilistic Bayesian learning framework, we can derive accurate prediction models which typically utilise dramatically fewer basis functions than a comparable SVM while oering a number of additional advantages. These include the benets of probabilistic predictions, automatic estimation of `nuisance' parameters, and the facility to utilise arbitrary basis functions (e.g. non`Mercer' kernels).
A Survey of Dimension Reduction Techniques
, 2002
"... this paper, we assume that we have n observations, each being a realization of the p dimensional random variable x = (x 1 , . . . , x p ) with mean E(x) = = ( 1 , . . . , p ) and covariance matrix E{(x )(x = # pp . We denote such an observation matrix by X = i,j : 1 p, 1 ..."
Abstract

Cited by 93 (0 self)
 Add to MetaCart
(Show Context)
this paper, we assume that we have n observations, each being a realization of the p dimensional random variable x = (x 1 , . . . , x p ) with mean E(x) = = ( 1 , . . . , p ) and covariance matrix E{(x )(x = # pp . We denote such an observation matrix by X = i,j : 1 p, 1 n}. If i and # i = # (i,i) denote the mean and the standard deviation of the ith random variable, respectively, then we will often standardize the observations x i,j by (x i,j i )/ # i , where i = x i = 1/n j=1 x i,j , and # i = 1/n j=1 (x i,j x i )
Fast Marginal Likelihood Maximisation for Sparse Bayesian Models
 Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics
, 2003
"... The 'sparse Bayesian' modelling approach, as exemplified by the 'relevance vector machine ', enables sparse classification and regression functions to be obtained by linearlyweighting a small nmnber of fixed basis functions from a large dictionary of potential candidates. S ..."
Abstract

Cited by 74 (0 self)
 Add to MetaCart
The 'sparse Bayesian' modelling approach, as exemplified by the 'relevance vector machine ', enables sparse classification and regression functions to be obtained by linearlyweighting a small nmnber of fixed basis functions from a large dictionary of potential candidates. Such a model conveys a nmnber of advantages over the related and very popular 'support vector machine', but the necessary 'training' procedure optimisation of the marginal likelihood function is typically much slower. We describe a new and highly accelerated algorithm which exploits recentlyelucidated properties of the marginal likelihood function to enable maximisation via a principled and efficient sequential addition and deletion of candidate basis functions.
FeedForward Neural Networks and Topographic Mappings for Exploratory Data Analysis
 Neural Computing and Applications
, 1996
"... A recent novel approach to the visualisation and analysis of datasets, and one which is particularly applicable to those of a high dimension, is discussed in the context of real applications. A feedforward neural network is utilised to effect a topographic, structurepreserving, dimensionreducing ..."
Abstract

Cited by 43 (2 self)
 Add to MetaCart
A recent novel approach to the visualisation and analysis of datasets, and one which is particularly applicable to those of a high dimension, is discussed in the context of real applications. A feedforward neural network is utilised to effect a topographic, structurepreserving, dimensionreducing transformation of the data, with an additional facility to incorporate different degrees of associated subjective information. The properties of this transformation are illustrated on synthetic and real datasets, including the 1992 UK Research Assessment Exercise for funding in higher education. The method is compared and contrasted to established techniques for feature extraction, and related to topographic mappings, the Sammon projection and the statistical field of multidimensional scaling. 1 INTRODUCTION The visualisation and analysis of highdimensional data is a difficult problem and one that may be helpfully viewed in the context of feature extraction, which provides a useful commo...
Predicting Performance via Automated FeatureInteraction Detection
"... Abstract—Customizable programs and program families provide userselectable features to allow users to tailor a program to an application scenario. Knowing in advance which feature selection yields the best performance is difficult because a direct measurement of all possible feature combinations is ..."
Abstract

Cited by 17 (12 self)
 Add to MetaCart
(Show Context)
Abstract—Customizable programs and program families provide userselectable features to allow users to tailor a program to an application scenario. Knowing in advance which feature selection yields the best performance is difficult because a direct measurement of all possible feature combinations is infeasible. Our work aims at predicting program performance based on selected features. However, when features interact, accurate predictions are challenging. An interaction occurs when a particular feature combination has an unexpected influence on performance. We present a method that automatically detects performancerelevant feature interactions to improve prediction accuracy. To this end, we propose three heuristics to reduce the number of measurements required to detect interactions. Our evaluation consists of six realworld case studies from varying domains (e.g., databases, encoding libraries, and web servers) using different configuration techniques (e.g., configuration files and preprocessor flags). Results show an average prediction accuracy of 95 %. I.
Least square projection: A fast highprecision multidimensional projection technique and its application to document mapping
 IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS
, 2008
"... The problem of projecting multidimensional data into lower dimensions has been pursued by many researchers due to its potential application to data analysis of various kinds. This paper presents a novel multidimensional projection technique based on least square approximations. The approximations c ..."
Abstract

Cited by 7 (4 self)
 Add to MetaCart
(Show Context)
The problem of projecting multidimensional data into lower dimensions has been pursued by many researchers due to its potential application to data analysis of various kinds. This paper presents a novel multidimensional projection technique based on least square approximations. The approximations compute the coordinates of a set of projected points based on the coordinates of a reduced number of control points with defined geometry. We name the technique Least Square Projections (LSP). From an initial projection of the control points, LSP defines the positioning of their neighboring points through a numerical solution that aims at preserving a similarity relationship between the points given by a metric in mD. In order to perform the projection, a small number of distance calculations are necessary, and no repositioning of the points is required to obtain a final solution with satisfactory precision. The results show the capability of the technique to form groups of points by degree of similarity in 2D. We illustrate that capability through its application to mapping collections of textual documents from varied sources, a strategic yet difficult application. LSP is faster and more accurate than other existing highquality methods, particularly where it was mostly tested, that is, for mapping text sets.
Array signal processing for radio astronomy,” in The Square Kilometre Array: An Engineering Perspective
, 2005
"... Abstract. Radio astronomy forms an interesting application area for array signal processing techniques. Current synthesis imaging telescopes consist of a small number of identical dishes, which track a fixed patch in the sky and produce estimates of the timevarying spatial covariance matrix. The ob ..."
Abstract

Cited by 5 (3 self)
 Add to MetaCart
Abstract. Radio astronomy forms an interesting application area for array signal processing techniques. Current synthesis imaging telescopes consist of a small number of identical dishes, which track a fixed patch in the sky and produce estimates of the timevarying spatial covariance matrix. The observations sometimes are distorted by interference, e.g., from radio, TV, radar or satellite tranmissions. We describe some of the tools that array signal processing offers to filter out the interference, based on eigenvalue decompositions and factor analysis, a more general technique applicable to partially calibrated arrays. We consider spatial filtering techniques using projections and interference subtraction, and discuss how a reference antenna pointed at the interferer can improve the performance. We also consider image formation and its relation to beamforming. Finally, we briefly discuss some future large scale radio telescopes.
Bayesian inference and the parametric bootstrap
"... The parametric bootstrap can be used for the efficient computation of Bayes posterior distributions. Importance sampling formulas take on an easy form relating to the deviance in exponential families, and are particularly simple starting from Jeffreys invariant prior. Because of the i.i.d. nature of ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
The parametric bootstrap can be used for the efficient computation of Bayes posterior distributions. Importance sampling formulas take on an easy form relating to the deviance in exponential families, and are particularly simple starting from Jeffreys invariant prior. Because of the i.i.d. nature of bootstrap sampling, familiar formulas describe the computational accuracy of the Bayes estimates. Besides computational methods, the theory provides a connection between Bayesian and frequentist analysis. Efficient algorithms for the frequentist accuracy of Bayesian inferences are developed and demonstrated in a model selection example. Keywords: Jeffreys prior, exponential families, deviance, generalized linear models
Developments and Applications of Nonlinear Principal Component Analysis  a Review
"... Although linear principal component analysis (PCA) originates from the work of Sylvester [67] and Pearson [51], the development of nonlinear counterparts has only received attention from the 1980s. Work on nonlinear PCA, or NLPCA, can be divided into the utilization of autoassociative neural networ ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Although linear principal component analysis (PCA) originates from the work of Sylvester [67] and Pearson [51], the development of nonlinear counterparts has only received attention from the 1980s. Work on nonlinear PCA, or NLPCA, can be divided into the utilization of autoassociative neural networks, principal curves and manifolds, kernel approaches or the combination of these approaches. This article reviews existing algorithmic work, shows how a given data set can be examined to determine whether a conceptually more demanding NLPCA model is required and lists developments of NLPCA algorithms. Finally, the paper outlines problem areas and challenges that require future work to mature the NLPCA research field.