Results 1 - 10
of
33
Prediction With Gaussian Processes: From Linear Regression To Linear Prediction And Beyond
- Learning and Inference in Graphical Models
, 1997
"... The main aim of this paper is to provide a tutorial on regression with Gaussian processes. We start from Bayesian linear regression, and show how by a change of viewpoint one can see this method as a Gaussian process predictor based on priors over functions, rather than on priors over parameters. Th ..."
Abstract
-
Cited by 160 (4 self)
- Add to MetaCart
The main aim of this paper is to provide a tutorial on regression with Gaussian processes. We start from Bayesian linear regression, and show how by a change of viewpoint one can see this method as a Gaussian process predictor based on priors over functions, rather than on priors over parameters. This leads in to a more general discussion of Gaussian processes in section 4. Section 5 deals with further issues, including hierarchical modelling and the setting of the parameters that control the Gaussian process, the covariance functions for neural network models and the use of Gaussian processes in classification problems. PREDICTION WITH GAUSSIAN PROCESSES: FROM LINEAR REGRESSION TO LINEAR PREDICTION AND BEYOND 2 1 Introduction In the last decade neural networks have been used to tackle regression and classification problems, with some notable successes. It has also been widely recognized that they form a part of a wide variety of non-linear statistical techniques that can be used for...
Support Vector Machines, Reproducing Kernel Hilbert Spaces and the Randomized GACV
, 1998
"... this paper we very briefly review some of these results. RKHS can be chosen tailored to the problem at hand in many ways, and we review a few of them, including radial basis function and smoothing spline ANOVA spaces. Girosi (1997), Smola and Scholkopf (1997), Scholkopf et al (1997) and others have ..."
Abstract
-
Cited by 122 (9 self)
- Add to MetaCart
this paper we very briefly review some of these results. RKHS can be chosen tailored to the problem at hand in many ways, and we review a few of them, including radial basis function and smoothing spline ANOVA spaces. Girosi (1997), Smola and Scholkopf (1997), Scholkopf et al (1997) and others have noted the relationship between SVM's and penalty methods as used in the statistical theory of nonparametric regression. In Section 1.2 we elaborate on this, and show how replacing the likelihood functional of the logit (log odds ratio) in penalized likelihood methods for Bernoulli [yes-no] data, with certain other functionals of the logit (to be called SVM functionals) results in several of the SVM's that are of modern research interest. The SVM functionals we consider more closely resemble a "goodness-of-fit" measured by classification error than a "goodness-of-fit" measured by the comparative Kullback-Liebler distance, which is frequently associated with likelihood functionals. This observation is not new or profound, but it is hoped that the discussion here will help to bridge the conceptual gap between classical nonparametric regression via penalized likelihood methods, and SVM's in RKHS. Furthermore, since SVM's can be expected to provide more compact representations of the desired classification boundaries than boundaries based on estimating the logit by penalized likelihood methods, they have potential as a prescreening or model selection tool in sifting through many variables or regions of attribute space to find influential quantities, even when the ultimate goal is not classification, but to understand how the logit varies as the important variables change throughout their range. This is potentially applicable to the variable/model selection problem in demographic m...
Smoothing Spline ANOVA for Exponential Families, with Application to the Wisconsin Epidemiological Study of Diabetic Retinopathy
- ANN. STATIST
, 1995
"... Let y i ; i = 1; \Delta \Delta \Delta ; n be independent observations with the density of y i of the form h(y i ; f i ) = exp[y i f i \Gammab(f i )+c(y i )], where b and c are given functions and b is twice continuously differentiable and bounded away from 0. Let f i = f(t(i)), where t = (t 1 ; \De ..."
Abstract
-
Cited by 64 (34 self)
- Add to MetaCart
Let y i ; i = 1; \Delta \Delta \Delta ; n be independent observations with the density of y i of the form h(y i ; f i ) = exp[y i f i \Gammab(f i )+c(y i )], where b and c are given functions and b is twice continuously differentiable and bounded away from 0. Let f i = f(t(i)), where t = (t 1 ; \Delta \Delta \Delta ; t d ) 2 T (1)\Omega \Delta \Delta \Delta\Omega T (d) = T , the T (ff) are measureable spaces of rather general form, and f is an unknown function on T with some assumed `smoothness' properties. Given fy i ; t(i); i = 1; \Delta \Delta \Delta ; ng, it is desired to estimate f(t) for t in some region of interest contained in T . We develop the fitting of smoothing spline ANOVA models to this data of the form f(t) = C + P ff f ff (t ff ) + P ff!fi f fffi (t ff ; t fi ) + \Delta \Delta \Delta. The components of the decomposition satisfy side conditions which generalize the usual side conditions for parametric ANOVA. The estimate of f is obtained as the minimizer...
A Computationally Efficient Superresolution Image Reconstruction Algorithm
, 2000
"... Superresolution reconstruction produces a high-resolution image from a set of low-resolution images. Previous iterative methods for superresolution had not adequately addressed the computational and numerical issues for this ill-conditioned and typically underdetermined large scale problem. We propo ..."
Abstract
-
Cited by 36 (4 self)
- Add to MetaCart
Superresolution reconstruction produces a high-resolution image from a set of low-resolution images. Previous iterative methods for superresolution had not adequately addressed the computational and numerical issues for this ill-conditioned and typically underdetermined large scale problem. We propose efficient block circulant preconditioners for solving the Tikhonov-regularized superresolution problem by the conjugate gradient method. We also extend to underdetermined systems the derivation of the generalized cross-validation method for automatic calculation of regularization parameters. Effectiveness of our preconditioners and regularization techniques is demonstrated with superresolution results for a simulated sequence and a forward looking infrared (FLIR) camera image sequence.
Smoothing spline ANOVA models for large data sets with Bernoulli observations and the randomized GACV
- Ann. Statist
"... (ranGACV) method for choosing multiple smoothing parameters in penalized likelihood estimates for Bernoulli data. The method is intended for application with penalized likelihood smoothing spline ANOVA models. In addition we propose a class of approximate numerical methods for solving the penalized ..."
Abstract
-
Cited by 31 (15 self)
- Add to MetaCart
(ranGACV) method for choosing multiple smoothing parameters in penalized likelihood estimates for Bernoulli data. The method is intended for application with penalized likelihood smoothing spline ANOVA models. In addition we propose a class of approximate numerical methods for solving the penalized likelihood variational problem which, in conjunction with the ranGACV method allows the application of smoothing spline ANOVA models with Bernoulli data to much larger data sets than previously possible. These methods are based on choosing an approximating subset of the natural (representer) basis functions for the variational problem. Simulation studies with synthetic data, including synthetic data mimicking demographic risk factor data sets is used to examine the properties of the method and to compare the approach with the GRKPACK code of Wang (1997c). Bayesian “confidence intervals ” are obtained for the fits and are shown in the simulation studies to have the “across the function ” property usually claimed for these confidence intervals. Finally the method is applied
Efficient generalized cross-validation with applications to parametric image restoration and resolution enhancement
- IEEE Trans. Image Processing
, 2001
"... Abstract—In many image restoration/resolution enhancement applications, the blurring process, i.e., point spread function (PSF) of the imaging system, is not known or is known only to within a set of parameters. We estimate these PSF parameters for this ill-posed class of inverse problem from raw da ..."
Abstract
-
Cited by 27 (6 self)
- Add to MetaCart
Abstract—In many image restoration/resolution enhancement applications, the blurring process, i.e., point spread function (PSF) of the imaging system, is not known or is known only to within a set of parameters. We estimate these PSF parameters for this ill-posed class of inverse problem from raw data, along with the regularization parameters required to stabilize the solution, using the generalized cross-validation method (GCV). We propose efficient approximation techniques based on the Lanczos algorithm and Gauss quadrature theory, reducing the computational complexity of the GCV. Data-driven PSF and regularization parameter estimation experiments with synthetic and real image sequences are presented to demonstrate the effectiveness and robustness of our method. Index Terms—Blind restoration, blur identification, generalized cross-validation, quadrature rules, superresolution. I.
Adaptive Tuning of Numerical Weather Prediction Models: Simultaneous Estimation of Weighting, Smoothing and Physical Parameters
, 1996
"... In Wahba et al (1995) it was shown how the randomized trace method could be used to adaptively tune numerical weather prediction models via generalized cross validation (GCV ) and related methods. In this paper a `toy' four dimensional data assimilation model is developed (actually one space and one ..."
Abstract
-
Cited by 21 (7 self)
- Add to MetaCart
In Wahba et al (1995) it was shown how the randomized trace method could be used to adaptively tune numerical weather prediction models via generalized cross validation (GCV ) and related methods. In this paper a `toy' four dimensional data assimilation model is developed (actually one space and one time variable), consisting of an equivalent barotropic vorticity equation on a latitude circle, and used to demonstrate how this technique may be used to simultaneously tune weighting, smoothing and physical parameters. Analyses both with the model as a strong constraint (corresponding to the usual 4D-Var approach) and as a weak constraint (corresponding theoretically to a fixed-interval Kalman smoother) are carried out. The conclusions are limited to the particular toy problem considered but it can be seen how more elaborate experiments could be carried out, as well as how the method might be applied in practice. We have considered five adjustable parameters, two related to a distributed c...
Variable Selection and Model Building via Likelihood Basis Pursuit
- JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 2002
"... This paper presents a nonparametric penalized likelihood approach for variable selection and model building, called likelihood basis pursuit (LBP). In the setting of a tensor product reproducing kernel Hilbert space, we decompose the log likelihood into the sum of different functional components suc ..."
Abstract
-
Cited by 16 (8 self)
- Add to MetaCart
This paper presents a nonparametric penalized likelihood approach for variable selection and model building, called likelihood basis pursuit (LBP). In the setting of a tensor product reproducing kernel Hilbert space, we decompose the log likelihood into the sum of different functional components such as main effects and interactions, with each component represented by appropriate basis functions. The basis functions are chosen to be compatible with variable selection and model building in the context of a smoothing spline ANOVA model. Basis pursuit is applied to obtain the optimal decomposition in terms of having the smallest l 1 norm on the coefficients. We use the functional L 1 norm to measure the importance of each component and determine the "threshold" value by a sequential Monte Carlo bootstrap test algorithm. As a generalized LASSO-type method, LBP produces shrinkage estimates for the coefficients, which greatly facilitates the variable selection process, and provides highly interpretable multivariate functional estimates at the same time. To choose the regularization parameters appearing in the LBP models, generalized approximate cross validation (GACV) is derived as a tuning criterion. To make GACV widely applicable to large data sets, its randomized version is proposed as well. A technique "slice modeling" is used to solve the optimization problem and makes the computation more efficient. LBP has great potential for a wide range of research and application areas such as medical studies, and in this paper we apply it to two large on-going epidemiological studies: the Wisconsin Epidemiological Study of Diabetic Retinopathy (WESDR) and the Beaver Dam Eye Study (BDES).
Generalized Cross-Validation for Large Scale Problems
- J. Comput. Graph. Stat
, 1995
"... . Although generalized cross-validation is a popular tool for calculating a regularization parameter it has been rarely applied to large scale problems until recently. A major difficulty lies in the evaluation of the cross-validation function which requires the calculation of the trace of an inverse ..."
Abstract
-
Cited by 13 (6 self)
- Add to MetaCart
. Although generalized cross-validation is a popular tool for calculating a regularization parameter it has been rarely applied to large scale problems until recently. A major difficulty lies in the evaluation of the cross-validation function which requires the calculation of the trace of an inverse matrix. In the last few years stochastic trace estimators have been proposed to alleviate this problem. In this paper numerical approximation techniques are used to further reduce the computational complexity. The new approach employs Gauss quadrature to compute lower and upper bounds on the cross-validation function. It only requires the operator form of the system matrix, i.e., a subroutine to evaluate matrix-vector products. Thus the factorization of large matrices can be avoided. The new approach has been implemented in MATLAB. Numerical experiments confirm the remarkable accuracy of the stochastic trace estimator. Regularization parameters are computed for ill-posed problems with 100, ...
Some Large Scale Matrix Computation Problems
- J. Comput. Appl. Math
"... The central mathematical problem of this report is to bound the quantity u T f(A)v, where A is a given n \Theta n real matrix, u and v are given n-vectors, and f is a given smooth function. Estimating the entries and the trace of the inverse of a matrix and the determinant of a matrix can be clas ..."
Abstract
-
Cited by 13 (6 self)
- Add to MetaCart
The central mathematical problem of this report is to bound the quantity u T f(A)v, where A is a given n \Theta n real matrix, u and v are given n-vectors, and f is a given smooth function. Estimating the entries and the trace of the inverse of a matrix and the determinant of a matrix can be classified as such problems. There are a number of interesting applications for such matrix computation problems. The applications in fractal and lattice Quantum Chromodynamics (QCD) are our new motivation for studying such problems. In these applications, the matrices involved are sparse and could be up to the order of millions. It is still a challenging problem to efficiently solve such large matrix computation problems on today's supercomputers. 1 Introduction The central problem studied in this chapter is to estimate a lower bound L and/or an upper bound U , such that L u T f(A)v U; (1) where A is an n \Theta n given real matrix, u and v are given n-vectors, and f is a given smooth fun...

