Results 1  10
of
198
Computer Experiments
, 1996
"... Introduction Deterministic computer simulations of physical phenomena are becoming widely used in science and engineering. Computers are used to describe the flow of air over an airplane wing, combustion of gasses in a flame, behavior of a metal structure under stress, safety of a nuclear reactor, a ..."
Abstract

Cited by 119 (6 self)
 Add to MetaCart
Introduction Deterministic computer simulations of physical phenomena are becoming widely used in science and engineering. Computers are used to describe the flow of air over an airplane wing, combustion of gasses in a flame, behavior of a metal structure under stress, safety of a nuclear reactor, and so on. Some of the most widely used computer models, and the ones that lead us to work in this area, arise in the design of the semiconductors used in the computers themselves. A process simulator starts with a data structure representing an unprocessed piece of silicon and simulates the steps such as oxidation, etching and ion injection that produce a semiconductor device such as a transistor. A device simulator takes a description of such a device and simulates the flow of current through it under varying conditions to determine properties of the device such as its switching speed and the critical voltage at which it switches. A circuit simulator takes a list of devices and the
Covariance tapering for interpolation of large spatial datasets
 Journal of Computational and Graphical Statistics
, 2006
"... Interpolation of a spatially correlated random process is used in many areas. The best unbiased linear predictor, often called kriging predictor in geostatistical science, requires the solution of a large linear system based on the covariance matrix of the observations. In this article, we show that ..."
Abstract

Cited by 97 (9 self)
 Add to MetaCart
(Show Context)
Interpolation of a spatially correlated random process is used in many areas. The best unbiased linear predictor, often called kriging predictor in geostatistical science, requires the solution of a large linear system based on the covariance matrix of the observations. In this article, we show that tapering the correct covariance matrix with an appropriate compactly supported covariance function reduces the computational burden significantly and still has an asymptotic optimal mean squared error. The effect of tapering is to create a sparse approximate linear system that can then be solved using sparse matrix algorithms. Extensive Monte Carlo simulations support the theoretical results. An application to a large climatological precipitation dataset is presented as a concrete practical illustration.
Objective Bayesian analysis of spatially correlated data
 Journal of the American Statistical Association
, 2001
"... Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at ..."
Abstract

Cited by 96 (10 self)
 Add to MetaCart
(Show Context)
Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at
Gaussian processes for machine learning
 International Journal of Neural Systems
, 2004
"... Gaussian processes (GPs) are natural generalisations of multivariate Gaussian random variables to infinite (countably or continuous) index sets. GPs have been applied in a large number of fields to a diverse range of ends, and very many deep theoretical analyses of various properties are available. ..."
Abstract

Cited by 93 (14 self)
 Add to MetaCart
(Show Context)
Gaussian processes (GPs) are natural generalisations of multivariate Gaussian random variables to infinite (countably or continuous) index sets. GPs have been applied in a large number of fields to a diverse range of ends, and very many deep theoretical analyses of various properties are available. This paper gives an introduction to Gaussian processes on a fairly elementary level with special emphasis on characteristics relevant in machine learning. It draws explicit connections to branches such as spline smoothing models and support vector machines in which similar ideas have been investigated. Gaussian process models are routinely used to solve hard machine learning problems. They are attractive because of their flexible nonparametric nature and computational simplicity. Treated within a Bayesian framework, very powerful statistical methods can be implemented which offer valid estimates of uncertainties in our predictions and generic model selection procedures cast as nonlinear optimization problems. Their main drawback of heavy computational scaling has recently been alleviated by the introduction of generic sparse approximations [13, 78, 31]. The mathematical literature on GPs is large and often uses deep
Fast and Exact Simulation of Stationary Gaussian Processes through Circulant Embedding of the Covariance Matrix
 SIAM Journal of Scienti Computations
, 1993
"... ..."
Space and SpaceTime Modeling Using Process Convolutions
"... . A continuous spatial model can be constructed by convolving a very simple, perhaps independent, process with a kernel or point spread function. This approach for constructing a spatial process o#ers a number of advantages over specification through a spatial covariogram. In particular, this proces ..."
Abstract

Cited by 77 (4 self)
 Add to MetaCart
. A continuous spatial model can be constructed by convolving a very simple, perhaps independent, process with a kernel or point spread function. This approach for constructing a spatial process o#ers a number of advantages over specification through a spatial covariogram. In particular, this process convolution specification leads to compuational simplifications and easily extends beyond simple stationary models. This paper uses process convolution models to build space and spacetime models that are flexible and able to accomodate large amounts of data. Data from environmental monitoring is considered. 1 Introduction Modeling spatial data with Gaussian processes is the common thread of all geostatistical analyses. Some notable references in this area include Matheron (1963), Journel and Huijbregts (1978), Ripley (1981), Cressie (1991), Wackernagel (1995), and Stein (1999). A common approach is to model spatial dependence through the covariogram c(), so that covariance between any t...
Bayesian Gaussian Process Models: PACBayesian Generalisation Error Bounds and Sparse Approximations
, 2003
"... ii Nonparametric models and techniques enjoy a growing popularity in the field of machine learning, and among these Bayesian inference for Gaussian process (GP) models has recently received significant attention. We feel that GP priors should be part of the standard toolbox for constructing models ..."
Abstract

Cited by 73 (14 self)
 Add to MetaCart
(Show Context)
ii Nonparametric models and techniques enjoy a growing popularity in the field of machine learning, and among these Bayesian inference for Gaussian process (GP) models has recently received significant attention. We feel that GP priors should be part of the standard toolbox for constructing models relevant to machine learning in the same way as parametric linear models are, and the results in this thesis help to remove some obstacles on the way towards this goal. In the first main chapter, we provide a distributionfree finite sample bound on the difference between generalisation and empirical (training) error for GP classification methods. While the general theorem (the PACBayesian bound) is not new, we give a much simplified and somewhat generalised derivation and point out the underlying core technique (convex duality) explicitly. Furthermore, the application to GP models is novel (to our knowledge). A central feature of this bound is that its quality depends crucially on task knowledge being encoded
On a Connection between Kernel PCA and Metric Multidimensional Scaling
 Advances in Neural Information Processing Systems 13
, 2001
"... In this paper we show that the kernel PCA algorithm of Schölkopf et al. (1998) can be interpreted as a form of metric multidimensional scaling (MDS) when the kernel function k(x; y) is isotropic, i.e. it depends only on jjx yjj. This leads to a metric MDS algorithm where the desired configuration of ..."
Abstract

Cited by 70 (0 self)
 Add to MetaCart
(Show Context)
In this paper we show that the kernel PCA algorithm of Schölkopf et al. (1998) can be interpreted as a form of metric multidimensional scaling (MDS) when the kernel function k(x; y) is isotropic, i.e. it depends only on jjx yjj. This leads to a metric MDS algorithm where the desired configuration of points is found via the solution of an eigenproblem rather than through the iterative optimization of the stress objective function. The question of kernel choice is also discussed.
An Adaptive Ensemble Kalman Filter
, 2000
"... To the extent that model error is nonnegligible in numerical models of the atmosphere, it must be accounted for in 4D atmospheric data assimilation systems. In this study, a method of estimating and accounting for model error in the context of an ensemble Kalman filter technique is developed. The ..."
Abstract

Cited by 53 (0 self)
 Add to MetaCart
To the extent that model error is nonnegligible in numerical models of the atmosphere, it must be accounted for in 4D atmospheric data assimilation systems. In this study, a method of estimating and accounting for model error in the context of an ensemble Kalman filter technique is developed. The method involves parameterizing the model error and using innovations to estimate the modelerror parameters. The estimation algorithm is based on a maximum likelihood approach and the study is performed in an idealized environment using a threelevel, quasigeostrophic, T21 model and simulated observations and model error. The use of a
Computation With Infinite Neural Networks
, 1997
"... For neural networks with a wide class of weight priors, it can be shown that in the limit of an infinite number of hidden units the prior over functions tends to a Gaussian process. In this paper analytic forms are derived for the covariance function of the Gaussian processes corresponding to networ ..."
Abstract

Cited by 41 (1 self)
 Add to MetaCart
For neural networks with a wide class of weight priors, it can be shown that in the limit of an infinite number of hidden units the prior over functions tends to a Gaussian process. In this paper analytic forms are derived for the covariance function of the Gaussian processes corresponding to networks with sigmoidal and Gaussian hidden units. This allows predictions to be made efficiently using networks with an infinite number of hidden units, and shows that, somewhat paradoxically, it may be easier to carry out Bayesian prediction with infinite networks rather than finite ones. 1 Introduction To someone training a neural network by maximizing the likelihood of a finite amount of data it makes no sense to use a network with an infinite number of hidden units; the network will "overfit" the data and so will be expected to generalize poorly. However, the idea of selecting the network size depending on the amount of training data makes little sense to a Bayesian; a model should be chosen...