Results 1  10
of
203
ModelBased Clustering, Discriminant Analysis, and Density Estimation
 JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 2000
"... Cluster analysis is the automated search for groups of related observations in a data set. Most clustering done in practice is based largely on heuristic but intuitively reasonable procedures and most clustering methods available in commercial software are also of this type. However, there is little ..."
Abstract

Cited by 259 (24 self)
 Add to MetaCart
Cluster analysis is the automated search for groups of related observations in a data set. Most clustering done in practice is based largely on heuristic but intuitively reasonable procedures and most clustering methods available in commercial software are also of this type. However, there is little systematic guidance associated with these methods for solving important practical questions that arise in cluster analysis, such as \How many clusters are there?", "Which clustering method should be used?" and \How should outliers be handled?". We outline a general methodology for modelbased clustering that provides a principled statistical approach to these issues. We also show that this can be useful for other problems in multivariate analysis, such as discriminant analysis and multivariate density estimation. We give examples from medical diagnosis, mineeld detection, cluster recovery from noisy data, and spatial density estimation. Finally, we mention limitations of the methodology, a...
Prediction With Gaussian Processes: From Linear Regression To Linear Prediction And Beyond
 Learning and Inference in Graphical Models
, 1997
"... The main aim of this paper is to provide a tutorial on regression with Gaussian processes. We start from Bayesian linear regression, and show how by a change of viewpoint one can see this method as a Gaussian process predictor based on priors over functions, rather than on priors over parameters. Th ..."
Abstract

Cited by 195 (4 self)
 Add to MetaCart
The main aim of this paper is to provide a tutorial on regression with Gaussian processes. We start from Bayesian linear regression, and show how by a change of viewpoint one can see this method as a Gaussian process predictor based on priors over functions, rather than on priors over parameters. This leads in to a more general discussion of Gaussian processes in section 4. Section 5 deals with further issues, including hierarchical modelling and the setting of the parameters that control the Gaussian process, the covariance functions for neural network models and the use of Gaussian processes in classification problems. PREDICTION WITH GAUSSIAN PROCESSES: FROM LINEAR REGRESSION TO LINEAR PREDICTION AND BEYOND 2 1 Introduction In the last decade neural networks have been used to tackle regression and classification problems, with some notable successes. It has also been widely recognized that they form a part of a wide variety of nonlinear statistical techniques that can be used for...
Modelbased Geostatistics
 Applied Statistics
, 1998
"... Conventional geostatistical methodology solves the problem of predicting the realised value of a linear functional of a Gaussian spatial stochastic process, S(x), based on observations Y i = S(x i ) + Z i at sampling locations x i , where the Z i are mutually independent, zeromean Gaussian random v ..."
Abstract

Cited by 96 (4 self)
 Add to MetaCart
Conventional geostatistical methodology solves the problem of predicting the realised value of a linear functional of a Gaussian spatial stochastic process, S(x), based on observations Y i = S(x i ) + Z i at sampling locations x i , where the Z i are mutually independent, zeromean Gaussian random variables. We describe two spatial applications for which Gaussian distributional assumptions are clearly inappropriate. The first concerns the assessment of residual contamination from nuclear weapons testing on a South Pacific island, in which the sampling method generates spatially indexed Poisson counts conditional on an unobserved spatially varying intensity of radioactivity; we conclude that a coventional geostatistical analysis oversmooths the data and underestimates the spatial extremes of the intensity. The second application provides a description of spatial variation in the risk of campylobacter infections relative to other enteric infections in part of North Lancashire and South C...
Computer Experiments
, 1996
"... Introduction Deterministic computer simulations of physical phenomena are becoming widely used in science and engineering. Computers are used to describe the flow of air over an airplane wing, combustion of gasses in a flame, behavior of a metal structure under stress, safety of a nuclear reactor, a ..."
Abstract

Cited by 67 (5 self)
 Add to MetaCart
Introduction Deterministic computer simulations of physical phenomena are becoming widely used in science and engineering. Computers are used to describe the flow of air over an airplane wing, combustion of gasses in a flame, behavior of a metal structure under stress, safety of a nuclear reactor, and so on. Some of the most widely used computer models, and the ones that lead us to work in this area, arise in the design of the semiconductors used in the computers themselves. A process simulator starts with a data structure representing an unprocessed piece of silicon and simulates the steps such as oxidation, etching and ion injection that produce a semiconductor device such as a transistor. A device simulator takes a description of such a device and simulates the flow of current through it under varying conditions to determine properties of the device such as its switching speed and the critical voltage at which it switches. A circuit simulator takes a list of devices and the
Geostatistical Motion Interpolation
 ACM Transactions on Graphics
, 2005
"... Figure 1: Animations synthesized by our motion interpolation in a 5D parametric space. One parameter changes the style of motion from rough to delicate as shown by the bar indicator. The other four parameters are the heights and widths of two successive steps of stairs for gait motions, and the 2D s ..."
Abstract

Cited by 43 (4 self)
 Add to MetaCart
Figure 1: Animations synthesized by our motion interpolation in a 5D parametric space. One parameter changes the style of motion from rough to delicate as shown by the bar indicator. The other four parameters are the heights and widths of two successive steps of stairs for gait motions, and the 2D start and end locations of the box for lifting motions. None of the motions required postcleaning of foot or handsliding. A common motion interpolation technique for realistic human animation is to blend similar motion samples with weighting functions whose parameters are embedded in an abstract space. Existing methods, however, are insensitive to statistical properties, such as correlations between motions. In addition, they lack the capability to quantitatively evaluate the reliability of synthesized motions. This paper proposes a method that treats motion interpolations as statistical predictions of missing data in an arbitrarily definable parametric space. A practical technique of geostatistics, called universal kriging, is then introduced for statistically estimating the correlations between the dissimilarity of motions and the distance
Fast And Exact Simulation Of Stationary Gaussian Processes Through Circulant Embedding Of The Covariance Matrix
, 1997
"... . Geostatistical simulations often require the generation of numerous realizations of a stationary Gaussian process over a regularly meshed sample grid## This paper shows that for many important correlation functions in geostatistics, realizations of the associated process over m +1 equispaced point ..."
Abstract

Cited by 39 (1 self)
 Add to MetaCart
. Geostatistical simulations often require the generation of numerous realizations of a stationary Gaussian process over a regularly meshed sample grid## This paper shows that for many important correlation functions in geostatistics, realizations of the associated process over m +1 equispaced points on a line can be produced at the cost of an initial FFT of length 2m with each new realization requiring an additional FFT of the same length. In particular, the paper first notes that if an (m+1)×(m+1) Toeplitz correlation matrix R can be embedded in a nonnegative definite 2M×2M circulant matrix S, exact realizations of the normal multivariate y #N(0,R) can be generated via FFTs of length 2M . Theoretical results are then presented to demonstrate that for many commonly used correlation structures the minimal embedding in which M = m is nonnegative definite. Extensions to simulations of stationary fields in higher dimensions are also provided and illustrated. Key words. geostatistics, ...
Computation With Infinite Neural Networks
, 1997
"... For neural networks with a wide class of weight priors, it can be shown that in the limit of an infinite number of hidden units the prior over functions tends to a Gaussian process. In this paper analytic forms are derived for the covariance function of the Gaussian processes corresponding to networ ..."
Abstract

Cited by 33 (1 self)
 Add to MetaCart
For neural networks with a wide class of weight priors, it can be shown that in the limit of an infinite number of hidden units the prior over functions tends to a Gaussian process. In this paper analytic forms are derived for the covariance function of the Gaussian processes corresponding to networks with sigmoidal and Gaussian hidden units. This allows predictions to be made efficiently using networks with an infinite number of hidden units, and shows that, somewhat paradoxically, it may be easier to carry out Bayesian prediction with infinite networks rather than finite ones. 1 Introduction To someone training a neural network by maximizing the likelihood of a finite amount of data it makes no sense to use a network with an infinite number of hidden units; the network will "overfit" the data and so will be expected to generalize poorly. However, the idea of selecting the network size depending on the amount of training data makes little sense to a Bayesian; a model should be chosen...
Bayesian Variogram Modeling for an Isotropic Spatial Process
 Journal of Agricultural, Biological and Environmental Statistics
, 1997
"... The variogram is a basic tool in geostatistics. In the case of an assumed isotropic process, it is used to compare variability of the difference between pairs of observations as a function of their distance. Customary approaches to variogram modeling create an empirical variogram and then fit a vali ..."
Abstract

Cited by 26 (5 self)
 Add to MetaCart
The variogram is a basic tool in geostatistics. In the case of an assumed isotropic process, it is used to compare variability of the difference between pairs of observations as a function of their distance. Customary approaches to variogram modeling create an empirical variogram and then fit a valid parametric or nonparametric variogram model to it. Here we adopt a Bayesian approach to variogram modeling. In particular, we seek to analyze a recent data set of scallop catches. We have the results of the analysis of an earlier data set from the region to supply useful prior information. In addition, the Bayesian approach enables inference about any aspect of spatial dependence of interest rather than merely providing a fitted variogram. We utilize discrete mixtures of Bessel functions which allow a rich and flexible class of variogram models. To differentiate between models, we introduce a utility based model choice criterion that encourages parsimony. We conclude with a fully Bayesian ...
Computing With Infinite Networks
 Advances in Neural Information Processing Systems 9
, 1996
"... For neural networks with a wide class of weightpriors, it can be shown that in the limit of an infinite number of hidden units the prior over functions tends to a Gaussian process. In this paper analytic forms are derived for the covariance function of the Gaussian processes corresponding to networ ..."
Abstract

Cited by 22 (2 self)
 Add to MetaCart
For neural networks with a wide class of weightpriors, it can be shown that in the limit of an infinite number of hidden units the prior over functions tends to a Gaussian process. In this paper analytic forms are derived for the covariance function of the Gaussian processes corresponding to networks with sigmoidal and Gaussian hidden units. This allows predictions to be made efficiently using networks with an infinite number of hidden units, and shows that, somewhat paradoxically, it may be easier to compute with infinite networks than finite ones.
Geostatistics in soil science: stateoftheart and perspectives
 Geoderma
, 1999
"... This paper presents an overview of the most recent developments in the field of geostatistics and describes their application to soil science. Geostatistics provides descriptive tools such as semivariograms to characterize the spatial pattern of continuous and categorical soil attributes. Various in ..."
Abstract

Cited by 21 (4 self)
 Add to MetaCart
This paper presents an overview of the most recent developments in the field of geostatistics and describes their application to soil science. Geostatistics provides descriptive tools such as semivariograms to characterize the spatial pattern of continuous and categorical soil attributes. Various interpolation Ž kriging. techniques capitalize on the spatial correlation between observations to predict attribute values at unsampled locations using information related to one or several attributes. An important contribution of geostatistics is the assessment of the uncertainty about unsampled values, which usually takes the form of a map of the probability of exceeding critical values, such as regulatory thresholds in soil pollution or criteria for soil quality. This uncertainty assessment can be combined with expert knowledge for decision making such as delineation of contaminated areas where remedial measures should be taken or areas of good soil quality where specific management plans can be developed. Last, stochastic simulation allows one to generate several models Ž images. of the spatial distribution of soil attribute values, all of which are consistent with the information available. A given scenario Ž remediation process, land use policy. can be applied to the set of realizations, allowing the uncertainty of the response Žremediation efficiency, soil productivity.