Results 1  10
of
14
An Unsupervised Ensemble Learning Method for Nonlinear Dynamic StateSpace Models
 Neural Computation
, 2001
"... A Bayesian ensemble learning method is introduced for unsupervised extraction of dynamic processes from noisy data. The data are assumed to be generated by an unknown nonlinear mapping from unknown factors. The dynamics of the factors are modeled using a nonlinear statespace model. The nonlinear map ..."
Abstract

Cited by 87 (32 self)
 Add to MetaCart
A Bayesian ensemble learning method is introduced for unsupervised extraction of dynamic processes from noisy data. The data are assumed to be generated by an unknown nonlinear mapping from unknown factors. The dynamics of the factors are modeled using a nonlinear statespace model. The nonlinear mappings in the model are represented using multilayer perceptron networks. The proposed method is computationally demanding, but it allows the use of higher dimensional nonlinear latent variable models than other existing approaches. Experiments with chaotic data show that the new method is able to blindly estimate the factors and the dynamic process which have generated the data. It clearly outperforms currently available nonlinear prediction techniques in this very di#cult test problem.
Bayesian Model Assessment and Comparison Using CrossValidation Predictive Densities
 Neural Computation
, 2002
"... In this work, we discuss practical methods for the assessment, comparison, and selection of complex hierarchical Bayesian models. A natural way to assess the goodness of the model is to estimate its future predictive capability by estimating expected utilities. Instead of just making a point estimat ..."
Abstract

Cited by 26 (10 self)
 Add to MetaCart
In this work, we discuss practical methods for the assessment, comparison, and selection of complex hierarchical Bayesian models. A natural way to assess the goodness of the model is to estimate its future predictive capability by estimating expected utilities. Instead of just making a point estimate, it is important to obtain the distribution of the expected utility estimate, as it describes the uncertainty in the estimate. The distributions of the expected utility estimates can also be used to compare models, for example, by computing the probability of one model having a better expected utility than some other model. We propose an approach using crossvalidation predictive densities to obtain expected utility estimates and Bayesian bootstrap to obtain samples from their distributions. We also discuss the probabilistic assumptions made and properties of two practical crossvalidation methods, importance sampling and kfold crossvalidation. As illustrative examples, we use MLP neural networks and Gaussian Processes (GP) with Markov chain Monte Carlo sampling in one toy problem and two challenging realworld problems.
Bayesian support vector regression using a unified loss function
 IEEE Transactions on Neural Networks
, 2004
"... In this paper, we use a unified loss function, called the soft insensitive loss function, for Bayesian support vector regression. We follow standard Gaussian processes for regression to set up the Bayesian framework, in which the unified loss function is used in the likelihood evaluation. Under this ..."
Abstract

Cited by 20 (2 self)
 Add to MetaCart
In this paper, we use a unified loss function, called the soft insensitive loss function, for Bayesian support vector regression. We follow standard Gaussian processes for regression to set up the Bayesian framework, in which the unified loss function is used in the likelihood evaluation. Under this framework, the maximum a posteriori estimate of the function values corresponds to the solution of an extended support vector regression problem. The overall approach has the merits of support vector regression such as convex quadratic programming and sparsity in solution representation. It also has the advantages of Bayesian methods for model adaptation and error bars of its predictions. Experimental results on simulated and realworld data sets indicate that the approach works well even on large data sets.
Generative Probability Density Model in the SelfOrganizing Map
 Neurocomputing
, 2002
"... The SelfOrganizing Map, SOM, is a widely used tool in exploratory data analysis. A theoretical and practical challenge in the SOM has been the difficulty to treat the method as a statistical model fitting procedure. In this chapter we give a short review of statistical approaches for the SOM. Then ..."
Abstract

Cited by 11 (0 self)
 Add to MetaCart
The SelfOrganizing Map, SOM, is a widely used tool in exploratory data analysis. A theoretical and practical challenge in the SOM has been the difficulty to treat the method as a statistical model fitting procedure. In this chapter we give a short review of statistical approaches for the SOM. Then we present the probability density model for which the SOM training gives the maximum likelihood estimate. The density model can be used to choose the neighborhood width of the SOM so as to avoid overfitting and to improve the reliability of the results. The density model also gives tools for systematic analysis of the SOM. A major application of the SOM is the analysis of dependencies between variables. We discuss some difficulties in the visual analysis of the SOM and demonstrate how quantitative analysis of the dependencies can be carried out by calculating conditional distributions from the density model.
On Bayesian model assessment and choice using crossvalidation predictive densities
, 2001
"... We consider the problem of estimating the distribution of the expected utility of the Bayesian model (expected utility is also known as generalization error). We use the crossvalidation predictive densities to compute the expected utilities. We demonstrate that in flexible nonlinear models having ..."
Abstract

Cited by 7 (7 self)
 Add to MetaCart
We consider the problem of estimating the distribution of the expected utility of the Bayesian model (expected utility is also known as generalization error). We use the crossvalidation predictive densities to compute the expected utilities. We demonstrate that in flexible nonlinear models having many parameters, the importance sampling approximated leaveoneout crossvalidation (ISLOOCV) proposed in (Gelfand et al., 1992) may not work. We discuss how the reliability of the importance sampling can be evaluated and in case there is reason to suspect the reliability of the importance sampling, we suggest to use predictive densities from the kfold crossvalidation (kfoldCV). We also note that the kfoldCV has to be used if data points have certain dependencies. As the kfoldCV predictive densities are based on slightly smaller data sets than the full data set, we use a bias correction proposed in (Burman, 1989) when computing the expected utilities. In order to assess the reliability of the estimated expected utilities, we suggest a quick and generic approach based on the Bayesian bootstrap for obtaining samples from the distributions of the expected utilities. Our main goal is to estimate how good (in terms of application field) the predictive ability of the model is, but the distributions of the expected utilities can also be used for comparing different models. With the proposed method, it is easy to compute the probability that one method has better expected utility than some other method. If the predictive likelihood is used as a utility (instead
Bayesian Input Variable Selection Using Posterior Probabilities and Expected Utilities
, 2002
"... We consider the input variable selection in complex Bayesian hierarchical models. Our goal is to find a model with the smallest number of input variables having statistically or practically at least the same expected utility as the full model with all the available inputs. A good estimate for the ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
We consider the input variable selection in complex Bayesian hierarchical models. Our goal is to find a model with the smallest number of input variables having statistically or practically at least the same expected utility as the full model with all the available inputs. A good estimate for the expected utility can be computed using crossvalidation predictive densities. In the case of input selection and a large number of input combinations, the computation of the crossvalidation predictive densities for each model easily becomes computationally prohibitive. We propose to use the posterior probabilities obtained via variable dimension MCMC methods to find out potentially useful input combinations, for which the final model choice and assessment is done using the expected utilities.
Mobile User Movement Prediction Using Bayesian Learning for Neural Networks
"... Nowadays, path prediction is being extensively examined for use in the context of mobile and wireless computing towards more efficient network resource management schemes. Path prediction allows the network and services to further enhance the quality of service levels that the user enjoys. In this p ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
Nowadays, path prediction is being extensively examined for use in the context of mobile and wireless computing towards more efficient network resource management schemes. Path prediction allows the network and services to further enhance the quality of service levels that the user enjoys. In this paper we present a path prediction algorithm that exploits human creatures habits. In this paper, we present a novel hybrid Bayesian neural network model for predicting locations on Cellular Networks (can also be extended to other wireless networks such as WIFI and WiMAX). We investigate different parallel implementation techniques on mobile devices of the proposed approach and compare it to many standard neural network techniques such as: Backpropagation, Elman, Resilient, LevenbergMarqudat, and OneStep Secant models. In our experiments, we compare results of the proposed Bayesian Neural Network with 5 standard neural network techniques in predicting both next location and next service to request. Bayesian learning for Neural Networks predicts both location and service better than standard neural network techniques since it uses well founded probability model to represent uncertainty about the relationships being learned. The result of Bayesian training is a posterior distribution over network weights. We use Markov chain Monte Carlo methods (MCMC) to sample N values from the posterior weights distribution. These N samples vote for the best prediction. Simulations of the algorithm, performed using a Realistic Mobility Patterns, show increased prediction accuracy.
Bayesian Input Variable Selection Using CrossValidation Predictive Densities and Reversible Jump MCMC
, 2001
"... We consider the problem of input variable selection of a Bayesian model. With suitable priors it is possible to have a large number of input variables in Bayesian models, as less relevant inputs can have a smaller effect in the model. To make the model more explainable and easier to analyse, or to r ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
We consider the problem of input variable selection of a Bayesian model. With suitable priors it is possible to have a large number of input variables in Bayesian models, as less relevant inputs can have a smaller effect in the model. To make the model more explainable and easier to analyse, or to reduce the cost of making measurements or the cost of computation, it may be useful to select a smaller set of input variables. Our goal is to find a model with the smallest number of input variables having statistically or practically the same expected utility as the full model. A good estimate for the expected utility, with any desired utility, can be computed using crossvalidation predictive densities (Vehtari and Lampinen, 2001). In the case of input selection, there are 2 K input combinations and computing the crossvalidation predictive densities for each model easily becomes computationally prohibitive. We propose to use the reversible jump Markov chain Monte Carlo (RJMCMC) method to find out potentially useful input combinations, for which the final model choice and assessment is done using the crossvalidation predictive densities. The RJMCMC visits the models according to their posterior probabilities. As models with negligible probability are probably not visited in finite time, the computational savings can be considerable compared to going through all possible models. The posterior probabilities of the models, given by the RJMCMC, are proportional to the product of the prior probabilities of the models and the prior predictive likelihoods of the models. The prior predictive likelihood measures the goodness of the model if no training data were used, and thus can be used to estimate the lower limit of the expected predictive likelihood. These estimates indicate ...
Model Selection via Predictive Explanatory Power 20
 Helsinki University of Technology, Laboratory of Computational Engineering
, 1998
"... We consider model selection as a decision problem from a predictive perspective. The optimal Bayesian way of handling model uncertainty is to integrate over model space. Model selection can then be seen as point estimation in the model space. We propose a model selection method based on KullbackLei ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
We consider model selection as a decision problem from a predictive perspective. The optimal Bayesian way of handling model uncertainty is to integrate over model space. Model selection can then be seen as point estimation in the model space. We propose a model selection method based on KullbackLeibler divergence from the predictive distribution of the full model to the predictive distributions of the submodels. The loss of predictive explanatory power is defined as the expectation of this predictive discrepancy. The goal is to find the simplest submodel which has a similar predictive distribution as the full model, that is, the simplest submodel whose loss of explanatory power is acceptable. To compute the expected predictive discrepancy between complex models, for which analytical solutions do not exist, we propose to use predictive distributions obtained via kfold crossvalidation. We compare the performance of the method to posterior probabilities (Bayes factors), deviance information criteria (DIC) and direct maximization of the expected utility via crossvalidation.
MCMC Methods for MLPnetwork and Gaussian Process and Stuff – A documentation for Matlab
, 2006
"... Version 2.1 MCMCstuff toolbox is a collection of Matlab functions for Bayesian inference with Markov chain Monte Carlo (MCMC) methods. This documentation introduces some of the features available in the toolbox. Introduction includes demonstrations of using Bayesian Multilayer Perceptron (MLP) netwo ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Version 2.1 MCMCstuff toolbox is a collection of Matlab functions for Bayesian inference with Markov chain Monte Carlo (MCMC) methods. This documentation introduces some of the features available in the toolbox. Introduction includes demonstrations of using Bayesian Multilayer Perceptron (MLP) network and Gaussian process in simple regression and classification problems with a hierarchical automatic relevance determination (ARD) prior for covariate related parameters. The regression problems demonstrate the use of Gaussian and Student’s tdistribution residual models and classification