Results 1  10
of
52
A Bayesian Framework for the Analysis of Microarray Expression Data: Regularized tTest and Statistical Inferences of Gene Changes
 Bioinformatics
, 2001
"... Motivation: DNA microarrays are now capable of providing genomewide patterns of gene expression across many different conditions. The first level of analysis of these patterns requires determining whether observed differences in expression are significant or not. Current methods are unsatisfactory ..."
Abstract

Cited by 294 (2 self)
 Add to MetaCart
Motivation: DNA microarrays are now capable of providing genomewide patterns of gene expression across many different conditions. The first level of analysis of these patterns requires determining whether observed differences in expression are significant or not. Current methods are unsatisfactory due to the lack of a systematic framework that can accommodate noise, variability, and low replication often typical of microarray data. Results: We develop a Bayesian probabilistic framework for microarray data analysis. At the simplest level, we model logexpression values by independent normal distributions, parameterized by corresponding means and variances with hierarchical prior distributions. We derive point estimates for both parameters and hyperparameters, and regularized expressions for the variance of each gene by combining the empirical variance with a local background variance associated with neighboring genes. An additional hyperparameter, inversely related to the number of empirical observations, determines the strength of the background variance. Simulations show that these point estimates, combined with a ttest, provide a systematic inference approach that compares favorably with simple ttest or fold methods, and partly compensate for the lack of replication. Availability: The approach is implemented in a software called CyberT accessible through a Web interface at www.genomics.uci.edu/software.html. The code is available as Open Source and is written in the freely available statistical language R. and Department of Biological Chemistry, College of Medicine, University of California, Irvine. To whom all correspondence should be addressed. Contact: pfbaldi@ics.uci.edu, tdlong@uci.edu. 1
Recovering 3D Human Pose from Monocular Images
"... We describe a learning based method for recovering 3D human body pose from single images and monocular image sequences. Our approach requires neither an explicit body model nor prior labelling of body parts in the image. Instead, it recovers pose by direct nonlinear regression against shape descrip ..."
Abstract

Cited by 165 (0 self)
 Add to MetaCart
We describe a learning based method for recovering 3D human body pose from single images and monocular image sequences. Our approach requires neither an explicit body model nor prior labelling of body parts in the image. Instead, it recovers pose by direct nonlinear regression against shape descriptor vectors extracted automatically from image silhouettes. For robustness against local silhouette segmentation errors, silhouette shape is encoded by histogramofshapecontexts descriptors. We evaluate several different regression methods: ridge regression, Relevance Vector Machine (RVM) regression and Support Vector Machine (SVM) regression over both linear and kernel bases. The RVMs provide much sparser regressors without compromising performance, and kernel bases give a small but worthwhile improvement in performance. Loss of depth and limb labelling information often makes the recovery of 3D pose from single silhouettes ambiguous. We propose two solutions to this: the first embeds the method in a tracking framework, using dynamics from the previous state estimate to disambiguate the pose; the second uses a mixture of regressors framework to return multiple solutions for each silhouette. We show that the resulting system tracks long sequences stably, and is also capable of accurately reconstructing 3D human pose from single images, giving multiple possible solutions in ambiguous cases. For realism and good generalization over a wide range of viewpoints, we train the regressors on images resynthesized from real human motion capture data. The method is demonstrated on a 54parameter full body pose model, both quantitatively on independent but similar test data, and qualitatively on real image sequences. Mean angular errors of 4–5 degrees are obtained — a factor of 3 better than the current state of the art for the much simpler upper body problem.
Ensemble learning for independent component analysis
 in Advances in Independent Component Analysis
, 2000
"... i Abstract This thesis is concerned with the problem of Blind Source Separation. Specifically we considerthe Independent Component Analysis (ICA) model in which a set of observations are modelled by xt = Ast: (1) where A is an unknown mixing matrix and st is a vector of hidden source components atti ..."
Abstract

Cited by 49 (2 self)
 Add to MetaCart
i Abstract This thesis is concerned with the problem of Blind Source Separation. Specifically we considerthe Independent Component Analysis (ICA) model in which a set of observations are modelled by xt = Ast: (1) where A is an unknown mixing matrix and st is a vector of hidden source components attime t. The ICA problem is to find the sources given only a set of observations. In chapter 1, the blind source separation problem is introduced. In chapter 2 the methodof Ensemble Learning is explained. Chapter 3 applies Ensemble Learning to the ICA model and chapter 4 assesses the use of Ensemble Learning for model selection.Chapters 57 apply the Ensemble Learning ICA algorithm to data sets from physics (a medical imaging data set consisting of images of a tooth), biology (data sets from cDNAmicroarrays) and astrophysics (Planck image separation and galaxy spectra separation).
Assessing approximate inference for binary Gaussian process classification
 Journal of Machine Learning Research
, 2005
"... Gaussian process priors can be used to define flexible, probabilistic classification models. Unfortunately exact Bayesian inference is analytically intractable and various approximation techniques have been proposed. In this work we review and compare Laplace’s method and Expectation Propagation for ..."
Abstract

Cited by 40 (3 self)
 Add to MetaCart
Gaussian process priors can be used to define flexible, probabilistic classification models. Unfortunately exact Bayesian inference is analytically intractable and various approximation techniques have been proposed. In this work we review and compare Laplace’s method and Expectation Propagation for approximate Bayesian inference in the binary Gaussian process classification model. We present a comprehensive comparison of the approximations, their predictive performance and marginal likelihood estimates to results obtained by MCMC sampling. We explain theoretically and corroborate empirically the advantages of Expectation Propagation compared to Laplace’s method. Keywords: Gaussian process priors, probabilistic classification, Laplace’s approximation, expectation propagation, marginal likelihood, evidence, MCMC
A new view of automatic relevance determination
 In NIPS 20
, 2008
"... Automatic relevance determination (ARD) and the closelyrelated sparse Bayesian learning (SBL) framework are effective tools for pruning large numbers of irrelevant features leading to a sparse explanatory subset. However, popular update rules used for ARD are either difficult to extend to more gene ..."
Abstract

Cited by 38 (8 self)
 Add to MetaCart
Automatic relevance determination (ARD) and the closelyrelated sparse Bayesian learning (SBL) framework are effective tools for pruning large numbers of irrelevant features leading to a sparse explanatory subset. However, popular update rules used for ARD are either difficult to extend to more general problems of interest or are characterized by nonideal convergence properties. Moreover, it remains unclear exactly how ARD relates to more traditional MAP estimationbased methods for learning sparse representations (e.g., the Lasso). This paper furnishes an alternative means of expressing the ARD cost function using auxiliary functions that naturally addresses both of these issues. First, the proposed reformulation of ARD can naturally be optimized by solving a series of reweighted ℓ1 problems. The result is an efficient, extensible algorithm that can be implemented using standard convex programming toolboxes and is guaranteed to converge to a local minimum (or saddle point). Secondly, the analysis reveals that ARD is exactly equivalent to performing standard MAP estimation in weight space using a particular feature and noisedependent, nonfactorial weight prior. We then demonstrate that this implicit prior maintains several desirable advantages over conventional priors with respect to feature selection. Overall these results suggest alternative cost functions and update procedures for selecting features and promoting sparse solutions in a variety of general situations. In particular, the methodology readily extends to handle problems such as nonnegative sparse coding and covariance component estimation. 1
Variational EM algorithms for nonGaussian latent variable models
 Advances in Neural Information Processing Systems 18
, 2006
"... We consider criteria for variational representations of nonGaussian latent variables, and derive variational EM algorithms in general form. We establish a general equivalence among convex bounding methods, evidence based methods, and ensemble learning/Variational Bayes methods, which has previously ..."
Abstract

Cited by 37 (13 self)
 Add to MetaCart
We consider criteria for variational representations of nonGaussian latent variables, and derive variational EM algorithms in general form. We establish a general equivalence among convex bounding methods, evidence based methods, and ensemble learning/Variational Bayes methods, which has previously been demonstrated only for particular cases. 1
Speaker and session variability in GMMbased speaker verification
 IEEE Trans. Audio, Speech, Lang. Process
, 2007
"... Abstract — We present a corpusbased approach to speaker verification in which maximum likelihood II criteria are used to train a large scale generative model of speaker and session variability which we call joint factor analysis. Enrolling a target speaker consists in calculating the posterior dist ..."
Abstract

Cited by 29 (7 self)
 Add to MetaCart
Abstract — We present a corpusbased approach to speaker verification in which maximum likelihood II criteria are used to train a large scale generative model of speaker and session variability which we call joint factor analysis. Enrolling a target speaker consists in calculating the posterior distribution of the hidden variables in the factor analysis model and verification tests are conducted using a new type of likelihood II ratio statistic. Using the NIST 1999 and 2000 speaker recognition evaluation data sets, we show that the effectiveness of this approach depends on the availability of a training corpus which is well matched with the evaluation set used for testing. Experiments on the NIST 1999 evaluation set using a mismatched corpus to train factor analysis models did not result in any improvement over standard methods but we found that, even with this type of mismatch, feature warping performs extremely well in conjunction with the factor analysis model and this enabled us to obtain very good results (equal error rates of about 6.2%). Index terms: speaker verification, Gaussian mixture, factor analysis I.
Bayesian framework for least squares support vector machine classifiers, Gaussian processes and kernel fisher discriminant analysis
 NEURAL COMPUTATION
, 2002
"... The Bayesian evidence framework has been successfully applied to the design of multilayer perceptrons (MLPs) in the work of MacKay. Nevertheless,the training of MLPs suffers from drawbacks like the nonconvex optimization problem and the choice of the number of hidden units. In Support Vector Machin ..."
Abstract

Cited by 19 (7 self)
 Add to MetaCart
The Bayesian evidence framework has been successfully applied to the design of multilayer perceptrons (MLPs) in the work of MacKay. Nevertheless,the training of MLPs suffers from drawbacks like the nonconvex optimization problem and the choice of the number of hidden units. In Support Vector Machines (SVMs) for classification,as introduced by Vapnik,a nonlinear decision boundary is obtained by mapping the input vector first in a nonlinear way to a high dimensional kernelinduced feature space in which a linear large margin classifier is constructed. Practical expressions are formulated in the dual space in terms of the related kernel function and the solution follows from a (convex) quadratic programming (QP) problem. In Least Squares SVMs (LSSVMs), the SVM problem formulation is modified by introducing a least squares cost function and equality instead of inequality constraints and the solution follows from a linear system in the dual space. Implicitly,the least squares formulation corresponds to a regression formulation and is also related to kernel
The Bayesian Backfitting Relevance Vector Machine
 IN PROCEEDINGS OF THE 21ST INTERNATIONAL CONFERENCE ON MACHINE LEARNING
, 2004
"... Traditional nonparametric statistical learning techniques are often computationally attractive, but lack the same generalization and model selection abilities as stateoftheart Bayesian algorithms which, however, are usually computationally prohibitive. This paper makes several important co ..."
Abstract

Cited by 18 (7 self)
 Add to MetaCart
Traditional nonparametric statistical learning techniques are often computationally attractive, but lack the same generalization and model selection abilities as stateoftheart Bayesian algorithms which, however, are usually computationally prohibitive. This paper makes several important contributions that allow Bayesian learning to scale to more complex, realworld learning scenarios. Firstly, we show that backfitting  a traditional nonparametric, yet highly e#cient regression tool  can be derived in a novel formulation within an expectation maximization (EM) framework and thus can finally be given a probabilistic interpretation. Secondly, we show that the general framework of sparse Bayesian learning and in particular the relevance vector machine (RVM), can be derived as a highly e#cient algorithm using a Bayesian version of backfitting at its core. As we demonstrate on several regression and classification benchmarks, Bayesian backfitting o#ers a compelling alternative to current regression methods, especially when the size and dimensionality of the data challenge computational resources.