Results 1  10
of
82
Dynamic Bayesian Networks: Representation, Inference and Learning
, 2002
"... Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and biosequence analysis, and KFMs have bee ..."
Abstract

Cited by 564 (3 self)
 Add to MetaCart
Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and biosequence analysis, and KFMs have been used for problems ranging from tracking planes and missiles to predicting the economy. However, HMMs
and KFMs are limited in their “expressive power”. Dynamic Bayesian Networks (DBNs) generalize HMMs by allowing the state space to be represented in factored form, instead of as a single discrete random variable. DBNs generalize KFMs by allowing arbitrary probability distributions, not just (unimodal) linearGaussian. In this thesis, I will discuss how to represent many different kinds of models as DBNs, how to perform exact and approximate inference in DBNs, and how to learn DBN models from sequential data.
In particular, the main novel technical contributions of this thesis are as follows: a way of representing
Hierarchical HMMs as DBNs, which enables inference to be done in O(T) time instead of O(T 3), where T is the length of the sequence; an exact smoothing algorithm that takes O(log T) space instead of O(T); a simple way of using the junction tree algorithm for online inference in DBNs; new complexity bounds on exact online inference in DBNs; a new deterministic approximate inference algorithm called factored frontier; an analysis of the relationship between the BK algorithm and loopy belief propagation; a way of
applying RaoBlackwellised particle filtering to DBNs in general, and the SLAM (simultaneous localization
and mapping) problem in particular; a way of extending the structural EM algorithm to DBNs; and a variety of different applications of DBNs. However, perhaps the main value of the thesis is its catholic presentation of the field of sequential data modelling.
An interiorpoint method for largescale l1regularized logistic regression
 Journal of Machine Learning Research
, 2007
"... Logistic regression with ℓ1 regularization has been proposed as a promising method for feature selection in classification problems. In this paper we describe an efficient interiorpoint method for solving largescale ℓ1regularized logistic regression problems. Small problems with up to a thousand ..."
Abstract

Cited by 153 (6 self)
 Add to MetaCart
Logistic regression with ℓ1 regularization has been proposed as a promising method for feature selection in classification problems. In this paper we describe an efficient interiorpoint method for solving largescale ℓ1regularized logistic regression problems. Small problems with up to a thousand or so features and examples can be solved in seconds on a PC; medium sized problems, with tens of thousands of features and examples, can be solved in tens of seconds (assuming some sparsity in the data). A variation on the basic method, that uses a preconditioned conjugate gradient method to compute the search step, can solve very large problems, with a million features and examples (e.g., the 20 Newsgroups data set), in a few minutes, on a PC. Using warmstart techniques, a good approximation of the entire regularization path can be computed much more efficiently than by solving a family of problems independently.
Tutorial on Variational Approximation Methods
 In Advanced Mean Field Methods: Theory and Practice
, 2000
"... We provide an introduction to the theory and use of variational methods for inference and estimation in the context of graphical models. Variational methods become useful as ecient approximate methods when the structure of the graph model no longer admits feasible exact probabilistic calculations. T ..."
Abstract

Cited by 73 (1 self)
 Add to MetaCart
We provide an introduction to the theory and use of variational methods for inference and estimation in the context of graphical models. Variational methods become useful as ecient approximate methods when the structure of the graph model no longer admits feasible exact probabilistic calculations. The emphasis of this tutorial is on illustrating how inference and estimation problems can be transformed into variational form along with describing the resulting approximation algorithms and their properties insofar as these are currently known. 1 Introduction The term variational methods refers to a large collection of optimization techniques. The classical context for these methods involves nding the extremum of an integral depending on an unknown function and its derivatives. This classical de nition, however, and the accompanying calculus of variation no longer adequately characterizes modern variational methods. Modern variational approaches have become indispensable tools in...
Bayesian indoor positioning systems
 In Infocom
, 2005
"... Abstract — In this paper, we introduce a new approach to location estimation where, instead of locating a single client, we simultaneously locate a set of wireless clients. We present a Bayesian hierarchical model for indoor location estimation in wireless networks. We demonstrate that our model ach ..."
Abstract

Cited by 67 (13 self)
 Add to MetaCart
Abstract — In this paper, we introduce a new approach to location estimation where, instead of locating a single client, we simultaneously locate a set of wireless clients. We present a Bayesian hierarchical model for indoor location estimation in wireless networks. We demonstrate that our model achieves accuracy that is similar to other published models and algorithms. By harnessing prior knowledge, our model eliminates the requirement for training data as compared with existing approaches, thereby introducing the notion of a fully adaptive zero profiling approach to location estimation. Index Terms — Experimentation with real networks/Testbed, Statistics, WLAN, localization,
Parameter estimation in TV image restoration using variational distribution approximation
 IEEE TRANS. IMAGE PROCESSING
, 2008
"... In this paper, we propose novel algorithms for total variation (TV) based image restoration and parameter estimation utilizing variational distribution approximations. Within the hierarchical Bayesian formulation, the reconstructed image and the unknown hyperparameters for the image prior and the no ..."
Abstract

Cited by 45 (27 self)
 Add to MetaCart
In this paper, we propose novel algorithms for total variation (TV) based image restoration and parameter estimation utilizing variational distribution approximations. Within the hierarchical Bayesian formulation, the reconstructed image and the unknown hyperparameters for the image prior and the noise are simultaneously estimated. The proposed algorithms provide approximations to the posterior distributions of the latent variables using variational methods. We show that some of the current approaches to TVbased image restoration are special cases of our framework. Experimental results show that the proposed approaches provide competitive performance without any assumptions about unknown hyperparameters and clearly outperform existing methods when additional information is included.
Learning with Matrix Factorization
, 2004
"... Matrices that can be factored into a product of two simpler matrices can serve as a useful and often natural model in the analysis of tabulated or highdimensional data. Models based on matrix factorization (Factor Analysis, PCA) have been extensively used in statistical analysis and machine learning ..."
Abstract

Cited by 38 (4 self)
 Add to MetaCart
Matrices that can be factored into a product of two simpler matrices can serve as a useful and often natural model in the analysis of tabulated or highdimensional data. Models based on matrix factorization (Factor Analysis, PCA) have been extensively used in statistical analysis and machine learning for over a century, with many new formulations and models suggested in recent
Discriminative, Generative and Imitative Learning
, 2002
"... I propose a common framework that combines three different paradigms in machine learning: generative, discriminative and imitative learning. A generative probabilistic distribution is a principled way to model many machine learning and machine perception problems. Therein, one provides domain specif ..."
Abstract

Cited by 34 (1 self)
 Add to MetaCart
I propose a common framework that combines three different paradigms in machine learning: generative, discriminative and imitative learning. A generative probabilistic distribution is a principled way to model many machine learning and machine perception problems. Therein, one provides domain specific knowledge in terms of structure and parameter priors over the joint space of variables. Bayesian networks and Bayesian statistics provide a rich and flexible language for specifying this knowledge and subsequently refining it with data and observations. The final result is a distribution that is a good generator of novel exemplars.
Variational Mixture of Bayesian Independent Component Analysers
 Neural Computation
, 2002
"... There has been growing interest in subspace data modelling over the past few years. Methods such as Principal Component Analysis, Factor Analysis and Independent Component Analysis have gained in popularity and have found many applications in image modelling, signal processing and data compression t ..."
Abstract

Cited by 22 (5 self)
 Add to MetaCart
There has been growing interest in subspace data modelling over the past few years. Methods such as Principal Component Analysis, Factor Analysis and Independent Component Analysis have gained in popularity and have found many applications in image modelling, signal processing and data compression to name just a few. As applications and computing power grow, more and more sophisticated analyses and meaningful representations are sought. Mixture modelling methods have been proposed for principal and factor analysers which exploit local Gaussian features in the subspace manifolds. Meaningful representations may be lost, however, if these local features are nonGaussian and/or discontinuous. In this paper we propose extending the Gaussian analysers mixture model to an Independent Component Analysers mixture model. We employ recent developments in variational Bayesian inference and structure determination to construct a novel approach for modelling nonGaussian, discontinuous manifolds. We automaticaly determine the local dimensionality of each manifold and use variational inference to calculate the optimum number of ICA components needed in our mixture model. We demonstrate our framework on complex synthetic data and illustrate its application to real data by decomposing functional Magnetic Resonance Images into meaningful  and medically useful  features.
The Bayesian Backfitting Relevance Vector Machine
 IN PROCEEDINGS OF THE 21ST INTERNATIONAL CONFERENCE ON MACHINE LEARNING
, 2004
"... Traditional nonparametric statistical learning techniques are often computationally attractive, but lack the same generalization and model selection abilities as stateoftheart Bayesian algorithms which, however, are usually computationally prohibitive. This paper makes several important co ..."
Abstract

Cited by 18 (7 self)
 Add to MetaCart
Traditional nonparametric statistical learning techniques are often computationally attractive, but lack the same generalization and model selection abilities as stateoftheart Bayesian algorithms which, however, are usually computationally prohibitive. This paper makes several important contributions that allow Bayesian learning to scale to more complex, realworld learning scenarios. Firstly, we show that backfitting  a traditional nonparametric, yet highly e#cient regression tool  can be derived in a novel formulation within an expectation maximization (EM) framework and thus can finally be given a probabilistic interpretation. Secondly, we show that the general framework of sparse Bayesian learning and in particular the relevance vector machine (RVM), can be derived as a highly e#cient algorithm using a Bayesian version of backfitting at its core. As we demonstrate on several regression and classification benchmarks, Bayesian backfitting o#ers a compelling alternative to current regression methods, especially when the size and dimensionality of the data challenge computational resources.