Results 1  10
of
46
Learning with mixtures of trees
 Journal of Machine Learning Research
, 2000
"... This paper describes the mixturesoftrees model, a probabilistic model for discrete multidimensional domains. Mixturesoftrees generalize the probabilistic trees of Chow and Liu [6] in a different and complementary direction to that of Bayesian networks. We present efficient algorithms for learnin ..."
Abstract

Cited by 146 (2 self)
 Add to MetaCart
(Show Context)
This paper describes the mixturesoftrees model, a probabilistic model for discrete multidimensional domains. Mixturesoftrees generalize the probabilistic trees of Chow and Liu [6] in a different and complementary direction to that of Bayesian networks. We present efficient algorithms for learning mixturesoftrees models in maximum likelihood and Bayesian frameworks. We also discuss additional efficiencies that can be obtained when data are “sparse, ” and we present data structures and algorithms that exploit such sparseness. Experimental results demonstrate the performance of the model for both density estimation and classification. We also discuss the sense in which treebased classifiers perform an implicit form of feature selection, and demonstrate a resulting insensitivity to irrelevant attributes.
P.: An Expectation Maximization approach to the synergy between image segmentation and object categorization
 In: ICCV
, 2005
"... In this work we deal with the problem of modelling and exploiting the interaction between the processes of image segmentation and object categorization. We propose a novel framework to address this problem that is based on the combination of the Expectation Maximization (EM) algorithm and generative ..."
Abstract

Cited by 13 (7 self)
 Add to MetaCart
(Show Context)
In this work we deal with the problem of modelling and exploiting the interaction between the processes of image segmentation and object categorization. We propose a novel framework to address this problem that is based on the combination of the Expectation Maximization (EM) algorithm and generative models for object categories. Using a concise formulation of the interaction between these two processes, segmentation is interpreted as the E step, assigning observations to models, whereas object detection/analysis is modelled as the Mstep, fitting models to observations. We present in detail the segmentation and detection processes comprising the E and M steps and demonstrate results on the joint detection and segmentation of the object categories of faces and cars.
Building Blocks For Variational Bayesian Learning Of Latent Variable Models
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2006
"... We introduce standardised building blocks designed to be used with variational Bayesian learning. The blocks include Gaussian variables, summation, multiplication, nonlinearity, and delay. A large variety of latent variable models can be constructed from these blocks, including variance models a ..."
Abstract

Cited by 12 (8 self)
 Add to MetaCart
We introduce standardised building blocks designed to be used with variational Bayesian learning. The blocks include Gaussian variables, summation, multiplication, nonlinearity, and delay. A large variety of latent variable models can be constructed from these blocks, including variance models and nonlinear modelling, which are lacking from most existing variational systems. The introduced blocks are designed to fit together and to yield e#cient update rules. Practical implementation of various models is easy thanks to an associated software package which derives the learning formulas automatically once a specific model structure has been fixed. Variational Bayesian learning provides a cost function which is used both for updating the variables of the model and for optimising the model structure. All the computations can be carried out locally, resulting in linear computational complexity. We present
NoisyOR Component Analysis and its Application to Link Analysis
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2006
"... We develop a new component analysis framework, the NoisyOr Component Analyzer (NOCA), that targets highdimensional binary data. NOCA is a probabilistic latent variable model that assumes the expression of observed highdimensional binary data is driven by a small number of hidden binary sources ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
We develop a new component analysis framework, the NoisyOr Component Analyzer (NOCA), that targets highdimensional binary data. NOCA is a probabilistic latent variable model that assumes the expression of observed highdimensional binary data is driven by a small number of hidden binary sources combined via noisyor units. The component analysis procedure is equivalent to learning of NOCA parameters. Since the classical EM formulation of the NOCA learning problem is intractable, we develop its variational approximation. We test the NOCA framework on two problems: (1) a synthetic imagedecomposition problem and (2) a cocitation data analysis problem for thousands of CiteSeer documents. We demonstrate good performance of the new model on both problems. In addition, we contrast the model to two mixturebased latentfactor models: the probabilistic latent semantic analysis (PLSA) and latent Dirichlet allocation (LDA).
A Gaussian process latent variable model formulation of canonical correlation analysis
 Proceedings of the 14th European Symposium on Artificial Neural Networks
, 2006
"... Abstract. We investigate a nonparametric model with which to visualize the relationship between two datasets. We base our model on Gaussian Process Latent Variable Models (GPLVM)[1],[2], a probabilistically defined latent variable model which takes the alternative approach of marginalizing the param ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
(Show Context)
Abstract. We investigate a nonparametric model with which to visualize the relationship between two datasets. We base our model on Gaussian Process Latent Variable Models (GPLVM)[1],[2], a probabilistically defined latent variable model which takes the alternative approach of marginalizing the parameters and optimizing the latent variables; we optimize a latent variable set for each dataset, which preserves the correlations between the datasets, resulting in a GPLVM formulation of canonical correlation analysis which can be nonlinearised by choice of covariance function. 1
Evolutionary Continuous Optimization by Distribution Estimation with Variational Bayesian Independent Component Analyzers Mixture Model
 In Proceedings of Parallel Problem Solving from Nature VIII
, 2004
"... Abstract. In evolutionary continuous optimization by building and using probabilistic models, the multivariate Gaussian distribution and their variants or extensions such as the mixture of Gaussians have been used popularly. However, this Gaussian assumption is often violated in many real problems. ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
(Show Context)
Abstract. In evolutionary continuous optimization by building and using probabilistic models, the multivariate Gaussian distribution and their variants or extensions such as the mixture of Gaussians have been used popularly. However, this Gaussian assumption is often violated in many real problems. In this paper, we propose a new continuous estimation of distribution algorithms (EDAs) with the variational Bayesian independent component analyzers mixture model (vbICAMM) for allowing any distribution to be modeled. We examine how this sophisticated density estimation technique has influence on the performance of the optimization by employing the same selection and population alternation schemes used in the previous EDAs. Our experimental results support that the presented EDAs achieve better performance than previous EDAs with ICA and Gaussian mixture or kernelbased approaches. 1
Reasoning about Independence in Probabilistic Models of Relational Data
, 2013
"... The rules of dseparation provide a theoretical and algorithmic framework for deriving conditional independence facts from model structure. However, this theory only applies to Bayesian networks. Many realworld systems are characterized by interacting heterogeneous entities and probabilistic depend ..."
Abstract

Cited by 5 (3 self)
 Add to MetaCart
(Show Context)
The rules of dseparation provide a theoretical and algorithmic framework for deriving conditional independence facts from model structure. However, this theory only applies to Bayesian networks. Many realworld systems are characterized by interacting heterogeneous entities and probabilistic dependencies that cross the boundaries of entities. Consequently, researchers have developed extensions to Bayesian networks that can represent these relational dependencies. We show that the theory of dseparation inaccurately infers conditional independence when applied directly to the structure of probabilistic models of relational data. We introduce relational dseparation, a theory for deriving conditional independence facts from relational models, and we provide a new representation, the abstract ground graph, that enables a sound, complete, and computationally efficient method for answering dseparation queries about relational models.
G.: HighD Data Visualization Methods via Probabilistic Principal Surfaces for Data Mining Applications
 In Chang, S.K. (Ed) Multimedia Databases and Image Communication, Salerno. Series on Software Engineering and Knowledge Engineering  World Scientific
, 2004
"... One of the central problems in pattern recognition is that of input data probability density function estimation (pdf), i.e., the construction of a model of a probability distribution given a finite sample of data drawn from that distribution. Probabilistic Principal Surfaces (hereinafter PPS) is a ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
(Show Context)
One of the central problems in pattern recognition is that of input data probability density function estimation (pdf), i.e., the construction of a model of a probability distribution given a finite sample of data drawn from that distribution. Probabilistic Principal Surfaces (hereinafter PPS) is a nonlinear latent variable model providing a way to accomplish pdf estimation, and possesses two attractive aspects useful for a wide range of data mining applications: (1) visualization of high dimensional data and (2) their classification. PPS generates a non linear manifold passing through the data points defined in terms of a number of latent variables and of a nonlinear mapping from latent space to data space. Depending upon dimensionality of the latent space (usually at most 3−dimensional) one has 1−D, 2 − D or 3 − D manifolds. Among the 3 − D manifolds, PPS permits to build a spherical manifold where the latent variables are uniformly arranged on a unit sphere. This particular form of the manifold provides a very effective tool to reduce the problems deriving from curse of dimensionality when data dimension increases. In this paper we concentrate on PPS used as a visualization tool proposing a number of plot options and showing its effectiveness on two complex astronomical data sets. 1.