Results 1  10
of
16
Empirical Analysis of Predictive Algorithm for Collaborative Filtering
 Proceedings of the 14 th Conference on Uncertainty in Artificial Intelligence
, 1998
"... 1 ..."
Modelling gene expression data using dynamic bayesian networks
, 1999
"... Recently, there has been much interest in reverse engineering genetic networks from time series data. In this paper, we show that most of the proposed discrete time models — including the boolean network model [Kau93, SS96], the linear model of D’haeseleer et al. [DWFS99], and the nonlinear model of ..."
Abstract

Cited by 157 (1 self)
 Add to MetaCart
Recently, there has been much interest in reverse engineering genetic networks from time series data. In this paper, we show that most of the proposed discrete time models — including the boolean network model [Kau93, SS96], the linear model of D’haeseleer et al. [DWFS99], and the nonlinear model of Weaver et al. [WWS99] — are all special cases of a general class of models called Dynamic Bayesian Networks (DBNs). The advantages of DBNs include the ability to model stochasticity, to incorporate prior knowledge, and to handle hidden variables and missing data in a principled way. This paper provides a review of techniques for learning DBNs. Keywords: Genetic networks, boolean networks, Bayesian networks, neural networks, reverse engineering, machine learning. 1
An experimental comparison of several clustering and intialization methods
, 1998
"... We examine methods for clustering in high dimensions. In the first part of the paper, we perform an experimental comparison between three batch clustering algorithms: the Expectation–Maximization (EM) algorithm, a “winner take all ” version of the EM algorithm reminiscent of the Kmeans algorithm, a ..."
Abstract

Cited by 78 (0 self)
 Add to MetaCart
We examine methods for clustering in high dimensions. In the first part of the paper, we perform an experimental comparison between three batch clustering algorithms: the Expectation–Maximization (EM) algorithm, a “winner take all ” version of the EM algorithm reminiscent of the Kmeans algorithm, and modelbased hierarchical agglomerative clustering. We learn naiveBayes models with a hidden root node, using highdimensional discretevariable data sets (both real and synthetic). We find that the EM algorithm significantly outperforms the other methods, and proceed to investigate the effect of various initialization schemes on the final solution produced by the EM algorithm. The initializations that we consider are (1) parameters sampled from an uninformative prior, (2) random perturbations of the marginal distribution of the data, and (3) the output of hierarchical agglomerative clustering. Although the methods are substantially different, they lead to learned models that are strikingly similar in quality. 1
Learning recursive Bayesian multinets for data clustering by means of constructive induction
, 2001
"... This paper introduces and evaluates a new class of knowledge model, the recursive Bayesian multinet (RBMN), which encodes the joint probability distribution of a given database. RBMNs extend Bayesian networks (BNs) as well as partitional clustering systems. Briefly, a RBMN is a decision tree with co ..."
Abstract

Cited by 18 (7 self)
 Add to MetaCart
This paper introduces and evaluates a new class of knowledge model, the recursive Bayesian multinet (RBMN), which encodes the joint probability distribution of a given database. RBMNs extend Bayesian networks (BNs) as well as partitional clustering systems. Briefly, a RBMN is a decision tree with component BNs at the leaves. A RBMN is learnt using a greedy, heuristic approach akin to that used by many supervised decision tree learners, but where BNs are learnt at leaves using constructive induction. A key idea is to treat expected data as real data. This allows us to complete the database and to take advantage of a closed form for the marginal likelihood of the expected complete data that factorizes into separate marginal likelihoods for each family (a node and its parents). Our approach is evaluated on synthetic and realworld databases.
Dimensionality Reduction in Unsupervised Learning of Conditional Gaussian Networks
 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 2001
"... This paper introduces a novel enhancement for unsupervised learning of conditional Gaussian networks that benefits from feature selection. Our proposal is based on the assumption that, in the absence of labels reflecting the cluster membership of each case of the database, those features that exh ..."
Abstract

Cited by 16 (2 self)
 Add to MetaCart
This paper introduces a novel enhancement for unsupervised learning of conditional Gaussian networks that benefits from feature selection. Our proposal is based on the assumption that, in the absence of labels reflecting the cluster membership of each case of the database, those features that exhibit low correlation with the rest of features can be considered irrelevant for the learning process. Thus, we suggest performing this process using only the relevant features. Then, every irrelevant feature is added to the learnt model to obtain an explanatory model for the original database which is our primary goal. A simple and, thus, efficient measure to assess the relevance of the features for the learning process is presented. Additionally, the form of this measure allows us to calculate a relevance threshold to automatically identify the relevant features. The experimental results reported for synthetic and realworld databases show the ability of our proposal to distinguish between relevant and irrelevant features and to accelerate learning; however, still obtaining good explanatory models for the original database.
Inferring mixtures of markov chains
 2004 Black, Paul E. “Markov Chain.” National Institute of Standards and Technology
, 2002
"... Abstract. We define the problem of inferring a “mixture of Markov chains ” based on observing a stream of interleaved outputs from these chains. We show a sharp characterization of the inference process. The problems we consider also has applications such as gene finding, intrusion detection, etc., ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
Abstract. We define the problem of inferring a “mixture of Markov chains ” based on observing a stream of interleaved outputs from these chains. We show a sharp characterization of the inference process. The problems we consider also has applications such as gene finding, intrusion detection, etc., and more generally in analyzing interleaved sequences. 1
An improved Bayesian Structural EM algorithm for learning Bayesian networks for clustering
 Pattern Recognition Letters
"... The application of the Bayesian Structural EM algorithm to learn Bayesian networks for clustering implies a search over the space of Bayesian network structures alternating between two steps: an optimization of the Bayesian network parameters (usually by means of the EM algorithm) and a structural s ..."
Abstract

Cited by 11 (6 self)
 Add to MetaCart
The application of the Bayesian Structural EM algorithm to learn Bayesian networks for clustering implies a search over the space of Bayesian network structures alternating between two steps: an optimization of the Bayesian network parameters (usually by means of the EM algorithm) and a structural search for model selection. In this paper, we propose to perform the optimization of the Bayesian network parameters using an alternative approach to the EM algorithm: the BC+EM method. We provide experimental results to show that our proposal results in a more effective and efficient version of the Bayesian Structural EM algorithm for learning Bayesian networks for clustering. Key words: clustering, Bayesian networks, EM algorithm, Bayesian Structural EM algorithm, Bound and Collapse method. 1 Introduction One of the basic problems that arises in a great variety of fields, including pattern recognition, machine learning and statistics, is the socalled data clustering problem [1,5,6,10,1...
Robust bayesian linear classifier ensembles
 Proc. 16th European Conf. Machine Learning, Lecture Notes in Computer Science
, 2005
"... Abstract. Ensemble classifiers combine the classification results of several classifiers. Simple ensemble methods such as uniform averaging over a set of models usually provide an improvement over selecting the single best model. Usually probabilistic classifiers restrict the set of possible models ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
Abstract. Ensemble classifiers combine the classification results of several classifiers. Simple ensemble methods such as uniform averaging over a set of models usually provide an improvement over selecting the single best model. Usually probabilistic classifiers restrict the set of possible models that can be learnt in order to lower computational complexity costs. In these restricted spaces, where incorrect modelling assumptions are possibly made, uniform averaging sometimes performs even better than bayesian model averaging. Linear mixtures over sets of models provide an space that includes uniform averaging as a particular case. We develop two algorithms for learning maximum a posteriori weights for linear mixtures, based on expectation maximization and on constrained optimizition. We provide a nontrivial example of the utility of these two algorithms by applying them for one dependence estimators. We develop the conjugate distribution for one dependence estimators and empirically show that uniform averaging is clearly superior to BMA for this family of models. After that we empirically show that the maximum a posteriori linear mixture weights improve accuracy significantly over uniform aggregation.
Learning Bayesian networks for clustering by means of constructive induction
 Pattern Recognition Letters
, 1999
"... The purpose of this paper is to present and evaluate a heuristic algorithm for learning Bayesian networks for clustering. Our approach is based upon improving the NaiveBayes model by means of constructive induction. A key idea in this approach is to treat expected data as real data. This allows us ..."
Abstract

Cited by 7 (4 self)
 Add to MetaCart
The purpose of this paper is to present and evaluate a heuristic algorithm for learning Bayesian networks for clustering. Our approach is based upon improving the NaiveBayes model by means of constructive induction. A key idea in this approach is to treat expected data as real data. This allows us to complete the database and to take advantage of factorable closed forms for the marginal likelihood. In order to get such an advantage, we search for parameter values using the EM algorithm or another alternative approach that we have developed: a hybridization of the Bound and Collapse method and the EM algorithm, which results in a method that exhibits a faster convergence rate and more effective behaviour than the EM algorithm. Also, we consider the possibility of interleaving runnings of these two methods after each structural change. We evaluate our approach on synthetic and realworld databases. Key words: clustering, Bayesian networks, learning from incomplete data, constructive ind...
Performance Evaluation of Compromise Conditional Gaussian Networks for Data Clustering
, 2001
"... This paper is devoted to the proposal of two classes of compromise conditional Gaussian networks for data clustering as well as to their experimental evaluation and comparison on synthetic and realworld databases. According to the reported results, the models show an ideal tradeoff between eciency ..."
Abstract

Cited by 5 (3 self)
 Add to MetaCart
This paper is devoted to the proposal of two classes of compromise conditional Gaussian networks for data clustering as well as to their experimental evaluation and comparison on synthetic and realworld databases. According to the reported results, the models show an ideal tradeoff between eciency and effectiveness, i.e., a balance between the cost of the unsupervised model learning process and the quality of the learnt models. Moreover, the proposed models are very appealing due to their closeness to human intuition and computational advantages for the unsupervised model induction process, while preserving a rich enough modelling power.