Results 1 
5 of
5
On Discriminative Bayesian Network Classifiers and Logistic Regression
 Machine Learning
, 2005
"... Discriminative learning of the parameters in the naive Bayes model is known to be equivalent to a logistic regression problem. Here we show that the same fact holds for much more general Bayesian network models, as long as the corresponding network structure satisfies a certain graphtheoretic prope ..."
Abstract

Cited by 15 (1 self)
 Add to MetaCart
Discriminative learning of the parameters in the naive Bayes model is known to be equivalent to a logistic regression problem. Here we show that the same fact holds for much more general Bayesian network models, as long as the corresponding network structure satisfies a certain graphtheoretic property. The property holds for naive Bayes but also for more complex structures such as treeaugmented naive Bayes (TAN) as well as for mixed diagnosticdiscriminative structures. Our results imply that for networks satisfying our property, the conditional likelihood cannot have local maxima so that the global maximum can be found by simple local optimization methods. We also show that if this property does not hold, then in general the conditional likelihood can have local, nonglobal maxima. We illustrate our theoretical results by empirical experiments with local optimization in a conditional naive Bayes model. Furthermore, we provide a heuristic strategy for pruning the number of parameters and relevant features in such models. For many data sets, we obtain good results with heavily pruned submodels containing many fewer parameters than the original naive Bayes model.
Classification using Hierarchical Naïve Bayes models
 Machine Learning 2006
, 2002
"... Classification problems have a long history in the machine learning literature. One of the simplest, and yet most consistently well performing set of classifiers is the Nave Bayes models. However, an inherent problem with these classifiers is the assumption that all attributes used to describe an in ..."
Abstract

Cited by 11 (1 self)
 Add to MetaCart
Classification problems have a long history in the machine learning literature. One of the simplest, and yet most consistently well performing set of classifiers is the Nave Bayes models. However, an inherent problem with these classifiers is the assumption that all attributes used to describe an instance are conditionally independent given the class of that instance. When this assumption is violated (which is often the case in practice) it can reduce classification accuracy due to "information doublecounting" and interaction omission.
Latent Classification Models for Binary Data
, 2009
"... One of the simplest, and yet most consistently wellperforming set of classifiers is the naïve Bayes models (a special class of Bayesian network models). However, these models rely on the (naïve) assumption that all the attributes used to describe an instance are conditionally independent given the ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
One of the simplest, and yet most consistently wellperforming set of classifiers is the naïve Bayes models (a special class of Bayesian network models). However, these models rely on the (naïve) assumption that all the attributes used to describe an instance are conditionally independent given the class of that instance. To relax this independence assumption, we have in previous work proposed a family of models, called latent classification models (LCMs). LCMs are defined for continuous domains and generalize the naïve Bayes model by using latent variables to model classconditional dependencies between the attributes. In addition to providing good classification accuracy, the LCM model has several appealing properties, including a relatively small parameter space making it less susceptible to overfitting. In this paper we take a firststep towards generalizing LCMs to hybrid domains, by proposing an LCM model for domains with binary attributes. We present algorithms for learning the proposed model, and we describe a variational approximationbased inference procedure. Finally, we empirically compare the accuracy of the proposed model to the accuracy of other classifiers for a number of different domains, including the problem of recognizing symbols in black and white images.
Latent Classification Models for Binary Data
"... One of the simplest, and yet most consistently wellperforming set of classifiers is the naïve Bayes models (a special class of Bayesian network models). However, these models rely on the (naïve) assumption that all the attributes used to describe an instance are conditionally independent given the ..."
Abstract
 Add to MetaCart
One of the simplest, and yet most consistently wellperforming set of classifiers is the naïve Bayes models (a special class of Bayesian network models). However, these models rely on the (naïve) assumption that all the attributes used to describe an instance are conditionally independent given the class of that instance. To relax this independence assumption, we have in previous work proposed a family of models, called latent classification models (LCMs). LCMs are defined for continuous domains and generalize the naïve Bayes model by using latent variables to model classconditional dependencies between the attributes. In addition to providing good classification accuracy, the LCM model has several appealing properties, including a relatively small parameter space making it less susceptible to overfitting. In this paper we take a firststep towards generalizing LCMs to hybrid domains, by proposing an LCM model for domains with binary attributes. We present algorithms for learning the proposed model, and we describe a variational approximationbased inference procedure. Finally, we empirically compare the accuracy of the proposed model to the accuracy of other classifiers for a number of different domains, including the problem of recognizing symbols in black and white images.
The Most Generative Maximum Margin Bayesian Networks
"... *These authors contributed equally to this paper Althoughdiscriminativelearningingraphical models generally improves classification results, the generative semantics of the model are compromised. In this paper, we introduce a novel approach of hybrid generativediscriminative learning for Bayesian ne ..."
Abstract
 Add to MetaCart
*These authors contributed equally to this paper Althoughdiscriminativelearningingraphical models generally improves classification results, the generative semantics of the model are compromised. In this paper, we introduce a novel approach of hybrid generativediscriminative learning for Bayesian networks. We use an SVMtype large margin formulation for discriminative training, introducing a likelihoodweighted ℓ 1norm for theSVMnormpenalization. Thissimultaneouslyoptimizesthedatalikelihoodandtherefore partly maintains the generative character of the model. For many network structures,ourmethodcanbeformulatedasaconvex problem, guaranteeingaglobally optimal solution. Intermsofclassification,theresulting models outperform stateofthe art generative and discriminative learning methods for Bayesian networks, and are comparable with linear and kernelized SVMs. Furthermore, the models achieve likelihoods close to the maximum likelihood solution and show robust behavior in classification experiments with missing features. 1.