Results 1 - 10
of
37
Learning Bayesian network classifiers by maximizing conditional likelihood
- In ICML2004
, 2004
"... Bayesian networks are a powerful probabilistic representation, and their use for classification has received considerable attention. However, they tend to perform poorly when learned in the standard way. This is attributable to a mismatch between the objective function used (likelihood or a function ..."
Abstract
-
Cited by 44 (0 self)
- Add to MetaCart
Bayesian networks are a powerful probabilistic representation, and their use for classification has received considerable attention. However, they tend to perform poorly when learned in the standard way. This is attributable to a mismatch between the objective function used (likelihood or a function thereof) and the goal of classification (maximizing accuracy or conditional likelihood). Unfortunately, the computational cost of optimizing structure and parameters for conditional likelihood is prohibitive. In this paper we show that a simple approximation— choosing structures by maximizing conditional likelihood while setting parameters by maximum likelihood—yields good results. On a large suite of benchmark datasets, this approach produces better class probability estimates than naive Bayes, TAN, and generatively-trained Bayesian networks. 1.
Recognizing Planned, Multiperson Action
- Computer Vision and Image Understanding
, 2001
"... This paper demonstrates how highly structured, multiperson action can be recognized from noisy perceptual data using visually grounded goal-based primitives and low-order temporal relationships that are integrated in a probabilistic framework. The representation, which is motivated by work in mo ..."
Abstract
-
Cited by 41 (2 self)
- Add to MetaCart
This paper demonstrates how highly structured, multiperson action can be recognized from noisy perceptual data using visually grounded goal-based primitives and low-order temporal relationships that are integrated in a probabilistic framework. The representation, which is motivated by work in model-based object recognition and probabilistic plan recognition, makes four principal assumptions: (1) the goals of individual agents are natural atomic representational units for specifying the temporal relationships between agents engaged in group activities, (2) a high-level description of temporal structure of the action using a small set of low-order temporal and logical constraints is adequate for representing the relationships between the agent goals for highly structured, multiagent action recognition, (3) Bayesian networks provide a suitable mechanism for integrating multiple sources of uncertain visual perceptual feature evidence, and (4) an automatically generated Bayesian
Combining Naive Bayes and n-Gram Language Models for Text Classification
- In 25th European Conference on Information Retrieval Research (ECIR
, 2003
"... We augment the naive Bayes model with an n-gram language model to address two shortcomings of naive Bayes text classifiers. ..."
Abstract
-
Cited by 26 (2 self)
- Add to MetaCart
We augment the naive Bayes model with an n-gram language model to address two shortcomings of naive Bayes text classifiers.
BNT structure learning package: documentation and experiments
- Technical Report FRE CNRS 2645). Laboratoire PSI, Universitè et INSA de Rouen
, 2004
"... Bayesian networks are a formalism for probabilistic reasonning that is more and more used for classification task in data-mining. In some situations, the network structure is given by an expert, otherwise, retrieving it from a database is a NP-hard problem, notably because of the search space comple ..."
Abstract
-
Cited by 16 (1 self)
- Add to MetaCart
Bayesian networks are a formalism for probabilistic reasonning that is more and more used for classification task in data-mining. In some situations, the network structure is given by an expert, otherwise, retrieving it from a database is a NP-hard problem, notably because of the search space complexity. In the last decade, lot of methods have been introduced to learn the network structure automatically, by simplifying the search space (augmented naive bayes, K2) or by using an heuristic in this search space (greedy search). Most of these methods deal with completely observed data, but some others can deal with incomplete data (SEM, MWST-EM). The Bayes Net Toolbox introduced by [Murphy, 2001a] for Matlab allows us using Bayesian Networks or learning them. But this toolbox is not ’state of the art ’ if we want to perform a Structural Learning, that’s why we propose this package.
On Discriminative Bayesian Network Classifiers and Logistic Regression
- Machine Learning
, 2005
"... Discriminative learning of the parameters in the naive Bayes model is known to be equivalent to a logistic regression problem. Here we show that the same fact holds for much more general Bayesian network models, as long as the corresponding network structure satisfies a certain graph-theoretic prope ..."
Abstract
-
Cited by 11 (1 self)
- Add to MetaCart
Discriminative learning of the parameters in the naive Bayes model is known to be equivalent to a logistic regression problem. Here we show that the same fact holds for much more general Bayesian network models, as long as the corresponding network structure satisfies a certain graph-theoretic property. The property holds for naive Bayes but also for more complex structures such as tree-augmented naive Bayes (TAN) as well as for mixed diagnostic-discriminative structures. Our results imply that for networks satisfying our property, the conditional likelihood cannot have local maxima so that the global maximum can be found by simple local optimization methods. We also show that if this property does not hold, then in general the conditional likelihood can have local, non-global maxima. We illustrate our theoretical results by empirical experiments with local optimization in a conditional naive Bayes model. Furthermore, we provide a heuristic strategy for pruning the number of parameters and relevant features in such models. For many data sets, we obtain good results with heavily pruned submodels containing many fewer parameters than the original naive Bayes model.
An improved Bayesian Structural EM algorithm for learning Bayesian networks for clustering
- Pattern Recognition Letters
"... The application of the Bayesian Structural EM algorithm to learn Bayesian networks for clustering implies a search over the space of Bayesian network structures alternating between two steps: an optimization of the Bayesian network parameters (usually by means of the EM algorithm) and a structural s ..."
Abstract
-
Cited by 10 (6 self)
- Add to MetaCart
The application of the Bayesian Structural EM algorithm to learn Bayesian networks for clustering implies a search over the space of Bayesian network structures alternating between two steps: an optimization of the Bayesian network parameters (usually by means of the EM algorithm) and a structural search for model selection. In this paper, we propose to perform the optimization of the Bayesian network parameters using an alternative approach to the EM algorithm: the BC+EM method. We provide experimental results to show that our proposal results in a more effective and efficient version of the Bayesian Structural EM algorithm for learning Bayesian networks for clustering. Key words: clustering, Bayesian networks, EM algorithm, Bayesian Structural EM algorithm, Bound and Collapse method. 1 Introduction One of the basic problems that arises in a great variety of fields, including pattern recognition, machine learning and statistics, is the so-called data clustering problem [1,5,6,10,1...
METIORE: A Personalized Information Retrieval System
- 8 International Conference on User Modeling.UM'2001
, 2001
"... The idea of personalizing the interactions of a system is not new. ..."
Abstract
-
Cited by 9 (6 self)
- Add to MetaCart
The idea of personalizing the interactions of a system is not new.
Discretization for naive-Bayes learning: managing discretization bias and variance
, 2003
"... Quantitative attributes are usually discretized in naive-Bayes learning. We prove a theorem that explains why discretization can be effective for naive-Bayes learning. The use of different discretization techniques can be expected to affect the classification bias and variance of generated naive-Bay ..."
Abstract
-
Cited by 9 (5 self)
- Add to MetaCart
Quantitative attributes are usually discretized in naive-Bayes learning. We prove a theorem that explains why discretization can be effective for naive-Bayes learning. The use of different discretization techniques can be expected to affect the classification bias and variance of generated naive-Bayes classifiers, effects we name discretization bias and variance. We argue that by properly managing discretization bias and variance, we can effectively reduce naive-Bayes classification error. In particular, we propose proportional k-interval discretization and equal size discretization, two efficient heuristic discretization methods that are able to effectively manage discretization bias and variance by tuning discretized interval size and interval number. We empirically evaluate our new techniques against five key discretization methods for naive-Bayes classifiers. The experimental results support our theoretical arguments by showing that naive-Bayes classifiers trained on data discretized by our new methods are able to achieve lower classification error than those trained on data discretized by alternative discretization methods.
Learning Bayesian networks for clustering by means of constructive induction
- Pattern Recognition Letters
, 1999
"... The purpose of this paper is to present and evaluate a heuristic algorithm for learning Bayesian networks for clustering. Our approach is based upon improving the Naive-Bayes model by means of constructive induction. A key idea in this approach is to treat expected data as real data. This allows us ..."
Abstract
-
Cited by 7 (4 self)
- Add to MetaCart
The purpose of this paper is to present and evaluate a heuristic algorithm for learning Bayesian networks for clustering. Our approach is based upon improving the Naive-Bayes model by means of constructive induction. A key idea in this approach is to treat expected data as real data. This allows us to complete the database and to take advantage of factorable closed forms for the marginal likelihood. In order to get such an advantage, we search for parameter values using the EM algorithm or another alternative approach that we have developed: a hybridization of the Bound and Collapse method and the EM algorithm, which results in a method that exhibits a faster convergence rate and more effective behaviour than the EM algorithm. Also, we consider the possibility of interleaving runnings of these two methods after each structural change. We evaluate our approach on synthetic and real-world databases. Key words: clustering, Bayesian networks, learning from incomplete data, constructive ind...
Averaged One-Dependence Estimators: Preliminary Results
- University of Technology Sydney
, 2002
"... Naive Bayes is a simple, computationally efficient and remarkably accurate approach to classification learning. These properties have led to its wide deployment in many online applications. However, it is based on an assumption that all attributes are conditionally independent given the class. This ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
Naive Bayes is a simple, computationally efficient and remarkably accurate approach to classification learning. These properties have led to its wide deployment in many online applications. However, it is based on an assumption that all attributes are conditionally independent given the class. This assumption leads to decreased accuracy in some applications. AODE overcomes the attribute independence assumption of naive Bayes by averaging over all models in which all attributes depend upon the class and a single other attribute. The resulting classification learning algorithm for nominal data is computationally efficient and achieves very low error rates.

