Results 1 - 10
of
25
Not so naive Bayes: Aggregating one-dependence estimators
- Machine Learning
, 2005
"... Of numerous proposals to improve the accuracy of naive Bayes by weakening its attribute independence assumption, both LBR and super-parent TAN have demonstrated remarkable error performance. However, both techniques obtain this outcome at a considerable computational cost. We present a new approach ..."
Abstract
-
Cited by 44 (8 self)
- Add to MetaCart
Of numerous proposals to improve the accuracy of naive Bayes by weakening its attribute independence assumption, both LBR and super-parent TAN have demonstrated remarkable error performance. However, both techniques obtain this outcome at a considerable computational cost. We present a new approach to weakening the attribute independence assumption by averaging all of a constrained class of classifiers. In extensive experiments this technique delivers comparable prediction accuracy to LBR and super-parent TAN with substantially improved computational e#ciency at test time relative to the former and at training time relative to the latter. The new algorithm is shown to have low variance and is suited to incremental learning.
Learning recursive Bayesian multinets for data clustering by means of constructive induction
, 2001
"... This paper introduces and evaluates a new class of knowledge model, the recursive Bayesian multinet (RBMN), which encodes the joint probability distribution of a given database. RBMNs extend Bayesian networks (BNs) as well as partitional clustering systems. Briefly, a RBMN is a decision tree with co ..."
Abstract
-
Cited by 18 (7 self)
- Add to MetaCart
This paper introduces and evaluates a new class of knowledge model, the recursive Bayesian multinet (RBMN), which encodes the joint probability distribution of a given database. RBMNs extend Bayesian networks (BNs) as well as partitional clustering systems. Briefly, a RBMN is a decision tree with component BNs at the leaves. A RBMN is learnt using a greedy, heuristic approach akin to that used by many supervised decision tree learners, but where BNs are learnt at leaves using constructive induction. A key idea is to treat expected data as real data. This allows us to complete the database and to take advantage of a closed form for the marginal likelihood of the expected complete data that factorizes into separate marginal likelihoods for each family (a node and its parents). Our approach is evaluated on synthetic and real-world databases.
SNNB: A Selective Neighborhood based Naive Bayes for Lazy Learning
- IN: PROCEEDINGS OF THE 6TH PAKDD
, 2002
"... Naive Bayes is a probability-based classification method which is based on the assumption that attributes are conditionally mutually independent given the class label. Much research has been focused on improving the accuracy of Nave Bayes via eager learning. In this paper, we propose a novel laz ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
Naive Bayes is a probability-based classification method which is based on the assumption that attributes are conditionally mutually independent given the class label. Much research has been focused on improving the accuracy of Nave Bayes via eager learning. In this paper, we propose a novel lazy learning algorithm, Selective Neighbourhood based Nave Bayes (SNNB).
On Why Discretization Works for Naive-Bayes Classifiers
- In Proceedings of the 16th Australian Joint Conference on Artificial Intelligence (AI
, 2003
"... We investigate why discretization is effective in naive-Bayes learning. We prove a theorem that identifies particular conditions under which discretization will result in naive-Bayes classifiers delivering the same probability estimates as would be obtained if the correct probability density functio ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
We investigate why discretization is effective in naive-Bayes learning. We prove a theorem that identifies particular conditions under which discretization will result in naive-Bayes classifiers delivering the same probability estimates as would be obtained if the correct probability density functions were employed.
A comparative study of semi-naive Bayes methods in classification learning
- Proc. 4th Australasian Data Mining conference (AusDM05
, 2005
"... Abstract. Numerous techniques have sought to improve the accuracy of Naive Bayes (NB) by alleviating the attribute interdependence problem. This paper summarizes these semi-naive Bayesian methods into two groups: those that apply conventional NB with a new attribute set, and those that alter NB by a ..."
Abstract
-
Cited by 8 (5 self)
- Add to MetaCart
Abstract. Numerous techniques have sought to improve the accuracy of Naive Bayes (NB) by alleviating the attribute interdependence problem. This paper summarizes these semi-naive Bayesian methods into two groups: those that apply conventional NB with a new attribute set, and those that alter NB by allowing inter-dependencies between attributes. We review eight typical semi-naive Bayesian learning algorithms and perform error analysis using the bias-variance decomposition on thirty-six natural domains from the UCI Machine Learning Repository. In analysing the results of these experiments we provide general recommendations for selection between methods.
To Select or To Weigh: A Comparative Study of Linear Combination Schemes for SuperParent-One-Dependence Estimators
"... We conduct a large-scale comparative study on linearly combining superparent-one-dependence estimators (SPODEs), a popular family of semi-naive Bayesian classifiers. Altogether 16 model selection and weighing schemes, 58 benchmark data sets, as well as various statistical tests are employed. This p ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
We conduct a large-scale comparative study on linearly combining superparent-one-dependence estimators (SPODEs), a popular family of semi-naive Bayesian classifiers. Altogether 16 model selection and weighing schemes, 58 benchmark data sets, as well as various statistical tests are employed. This paper’s main contributions are three-fold. First, it formally presents each scheme’s definition, rationale and time complexity; and hence can serve as a comprehensive reference for researchers interested in ensemble learning. Second, it offers bias-variance analysis for each scheme’s classification error performance. Third, it identifies effective schemes that meet various needs in practice. This leads to accurate and fast classification algorithms with immediate and significant impact on real-world applications. Another important feature of our study is using a variety of statistical tests to evaluate multiple learning methods across multiple data sets.
Robust bayesian linear classifier ensembles
- Proc. 16th European Conf. Machine Learning, Lecture Notes in Computer Science
, 2005
"... Abstract. Ensemble classifiers combine the classification results of several classifiers. Simple ensemble methods such as uniform averaging over a set of models usually provide an improvement over selecting the single best model. Usually probabilistic classifiers restrict the set of possible models ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Abstract. Ensemble classifiers combine the classification results of several classifiers. Simple ensemble methods such as uniform averaging over a set of models usually provide an improvement over selecting the single best model. Usually probabilistic classifiers restrict the set of possible models that can be learnt in order to lower computational complexity costs. In these restricted spaces, where incorrect modelling assumptions are possibly made, uniform averaging sometimes performs even better than bayesian model averaging. Linear mixtures over sets of models provide an space that includes uniform averaging as a particular case. We develop two algorithms for learning maximum a posteriori weights for linear mixtures, based on expectation maximization and on constrained optimizition. We provide a nontrivial example of the utility of these two algorithms by applying them for one dependence estimators. We develop the conjugate distribution for one dependence estimators and empirically show that uniform averaging is clearly superior to BMA for this family of models. After that we empirically show that the maximum a posteriori linear mixture weights improve accuracy significantly over uniform aggregation.
Efficiently mining interesting emerging patterns
- in Proc. 4th Int’l. Conf. on Web-Age Information Management (WAIM2003
, 2003
"... Knowledge Discovery in Databases (KDD), or Data Mining is used to discover interesting or useful patterns and relationships in data, with an emphasis on large volume of observational databases. Among many other types of information (knowledge) that can be discovered in data, patterns that are expres ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
Knowledge Discovery in Databases (KDD), or Data Mining is used to discover interesting or useful patterns and relationships in data, with an emphasis on large volume of observational databases. Among many other types of information (knowledge) that can be discovered in data, patterns that are expressed in terms of features are popular because they can be understood and used directly by people. The recently proposed Emerging Pattern (EP) is one type of such knowledge patterns. Emerging Patterns are sets of items (conjunctions of attribute values) whose frequency changes significantly from one dataset to another. They are useful as a means of discovering distinctions inherently present amongst a collection of datasets and have been shown to be a powerful method for constructing accurate classifiers. In this doctoral dissertation, we study the following three major problems involved in the discovery of Emerging Patterns and the construction of classification systems based on Emerging Patterns: 1. How to efficiently discover the complete set of Emerging Patterns between two classes
To select or to weigh: A comparative study of model selection and model weighing for SPODE ensembles
- IN: PROC. 18TH EUROPEAN CONF. MACHINE LEARNING (ECML), 2006
, 2006
"... An ensemble of Super-parent-one-dependence Estimators (SPODEs) offers a powerful yet simple alternative to naive Bayes classifiers, achieving significantly higher classification accuracy at a moderate cost in classification efficiency. Currently there exist two families of methodologies that ensembl ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
An ensemble of Super-parent-one-dependence Estimators (SPODEs) offers a powerful yet simple alternative to naive Bayes classifiers, achieving significantly higher classification accuracy at a moderate cost in classification efficiency. Currently there exist two families of methodologies that ensemble candidate SPODEs for classification. One is to select only helpful SPODEs and uniformly average their probability estimates, a type of model selection. Another is to assign a weight to each SPODE and linearly combine their probability estimates, a methodology named model weighing. This paper presents a theoretical and empirical study comparing model selection and model weighing for ensembling SPODEs. The focus is on maximizing the ensemble’s classification accuracy while minimizing its computational time. A number of representative selection and weighing schemes are studied, providing a comprehensive research on this topic and identifying effective schemes that provide alternative trades-off between speed and expected error.
Local Strategy Learning in Networked Multi-Agent Team Formation (Final Draft)
"... Abstract. Networked multi-agent systems are comprised of many autonomous yet interdependent agents situated in a virtual social network. Two examples of such systems are supply chain networks and sensor networks. A common challenge in many networked multiagent systems is decentralized team formation ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Abstract. Networked multi-agent systems are comprised of many autonomous yet interdependent agents situated in a virtual social network. Two examples of such systems are supply chain networks and sensor networks. A common challenge in many networked multiagent systems is decentralized team formation among the spatially and logically extended agents. Even in cooperative multi-agent systems, efficient team formation is made difficult by the limited local information available to the individual agents. We present a model of distributed multi-agent team formation in networked multi-agent systems, describe a policy learning framework for joining teams based on local information, and give empirical results on improving team formation performance. In particular, we show that local policy learning from limited information leads to a significant increase in organizational team formation performance compared to a random policy.

