Results 1  10
of
51
Estimating Continuous Distributions in Bayesian Classifiers
 In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence
, 1995
"... When modeling a probability distribution with a Bayesian network, we are faced with the problem of how to handle continuous variables. Most previous work has either solved the problem by discretizing, or assumed that the data are generated by a single Gaussian. In this paper we abandon the normality ..."
Abstract

Cited by 312 (2 self)
 Add to MetaCart
When modeling a probability distribution with a Bayesian network, we are faced with the problem of how to handle continuous variables. Most previous work has either solved the problem by discretizing, or assumed that the data are generated by a single Gaussian. In this paper we abandon the normality assumption and instead use statistical methods for nonparametric density estimation. For a naive Bayesian classifier, we present experimental results on a variety of natural and artificial domains, comparing two methods of density estimation: assuming normality and modeling each conditional distribution with a single Gaussian; and using nonparametric kernel density estimation. We observe large reductions in error on several natural and artificial data sets, which suggests that kernel estimation is a useful tool for learning Bayesian models. In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, Morgan Kaufmann Publishers, San Mateo, 1995 1 Introduction In rec...
Scaling Up the Accuracy of NaiveBayes Classifiers: a DecisionTree Hybrid
 PROCEEDINGS OF THE SECOND INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING
, 1996
"... NaiveBayes induction algorithms were previously shown to be surprisingly accurate on many classification tasks even when the conditional independence assumption on which they are based is violated. However, most studies were done on small databases. We show that in some larger databases, the accura ..."
Abstract

Cited by 177 (4 self)
 Add to MetaCart
NaiveBayes induction algorithms were previously shown to be surprisingly accurate on many classification tasks even when the conditional independence assumption on which they are based is violated. However, most studies were done on small databases. We show that in some larger databases, the accuracy of NaiveBayes does not scale up as well as decision trees. We then propose a new algorithm, NBTree, which induces a hybrid of decisiontree classifiers and NaiveBayes classifiers: the decisiontree nodes contain univariate splits as regular decisiontrees, but the leaves contain NaiveBayesian classifiers. The approach retains the interpretability of NaiveBayes and decision trees, while resulting in classifiers that frequently outperform both constituents, especially in the larger databases tested.
Wrappers For Performance Enhancement And Oblivious Decision Graphs
, 1995
"... In this doctoral dissertation, we study three basic problems in machine learning and two new hypothesis spaces with corresponding learning algorithms. The problems we investigate are: accuracy estimation, feature subset selection, and parameter tuning. The latter two problems are related and are stu ..."
Abstract

Cited by 107 (8 self)
 Add to MetaCart
In this doctoral dissertation, we study three basic problems in machine learning and two new hypothesis spaces with corresponding learning algorithms. The problems we investigate are: accuracy estimation, feature subset selection, and parameter tuning. The latter two problems are related and are studied under the wrapper approach. The hypothesis spaces we investigate are: decision tables with a default majority rule (DTMs) and oblivious readonce decision graphs (OODGs).
Improving Simple Bayes
, 1997
"... The simple Bayesian classifier (SBC), sometimes called NaiveBayes, is built based on a conditional independence model of each attribute given the class. The model was previously shown to be surprisingly robust to obvious violations of this independence assumption, yielding accurate classificat ..."
Abstract

Cited by 59 (1 self)
 Add to MetaCart
The simple Bayesian classifier (SBC), sometimes called NaiveBayes, is built based on a conditional independence model of each attribute given the class. The model was previously shown to be surprisingly robust to obvious violations of this independence assumption, yielding accurate classification models even when there are clear conditional dependencies. We examine different approaches for handling unknowns and zero counts when estimating probabilities. Large scale experiments on 37 datasets were conducted to determine the effects of these approaches and several interesting insights are given, including a new variant of the Laplace estimator that outperforms other methods for dealing with zero counts. Using the biasvariance decomposition [15, 10], we show that while the SBC has performed well on common benchmark datasets, its accuracy will not scale up as the dataset sizes grow. Even with these limitations in mind, the SBC can serve as an excellenttool for initial exp...
Lazy Learning of Bayesian Rules
 Machine Learning
, 2000
"... The naive Bayesian classifier provides a simple and e#ective approach to classifier learning, but its attribute independence assumption is often violated in the real world. A number of approaches have sought to alleviate this problem. A Bayesian tree learning algorithm builds a decision tree, and ge ..."
Abstract

Cited by 39 (8 self)
 Add to MetaCart
The naive Bayesian classifier provides a simple and e#ective approach to classifier learning, but its attribute independence assumption is often violated in the real world. A number of approaches have sought to alleviate this problem. A Bayesian tree learning algorithm builds a decision tree, and generates a local naive Bayesian classifier at each leaf. The tests leading to a leaf can alleviate attribute interdependencies for the local naive Bayesian classifier. However, Bayesian tree learning still su#ers from the small disjunct problem of tree learning. While inferred Bayesian trees demonstrate low average prediction error rates, there is reason to believe that error rates will be higher for those leaves with few training examples. This paper proposes the application of lazy learning techniques to Bayesian tree induction and presents the resulting lazy Bayesian rule learning algorithm, called Lbr. This algorithm can be justified by a variant of Bayes theorem which supports a weaker conditional attribute independence assumption than is required by naive Bayes. For each test example, it builds a most appropriate rule with a local naive Bayesian classifier as its consequent. It is demonstrated that the computational requirements of Lbr are reasonable in a wide crosssection of natural domains. Experiments with these domains show that, on average, this new algorithm obtains lower error rates significantly more often than the reverse in comparison to a naive Bayesian classifier, C4.5, a Bayesian tree learning algorithm, a constructive Bayesian classifier that eliminates attributes and constructs new attributes using Cartesian products of existing nominal attributes, and a lazy decision tree learning algorithm. It also outperforms, although the result is not statisticall...
Overcoming the myopia of inductive learning algorithms with RELIEFF
 Applied Intelligence
, 1997
"... . Current inductive machine learning algorithms typically use greedy search with limited lookahead. This prevents them to detect significant conditional dependencies between the attributes that describe training objects. Instead of myopic impurity functions and lookahead, we propose to use RELIEFF, ..."
Abstract

Cited by 38 (12 self)
 Add to MetaCart
. Current inductive machine learning algorithms typically use greedy search with limited lookahead. This prevents them to detect significant conditional dependencies between the attributes that describe training objects. Instead of myopic impurity functions and lookahead, we propose to use RELIEFF, an extension of RELIEF developed by Kira and Rendell [10], [11], for heuristic guidance of inductive learning algorithms. We have reimplemented Assistant, a system for top down induction of decision trees, using RELIEFF as an estimator of attributes at each selection step. The algorithm is tested on several artificial and several real world problems and the results are compared with some other well known machine learning algorithms. Excellent results on artificial data sets and two real world problems show the advantage of the presented approach to inductive learning. Keywords: learning from examples, estimating attributes, impurity function, RELIEFF, empirical evaluation 1. Introduction ...
Machine learning for medical diagnosis: history, state of the art and perspective
 Artificial Intelligence in Medicine
, 2001
"... The paper provides an overview of the development of intelligent data analysis in medicine from a machine learning perspective: a historical view, a state of the art view and a view on some future trends in this subfield of applied artificial intelligence. The paper is not intended to provide a com ..."
Abstract

Cited by 35 (1 self)
 Add to MetaCart
The paper provides an overview of the development of intelligent data analysis in medicine from a machine learning perspective: a historical view, a state of the art view and a view on some future trends in this subfield of applied artificial intelligence. The paper is not intended to provide a comprehensive overview but rather describes some subeareas and directions which from my personal point of view seem to be important for applying machine learning in medical diagnosis. In the historical overview I emphasize the naive Bayesian classifier, neural networks and decision trees. I present a comparison of some state of the art systems, representatives from each branch of machine learning, when applied to several medical diagnostic tasks. The future trends are illustrated by two case studies. The first describes a recently developed method for dealing with reliability of decisions of classifiers, which seems to be promising for intelligent data analysis in medicine. The second describes an approach to using machine learning in order to verify some unexplained phenomena from complementary medicine, which is not (yet) approved by the orthodox medical community but could in the future play an important role in overall medical diagnosis and treatment. 1
ExpertGuided Subgroup Discovery: Methodology and Application
 Journal of Artificial Intelligence Research
, 2002
"... This paper presents an approach to expertguided subgroup discovery. The main step of the subgroup discovery process, the induction of subgroup descriptions, is performed by a heuristic beam search algorithm, using a novel parametrized definition of rule quality which is analyzed in detail. The othe ..."
Abstract

Cited by 32 (7 self)
 Add to MetaCart
This paper presents an approach to expertguided subgroup discovery. The main step of the subgroup discovery process, the induction of subgroup descriptions, is performed by a heuristic beam search algorithm, using a novel parametrized definition of rule quality which is analyzed in detail. The other important steps of the proposed subgroup discovery process are the detection of statistically significant properties of selected subgroups and subgroup visualization: statistically significant properties are used to enrich the descriptions of induced subgroups, while the visualization shows subgroup properties in the form of distributions of the numbers of examples in the subgroups. The approach is illustrated by the results obtained for a medical problem of early detection of patient risk groups.
A Comparative Study of Discretization Methods for NaiveBayes Classifiers
 In Proceedings of PKAW 2002: The 2002 Pacific Rim Knowledge Acquisition Workshop
, 2002
"... Discretization is a popular approach to handling numeric attributes in machine learning. We argue that the requirements for effective discretization differ between naiveBayes learning and many other learning algorithms. We evaluate the effectiveness with naiveBayes classifiers of nine discretizati ..."
Abstract

Cited by 18 (0 self)
 Add to MetaCart
Discretization is a popular approach to handling numeric attributes in machine learning. We argue that the requirements for effective discretization differ between naiveBayes learning and many other learning algorithms. We evaluate the effectiveness with naiveBayes classifiers of nine discretization methods, equal width discretization (EWD), equal frequency discretization (EFD), fuzzy discretization (FD), entropy minimization discretization (EMD), iterative discretization (ID), proportional kinterval discretization (PKID), lazy discretization (LD), nondisjoint discretization (NDD) and weighted proportional kinterval discretization (WPKID). It is found that in general naiveBayes classifiers trained on data preprocessed by LD, NDD or WPKID achieve lower classification error than those trained on data preprocessed by the other discretization methods. But LD can not scale to large data. This study leads to a new discretization method, weighted nondisjoint discretization (WNDD) that combines WPKID and NDD's advantages. Our experiments show that among all the rival discretization methods, WNDD best helps naiveBayes classifiers reduce average classification error.
Discretization for naiveBayes learning: managing discretization bias and variance
, 2003
"... Quantitative attributes are usually discretized in naiveBayes learning. We prove a theorem that explains why discretization can be effective for naiveBayes learning. The use of different discretization techniques can be expected to affect the classification bias and variance of generated naiveBay ..."
Abstract

Cited by 17 (5 self)
 Add to MetaCart
Quantitative attributes are usually discretized in naiveBayes learning. We prove a theorem that explains why discretization can be effective for naiveBayes learning. The use of different discretization techniques can be expected to affect the classification bias and variance of generated naiveBayes classifiers, effects we name discretization bias and variance. We argue that by properly managing discretization bias and variance, we can effectively reduce naiveBayes classification error. In particular, we propose proportional kinterval discretization and equal size discretization, two efficient heuristic discretization methods that are able to effectively manage discretization bias and variance by tuning discretized interval size and interval number. We empirically evaluate our new techniques against five key discretization methods for naiveBayes classifiers. The experimental results support our theoretical arguments by showing that naiveBayes classifiers trained on data discretized by our new methods are able to achieve lower classification error than those trained on data discretized by alternative discretization methods.