Results 1  10
of
20
An analysis of Bayesian classifiers
 IN PROCEEDINGS OF THE TENTH NATIONAL CONFERENCE ON ARTI CIAL INTELLIGENCE
, 1992
"... In this paper we present anaveragecase analysis of the Bayesian classifier, a simple induction algorithm that fares remarkably well on many learning tasks. Our analysis assumes a monotone conjunctive target concept, and independent, noisefree Boolean attributes. We calculate the probability that t ..."
Abstract

Cited by 333 (17 self)
 Add to MetaCart
In this paper we present anaveragecase analysis of the Bayesian classifier, a simple induction algorithm that fares remarkably well on many learning tasks. Our analysis assumes a monotone conjunctive target concept, and independent, noisefree Boolean attributes. We calculate the probability that the algorithm will induce an arbitrary pair of concept descriptions and then use this to compute the probability of correct classification over the instance space. The analysis takes into account the number of training instances, the number of attributes, the distribution of these attributes, and the level of class noise. We also explore the behavioral implications of the analysis by presenting
KnowledgeBased Artificial Neural Networks
, 1994
"... Hybrid learning methods use theoretical knowledge of a domain and a set of classified examples to develop a method for accurately classifying examples not seen during training. The challenge of hybrid learning systems is to use the information provided by one source of information to offset informat ..."
Abstract

Cited by 145 (13 self)
 Add to MetaCart
Hybrid learning methods use theoretical knowledge of a domain and a set of classified examples to develop a method for accurately classifying examples not seen during training. The challenge of hybrid learning systems is to use the information provided by one source of information to offset information missing from the other source. By so doing, a hybrid learning system should learn more effectively than systems that use only one of the information sources. KBANN(KnowledgeBased Artificial Neural Networks) is a hybrid learning system built on top of connectionist learning techniques. It maps problemspecific "domain theories", represented in propositional logic, into neural networks and then refines this reformulated knowledge using backpropagation. KBANN is evaluated by extensive empirical tests on two problems from molecular biology. Among other results, these tests show that the networks created by KBANN generalize better than a wide variety of learning systems, as well as several t...
Addressing the Selective Superiority Problem: Automatic Algorithm/Model Class Selection
, 1993
"... The results of empirical comparisons of existing learning algorithms illustrate that each algorithm has a selective superiority; it is best for some but not all tasks. Given a data set, it is often not clear beforehand which algorithm will yield the best performance. In such cases one must search th ..."
Abstract

Cited by 63 (2 self)
 Add to MetaCart
The results of empirical comparisons of existing learning algorithms illustrate that each algorithm has a selective superiority; it is best for some but not all tasks. Given a data set, it is often not clear beforehand which algorithm will yield the best performance. In such cases one must search the space of available algorithms to find the one that produces the best classifier. In this paper we present an approach that applies knowledge about the representational biases of a set of learning algorithms to conduct this search automatically. In addition, the approach permits the available algorithms' model classes to be mixed in a recursive treestructured hybrid. We describe an implementation of the approach, MCS, that performs a heuristic bestfirst search for the best hybrid classifier for a set of data. An empirical comparison of MCS to each of its primitive learning algorithms, and to the computationally intensive method of crossvalidation, illustrates that automatic selection of l...
Understanding the crucial role of attribute interaction in data mining
 Artif. Intel. Rev
, 2001
"... This is a review paper, whose goal is to significantly improve our understanding of the crucial role of attribute interaction in data mining. The main contributions of this paper are as follows. Firstly, we show that the concept of attribute interaction has a crucial role across different kinds of p ..."
Abstract

Cited by 48 (14 self)
 Add to MetaCart
This is a review paper, whose goal is to significantly improve our understanding of the crucial role of attribute interaction in data mining. The main contributions of this paper are as follows. Firstly, we show that the concept of attribute interaction has a crucial role across different kinds of problem in data mining, such as attribute construction, coping with small disjuncts, induction of firstorder logic rules, detection of Simpsonâ€™s paradox, and finding several types of interesting rules. Hence, a better understanding of attribute interaction can lead to a better understanding of the relationship between these kinds of problems, which are usually studied separately from each other. Secondly, we draw attention to the fact that most rule induction algorithms are based on a greedy search which does not cope well with the problem of attribute interaction, and point out some alternative kinds of rule discovery methods which tend to cope better with this problem. Thirdly, we discussed several algorithms and methods for discovering interesting knowledge that, implicitly or explicitly, are based on the concept of attribute interaction.
A scheme for feature construction and a comparison of empirical methods
 Proceedings of the Twelfth International Joint Conference on Artificial Intelligence
, 1991
"... A class of concept learning algorithms CL augments standard similaritybased techniques by performing feature construction based on the SBL output. Pagallo and Hausslcr's FRINGE, Pagallo's extension Symmetric FRINGE (SymFringe) and a refinement we call DCFringe are all instances of this class using ..."
Abstract

Cited by 17 (1 self)
 Add to MetaCart
A class of concept learning algorithms CL augments standard similaritybased techniques by performing feature construction based on the SBL output. Pagallo and Hausslcr's FRINGE, Pagallo's extension Symmetric FRINGE (SymFringe) and a refinement we call DCFringe are all instances of this class using decision trees as their underlying representation. These methods use patterns at the fringe of the tree to guide their construction, but DCFringe uses limited construction of conjunction and disjunction. Experiments with small DNF and CNF concepts show that DCFringe outperforms both the purely conjunctive FRINGE and the less restrictive SymFringe, in terms of accuracy, conciseness, and efficiency. Further, the gain of these methods is linked to the size of the training set. We discuss the apparent limitation of current methods to concepts exhibiting a low degree of feature interaction, and suggest ways to alleviate it. This leads to a feature construction approach based on a wider variety of patterns restricted by statistical measures and optional knowledge. 1
A comprehensive case study: An examination of machine learning and connectionist algorithms
, 1995
"... ..."
A comparative assessment of classification methods
 Decision Support Systems
, 2003
"... Classification systems play an important role in business decisionmaking tasks by classifying the available information based on some criteria. The objective of this research is to assess the relative performance of some wellknown classification methods. We consider classification techniques that ..."
Abstract

Cited by 8 (0 self)
 Add to MetaCart
Classification systems play an important role in business decisionmaking tasks by classifying the available information based on some criteria. The objective of this research is to assess the relative performance of some wellknown classification methods. We consider classification techniques that are based on statistical and AI techniques. We use synthetic data to perform a controlled experiment in which the data characteristics are systematically altered to introduce imperfections such as nonlinearity, multicollinearity, unequal covariance, etc. Our experiments suggest that data characteristics considerably impact the classification performance of the methods. The results of the study can aid in the design of classification systems in which several classification methods can be employed to increase the reliability and consistency of the classification.
Constructing New Attributes for Decision Tree Learning
, 1996
"... A wellknown fundamental limitation of selective induction algorithms is that when tasksupplied attributes are not adequate for, or directly relevant to, describing hypotheses, their performance in terms of prediction accuracy and/or theory complexity is poor. One solution to this problem is constru ..."
Abstract

Cited by 7 (3 self)
 Add to MetaCart
A wellknown fundamental limitation of selective induction algorithms is that when tasksupplied attributes are not adequate for, or directly relevant to, describing hypotheses, their performance in terms of prediction accuracy and/or theory complexity is poor. One solution to this problem is constructive induction. It constructs, by using tasksupplied attributes, new attributes that are expected to be more appropriate than the tasksupplied attributes for describing the target concepts. This thesis focuses on constructive induction with decision trees as the theory description language. It explores: (1) novel approaches to constructing new binary attributes using existing constructive operators, and (2) novel methods of constructing new nominal and new continuousvalued attributes based on a newly proposed constructive operator. The thesis investigates a fixed rulebased approach to constructing new binary attributes for decision tree learning. It generates conjunctions from producti...
Dynamic Automatic Model Selection
, 1992
"... The problem of how to learn from examples has been studied throughout the history of machine learning, and many successful learning algorithms have been developed. A problem that has received less attention is how to select which algorithm to use for a given learning task. The ability of a chosen al ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
The problem of how to learn from examples has been studied throughout the history of machine learning, and many successful learning algorithms have been developed. A problem that has received less attention is how to select which algorithm to use for a given learning task. The ability of a chosen algorithm to induce a good generalization depends on how appropriate the model class underlying the algorithm is for the given task. We define an algorithm's model class to be the representation language it uses to express a generalization of the examples. Supervised learning algorithms differ in their underlying model class and in how they search for a good generalization. Given this characterization, it is not surprising that some algorithms find better generalizations for some, but not all tasks. Therefore, in order to find the best generalization for each task, an automated learning system must search for the appropriate model class in addition to searching for the best generalization wit...
Sources of Success for Boosted Wrapper Induction
 Journal of Machine Learning Research
, 2004
"... In this paper, we examine an important recent rulebased information extraction (IE) technique named Boosted Wrapper Induction (BWI) by conducting experiments on a wider variety of tasks than previously studied, including tasks using several collections of natural text documents. We investigate syst ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
In this paper, we examine an important recent rulebased information extraction (IE) technique named Boosted Wrapper Induction (BWI) by conducting experiments on a wider variety of tasks than previously studied, including tasks using several collections of natural text documents. We investigate systematically how each algorithmic component of BWI, in particular boosting, contributes to its success. We show that the benefit of boosting arises from the ability to reweight examples to learn specific rules (resulting in high precision) combined with the ability to continue learning rules after all positive examples have been covered (resulting in high recall). As a quantitative indicator of the regularity of an extraction task, we propose a new measure that we call the SWI ratio. We show that this measure is a good predictor of IE success and a useful tool for analyzing IE tasks. Based on these results, we analyze the strengths and limitations of BWI. Specifically, we explain limitations in the information made available, and in the representations used. We also investigate the consequences of the fact that confidence values returned during extraction are not true probabilities. Next, we investigate the benefits of including grammatical and semantic information for natural text documents, as well as parse tree and attributevalue information for XML and