Results 1 - 10
of
13
Learning With Many Irrelevant Features
- In Proceedings of the Ninth National Conference on Artificial Intelligence
, 1991
"... In many domains, an appropriate inductive bias is the MIN-FEATURES bias, which prefers consistent hypotheses definable over as few features as possible. This paper defines and studies this bias. First, it is shown that any learning algorithm implementing the MIN-FEATURES bias requires \Theta( 1 ff ..."
Abstract
-
Cited by 187 (3 self)
- Add to MetaCart
In many domains, an appropriate inductive bias is the MIN-FEATURES bias, which prefers consistent hypotheses definable over as few features as possible. This paper defines and studies this bias. First, it is shown that any learning algorithm implementing the MIN-FEATURES bias requires \Theta( 1 ffl ln 1 ffi + 1 ffl [2 p + p ln n]) training examples to guarantee PAC-learning a concept having p relevant features out of n available features. This bound is only logarithmic in the number of irrelevant features. The paper also presents a quasi-polynomial time algorithm, FOCUS, which implements MIN-FEATURES. Experimental studies are presented that compare FOCUS to the ID3 and FRINGE algorithms. These experiments show that--- contrary to expectations---these algorithms do not implement good approximations of MIN-FEATURES. The coverage, sample complexity, and generalization performance of FOCUS is substantially better than either ID3 or FRINGE on learning problems where the MIN-FEATURE...
Learning Boolean Concepts in the Presence of Many Irrelevant Features
- Artificial Intelligence
, 1994
"... In many domains, an appropriate inductive bias is the MIN-FEATURES bias, which prefers consistent hypotheses definable over as few features as possible. This paper defines and studies this bias in Boolean domains. First, it is shown that any learning algorithm implementing the MIN-FEATURES bias requ ..."
Abstract
-
Cited by 80 (0 self)
- Add to MetaCart
In many domains, an appropriate inductive bias is the MIN-FEATURES bias, which prefers consistent hypotheses definable over as few features as possible. This paper defines and studies this bias in Boolean domains. First, it is shown that any learning algorithm implementing the MIN-FEATURES bias requires \Theta( 1 ffl ln 1 ffi + 1 ffl [2 p + p ln n]) training examples to guarantee PAC-learning a concept having p relevant features out of n available features. This bound is only logarithmic in the number of irrelevant features. For implementing the MIN-FEATURES bias, the paper presents five algorithms that identify a subset of features sufficient to construct a hypothesis consistent with the training examples. FOCUS-1 is a straightforward algorithm that returns a minimal and sufficient subset of features in quasi-polynomial time. FOCUS-2 does the same task as FOCUS-1 but is empirically shown to be substantially faster than FOCUS-1. Finally, the Simple-Greedy, Mutual-Information-G...
Graphical Models for Discovering Knowledge
, 1995
"... There are many different ways of representing knowledge, and for each of these ways there are many different discovery algorithms. How can we compare different representations? How can we mix, match and merge representations and algorithms on new problems with their own unique requirements? This cha ..."
Abstract
-
Cited by 27 (3 self)
- Add to MetaCart
There are many different ways of representing knowledge, and for each of these ways there are many different discovery algorithms. How can we compare different representations? How can we mix, match and merge representations and algorithms on new problems with their own unique requirements? This chapter introduces probabilistic modeling as a philosophy for addressing these questions and presents graphical models for representing probabilistic models. Probabilistic graphical models are a unified qualitative and quantitative framework for representing and reasoning with probabilities and independencies. 4.1 Introduction Perhaps one common element of the discovery systems described in this and previous books on knowledge discovery is that they are all different. Since the class of discovery problems is a challenging one, we cannot write a single program to address all of knowledge discovery. The KEFIR discovery system applied to health care by Matheus, Piatetsky-Shapiro, and McNeill (199...
A comprehensive case study: An examination of machine learning and connectionist algorithms
, 1995
"... ..."
Iterate: A conceptual clustering method for knowledge discovery in databases
- In Braunschweig, B., & Day, R. (Eds.), Innovative Applications of Artificial Intelligence in the Oil and Gas Industry
, 1995
"... ..."
Abduction and Explanation-Based Learning: Case Studies in Diverse Domains
, 1993
"... This paper presents a knowledge-based learning method and reports on case studies in di#erent domains. The method integrates abduction and explanation-based learning. Abduction provides an improved method for constructing explanations. The improvement enlarges the set of examples that can be expl ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
This paper presents a knowledge-based learning method and reports on case studies in di#erent domains. The method integrates abduction and explanation-based learning. Abduction provides an improved method for constructing explanations. The improvement enlarges the set of examples that can be explained so that one can learn from additional examples using traditional explanation-based macro learning. Abduction also provides a form of knowledge level learning. Descriptions of case studies show how to set up abduction engines for tasks in particular domains. The case studies involve over a hundred examples taken from diverse domains requiring logical, physical, and psychological knowledge and reasoning. The case studies are relevant to a wide range of practical tasks including: natural language understanding and plan recognition; qualitative physical reasoning and postdiction; diagnosis and signal interpretation; and decision-making under uncertainty. The descriptions of the ca...
A Bayesian Analysis of Algorithms for Learning Finite Functions
- Machine Learning: Proceedings of the Twelfth International Conference (ML95
, 1995
"... We consider algorithms for learning functions f : X ! Y , where X and Y are finite, and there is assumed to be no noise in the data. Learning algorithms, Alg, are connected with \Gamma(Alg), the set of prior probability distributions for which they are optimal. A method for constructing \Gamma(Alg) ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
We consider algorithms for learning functions f : X ! Y , where X and Y are finite, and there is assumed to be no noise in the data. Learning algorithms, Alg, are connected with \Gamma(Alg), the set of prior probability distributions for which they are optimal. A method for constructing \Gamma(Alg) from Alg is given and the relationship between the various \Gamma(Alg) is discussed. Improper algorithms are identified as those for which \Gamma(Alg) has zero volume. Improper algorithms are investigated using linear algebra and two examples of improper algorithms are given. This framework is then applied to the question of choosing between competing algorithms. "Leave-oneout " cross-validation is hence characterised as a crude method of ML-II prior selection. We conclude by examining how the mathematical results bear on practical problems and by discussing related work, as well as suggesting future work. 1 Introduction Given the plethora of different learning algorithms produced by the m...
Layered Models of Research Methodologies
, 1994
"... The status of research methodology employed by studies on the application of AI techniques to solving problems in engineering design, analysis, and manufacturing is poor. There may be many reasons for this status including: unfortunate heritage from AI, poor educational system, and researchers' slop ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
The status of research methodology employed by studies on the application of AI techniques to solving problems in engineering design, analysis, and manufacturing is poor. There may be many reasons for this status including: unfortunate heritage from AI, poor educational system, and researchers' sloppiness. Understanding this status is a prerequisite for improvement. The study of research methodology can promote such understanding, but most importantly, it can assist in improving the situation. This paper introduces concepts from the philosophy of science and builds on them models of worldviews of science. These worldviews are combined with a research heuristics or research perspectives and criteria for evaluating research to create a layered model of research methodology. This layerd model can serve to organize and facilitate a better understanding of future studies of research methodologies. The paper discusses many of the issues involved in the study of AI and AIEDAM research methodo...
Extending Iterate Conceptual Clustering Scheme In Dealing With Numeric Data
, 1995
"... ion and Interpretation Clustering Meaningful Clusters with Interpretations Figure 1: The Key Steps in Conceptual Clustering Systems grouping the data objects into clusters or groups based on the similarity of properties among the objects. The goal is to derive more general concepts that describe the ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
ion and Interpretation Clustering Meaningful Clusters with Interpretations Figure 1: The Key Steps in Conceptual Clustering Systems grouping the data objects into clusters or groups based on the similarity of properties among the objects. The goal is to derive more general concepts that describe the problem solving task. The task of interpretation involves determining whether the induced concepts are useful for the problem solving tasks that the user is interested in. This task involves the examination of the intentional description of a class in the context of background knowledge about the domain. Overview of the Clustering Methods Traditional approaches to cluster analysis (numerical taxonomy) represent the objects to be clustered as points in a multi-dimensional metric space and adopt distance metrics, such as Euclidean and Mahalanobis measures, to define dissimilarity between objects. Cluster analysis methods take on one of two different forms: 1. parametric methods: they assume t...
DMP3: a dynamic multilayer perceptron construction algorithm
- INT. J. NEURAL SYSTEMS
, 2001
"... This paper presents DMP3 (Dynamic Multilayer Perceptron 3), a multilayer perceptron (MLP) constructive training method that constructs MLPs by incrementally adding network elements of varying complexity to the network. DMP3 differs from other MLP construction techniques in several important ways, an ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
This paper presents DMP3 (Dynamic Multilayer Perceptron 3), a multilayer perceptron (MLP) constructive training method that constructs MLPs by incrementally adding network elements of varying complexity to the network. DMP3 differs from other MLP construction techniques in several important ways, and the motivation for these differences are given. Information gain rather than error minimization is used to guide the growth of the network, which increases the utility of newly added network elements and decreases the likelihood that a premature dead end in the growth of the network will occur. The generalization performance of DMP3 is compared with that of several other well-known machine learning and neural network learning algorithms on nine real world data sets. Simulation results show that DMP3 performs better (on average) than any of the other algorithms on the data sets tested. The main reasons for this result are discussed in detail.

