Results 1  10
of
16
Selection of relevant features and examples in machine learning
 ARTIFICIAL INTELLIGENCE
, 1997
"... In this survey, we review work in machine learning on methods for handling data sets containing large amounts of irrelevant information. We focus on two key issues: the problem of selecting relevant features, and the problem of selecting relevant examples. We describe the advances that have been mad ..."
Abstract

Cited by 423 (1 self)
 Add to MetaCart
In this survey, we review work in machine learning on methods for handling data sets containing large amounts of irrelevant information. We focus on two key issues: the problem of selecting relevant features, and the problem of selecting relevant examples. We describe the advances that have been made on these topics in both empirical and theoretical work in machine learning, and we present a general framework that we use to compare different methods. We close with some challenges for future work in this area.
A System for Induction of Oblique Decision Trees
 Journal of Artificial Intelligence Research
, 1994
"... This article describes a new system for induction of oblique decision trees. This system, OC1, combines deterministic hillclimbing with two forms of randomization to find a good oblique split (in the form of a hyperplane) at each node of a decision tree. Oblique decision tree methods are tuned espe ..."
Abstract

Cited by 251 (13 self)
 Add to MetaCart
This article describes a new system for induction of oblique decision trees. This system, OC1, combines deterministic hillclimbing with two forms of randomization to find a good oblique split (in the form of a hyperplane) at each node of a decision tree. Oblique decision tree methods are tuned especially for domains in which the attributes are numeric, although they can be adapted to symbolic or mixed symbolic/numeric attributes. We present extensive empirical studies, using both real and artificial data, that analyze OC1's ability to construct oblique trees that are smaller and more accurate than their axisparallel counterparts. We also examine the benefits of randomization for the construction of oblique decision trees. 1. Introduction Current data collection technology provides a unique challenge and opportunity for automated machine learning techniques. The advent of major scientific projects such as the Human Genome Project, the Hubble Space Telescope, and the human brain mappi...
Cached Sufficient Statistics for Efficient Machine Learning with Large Datasets
 Journal of Artificial Intelligence Research
, 1997
"... This paper introduces new algorithms and data structures for quick counting for machine learning datasets. We focus on the counting task of constructing contingency tables, but our approach is also applicable to counting the number of records in a dataset that match conjunctive queries. Subject to c ..."
Abstract

Cited by 122 (19 self)
 Add to MetaCart
This paper introduces new algorithms and data structures for quick counting for machine learning datasets. We focus on the counting task of constructing contingency tables, but our approach is also applicable to counting the number of records in a dataset that match conjunctive queries. Subject to certain assumptions, the costs of these operations can be shown to be independent of the number of records in the dataset and loglinear in the number of nonzero entries in the contingency table. We provide a very sparse data structure, the ADtree, to minimize memory use. We provide analytical worstcase bounds for this structure for several models of data distribution. We empirically demonstrate that tractablysized data structures can be produced for large realworld datasets by (a) using a sparse tree structure that never allocates memory for counts of zero, (b) never allocating memory for counts that can be deduced from other counts, and (c) not bothering to expand the tree fully near its...
Efficient Locally Weighted Polynomial Regression Predictions
 In Proceedings of the 1997 International Machine Learning Conference
"... Locally weighted polynomial regression (LWPR) is a popular instancebased algorithm for learning continuous nonlinear mappings. For more than two or three inputs and for more than a few thousand datapoints the computational expense of predictions is daunting. We discuss drawbacks with previous appr ..."
Abstract

Cited by 79 (11 self)
 Add to MetaCart
Locally weighted polynomial regression (LWPR) is a popular instancebased algorithm for learning continuous nonlinear mappings. For more than two or three inputs and for more than a few thousand datapoints the computational expense of predictions is daunting. We discuss drawbacks with previous approaches to dealing with this problem, and present a new algorithm based on a multiresolution search of a quicklyconstructible augmented kdtree. Without needing to rebuild the tree, we can make fast predictions with arbitrary local weighting functions, arbitrary kernel widths and arbitrary queries. The paper begins with a new, faster, algorithm for exact LWPR predictions. Next we introduce an approximation that achieves up to a twoordersof magnitude speedup with negligible accuracy losses. Increasing a certain approximation parameter achieves greater speedups still, but with a correspondingly larger accuracy degradation. This is nevertheless useful during operations such as the early stages...
Explanationbased learning and reinforcement learning: A unified view
 In Proceedings Twelfth International Conference on Machsâ€™ne Learning
, 1995
"... Abstract. In speeduplearning problems, where full descriptions of operators are known, both explanationbased learning (EBL) and reinforcement learning (RL) methods can be applied. This paper shows that both methods involve fundamentally the same process of propagating information backward from the ..."
Abstract

Cited by 46 (2 self)
 Add to MetaCart
Abstract. In speeduplearning problems, where full descriptions of operators are known, both explanationbased learning (EBL) and reinforcement learning (RL) methods can be applied. This paper shows that both methods involve fundamentally the same process of propagating information backward from the goal toward the starting state. Most RL methods perform this propagation on a statebystate basis, while EBL methods compute the weakest preconditions of operators, and hence, perform this propagation on a regionbyregion basis. Barto, Bradtke, and Singh (1995) have observed that many algorithms for reinforcement learning can be viewed as asynchronous dynamic programming. Based on this observation, this paper shows how to develop dynamic programming versions of EBL, which we call regionbased dynamic programming or ExplanationBased Reinforcement Learning (EBRL). The paper compares batch and online versions of EBRL to batch and online versions of pointbased dynamic programming and to standard EBL. The results show that regionbased dynamic programming combines the strengths of EBL (fast learning and the ability to scale to large state spaces) with the strengths of reinforcement learning algorithms (learning of optimal policies). Results are shown in chess endgames and in synthetic maze tasks.
Balanced Cooperative Modeling
 Machine Learning
, 1993
"... Machine learning techniques are often used for supporting a knowledge engineer in constructing a model of part of the world. Different learning algorithms contribute to different tasks within the modeling process. Integrating several learning algorithms into one system allows it to support several m ..."
Abstract

Cited by 32 (3 self)
 Add to MetaCart
Machine learning techniques are often used for supporting a knowledge engineer in constructing a model of part of the world. Different learning algorithms contribute to different tasks within the modeling process. Integrating several learning algorithms into one system allows it to support several modeling tasks within the same framework. In this paper, we focus on the distribution of work between several learning algorithms on the one hand and the user on the other hand. The approach followed by the MOBAL system is that of balanced cooperation, i.e. each modeling task can be done by the user or by a learning tool of the system. The MOBAL system is described in detail. We discuss the principle of multifunctionality of one representation for the balanced use by learning algorithms and users. Key words: multistrategy learning, balanced cooperative modeling, MOBAL 1 Introduction The overall task of knowledge acquisition as well as the one of machine learning has often been described a...
Densityadaptive learning and forgetting
 In Proceedings of the Tenth International Conference on Machine Learning
, 1993
"... We describe a densityadaptive reinforcement learning and a densityadaptive forgetting algorithm. This learning algorithm uses hybrid kD/2ktrees to allow foravariable resolution partitioning and labelling of the input space. The density adaptive forgetting algorithm deletes observations from the ..."
Abstract

Cited by 22 (2 self)
 Add to MetaCart
We describe a densityadaptive reinforcement learning and a densityadaptive forgetting algorithm. This learning algorithm uses hybrid kD/2ktrees to allow foravariable resolution partitioning and labelling of the input space. The density adaptive forgetting algorithm deletes observations from the learning set depending on whether subsequent evidence is available in a local region of the parameter space. The algorithms are demonstrated in a simulation for learning feasible robotic grasp approach directions and orientations and then adapting to subsequent mechanical failures in the gripper. 1
Efficient Incremental Induction of Decision Trees
, 1995
"... This paper proposes a method to improve ID5R, an incremental TDIDT algorithm. The new method evaluates the quality of attributes selected at the nodes of a decision tree and estimates a minimum number of steps for which these attributes are guaranteed such a selection. This results in reducing overh ..."
Abstract

Cited by 16 (0 self)
 Add to MetaCart
This paper proposes a method to improve ID5R, an incremental TDIDT algorithm. The new method evaluates the quality of attributes selected at the nodes of a decision tree and estimates a minimum number of steps for which these attributes are guaranteed such a selection. This results in reducing overheads during incremental learning. The method is supported by theoretical analysis and experimental results. Keywords: Incremental algorithm, decision tree induction 1.
Theory refinement of bayesian networks with hidden variables
 In Machine Learning: Proceedingsof the International Conference
, 1998
"... Copyright by ..."