Results 1 - 10
of
146
A Practical Bayesian Framework for Backprop Networks
- Neural Computation
, 1991
"... A quantitative and practical Bayesian framework is described for learning of mappings in feedforward networks. The framework makes possible: (1) objective comparisons between solutions using alternative network architectures ..."
Abstract
-
Cited by 347 (19 self)
- Add to MetaCart
A quantitative and practical Bayesian framework is described for learning of mappings in feedforward networks. The framework makes possible: (1) objective comparisons between solutions using alternative network architectures
Multitask Learning
- MACHINE LEARNING
, 1997
"... Multitask Learning is an approach to inductive transfer that improves generalization by using the domain information contained in the training signals of related tasks as an inductive bias. It does this by learning tasks in parallel while using a shared representation; what is learned for each task ..."
Abstract
-
Cited by 328 (6 self)
- Add to MetaCart
Multitask Learning is an approach to inductive transfer that improves generalization by using the domain information contained in the training signals of related tasks as an inductive bias. It does this by learning tasks in parallel while using a shared representation; what is learned for each task can help other tasks be learned better. This paper reviews prior work on MTL, presents new evidence that MTL in backprop nets discovers task relatedness without the need of supervisory signals, and presents new results for MTL with k-nearest neighbor and kernel regression. In this paper we demonstrate multitask learning in three domains. We explain how multitask learning works, and show that there are many opportunities for multitask learning in real domains. We present an algorithm and results for multitask learning with case-based methods like k-nearest neighbor and kernel regression, and sketch an algorithm for multitask learning in decision trees. Because multitask learning works, can be applied to many different kinds of domains, and can be used with different learning algorithms, we conjecture there will be many opportunities for its use on real-world problems.
How learning can guide evolution
- Complex Systems
, 1987
"... Reprinted by permission. Abstract. The assumption that acquired characteristics are not inherited is often taken to imply that the adaptations that an organism learns during its lifetime cannot guide the course of evolution. This inference is incorrect (Baldwin, 1896). Learning alters the shape of t ..."
Abstract
-
Cited by 302 (1 self)
- Add to MetaCart
Reprinted by permission. Abstract. The assumption that acquired characteristics are not inherited is often taken to imply that the adaptations that an organism learns during its lifetime cannot guide the course of evolution. This inference is incorrect (Baldwin, 1896). Learning alters the shape of the search space in which evolution operates and thereby provides good evolutionary paths towards sets of co-adapted alleles. We demonstrate that this effect allows learning organisms to evolve much faster than their nonlearning equivalents, even though the characteristics acquired by the phenotype are not communicated to the genotype. 1.
A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Features
- Machine Learning
, 1993
"... In the past, nearest neighbor algorithms for learning from examples have worked best in domains in which all features had numeric values. In such domains, the examples can be treated as points and distance metrics can use standard definitions. In symbolic domains, a more sophisticated treatment of t ..."
Abstract
-
Cited by 249 (3 self)
- Add to MetaCart
In the past, nearest neighbor algorithms for learning from examples have worked best in domains in which all features had numeric values. In such domains, the examples can be treated as points and distance metrics can use standard definitions. In symbolic domains, a more sophisticated treatment of the feature space is required. We introduce a nearest neighbor algorithm for learning in domains with symbolic features. Our algorithm calculates distance tables that allow it to produce real-valued distances between instances, and attaches weights to the instances to further modify the structure of feature space. We show that this technique produces excellent classification accuracy on three problems that have been studied by machine learning researchers: predicting protein secondary structure, identifying DNA promoter sequences, and pronouncing English text. Direct experimental comparisons with the other learning algorithms show that our nearest neighbor algorithm is comparable or superior ...
Co-Evolution in the Successful Learning of Backgammon Strategy
- Machine Learning
, 1998
"... Following Tesauro's work on TD-Gammon, we used a 4000 parameter feed-forward neural network to develop a competitive backgammon evaluation function. Play proceeds by a roll of the dice, application of the network to all legal moves, and choosing the move with the highest evaluation. However, no back ..."
Abstract
-
Cited by 100 (24 self)
- Add to MetaCart
Following Tesauro's work on TD-Gammon, we used a 4000 parameter feed-forward neural network to develop a competitive backgammon evaluation function. Play proceeds by a roll of the dice, application of the network to all legal moves, and choosing the move with the highest evaluation. However, no back-propagation, reinforcement or temporal difference learning methods were employed. Instead we apply simple hill-climbing in a relative fitness environment. We start with an initial champion of all zero weights and proceed simply by playing the current champion network against a slightly mutated challenger and changing weights if the challenger wins. Surprisingly, this worked rather well. We investigate how the peculiar dynamics of this domain enabled a previously discarded weak method to succeed, by preventing suboptimal equilibria in a "meta-game" of self-learning. Keywords: coevolution, backgammon, reinforcement, temporal difference learning, self-learning Running Head: CO-EVOLUTIONARY LEA...
Coevolution of A Backgammon Player
- Proceedings Artificial Life V
"... One of the persistent themes in Artificial Life research is the use of co-evolutionary arms races in the development of specific and complex behaviors. However, other than Sims’s work on artificial robots, most of the work has attacked very simple games of prisoners dilemma or predator and prey. Fol ..."
Abstract
-
Cited by 70 (11 self)
- Add to MetaCart
One of the persistent themes in Artificial Life research is the use of co-evolutionary arms races in the development of specific and complex behaviors. However, other than Sims’s work on artificial robots, most of the work has attacked very simple games of prisoners dilemma or predator and prey. Following Tesauro’s work on TD-Gammon, we used a 4000 parameter feed-forward neural network to develop a competitive backgammon evaluation function. Play proceeds by a roll of the dice, application of the network to all legal moves, and choosing the move with the highest evaluation. However, no back-propagation, reinforcement
Multitask Learning: A Knowledge-Based Source of Inductive Bias
- Proceedings of the Tenth International Conference on Machine Learning
, 1993
"... This paper suggests that it may be easier to learn several hard tasks at one time than to learn these same tasks separately. In effect, the information provided by the training signal for each task serves as a domain-specific inductive bias for the other tasks. Frequentlythe worldgives us clusters o ..."
Abstract
-
Cited by 56 (5 self)
- Add to MetaCart
This paper suggests that it may be easier to learn several hard tasks at one time than to learn these same tasks separately. In effect, the information provided by the training signal for each task serves as a domain-specific inductive bias for the other tasks. Frequentlythe worldgives us clusters of related tasks to learn. When it does not, it is often straightforward to create additional tasks. For many domains, acquiring inductive bias by collecting additional teaching signal may be more practical than the traditional approach of codifying domain-specific biases acquired from human expertise. We call this approach MultitaskLearning (MTL). Since much of the power of an inductive learner follows directly from its inductive bias, multitask learning may yield more powerful learning. An empirical example of multitask connectionist learning is presented where learning improves by training one network on several related tasks at the same time. Multitask decision tree induction is also outl...
Convolutional Networks for Images, Speech, and Time-Series
, 1995
"... INTRODUCTION The ability of multilayer back-propagation networks to learn complex, high-dimensional, nonlinear mappings from large collections of examples makes them obvious candidates for image recognition or speech recognition tasks (see PATTERN RECOGNITION AND NEURAL NETWORKS). In the traditional ..."
Abstract
-
Cited by 56 (2 self)
- Add to MetaCart
INTRODUCTION The ability of multilayer back-propagation networks to learn complex, high-dimensional, nonlinear mappings from large collections of examples makes them obvious candidates for image recognition or speech recognition tasks (see PATTERN RECOGNITION AND NEURAL NETWORKS). In the traditional model of pattern recognition, a hand-designed feature extractor gathers relevant information from the input and eliminates irrelevant variabilities. A trainable classifier then categorizes the resulting feature vectors (or strings of symbols) into classes. In this scheme, standard, fully-connected multilayer networks can be used as classifiers. A potentially more interesting scheme is to eliminate the feature extractor, feeding the network with "raw" inputs (e.g. normalized images), and to rely on backpropagation to turn the first few layers into an appropriate feature extractor. While this can be done with an ordinary fully connected feed-forward network with some success for tasks
Comparison of Approximate Methods for Handling Hyperparameters
- NEURAL COMPUTATION
"... I examine two approximate methods for computational implementation of Bayesian hierarchical models, that is, models which include unknown hyperparameters such as regularization constants and noise levels. In the 'evidence framework' the model parameters are integrated over, and the resulting evid ..."
Abstract
-
Cited by 49 (1 self)
- Add to MetaCart
I examine two approximate methods for computational implementation of Bayesian hierarchical models, that is, models which include unknown hyperparameters such as regularization constants and noise levels. In the 'evidence framework' the model parameters are integrated over, and the resulting evidence is maximized over the hyperparameters. The optimized
Gaussian Processes -- A Replacement for Supervised Neural Networks?
"... These lecture notes are based on the work of Neal (1996), Williams and ..."
Abstract
-
Cited by 43 (0 self)
- Add to MetaCart
These lecture notes are based on the work of Neal (1996), Williams and

