Results 1  10
of
32
Solving multiclass learning problems via errorcorrecting output codes
 Journal of Artificial Intelligence Research
, 1995
"... Multiclass learning problems involve nding a de nition for an unknown function f(x) whose range is a discrete set containing k>2values (i.e., k \classes"). The de nition is acquired by studying collections of training examples of the form hx i;f(x i)i. Existing approaches to multiclass learning ..."
Abstract

Cited by 564 (9 self)
 Add to MetaCart
Multiclass learning problems involve nding a de nition for an unknown function f(x) whose range is a discrete set containing k>2values (i.e., k \classes"). The de nition is acquired by studying collections of training examples of the form hx i;f(x i)i. Existing approaches to multiclass learning problems include direct application of multiclass algorithms such as the decisiontree algorithms C4.5 and CART, application of binary concept learning algorithms to learn individual binary functions for each of the k classes, and application of binary concept learning algorithms with distributed output representations. This paper compares these three approaches to a new technique in which errorcorrecting codes are employed as a distributed output representation. We show that these output representations improve the generalization performance of both C4.5 and backpropagation on a wide range of multiclass learning tasks. We also demonstrate that this approach is robust with respect to changes in the size of the training sample, the assignment of distributed representations to particular classes, and the application of over tting avoidance techniques such as decisiontree pruning. Finally,we show thatlike the other methodsthe errorcorrecting code technique can provide reliable class probability estimates. Taken together, these results demonstrate that errorcorrecting output codes provide a generalpurpose method for improving the performance of inductive learning programs on multiclass problems. 1.
Evolving Networks: Using the Genetic Algorithm with Connectionist Learning
 In
, 1990
"... It is appealing to consider hybrids of neuralnetwork learning algorithms with evolutionary search procedures, simply because Nature has so successfully done so. In fact, computational models of learning and evolution offer theoretical biology new tools for addressing questions about Nature that hav ..."
Abstract

Cited by 179 (2 self)
 Add to MetaCart
It is appealing to consider hybrids of neuralnetwork learning algorithms with evolutionary search procedures, simply because Nature has so successfully done so. In fact, computational models of learning and evolution offer theoretical biology new tools for addressing questions about Nature that have dogged that field since Darwin [Belew, 1990]. The concern of this paper, however, is strictly artificial: Can hybrids of connectionist learning algorithms and genetic algorithms produce more efficient and effective algorithms than either technique applied in isolation? The paper begins with a survey of recent work (by us and others) that combines Holland's Genetic Algorithm (GA) with connectionist techniques and delineates some of the basic design problems these hybrids share. This analysis suggests the dangers of overly literal representations of the network on the genome (e.g., encoding each weight explicitly). A preliminary set of experiments that use the GA to find unusual but successf...
KnowledgeBased Artificial Neural Networks
, 1994
"... Hybrid learning methods use theoretical knowledge of a domain and a set of classified examples to develop a method for accurately classifying examples not seen during training. The challenge of hybrid learning systems is to use the information provided by one source of information to offset informat ..."
Abstract

Cited by 145 (13 self)
 Add to MetaCart
Hybrid learning methods use theoretical knowledge of a domain and a set of classified examples to develop a method for accurately classifying examples not seen during training. The challenge of hybrid learning systems is to use the information provided by one source of information to offset information missing from the other source. By so doing, a hybrid learning system should learn more effectively than systems that use only one of the information sources. KBANN(KnowledgeBased Artificial Neural Networks) is a hybrid learning system built on top of connectionist learning techniques. It maps problemspecific "domain theories", represented in propositional logic, into neural networks and then refines this reformulated knowledge using backpropagation. KBANN is evaluated by extensive empirical tests on two problems from molecular biology. Among other results, these tests show that the networks created by KBANN generalize better than a wide variety of learning systems, as well as several t...
First and SecondOrder Methods for Learning: between Steepest Descent and Newton's Method
 Neural Computation
, 1992
"... Online first order backpropagation is sufficiently fast and effective for many largescale classification problems but for very high precision mappings, batch processing may be the method of choice. This paper reviews first and secondorder optimization methods for learning in feedforward neura ..."
Abstract

Cited by 126 (6 self)
 Add to MetaCart
Online first order backpropagation is sufficiently fast and effective for many largescale classification problems but for very high precision mappings, batch processing may be the method of choice. This paper reviews first and secondorder optimization methods for learning in feedforward neural networks. The viewpoint is that of optimization: many methods can be cast in the language of optimization techniques, allowing the transfer to neural nets of detailed results about computational complexity and safety procedures to ensure convergence and to avoid numerical problems. The review is not intended to deliver detailed prescriptions for the most appropriate methods in specific applications, but to illustrate the main characteristics of the different methods and their mutual relations.
Efficient BackProp
, 1998
"... . The convergence of backpropagation learning is analyzed so as to explain common phenomenon observed by practitioners. Many undesirable behaviors of backprop can be avoided with tricks that are rarely exposed in serious technical publications. This paper gives some of those tricks, and offers expl ..."
Abstract

Cited by 125 (24 self)
 Add to MetaCart
. The convergence of backpropagation learning is analyzed so as to explain common phenomenon observed by practitioners. Many undesirable behaviors of backprop can be avoided with tricks that are rarely exposed in serious technical publications. This paper gives some of those tricks, and offers explanations of why they work. Many authors have suggested that secondorder optimization methods are advantageous for neural net training. It is shown that most "classical" secondorder methods are impractical for large neural networks. A few methods are proposed that do not have these limitations. 1 Introduction Backpropagation is a very popular neural network learning algorithm because it is conceptually simple, computationally efficient, and because it often works. However, getting it to work well, and sometimes to work at all, can seem more of an art than a science. Designing and training a network using backprop requires making many seemingly arbitrary choices such as the number ...
Symbolic and neural learning algorithms: an experimental comparison
 Machine Learning
, 1991
"... Abstract Despite the fact that many symbolic and neural network (connectionist) learning algorithms address the same problem of learning from classified examples, very little is known regarding their comparative strengths and weaknesses. Experiments comparing the ID3 symbolic learning algorithm with ..."
Abstract

Cited by 99 (6 self)
 Add to MetaCart
Abstract Despite the fact that many symbolic and neural network (connectionist) learning algorithms address the same problem of learning from classified examples, very little is known regarding their comparative strengths and weaknesses. Experiments comparing the ID3 symbolic learning algorithm with the perception and backpropagation neural learning algorithms have been performed using five large, realworld data sets. Overall, backpropagation performs slightly better than the other two algorithms in terms of classification accuracy on new examples, but takes much longer to train. Experimental results suggest that backpropagation can work significantly better on data sets containing numerical data. Also analyzed empirically are the effects of (1) the amount of training data, (2) imperfect training examples, and (3) the encoding of the desired outputs. Backpropagation occasionally outperforms the other two systems when given relatively small amounts of training data. It is slightly more accurate than ID3 when examples are noisy or incompletely specified. Finally, backpropagation more effectively utilizes a "distributed " output encoding.
ErrorCorrecting Output Codes: A General Method for Improving Multiclass Inductive Learning Programs
 IN PROCEEDINGS OF AAAI91
, 1991
"... Multiclass learning problems involve finding a definition for an unknown function f(x) whose range is a discrete set containing k ? 2 values (i.e., k "classes"). The definition is acquired by studying large collections of training examples of the form hx i ; f(x i )i. Existing approaches to this pro ..."
Abstract

Cited by 88 (7 self)
 Add to MetaCart
Multiclass learning problems involve finding a definition for an unknown function f(x) whose range is a discrete set containing k ? 2 values (i.e., k "classes"). The definition is acquired by studying large collections of training examples of the form hx i ; f(x i )i. Existing approaches to this problem include (a) direct application of multiclass algorithms such as the decisiontree algorithms ID3 and CART, (b) application of binary concept learning algorithms to learn individual binary functions for each of the k classes, and (c) application of binary concept learning algorithms with distributed output codes such as those employed by Sejnowski and Rosenberg in the NETtalk system. This paper compares these three approaches to a new technique in which BCH errorcorrecting codes are employed as a distributed output representation. We show that these output representations improve the performance of ID3 on the NETtalk task and of backpropagation on an isolatedletter speechrecognition t...
Adaptive Load Migration Systems for PVM
, 1994
"... Adaptive load distribution is necessary for parallel applications to coexist effectively with other jobs in a network of shared heterogeneous workstations. We present three methods that provide such support for PVM applications. Two of these methods, MPVM and UPVM, adapt to changes in the workstati ..."
Abstract

Cited by 50 (2 self)
 Add to MetaCart
Adaptive load distribution is necessary for parallel applications to coexist effectively with other jobs in a network of shared heterogeneous workstations. We present three methods that provide such support for PVM applications. Two of these methods, MPVM and UPVM, adapt to changes in the workstation environment by transparently migrating the virtual processors (VPs) of the parallel application. A VP in MPVM is a Unix process, while UPVM defines lightweight, processlike VPs. The third method, ADM, is a programming methodology for writing programs that perform adaptive load distribution through data movement. These methods are discussed and compared in terms of effectiveness, usability, and performance. Adaptive Load Migration Systems for PVM 2 of 23 1.0 Introduction Messagepassing systems such as PVM [13] allow a heterogeneous network of parallel and serial computers to be programmed as a single computational resource. This resource appears to the application programmer as a d...
Direct Transfer of Learned Information Among Neural Networks
 Proceedings of AAAI91
, 1991
"... A touted advantage of symbolic representations is the ease of transferring learned information from one intelligent agent to another. This paper investigates an analogous problem: how to use information from one neural network to help a second network learn a related task. Rather than translate such ..."
Abstract

Cited by 46 (5 self)
 Add to MetaCart
A touted advantage of symbolic representations is the ease of transferring learned information from one intelligent agent to another. This paper investigates an analogous problem: how to use information from one neural network to help a second network learn a related task. Rather than translate such information into symbolic form (in which it may not be readily expressible), we investigate the direct transfer of information encoded as weights. Here, we focus on how transfer can be used to address the important problem of improving neural network learning speed. First we present an exploratory study of the somewhat surprising effects of presetting network weights on subsequent learning. Guided by hypotheses from this study, we sped up backpropagation learning for two speech recognition tasks. By transferring weights from smaller networks trained on subtasks, we achieved speedups of up to an order of magnitude compared with training starting with random weights, ...
Improving the Performance of Radial Basis Function Networks by Learning Center Locations
 In
, 1992
"... Three methods for improving the performance of (gaussian) radial basis function (RBF) networks were tested on the NETtalk task. In RBF, a new example is classified by computing its Euclidean distance to a set of centers chosen by unsupervised methods. The application of supervised learning to learn ..."
Abstract

Cited by 43 (3 self)
 Add to MetaCart
Three methods for improving the performance of (gaussian) radial basis function (RBF) networks were tested on the NETtalk task. In RBF, a new example is classified by computing its Euclidean distance to a set of centers chosen by unsupervised methods. The application of supervised learning to learn a nonEuclidean distance metric was found to reduce the error rate of RBF networks, while supervised learning of each center's variance resulted in inferior performance. The best improvement in accuracy was achieved by networks called generalized radial basis function (GRBF) networks. In GRBF, the center locations are determined by supervised learning. After training on 1000 words, RBF classifies 56.5% of letters correct, while GRBF scores 73.4% letters correct (on a separate test set). From these and other experiments, we conclude that supervised learning of center locations can be very important for radial basis function learning. 1 Introduction Radial basis function (RBF) networks are ...