Results 1 - 10
of
34
Active Learning with Statistical Models
, 1995
"... For manytypes of learners one can compute the statistically "optimal" way to select data. We review how these techniques have been used with feedforward neural networks [MacKay, 1992# Cohn, 1994]. We then showhow the same principles may be used to select data for two alternative, statistically-bas ..."
Abstract
-
Cited by 402 (7 self)
- Add to MetaCart
For manytypes of learners one can compute the statistically "optimal" way to select data. We review how these techniques have been used with feedforward neural networks [MacKay, 1992# Cohn, 1994]. We then showhow the same principles may be used to select data for two alternative, statistically-based learning architectures: mixtures of Gaussians and locally weighted regression. While the techniques for neural networks are expensive and approximate, the techniques for mixtures of Gaussians and locally weighted regression are both efficient and accurate.
A Sequential Algorithm for Training Text Classifiers
, 1994
"... The ability to cheaply train text classifiers is critical to their use in information retrieval, content analysis, natural language processing, and other tasks involving data which is partly or fully textual. An algorithm for sequential sampling during machine learning of statistical classifiers was ..."
Abstract
-
Cited by 365 (9 self)
- Add to MetaCart
The ability to cheaply train text classifiers is critical to their use in information retrieval, content analysis, natural language processing, and other tasks involving data which is partly or fully textual. An algorithm for sequential sampling during machine learning of statistical classifiers was developed and tested on a newswire text categorization task. This method, which we call uncertainty sampling, reduced by as much as 500-fold the amount of training data that would have to be manually classified to achieve a given level of effectiveness. 1 Introduction Text classification is the automated grouping of textual or partially textual entities. Document retrieval, categorization, routing, filtering, and clustering, as well as natural language processing tasks such as tagging, word sense disambiguation, and some aspects of understanding can be formulated as text classification. As the amount of online text increases, the demand for text classification to aid the analysis and mana...
Heterogeneous Uncertainty Sampling for Supervised Learning
- In Proceedings of the Eleventh International Conference on Machine Learning
, 1994
"... Uncertainty sampling methods iteratively request class labels for training instances whose classes are uncertain despite the previous labeled instances. These methods can greatly reduce the number of instances that an expert need label. One problem with this approach is that the classifier best suit ..."
Abstract
-
Cited by 194 (3 self)
- Add to MetaCart
Uncertainty sampling methods iteratively request class labels for training instances whose classes are uncertain despite the previous labeled instances. These methods can greatly reduce the number of instances that an expert need label. One problem with this approach is that the classifier best suited for an application may be too expensive to train or use during the selection of instances. We test the use of one classifier (a highly efficient probabilistic one) to select examples for training another (the C4.5 rule induction program). Despite being chosen by this heterogeneous approach, the uncertainty samples yielded classifiers with lower error rates than random samples ten times larger. 1 Introduction Machine learning algorithms have been used to build classification rules from data sets consisting of hundreds of thousands of instances [4]. In some applications unlabeled training instances are abundant but the cost of labeling an instance with its class is high. In the informatio...
Neural network exploration using optimal experiment design
- Neural Networks
, 1994
"... We consider the question "How should one act when the only goal is to learn as much as possible?" Building on the theoretical results of Fedorov [1972] and MacKay [1992], we apply techniques from Optimal Experiment Design (OED) to guide the query/action selection of a neural network learner. We de ..."
Abstract
-
Cited by 102 (2 self)
- Add to MetaCart
We consider the question "How should one act when the only goal is to learn as much as possible?" Building on the theoretical results of Fedorov [1972] and MacKay [1992], we apply techniques from Optimal Experiment Design (OED) to guide the query/action selection of a neural network learner. We demonstrate that these techniques allow the learner to minimize its generalization error by exploring its domain efficiently and completely.We conclude that, while not a panacea, OED-based query/action has muchto offer, especially in domains where its high computational costs can be tolerated.
Committee-Based Sampling For Training Probabilistic Classifiers
- In Proceedings of the Twelfth International Conference on Machine Learning
, 1995
"... In many real-world learning tasks, it is expensive to acquire a sufficient number of labeled examples for training. This paper proposes a general method for efficiently training probabilistic classifiers, by selecting for training only the more informative examples in a stream of unlabeled examples. ..."
Abstract
-
Cited by 93 (3 self)
- Add to MetaCart
In many real-world learning tasks, it is expensive to acquire a sufficient number of labeled examples for training. This paper proposes a general method for efficiently training probabilistic classifiers, by selecting for training only the more informative examples in a stream of unlabeled examples. The method, committee-based sampling, evaluates the informativeness of an example by measuring the degree of disagreement between several model variants. These variants (the committee) are drawn randomly from a probability distribution conditioned by the training set selected so far (Monte-Carlo sampling). The method is particularly attractive because it evaluates the expected information gain from a training example implicitly, making the model both easy to implement and generally applicable. We further show how to apply committeebased sampling for training Hidden Markov Model classifiers, which are commonly used for complex classification tasks. The method was implemented and tested for ...
A Comprehensive Survey of Fitness Approximation in Evolutionary Computation
, 2003
"... Evolutionary algorithms (EAs) have received increasing interests both in the academy and industry. One main difficulty in applying EAs to real-world applications is that EAs usually need a large number of fitness evaluations before a satisfying result can be obtained. However, fitness evaluations ar ..."
Abstract
-
Cited by 65 (6 self)
- Add to MetaCart
Evolutionary algorithms (EAs) have received increasing interests both in the academy and industry. One main difficulty in applying EAs to real-world applications is that EAs usually need a large number of fitness evaluations before a satisfying result can be obtained. However, fitness evaluations are not always straightforward in many real-world applications. Either an explicit fitness function does not exist, or the evaluation of the fitness is computationally very expensive. In both cases, it is necessary to estimate the fitness function by constructing an approximate model. In this paper, a comprehensive survey of the research on fitness approximation in evolutionary computation is presented. Main issues like approximation levels, approximate model management schemes, model construction techniques are reviewed. To conclude, open questions and interesting issues in the field are discussed.
A Framework for Evolutionary Optimization with Approximate Fitness Functions
- IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION
, 2002
"... It is a common engineering practice to use approximate models instead of the original computationally expensive model in optimization. When an approximate model is used for evolutionary optimization, the convergence properties of the evolutionary algorithm are unclear due to the approximation error. ..."
Abstract
-
Cited by 56 (12 self)
- Add to MetaCart
It is a common engineering practice to use approximate models instead of the original computationally expensive model in optimization. When an approximate model is used for evolutionary optimization, the convergence properties of the evolutionary algorithm are unclear due to the approximation error. In this paper, extensive empirical studies on convergence of an evolution strategy are carried out on two bench-mark problems. It is found that incorrect convergence will occur if the approximate model has false optima. To address this problem, individual and generation based evolution control is introduced and the resulting effects on the convergence properties are presented. A framework for managing approximate models in generation-based evolution control is proposed. This framework is well suited for parallel evolutionary optimization that is able to guarantee the correct convergence of the evolutionary algorithm and to reduce the computation costs as much as possible. Control o...
Learning Routing Queries in a Query Zone
, 1997
"... Word usage is domain dependent. A common word in one domain can be quite infrequent in another. In this study we exploit this property of word usage to improve document routing. We show that routing queries (profiles) learned only from the documents in a query domain are better than the routing prof ..."
Abstract
-
Cited by 50 (4 self)
- Add to MetaCart
Word usage is domain dependent. A common word in one domain can be quite infrequent in another. In this study we exploit this property of word usage to improve document routing. We show that routing queries (profiles) learned only from the documents in a query domain are better than the routing profiles learned when query domains are not used. We approximate a query domain by a query zone. Experiments show that routing profiles learned from a query zone are 8--12% more effective than the profiles generated when no query zoning is used. 1 Background Document routing is an important problem in the field of information retrieval. [12] When a user has marked several articles as relevant to his/her information need, a system should be able to automatically learn the user's "profile" and should be able to route (send) new, potentially interesting, articles to the user. This problem has also been called as selective dissemination of information or information filtering. [4] Most current st...
Committee-Based Sample Selection For Probabilistic Classifiers
- Journal of Artificial Intelligence Research
, 1999
"... In many real-world learning tasks it is expensive to acquire a sufficient number of labeled examples for training. This paper investigates methods for reducing annotation cost by sample selection. In this approach, during training the learning program examines many unlabeled examples and selects for ..."
Abstract
-
Cited by 35 (0 self)
- Add to MetaCart
In many real-world learning tasks it is expensive to acquire a sufficient number of labeled examples for training. This paper investigates methods for reducing annotation cost by sample selection. In this approach, during training the learning program examines many unlabeled examples and selects for labeling only those that are most informative at each stage. This avoids redundantly labeling examples that contribute little new information. Our work follows on previous research on Query By Committee, and extends the committee-based paradigm to the context of probabilistic classification. We describe a family of empirical methods for committee-based sample selection in probabilistic classification models, which evaluate the informativeness of an example by measuring the degree of disagreement between several model variants. These variants (the committee) are drawn randomly from a probability distribution conditioned by the training set labeled so far. The method was applied to...
Accelerated Learning By Active Example Selection
- International Journal of Neural Systems
, 1994
"... Much previous work on training multilayer neural networks has attempted to speed up the back-propagation algorithm using more sophisticated weight modification rules, whereby all the given training examples are used in a random or predetermined sequence. In this paper we investigate an alternative a ..."
Abstract
-
Cited by 31 (10 self)
- Add to MetaCart
Much previous work on training multilayer neural networks has attempted to speed up the back-propagation algorithm using more sophisticated weight modification rules, whereby all the given training examples are used in a random or predetermined sequence. In this paper we investigate an alternative approach in which the learning proceeds on an increasing number of selected training examples, starting with a small training set. We derive a measure of criticality of examples and present an incremental learning algorithm that uses this measure to select a critical subset of given examples for solving the particular task. Our experimental results suggest that the method can significantly improve training speed and generalization performance in many real applications of neural networks. This method can be used in conjunction with other variations of gradient descent algorithms. 1 Introduction One of the most widely used methods for training multilayer feedforward neural networks is the erro...

