Results 1 - 10
of
95
A Comparison of Two Learning Algorithms for Text Categorization
- In Third Annual Symposium on Document Analysis and Information Retrieval
, 1994
"... This paper examines the use of inductive learning to categorize natural language documents into predefined content categories. Categorization of text is of increasing importance in information retrieval and natural language processing systems. Previous research on automated text categorization has m ..."
Abstract
-
Cited by 239 (1 self)
- Add to MetaCart
This paper examines the use of inductive learning to categorize natural language documents into predefined content categories. Categorization of text is of increasing importance in information retrieval and natural language processing systems. Previous research on automated text categorization has mixed machine learning and knowledge engineering methods, making it difficult to draw conclusions about the performance of particular methods. In this paper we present empirical results on the performance of a Bayesian classifier and a decision tree learning algorithm on two text categorization data sets. We find that both algorithms achieve reasonable performance and allow controlled tradeoffs between false positives and false negatives. The stepwise feature selection in the decision tree algorithm is particularly effective in dealing with the large feature sets common in text categorization. However, even this algorithm is aided by an initial prefiltering of features, confirming the results...
Separate-and-conquer rule learning
- Artificial Intelligence Review
, 1999
"... This paper is a survey of inductive rule learning algorithms that use a separate-and-conquer strategy. This strategy can be traced back to the AQ learning system and still enjoys popularity as can be seen from its frequent use in inductive logic programming systems. We will put this wide variety of ..."
Abstract
-
Cited by 118 (29 self)
- Add to MetaCart
This paper is a survey of inductive rule learning algorithms that use a separate-and-conquer strategy. This strategy can be traced back to the AQ learning system and still enjoys popularity as can be seen from its frequent use in inductive logic programming systems. We will put this wide variety of algorithms into a single framework and analyze them along three different dimensions, namely their search, language and overfitting avoidance biases.
A Model of Inductive Bias Learning
- Journal of Artificial Intelligence Research
, 2000
"... A major problem in machine learning is that of inductive bias: how to choose a learner's hypothesis space so that it is large enough to contain a solution to the problem being learnt, yet small enough to ensure reliable generalization from reasonably-sized training sets. Typically such bias is suppl ..."
Abstract
-
Cited by 100 (0 self)
- Add to MetaCart
A major problem in machine learning is that of inductive bias: how to choose a learner's hypothesis space so that it is large enough to contain a solution to the problem being learnt, yet small enough to ensure reliable generalization from reasonably-sized training sets. Typically such bias is supplied by hand through the skill and insights of experts. In this paper a model for automatically learning bias is investigated. The central assumption of the model is that the learner is embedded within an environment of related learning tasks. Within such an environment the learner can sample from multiple tasks, and hence it can search for a hypothesis space that contains good solutions to many of the problems in the environment. Under certain restrictions on the set of all hypothesis spaces available to the learner, we show that a hypothesis space that performs well on a sufficiently large number of training tasks will also perform well when learning novel tasks in the same environment. Exp...
The Inferential Theory Of Learning: Developing Foundations for . . .
, 1993
"... Thedevelopmentofmultistrategylearningsystemsrequiresaclearunderstandingoftherolesandthe applicabilityconditionsofdifferentlearningstrategies.Tothisend,thischapterintroducesthe InferentialTheoryofLearning thatprovidesaconceptualframeworkforexplaininglogicalcapabilities oflearningstrategies,i.e.,thei ..."
Abstract
-
Cited by 61 (15 self)
- Add to MetaCart
Thedevelopmentofmultistrategylearningsystemsrequiresaclearunderstandingoftherolesandthe applicabilityconditionsofdifferentlearningstrategies.Tothisend,thischapterintroducesthe InferentialTheoryofLearning thatprovidesaconceptualframeworkforexplaininglogicalcapabilities oflearningstrategies,i.e.,their competence.Viewinglearningasaprocessofmodifyingthelearner's knowledgebyexploringthelearner'sexperience,thetheorypostulatesthatanysuchprocesscanbe describedasasearchina knowledgespace, which involvesthelearner'sexperience,piorknowledgeand the learninggoal .Thesearchoperatorsareinstantiationsof knowledgetransmutations, whichare genericpatternsofknowledgechange.Transmutationsmayemployanybasictypeofinference --- deduction,inductionoranalogy.Severalfundamentalknowledg etransmutationsaredescribedinanovel andgeneralway,suchasgeneralization,abstraction,explanationandsimilization,andtheircounterparts, specialization,concretion,predictionanddissimilization,respectively.Generalizationenlargesthe referenceset ofadescription(thesetofentitiesthatarebeingdescribed).Abstractionreducesthe amountofthedetailaboutthereferenceset.Explanationgeneratespremisesthatexplain(orimply)the givenpropertiesofthereferenceset.Similization transfersknowledgefromonereferencesettoasimilar referenceset.Usingconceptsofthetheory,a multistrategytask -adaptivelearning(MTL)methodology isoutlined,andillustratedbyanexample.MTLdynamicallyadaptsstrategiestothe learningtask , definedbytheinputinformation,learner'sbackgroundknowledge,andthelearninggoal. Thegoalof MTLresearchisto synergisticallyintegrateawiderangeofinferentiallearningstrategies,suchas empiricalgeneralization,constructiveinduction, deductivegeneralization,explanation,prediction, abstraction,andsimilization. Keywords: learningtheory,inferencetheory,multi...
Shifting Inductive Bias with Success-Story Algorithm, Adaptive Levin Search, and Incremental Self-Improvement
- MACHINE LEARNING
, 1997
"... We study task sequences that allow for speeding up the learner's average reward intake through appropriate shifts of inductive bias (changes of the learner's policy). To evaluate long-term effects of bias shifts setting the stage for later bias shifts we use the "success-story algorithm" (SSA). SSA ..."
Abstract
-
Cited by 58 (27 self)
- Add to MetaCart
We study task sequences that allow for speeding up the learner's average reward intake through appropriate shifts of inductive bias (changes of the learner's policy). To evaluate long-term effects of bias shifts setting the stage for later bias shifts we use the "success-story algorithm" (SSA). SSA is occasionally called at times that may depend on the policy itself. It uses backtracking to undo those bias shifts that have not been empirically observed to trigger longterm reward accelerations (measured up until the current SSA call). Bias shifts that survive SSA represent a lifelong success history. Until the next SSA call, they are considered useful and build the basis for additional bias shifts. SSA allows for plugging in a wide variety of learning algorithms. We plug in (1) a novel, adaptive extension of Levin search and (2) a method for embedding the learner's policy modification strategy within the policy itself (incremental self-improvement). Our inductive transfer case studies...
Learning One More Thing
, 1994
"... Most research on machine learning has focused on scenarios in which a learner faces a single, isolated learning task. The lifelong learning frameworkassumes instead that the learner encounters a multitude of related learning tasks over its lifetime, providing the opportunity for the transfer of know ..."
Abstract
-
Cited by 57 (6 self)
- Add to MetaCart
Most research on machine learning has focused on scenarios in which a learner faces a single, isolated learning task. The lifelong learning frameworkassumes instead that the learner encounters a multitude of related learning tasks over its lifetime, providing the opportunity for the transfer of knowledge. This paper studies lifelong learning in the context of binary classification. It presents the invariance approach, in which knowledge is transferred via a learned model of the invariances of the domain. Results on learning to recognize objects from color images demonstrate superior generalization capabilities if invariances are learned and used to bias subsequent learning. This research is sponsored in part by the National Science Foundation under award IRI-9313367, and by the Wright Laboratory, Aeronautical Systems Center, Air Force Materiel Command, USAF, and the Advanced Research Projects Agency (ARPA) under grant number F33615-93-1-1330. Views and conclusions contained in this doc...
Learning and Problem Solving with Multilayer Connectionist Systems
, 1986
"... Learning and Problem Solving with Multilayer Connectionist Systems September 1986 Charles William Anderson B.S., University of Nebraska M.S., University of Massachusetts Ph.D., University of Massachusetts Directed by: Professor Andrew G. Barto The di#culties of learning in multilayered netwo ..."
Abstract
-
Cited by 49 (1 self)
- Add to MetaCart
Learning and Problem Solving with Multilayer Connectionist Systems September 1986 Charles William Anderson B.S., University of Nebraska M.S., University of Massachusetts Ph.D., University of Massachusetts Directed by: Professor Andrew G. Barto The di#culties of learning in multilayered networks of computational units has limited the use of connectionist systems in complex domains. This dissertation elucidates the issues of learning in a network's hidden units, and reviews methods for addressing these issues that have been developed through the years. Issues of learning in hidden units are shown to be analogous to learning issues for multilayer systems employing symbolic representations.
Discovering Neural Nets With Low Kolmogorov Complexity And High Generalization Capability
- Neural Networks
, 1997
"... Many neural net learning algorithms aim at finding "simple" nets to explain training data. The expectation is: the "simpler" the networks, the better the generalization on test data (! Occam's razor). Previous implementations, however, use measures for "simplicity" that lack the power, universali ..."
Abstract
-
Cited by 41 (23 self)
- Add to MetaCart
Many neural net learning algorithms aim at finding "simple" nets to explain training data. The expectation is: the "simpler" the networks, the better the generalization on test data (! Occam's razor). Previous implementations, however, use measures for "simplicity" that lack the power, universality and elegance of those based on Kolmogorov complexity and Solomonoff's algorithmic probability. Likewise, most previous approaches (especially those of the "Bayesian" kind) suffer from the problem of choosing appropriate priors. This paper addresses both issues. It first reviews some basic concepts of algorithmic complexity theory relevant to machine learning, and how the Solomonoff-Levin distribution (or universal prior) deals with the prior problem. The universal prior leads to a probabilistic method for finding "algorithmically simple" problem solutions with high generalization capability. The method is based on Levin complexity (a time-bounded generalization of Kolmogorov comple...
For Every Generalization Action, Is There Really An Equal And Opposite Reaction? Analysis of the Conservation Law for Generalization Performance
- Proceedings of the Twelfth International Conference on Machine Learning
, 1995
"... The "Conservation Law for Generalization Performance" [Schaffer, 1994] states that for any learning algorithm and bias, "generalization is a zero-sum enterprise." In this paper we study the law and show that while the law is true, the manner in which the Conservation Law adds up generalization ..."
Abstract
-
Cited by 38 (0 self)
- Add to MetaCart
The "Conservation Law for Generalization Performance" [Schaffer, 1994] states that for any learning algorithm and bias, "generalization is a zero-sum enterprise." In this paper we study the law and show that while the law is true, the manner in which the Conservation Law adds up generalization performance over all target concepts, without regard to the probability with which each concept occurs, is relevant only in a uniformly random universe. We then introduce a more meaningful measure of generalization, expected generalization performance. Unlike the Conservation Law's measure of generalization perfor- mance (which is, in essence, defined to be zero), expected generalization performance is conserved only when certain symmetric properties hold in our universe. There is no reason to believe, a priori, that such symmetries exist; learning algorithms may well ex- hibit non-zero (expected) generalization per- forlllance.
Inductive Policy: The Pragmatics of Bias Selection
- MACHINE LEARNING
, 1995
"... This paper extends the currently accepted model of inductive bias by identifying six categories of bias and separates inductive bias from the policy for its selection (the inductive policy). We analyze existing "blas selection " systems, examining the similarities and differences in their ..."
Abstract
-
Cited by 37 (9 self)
- Add to MetaCart
This paper extends the currently accepted model of inductive bias by identifying six categories of bias and separates inductive bias from the policy for its selection (the inductive policy). We analyze existing "blas selection " systems, examining the similarities and differences in their inductive policies, and idemify three techniques useful for building inductive policies. We then present a framework for representing and automaticaIly selecting a wide variety of biases and describe experiments with an instantiation of the framework addressing various pragmatic tradeoffs of time, space, accuracy, and the cost oferrors. The experiments show that a common framework can be used to implement policies for a variety of different types of blas selection, such as parameter selection, term selection, and example selection, using similar techniques. The experiments also show that different tradeoffs can be made by the implementation of different policies; for example, from the same data different rule sets can be learned based on different tradeoffs of accuracy versus the cost of erroneous predictions.

