Results 1 - 10
of
14
Robust Logics
"... Suppose that we wish to learn from examples and counter-examples a criterion for recognizing whether an assembly of wooden blocks constitutes an arch. Suppose also that we have preprogrammed recognizers for various relationships e.g. on-top-of(x; y), above(x; y), etc. and believe that some possibl ..."
Abstract
-
Cited by 27 (6 self)
- Add to MetaCart
Suppose that we wish to learn from examples and counter-examples a criterion for recognizing whether an assembly of wooden blocks constitutes an arch. Suppose also that we have preprogrammed recognizers for various relationships e.g. on-top-of(x; y), above(x; y), etc. and believe that some possibly complex expression in terms of these base relationships should suffice to approximate the desired notion of an arch. How can we formulate such a relational learning problem so as to exploit the benefits that are demonstrably available in propositional learning, such as attribute-efficient learning by linear separators, and error-resilient learning? We believe that learning in a general setting that allows for multiple objects and relations in this way is a fundamental key to resolving the following dilemma that arises in the design of intelligent systems: Mathematical logic is an attractive language of description because it has clear semantics and sound proof procedures. However, as a basis for large programmed systems it leads to brittleness because, in practice, consistent usage of the various predicate names throughout a system cannot be guaranteed, except in application areas such as mathematics where the viability of the axiomatic method has been demonstrated independently. In this paper we develop the following approach to circumventing this dilemma. We suggest that brittleness can be overcome by using a new kind of logic in which each statement is learnable. By allowing the system to learn rules empirically from the environment, relative to any particular programs it may have for recognizing some base predicates, we enable the system to acquire a set of statements approximately consistent with each other and with the world, without the need for a globally knowledgeable and consistent programmer. We illustrate
Learning to Reason with a Restricted View
, 1998
"... The Learning to Reason framework combines the study of Learning and Reasoning into a single task. Within it, learning is done specifically for the purpose of reasoning with the learned knowledge. Computational considerations show that this is a useful paradigm; in some cases learning and reasoning p ..."
Abstract
-
Cited by 26 (15 self)
- Add to MetaCart
The Learning to Reason framework combines the study of Learning and Reasoning into a single task. Within it, learning is done specifically for the purpose of reasoning with the learned knowledge. Computational considerations show that this is a useful paradigm; in some cases learning and reasoning problems that are intractable when studied separately become tractable when performed as a task of Learning to Reason. In this paper we study Learning to Reason problems where the interaction with the world supplies the learner only partial information in the form of partial assignments. Several natural interpretations of partial assignments are considered and learning and reasoning algorithms using these are developed. The results presented exhibit a tradeoff between learnability, the strength of the oracles used in the interface, and the range of reasoning queries the learner is guaranteed to answer correctly.
Learning From Examples With Unspecified Attribute Values
- In Proc. 10th Annu. Conf. on Comput. Learning Theory
, 1998
"... We introduce the UAV learning model in which some of the attributes in the examples are unspecified. In our model, an example x is classified positive (resp., negative) if all possible assignments for the unspecified attributes result in a positive (resp., negative) classification. Otherwise the cla ..."
Abstract
-
Cited by 19 (1 self)
- Add to MetaCart
We introduce the UAV learning model in which some of the attributes in the examples are unspecified. In our model, an example x is classified positive (resp., negative) if all possible assignments for the unspecified attributes result in a positive (resp., negative) classification. Otherwise the classification given to x is "?" (for unknown). Given an example x in which some attributes are unspecified, the oracle UAV-MQ responds with the classification of x. Given a hypothesis h, the oracle UAV-EQ returns An earlier version appears in the Tenth Annual ACM Conferenceon ComputationalLearning Theory, 1997 y Supported in part by NSF NYI Grant CCR-9357707 with matching funds provided by Xerox PARC and WUTA. an example x (that could have unspecified attributes) for which h(x) is incorrect. We show that any class learnable in the exact model using the MQ and EQ oracles is also learnable in the UAV model using the MQ and UAV-EQ oracles as long as the counterexamples provided by the UAV-...
Learning active classifiers
- Proceedings of the Thirteenth International Conference on Machine Learning (ICML96
, 1996
"... Most classification algorithms are "passive", in that they assign a class-label to each instance based only on the description given, even if that description is incomplete. By contrast, an active classifier can -- at some cost -- obtain the values of missing attributes, before deciding upon a class ..."
Abstract
-
Cited by 17 (5 self)
- Add to MetaCart
Most classification algorithms are "passive", in that they assign a class-label to each instance based only on the description given, even if that description is incomplete. By contrast, an active classifier can -- at some cost -- obtain the values of missing attributes, before deciding upon a class label. This can be useful when considering, for example, whether to extract some information from the web for a critical decision or whether to gather information for a medical test or experiment. The expected utility of using an active classifier depends on both the cost required to obtain the additional attribute values and the penalty incurred if the classifier outputs the wrong classification. This paper analyzes the problem of learning optimal active classifiers, using a variant of the probably-approximately-correct (PAC) model. After defining the framework, we show that this task can be achieved efficiently when the active classifier is allowed to perform only (at most) a constant number of tests. We then show that, in more general environments, the task is often intractable.
Learning to Reason: The Non-Monotonic Case
, 1995
"... We suggest a new approach for the study of the nonmonotonicity of human commonsense reasoning. The two main premises that underlie this work are that commonsense reasoning is an inductive phenomenon, and that missing information in the interaction of the agent with the environment may be as informat ..."
Abstract
-
Cited by 14 (8 self)
- Add to MetaCart
We suggest a new approach for the study of the nonmonotonicity of human commonsense reasoning. The two main premises that underlie this work are that commonsense reasoning is an inductive phenomenon, and that missing information in the interaction of the agent with the environment may be as informative for future interactions as observed information. This intuition is formalized and the problem of reasoning from incomplete information is presented as a problem of learning attribute functions over a generalized domain. We consider examples that illustrate various aspects of the non-monotonic reasoning phenomena, which have been used over the years as "bench-marks" for various formalisms, and translate them into Learning to Reason problems. We demonstrate that these have concise representations over the generalized domain and prove that these representations can be learned efficiently. The framework developed suggests an "operational " approach to studying reasoning that is nevertheless ...
Learning to Classify Incomplete Examples
- In Computational Learning Theory and Natural Learning Systems: Addressing Real World Tasks
, 1993
"... Most research on supervised learning assumes the attributes of training and test examples are completely specified. Real-world data, however, is often incomplete. This paper studies the task of learning to classify incomplete test examples, given incomplete (resp., complete) training data. We first ..."
Abstract
-
Cited by 10 (2 self)
- Add to MetaCart
Most research on supervised learning assumes the attributes of training and test examples are completely specified. Real-world data, however, is often incomplete. This paper studies the task of learning to classify incomplete test examples, given incomplete (resp., complete) training data. We first show that the performance task of classifying incomplete examples requires the use of default classification functions which demonstrate nonmonotonic classification behavior. We then extend the standard pac-learning model to allow attribute values to be hidden from the classifier, investigate the robustness of various learning strategies, and study the sample complexity of learning classes of default classification functions from examples. 1 Introduction The central task of most expert systems is classifying objects from some domain of application; i.e., determining whether a particular object belongs to a specified class, given a description of that object (Clancey, 1985). For example, a ...
Logical Analysis of Binary Data with Missing Bits
, 1999
"... We model a given pair of sets of positive and negative examples, each of which may contain missing components, as a partially defined Boolean function with missing bits (pBmb) ( T , F ), where T # {0, 1, #} n and F # {0, 1, #} n , and "#" stands for a missing bit. Then we consider ..."
Abstract
-
Cited by 10 (4 self)
- Add to MetaCart
We model a given pair of sets of positive and negative examples, each of which may contain missing components, as a partially defined Boolean function with missing bits (pBmb) ( T , F ), where T # {0, 1, #} n and F # {0, 1, #} n , and "#" stands for a missing bit. Then we consider the problem of establishing a Boolean function (an extension) f : {0, 1} n # {0, 1} belonging to a given function class C, such that f is true (resp., false) for every vector in T (resp., in F ). This is a fundamental problem, encountered in many areas such as learning theory, pattern recognition, example-based knowledge bases, logical analysis of data, knowledge discovery and data mining. In this paper, depending upon how to deal with missing bits, we formulate three types of extensions called robust, consistent and most robust extensions, for various classes of Boolean functions such as general, positive, Horn, threshold, decomposable and k-DNF. The complexity of the associated p...
Rationality
, 1994
"... ) L.G. Valiant Harvard University 1 Introduction Human and certain other biological systems are endowed with remarkable information processing capabilities. They can make observations of the world through their senses, record information in their neural systems, perform various operations intern ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
) L.G. Valiant Harvard University 1 Introduction Human and certain other biological systems are endowed with remarkable information processing capabilities. They can make observations of the world through their senses, record information in their neural systems, perform various operations internally on this information, and subsequently perform actions that benefit them. The choices they make in selecting their actions may depend subtly on the then prevailing conditions in the world. These conditions are rarely identical to ones previously experienced. Nevertheless, even in a world as complex and uncertain as this one, these systems are able to select their actions with remarkable effectiveness. The information processing capability that enables an entity to abstract information from its experiences and to utilize it effectively in this manner we shall call rationality. An entity is maximally rational if it can make maximal use of the information available to it in order to unders...
Knowing What Doesn't Matter: Exploiting the Omission of Irrelevant Data
- Artificial Intelligence
, 1997
"... Most learning algorithms work most effectively when their training data contain completely specified labeled samples. In many diagnostic tasks, however, the data will include the values of only some of the attributes; we model this as a blocking process that hides the values of those attributes fro ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
Most learning algorithms work most effectively when their training data contain completely specified labeled samples. In many diagnostic tasks, however, the data will include the values of only some of the attributes; we model this as a blocking process that hides the values of those attributes from the learner. While blockers that remove the values of critical attributes can handicap a learner, this paper instead focuses on blockers that remove only conditionally irrelevant attribute values, i.e., values that are not needed to classify an instance, given the values of the other unblocked attributes. We first motivate and formalize this model of "superfluous-value blocking," and then demonstrate that these omissions can be useful, by proving that certain classes that seem hard to learn in the general PAC model --- viz., decision trees and DNF formulae --- are trivial to learn in this setting. We then extend this model to deal with (1) theory revision (i.e., modifying an existing form...
Exploiting the Omission of Irrelevant Data
- Artificial Intelligence
, 1996
"... Most learning algorithms work most effectively when their training data contain completely specified labeled samples. In many diagnostic tasks, however, the data will include the values of only some of the attributes; we model this as a blocking process that hides the values of those attributes from ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
Most learning algorithms work most effectively when their training data contain completely specified labeled samples. In many diagnostic tasks, however, the data will include the values of only some of the attributes; we model this as a blocking process that hides the values of those attributes from the learner. While blockers that remove the values of critical attributes can handicap a learner, this paper instead focuses on blockers that remove only irrelevant attribute values, i.e., values that are not needed to classify an instance, given the values of the other unblocked attributes. We first motivate and formalize this model of "superfluous-value blocking", and then demonstrate that these omissions can be useful, by proving that certain classes that seem hard to learn in the general PAC model --- viz., decision trees and DNF formulae --- are trivial to learn in this setting. We also show that this model can be extended to deal with (1) theory revision (i.e., modifying an existing ...

