Results 1  10
of
26
Using Unlabeled Data to Improve Text Classification
, 2001
"... One key difficulty with text classification learning algorithms is that they require many handlabeled examples to learn accurately. This dissertation demonstrates that supervised learning algorithms that use a small number of labeled examples and many inexpensive unlabeled examples can create high ..."
Abstract

Cited by 50 (0 self)
 Add to MetaCart
One key difficulty with text classification learning algorithms is that they require many handlabeled examples to learn accurately. This dissertation demonstrates that supervised learning algorithms that use a small number of labeled examples and many inexpensive unlabeled examples can create highaccuracy text classifiers. By assuming that documents are created by a parametric generative model, ExpectationMaximization (EM) finds local maximum a posteriori models and classifiers from all the data  labeled and unlabeled. These generative models do not capture all the intricacies of text; however on some domains this technique substantially improves classification accuracy, especially when labeled data are sparse. Two problems arise from this basic approach. First, unlabeled data can hurt performance in domains where the generative modeling assumptions are too strongly violated. In this case the assumptions can be made more representative in two ways: by modeling subtopic class structure, and by modeling supertopic hierarchical class relationships. By doing so, model probability and classification accuracy come into correspondence, allowing unlabeled data to improve classification performance. The second problem is that even with a representative model, the improvements given by unlabeled data do not sufficiently compensate for a paucity of labeled data. Here, limited labeled data provide EM initializations that lead to lowprobability models. Performance can be significantly improved by using active learning to select highquality initializations, and by using alternatives to EM that avoid lowprobability local maxima.
Scalability Problems of Simple Genetic Algorithms
 Evolutionary Computation
, 1999
"... Scalable evolutionary computation has become an intensively studied research topic in recent years. The issue of scalability is predominant in any field of algorithmic design, but it became particularly relevant for the design of competent genetic algorithms once the scalability problems of simpl ..."
Abstract

Cited by 38 (4 self)
 Add to MetaCart
Scalable evolutionary computation has become an intensively studied research topic in recent years. The issue of scalability is predominant in any field of algorithmic design, but it became particularly relevant for the design of competent genetic algorithms once the scalability problems of simple genetic algorithms were understood. Here we present some of the work that has aided in getting a clear insight in the scalability problems of simple genetic algorithms. Particularly, we discuss the important issue of building block mixing. We show how the need for mixing places a boundary in the GA parameter space that, together with the boundary from the schema theorem, delimits the region where the GA converges reliably to the optimum in problems of bounded difficulty. This region shrinks rapidly with increasing problem size unless the building blocks are tightly linked in the problem coding structure. In addition, we look at how straightforward extensions of the simple genetic a...
System Identification, Approximation and Complexity
 International Journal of General Systems
, 1977
"... This paper is concerned with establishing broadlybased systemtheoretic foundations and practical techniques for the problem of system identification that are rigorous, intuitively clear and conceptually powerful. A general formulation is first given in which two order relations are postulated on a ..."
Abstract

Cited by 34 (22 self)
 Add to MetaCart
This paper is concerned with establishing broadlybased systemtheoretic foundations and practical techniques for the problem of system identification that are rigorous, intuitively clear and conceptually powerful. A general formulation is first given in which two order relations are postulated on a class of models: a constant one of complexity; and a variable one of approximation induced by an observed behaviour. An admissible model is such that any less complex model is a worse approximation. The general problem of identification is that of finding the admissible subspace of models induced by a given behaviour. It is proved under very general assumptions that, if deterministic models are required then nearly all behaviours require models of nearly maximum complexity. A general theory of approximation between models and behaviour is then developed based on subjective probability concepts and semantic information theory The role of structural constraints such as causality, locality, finite memory, etc., are then discussed as rules of the game. These concepts and results are applied to the specific problem or stochastic automaton, or grammar, inference. Computational results are given to demonstrate that the theory is complete and fully operational. Finally the formulation of identification proposed in this paper is analysed in terms of Klir’s epistemological hierarchy and both are discussed in terms of the rich philosophical literature on the acquisition of knowledge. 1
A Bayesian Framework for Concept Learning
 DEPARTMENT OF ARTIFICIAL INTELLIGENCE, EDINBURGH UNIVERSITY
, 1999
"... Human concept learning presents a version of the classic problem of induction, which is made particularly difficult by the combination of two requirements: the need to learn from a rich (i.e. nested and overlapping) vocabulary of possible concepts and the need to be able to generalize concepts reaso ..."
Abstract

Cited by 21 (3 self)
 Add to MetaCart
Human concept learning presents a version of the classic problem of induction, which is made particularly difficult by the combination of two requirements: the need to learn from a rich (i.e. nested and overlapping) vocabulary of possible concepts and the need to be able to generalize concepts reasonably from only a few positive examples. I begin this thesis by considering a simple number concept game as a concrete illustration of this ability. On this task, human learners can with reasonable confidence lock in on one out of a billion billion billion logically possible concepts, after seeing only four positive examples of the concept, and can generalize informatively after seeing just a single example. Neither of the two classic approaches to inductive inference  hypothesis testing in a constrained space of possible rules and computing similarity to the observed examples  can provide a complete picture of how people generalize concepts in even this simple setting. This thesis prop...
An algebra of human concept learning
 Journal of Mathematical Psychology
, 2006
"... An important element of learning from examples is the extraction of patterns and regularities from data. This paper investigates the structure of patterns in data defined over discrete features, i.e. features with two or more qualitatively distinct values. Any such pattern can be algebraically decom ..."
Abstract

Cited by 15 (5 self)
 Add to MetaCart
An important element of learning from examples is the extraction of patterns and regularities from data. This paper investigates the structure of patterns in data defined over discrete features, i.e. features with two or more qualitatively distinct values. Any such pattern can be algebraically decomposed into a spectrum of component patterns, each of which is a simpler or more atomic ‘‘regularity.’ ’ Each component regularity involves a certain number of features, referred to as its degree. Regularities of lower degree represent simpler or more coarse patterns in the original pattern, while regularities of higher degree represent finer or more idiosyncratic patterns. The full spectral breakdown of a pattern into component regularities of minimal degree, referred to as its power series, expresses the original pattern in terms of the regular rules or patterns it obeys, amounting to a kind of ‘‘theory’ ’ of the pattern. The number of regularities at various degrees necessary to represent the pattern is tabulated in its power spectrum, which expresses how much of a pattern’s structure can be explained by regularities of various levels of complexity. A weighted mean of the pattern’s spectral power gives a useful numeric summary of its overall complexity, called its algebraic complexity. The basic theory of algebraic decomposition is extended in several ways, including algebraic accounts of the typicality of individual objects within concepts, and estimation of the power series from noisy data. Finally some relations between these algebraic quantities and empirical data are discussed.
Bridging the gap between distance and generalisation: Symbolic learning in metric spaces
, 2008
"... Distancebased and generalisationbased methods are two families of artificial intelligence techniques that have been successfully used over a wide range of realworld problems. In the first case, general algorithms can be applied to any data representation by just changing the distance. The metric ..."
Abstract

Cited by 7 (4 self)
 Add to MetaCart
Distancebased and generalisationbased methods are two families of artificial intelligence techniques that have been successfully used over a wide range of realworld problems. In the first case, general algorithms can be applied to any data representation by just changing the distance. The metric space sets the search and learning space, which is generally instanceoriented. In the second case, models can be obtained for a given pattern language, which can be comprehensible. The generalityordered space sets the search and learning space, which is generally modeloriented. However, the concepts of distance and generalisation clash in many different ways, especially when knowledge representation is complex (e.g. structured data). This work establishes a framework where these two fields can be integrated in a consistent way. We introduce the concept of distancebased generalisation, which connects all the generalised examples in such a way that all of them are reachable inside the generalisation by using straight paths in the metric space. This makes the metric space and the generalityordered space coherent (or even dual). Additionally, we also introduce a definition of minimal distancebased generalisation that can be seen as the first formulation of the Minimum Description Length (MDL)/Minimum Message Length (MML) principle in terms of a distance function. We instantiate and develop the framework for the most common data representations and distances, where we show that consistent instances can be found for numerical data, nominal data, sets, lists, tuples, graphs, firstorder atoms and clauses. As a result, general learning methods that integrate the best from distancebased and generalisationbased methods can be defined and adapted to any specific problem by appropriately choosing the distance, the pattern language and the generalisation operator.
Basic Elements and Problems of Probability Theory
, 1999
"... After a brief review of ontic and epistemic descriptions, and of subjective, logical and statistical interpretations of probability, we summarize the traditional axiomatization of calculus of probability in terms of Boolean algebras and its settheoretical realization in terms of Kolmogorov probabil ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
After a brief review of ontic and epistemic descriptions, and of subjective, logical and statistical interpretations of probability, we summarize the traditional axiomatization of calculus of probability in terms of Boolean algebras and its settheoretical realization in terms of Kolmogorov probability spaces. Since the axioms of mathematical probability theory say nothing about the conceptual meaning of “randomness” one considers probability as property of the generating conditions of a process so that one can relate randomness with predictability (or retrodictability). In the measuretheoretical codification of stochastic processes genuine chance processes can be defined rigorously as socalled regular processes which do not allow a longterm prediction. We stress that stochastic processes are equivalence classes of individual point functions so that they do not refer to individual processes but only to an ensemble of statistically equivalent individual processes. Less popular but conceptually more important than statistical descriptions are individual descriptions which refer to individual chaotic processes. First, we review the individual description based on the generalized harmonic analysis by Norbert Wiener. It allows the definition of individual purely chaotic processes which can be interpreted as trajectories of regular statistical stochastic processes. Another individual description refers to algorithmic procedures which connect the intrinsic randomness of a finite sequence with the complexity of the shortest program necessary to produce the sequence. Finally, we ask why there can be laws of chance. We argue that random events fulfill the laws of chance if and only if they can be reduced to (possibly hidden) deterministic events. This mathematical result may elucidate the fact that not all nonpredictable events can be grasped by the methods of mathematical probability theory.
Concept Acquisition by Autonomous Agents: Cognitive Modeling versus the Engineering Approach
 Lund University, Sweden
, 1992
"... This paper is a treatment of the problem of concept acquisition by autonomous agents, primarily from an AI point of view. However, as this problem is not very well studied in AI and as humans are indeed a kind of autonomous agent, the problem is also studied from a psychological point of view to see ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
This paper is a treatment of the problem of concept acquisition by autonomous agents, primarily from an AI point of view. However, as this problem is not very well studied in AI and as humans are indeed a kind of autonomous agent, the problem is also studied from a psychological point of view to see if the research in human concept acquisition can be of any help when designing artificial agents. However, the acquisition cannot be studied in isolation since it is dependent on more fundamental aspects of concepts. Consequently, these are studied as well. Thus, this paper will give a review of some of the research done in cognitive psychology and AI (and to some extent philosophy) on different aspects of concepts. Some proposals for how central problems should be attacked and some pointers for further research are also presented. 1 Introduction In order to pursue goals and to plan future actions efficiently, an intelligent system (human or artificial) must be able to classify and reason ...
NonBoolean Descriptions for MindMatter Problems
"... A framework for the mindmatter problem in a holistic universe which has no parts is outlined. The conceptual structure of modern quantum theory suggests to use complementary Boolean descriptions as elements for a more comprehensive nonBoolean description of a world without an apriorigiven mindmat ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
A framework for the mindmatter problem in a holistic universe which has no parts is outlined. The conceptual structure of modern quantum theory suggests to use complementary Boolean descriptions as elements for a more comprehensive nonBoolean description of a world without an apriorigiven mindmatter distinction. Such a description in terms of a locally Boolean but globally nonBoolean structure makes allowance for the fact that Boolean descriptions play a privileged role in science. If we accept the insight that there are no ultimate building blocks, the existence of holistic correlations between contextually chosen parts is a natural consequence. The main problem of a genuinely nonBoolean description is to find an appropriate partition of the universe of discourse. If we adopt the idea that all fundamental laws of physics are invariant under time translations, then we can consider a partition of the world into a tenseless and a tensed domain. In the sense of a regulative principle, the material domain is defined as the tenseless domain with its homogeneous time. The tensed domain contains the mental domain with a tensed time characterized by a privileged position, the Now. Since this partition refers to two complementary descriptions which are not given apriori,wehavetoexpectcorrelations between these two domains. In physics it corresponds to Newton’s separation of universal laws of nature and contingent initial conditions. Both descriptions have a nonBoolean structure and can be encompassed into a single nonBoolean description. Tensed and tenseless time can be synchronized by holistic correlations. 1.
Pauli’s ideas on mind and matter in the context of contemporary science
 Journal of Consciousness Studies
, 2006
"... ..."