Results 1 - 10
of
14
Generalizing Case Frames Using a Thesaurus and the MDL Principle
- Computational Linguistics
, 1998
"... this paper, we confine ourselves to the former issue, and refer the interested reader to Li and Abe (1996), which deals with the latter issue ..."
Abstract
-
Cited by 95 (4 self)
- Add to MetaCart
this paper, we confine ourselves to the former issue, and refer the interested reader to Li and Abe (1996), which deals with the latter issue
General and Efficient Multisplitting of Numerical Attributes
, 1999
"... . Often in supervised learning numerical attributes require special treatment and do not fit the learning scheme as well as one could hope. Nevertheless, they are common in practical tasks and, therefore, need to be taken into account. We characterize the well-behavedness of an evaluation function, ..."
Abstract
-
Cited by 31 (7 self)
- Add to MetaCart
. Often in supervised learning numerical attributes require special treatment and do not fit the learning scheme as well as one could hope. Nevertheless, they are common in practical tasks and, therefore, need to be taken into account. We characterize the well-behavedness of an evaluation function, a property that guarantees the optimal multi-partition of an arbitrary numerical domain to be defined on boundary points. Well-behavedness reduces the number of candidate cut points that need to be examined in multisplitting numerical attributes. Many commonly used attribute evaluation functions possess this property; we demonstrate that the cumulative functions Information Gain and Training Set Error as well as the non-cumulative functions Gain Ratio and Normalized Distance Measure are all well-behaved. We also devise a method of finding optimal multisplits efficiently by examining the minimum number of boundary point combinations that is required to produce partitions which are optimal wit...
Language as an evolutionary system
, 2005
"... John Maynard Smith and Eörs Szathmáry argued that human language signified the eighth major transition in evolution: human language marked a new form of information transmission from one generation to another [Maynard Smith J, Szathmáry E. The major transitions in evolution. Oxford: Oxford Univ. Pre ..."
Abstract
-
Cited by 14 (1 self)
- Add to MetaCart
John Maynard Smith and Eörs Szathmáry argued that human language signified the eighth major transition in evolution: human language marked a new form of information transmission from one generation to another [Maynard Smith J, Szathmáry E. The major transitions in evolution. Oxford: Oxford Univ. Press; 1995]. According to this view language codes cultural information and as such forms the basis for the evolution of complexity in human culture. In this article we develop the theory that language also codes information in another sense: languages code information on their own structure. As a result, languages themselves provide information that influences their own survival. To understand the consequences of this theory we discuss recent computational models of linguistic evolution. Linguistic evolution is the process by which languages themselves evolve. This article draws together this recent work on linguistic evolution and highlights the significance of this process in understanding the evolution of linguistic complexity. Our conclusions are that: (1) the process of linguistic transmission constitutes the basis for an evolutionary system, and (2), that this evolutionary system is only superficially comparable to the process of
Probabilistic Models for Bacterial Taxonomy
- INTERNATIONAL STATISTICAL REVIEW
, 2000
"... We give a survey of different probabilistic partitioning methods that have been applied to bacterial taxonomy. We introduce a theoretical framework, which makes it possible to treat the various models in a unified way. The key concepts of our approach are prediction and storing of microbiological in ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
We give a survey of different probabilistic partitioning methods that have been applied to bacterial taxonomy. We introduce a theoretical framework, which makes it possible to treat the various models in a unified way. The key concepts of our approach are prediction and storing of microbiological information in a Bayesian forecasting setting. We show that there is a close connection between classification and probabilistic identification and that, in fact, our approach ties these two concepts together in a coherent way.
The Minimax Strategy for Gaussian Density Estimation
- PROC. 13TH ANNU. CONFERENCE ON COMPUT. LEARNING THEORY
, 2000
"... We consider on-line density estimation with a Gaussian of unit variance. In each trial t the learner predicts a mean t . Then it receives an instance x t chosen by the adversary and incurs loss 1 2 ( t x t ) 2 . The performance of the learner is measured by the regret defined as the total los ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
We consider on-line density estimation with a Gaussian of unit variance. In each trial t the learner predicts a mean t . Then it receives an instance x t chosen by the adversary and incurs loss 1 2 ( t x t ) 2 . The performance of the learner is measured by the regret defined as the total loss of the learner minus the total loss of the best mean parameter chosen off-line. We assume that the horizon T of the protocol is fixed and known to both parties. We give the optimal strategies for both the learner and the adversary. The value of the game is 1 2 X 2 (ln T ln ln T +O(ln ln T= ln T )), where X is an upper bound of the 2-norm of instances. We also consider the standard algorithm that predicts with t = P t 1 q=1 x q =(t 1 + a) for a fixed a. We show that the regret of this algorithm is 1 2 X 2 (ln T O(1)) regardless of the choice of a.
Psychophysical investigations of incomplete forms and forms with background
- University of Minnesota
, 1999
"... Incompleteness and background are two important types of variance found in images of objects. It has been proposed that a bidirectional network within the visual cortex allows organisms to cope with this variability. In this thesis, the problems of incompleteness and background are defined in detail ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
Incompleteness and background are two important types of variance found in images of objects. It has been proposed that a bidirectional network within the visual cortex allows organisms to cope with this variability. In this thesis, the problems of incompleteness and background are defined in detail and various bidirectional (feed- forward and back-projecting) network solutions are proposed and discussed. Three experiments were performed to investigate how such a network might recognize objects which are incomplete or backgrounded. In the first experiment, spatial and temporal manipulations of illusory contours are used to test the hypothesis that a bidirectional network is responsible for illusory contour formation. In the second experiment, incomplete and backgrounded versions of the same object are studied to test the hypothesis that the real purpose of neural back projections is segmentation rather than object completion. And, in the third experiment, novel camouflage objects are used to study the ability or inability of the brain to learn new object representations, when the brain is without the benefit of active back projections. UNIVERSITY OF MINNESOTA This is to certify that I have examined this copy of a doctoral thesis by
Computational Machine Learning in Theory and Praxis
, 1995
"... In the last few decades a computational approach to machine learning has emerged based on paradigms from recursion theory and the theory of computation. Such ideas include learning in the limit, learning by enumeration, and probably approximately correct (pac) learning. These models usually are not ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
In the last few decades a computational approach to machine learning has emerged based on paradigms from recursion theory and the theory of computation. Such ideas include learning in the limit, learning by enumeration, and probably approximately correct (pac) learning. These models usually are not suitable in practical situations. In contrast, statistics based inference methods have enjoyed a long and distinguished career. Currently, Bayesian reasoning in various forms, minimum message length (MML) and minimum description length (MDL), are widely applied approaches. They are the tools to use with particular machine learning praxis such as simulated annealing, genetic algorithms, genetic programming, artificial neural networks, and the like. These statistical inference methods select the hypothesis which minimizes the sum of the length of the description of the hypothesis (also called `model') and the length of the description of the data relative to the hypothesis. It appears to us th...
Multiscale Image Segmentation Using Joint Texture and Shape Analysis
- in Proc. of SPIE on Wavelet Applications in Signal and Image Processing VIII
, 2000
"... We develop a general framework to simultaneously exploit texture and shape characterization in multiscale image segmentation. By posing multiscale segmentation as a model selection problem, we invoke the powerful framework offered by minimum description length (MDL). This framework dictates that mul ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
We develop a general framework to simultaneously exploit texture and shape characterization in multiscale image segmentation. By posing multiscale segmentation as a model selection problem, we invoke the powerful framework offered by minimum description length (MDL). This framework dictates that multiscale segmentation comprises multiscale texture characterization and multiscale shape coding. Analysis of current multiscale maximum a posteriori (MAP) segmentation algorithms reveals that these algorithms implicitly use a shape coder with the aim to estimate the optimal MDL solution, but find only an approximate solution. Towards achieving better segmentation estimates, we first propose a shape coding algorithm based on zero-trees which is well-suited to represent images with large homogeneous regions. For this coder, we design an efficient treebased algorithm using dynamic programming that attains the optimal MDL segmentation estimate. To incorporate arbitrary shape coding techniques in...
Supervised Ranking in Open-Domain Text Summarization
- Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics
, 2002
"... The paper proposes and empirically motivates an integration of supervised learning with unsupervised learning to deal with human biases in summarization. In particular, we explore the use of probabilistic decision tree within the clustering framework to account for the variation as well as regularit ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
The paper proposes and empirically motivates an integration of supervised learning with unsupervised learning to deal with human biases in summarization. In particular, we explore the use of probabilistic decision tree within the clustering framework to account for the variation as well as regularity in human created summaries.
Tree Augmented Classification of Binary Data Minimizing Stochastic Complexity
, 2002
"... We establish the algorithms and procedures that augment by trees the classfiers of binary feature vectors in (Gyllenberg et. al. 1993, 1997, Gyllenberg et. al. 1999 and Gyllenberg and Koski 2002). The notion of augmenting a classifier by a tree is due to (Chow and Liu 1968) and in a more extensive f ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
We establish the algorithms and procedures that augment by trees the classfiers of binary feature vectors in (Gyllenberg et. al. 1993, 1997, Gyllenberg et. al. 1999 and Gyllenberg and Koski 2002). The notion of augmenting a classifier by a tree is due to (Chow and Liu 1968) and in a more extensive form due to (Friedman et. al. 1997). These techniques will in another report be primarily applied to unsupervised classification of bacterial DNA fingerprints (or electrophoretic patterns), c.f., (Gyllenberg and Koski 2001 (a), Rademaker et. al. 1999). By classification we mean here both the (unsupervised) procedures of finding the classes in (training) data of items as well as the actual outcome of the procedure, i.e., a partitioning of the items. By identification we mean the procedures for finding the assignment of items in classes, pre-established in one way or the other. The distinction should be clear, although the algorithms of classification as given in the sequel will also...

