Results 11 - 20
of
27
Unsupervised learning of word segmentation rules with genetic algorithms and inductive logic programming
- Machine Learning
, 2001
"... Abstract. This article presents a combination of unsupervised and supervised learning techniques for the generation of word segmentation rules from a raw list of words. First, a language bias for word segmentation is introduced and a simple genetic algorithm is used in the search for a segmentation ..."
Abstract
-
Cited by 14 (1 self)
- Add to MetaCart
Abstract. This article presents a combination of unsupervised and supervised learning techniques for the generation of word segmentation rules from a raw list of words. First, a language bias for word segmentation is introduced and a simple genetic algorithm is used in the search for a segmentation that corresponds to the best bias value. In the second phase, the words segmented by the genetic algorithm are used as an input for the first order decision list learner CLOG. The result is a set of first order rules which can be used for segmentation of unseen words. When applied on either the training data or unseen data, these rules produce segmentations which are linguistically meaningful, and to a large degree conforming to the annotation provided. Keywords: unsupervised machine learning, inductive logic programming, natural language, word segmentation 1.
Partially Supervised Learning of Morphology with Stochastic Transducers
"... In this paper I present an algorithm for the unsupervised learning of morphology using stochastic finite state transducers, in particular Pair Hidden Markov Models. The task is viewed as an alignment problem between two sets of words. A supervised model of morphology acquisition is converted to an u ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
In this paper I present an algorithm for the unsupervised learning of morphology using stochastic finite state transducers, in particular Pair Hidden Markov Models. The task is viewed as an alignment problem between two sets of words. A supervised model of morphology acquisition is converted to an unsupervised model by treating the alignment as a further hidden variable. The use of the Expectation-Maximisation algorithm for this task is studied, which leads to calculations involving the permanent of a matrix of probabilities.
Analogical Prediction
, 1999
"... Inductive Logic Programming (ILP) involves constructing an hypothesis H on the basis of background knowledge B and training examples E. An independent test set is used to evaluate the accuracy of H. This paper concerns an alternative approach called Analogical Prediction (AP). AP takes B; E and the ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
Inductive Logic Programming (ILP) involves constructing an hypothesis H on the basis of background knowledge B and training examples E. An independent test set is used to evaluate the accuracy of H. This paper concerns an alternative approach called Analogical Prediction (AP). AP takes B; E and then for each test example hx; yi forms an hypothesis Hx from B; E; x. Evaluation of AP is based on estimating the probability that Hx(x) = y for a randomly chosen hx; yi. AP has been implemented within CProgol4.4. Experiments in the paper show that on English past tense data AP has signicantly higher predictive accuracy on this data than both previously reported results and CProgol in inductive mode. However, on KRK illegal AP does not outperform CProgol in inductive mode. We conjecture that AP has advantages for domains in which a large proportion of the examples must be treated as exceptions with respect to the hypothesis vocabulary. The relationship of AP to analogy and instance-based lear...
The Segmentation Problem in Morphology Learning
, 1998
"... this paper, I briefly discuss some experiments on learning morphological forms in languages with much richer morphological paradigms. Such langnages are common throughout much of the globe (from Latin and Greek to Inuit and Cashinahua or Anmajere and Kayardild - to finish with some Australian examp ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
this paper, I briefly discuss some experiments on learning morphological forms in languages with much richer morphological paradigms. Such langnages are common throughout much of the globe (from Latin and Greek to Inuit and Cashinahua or Anmajere and Kayardild - to finish with some Australian examples). Attempting to learn morphology in languages with rich morphology raises quite different problems from those discussed in the work above, issues discussed - if rather naively and unsatisfactorily from a computational viewpoint - in earlier work such as Pinker (1984), MacWhinney (1978) and Peters (1983). Foremost among these is the segmentation problem of how one cuts the complex morphological forms into bits with meanings identified. Note that I assume here that the child has already figured out the meanings of words. This is a big assumption, but it is reasonable for a model to focus on one aspect of the learning problem - and at any rate the learn- ing task is still much broader and more realistic than that attempted by the recent English past tense literature. It may not even be unrealistic; see Pinker (1984:29-30) for a general defense of assuming some form of "semantic bootstrapping" and MacWhinney (1978:70-71) who for arguments for the learning of word meanings before gaining a productive understanding of them ("it appears that the use of inflections in amalgams is stabilized semantically before these amalgams are analyzed morphologically"). Thus the learning task which I am attempting to address could be stated thus: Given a set of words and a representation of their meanings, determine an internalized representation that will allow heard and (regular) unheard forms to be successfully pre- dicted and parsed
Past Tenses of Verbs and First-Order Learning
- IN
, 1994
"... Learning to transform English verbs from present to past tense has been studied extensively in the connectionist literature. A recent paper describes a symbolic approach that outperforms neural networks on this task, but the new system is still constrained by propositional-level attribute-value r ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
Learning to transform English verbs from present to past tense has been studied extensively in the connectionist literature. A recent paper describes a symbolic approach that outperforms neural networks on this task, but the new system is still constrained by propositional-level attribute-value representations. A first-order learning method that uses a relatively natural representation is found to give slightly better results again. In the course of experiments, several weaknesses in the firstorder approach are identified that suggest areas for further research.
Induction in first order logic from noisy training examples and fixed example set size
- In PhD Thesis
, 1999
"... Abstract This dissertation investigates the field of inductive logic programming (ILP) and in so doing an ILP system, Lime, is designed and developed. Lime addresses the problem of noisy training examples; learning from only positive, only negative, or both positive and negative examples; efficientl ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Abstract This dissertation investigates the field of inductive logic programming (ILP) and in so doing an ILP system, Lime, is designed and developed. Lime addresses the problem of noisy training examples; learning from only positive, only negative, or both positive and negative examples; efficiently biasing and searching the hypothesis space; and handling recursion efficiently and effectively. The Q-heuristic is introduced to address the problem of learning with both noisy training examples and fixed numbers of positive and negative training examples. This heuristics is based on Bayes rule. Both a justification of its derivation and a description of the context in which it is appropriately applied are given. Because of the general nature of this heuristic its application is not restricted to ILP. Instead of employing a greedy covering approach to constructing clauses, Lime employs the Qheuristic to evaluate entire logic programs as hypotheses. To tame the inevitable explosion in the search space, the notion of a simple clause is introduced. These sets of literals may be viewed as subparts of clauses that are effectively independent in terms of variables used. Instead of growing a clause one literal at a time, Lime efficiently combines simple clauses to construct a set of gainful candidate clauses. Subsets of these candidate clauses are evaluated using the Q-heuristic to find the final hypothesis. Details of the algorithms and data structures of Lime are discussed. Lime's handling of recursive logic programs is also described. Experimental results are provided to illustrate how Lime achieves its design goals of better noise handling, learning from a fixed set of examples (e.g., from only positive data), and of learning recursive logic programs. These results compare the performance of Lime with other leading ILP systems like Foil and Progol in a variety of domains. Empirical results with a boosted version of Lime are also reported.
STRUCTURES AND DISTRIBUTIONS IN MORPHOLOGY LEARNING
, 2008
"... One of the great challenges in linguistics and cognitive science is to understand the nature of the mental representation of language. The precise mechanisms of the mind are unknown, but can be modeled through observation and experimentation. By viewing the mind as a computational device that receiv ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
One of the great challenges in linguistics and cognitive science is to understand the nature of the mental representation of language. The precise mechanisms of the mind are unknown, but can be modeled through observation and experimentation. By viewing the mind as a computational device that receives input (primary linguistic data) and produces output (the development of grammatical speech) during language acquisition, one can reason about what representations and algorithms must be internal to the learner. In this thesis, I investigate the acquisition of morphology. The principal challenges are how to learn a theory in the presence of sparse data, and in a manner that can provide explanations for the developmental processes in child language acquisition. The main idea underlying this work is that a consideration of the different aspects of language acquisition places strong constraints on cognitively plausible representations and algorithms that are internal to the learner. To develop a model of morphology acquisition, I pursue three lines of work: iv First, I formulate a cognitively-oriented computational framework for studying language acquisition that consists of four components: the linguistic representation, the
Learning bias and phonological rule induction
- Computational Linguistics
, 1996
"... A fundamental debate in the machine learning of language has been the role of prior knowledge in the learning process. Purely nativist approaches, such as the Principles and Parameters model, build parameterized linguistic generalizations directly into the learning system. Purely empirical approache ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
A fundamental debate in the machine learning of language has been the role of prior knowledge in the learning process. Purely nativist approaches, such as the Principles and Parameters model, build parameterized linguistic generalizations directly into the learning system. Purely empirical approaches use a general, domain-independent learning rule (Error Back-Propagation, Instance-Based Generalization, Minimum Description Length) to learn linguistic generalizations directly from the data. In this paper we suggest that an alternative to the purely nativist or purely empiricist learning paradigms is to represent the prior knowledge of language as a set of abstract learning biases, which guide an empirical inductive learning algorithm. We test our idea by examining the machine learning of simple Sound Pattern of English (SPE)-style phonological rules. We represent phonological rules as finite state transducers which accept underlying forms as input and generate surface forms as output. We show that OSTIA, a general-purpose transducer induction algorithm, was incapable of learning simple phonological rules like flapping. We then augmented OSTIA with three kinds of learning biases which are specific to natural language phonology, and are assumed explicitly or implicitly by every theory of phonology: Faithfulness (underlying segments tend
Identification of live news events using Twitter
- In Proceedings of the 3rd ACM SIGSPATIAL International Workshop on LocationBased Social Networks (LBSN '11
, 2011
"... Twitter presents a source of information that cannot easily be obtained anywhere else. However, though many posts on Twitter reveal up-to-the-minute information about events in the world or interesting sentiments, far more posts are of no interest to the general audience. A method to determine which ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Twitter presents a source of information that cannot easily be obtained anywhere else. However, though many posts on Twitter reveal up-to-the-minute information about events in the world or interesting sentiments, far more posts are of no interest to the general audience. A method to determine which Twitter users are posting reliable information and which posts are interesting is presented. Using this information a search through a large, online news corpus is conducted to discover future events before they occur along with information about the location of the event. These events can be identified with a high degree of accuracy by verifying that an event found in one news article is found in other similar news articles, since any event interesting to a general audience will likely have more than one news story written about it. Twitter posts near the time of the event can then be identified as interesting if they match the event in terms of keywords or location. This method enables the discovery of interesting posts about current and future events and helps in the identification of reliable users.
DEVELOPMENT OF GENDER CLASSIFICATIONS: MODELING THE HISTORICAL CHANGE FROM LATIN TO FRENCH
"... We present and analyze the results of a connectionist simulation which modeled the reanalysis of the Latin gender system in its transition to Old French. The network reanalysis was based solely on formal cues (word endings and analogy with other words) and on frequency. The results are in accordance ..."
Abstract
- Add to MetaCart
We present and analyze the results of a connectionist simulation which modeled the reanalysis of the Latin gender system in its transition to Old French. The network reanalysis was based solely on formal cues (word endings and analogy with other words) and on frequency. The results are in accordance with the historical data, and certain errors in simulations are also amenable to principled explanations. Simulations improve dramatically when the networks incorporate information about the Celtic substrate which presumably interfered with gender assignment in Gallo-Romance. This finding has a bearing on issues of gender assignment and processing in bilinguals. Simulations also improve with the introduction of more elaborate recurrent networks, which suggests implications for future connectionist modeling. In particular, the results could be applied to the modeling of gender change in other Romance languages and to the modeling of comparative Romance gender systems. The method proposed here would be advantageous for such simulations since it allows the modeler to take into account a rich variety of facts reflecting actual linguistic history.* 1. INTRODUCTION. ‘Genders

