Results 1 - 10
of
46
TiMBL: Tilburg Memory Based Learner - version 2.0 - Reference Guid
, 1999
"... This document is available from http://ilk.kub.nl/~ilk/papers/ilk9901.ps.gz. All rights reserved Induction of Linguistic Knowledge, Tilburg University. Contents 1 License terms 1 2 Installation 3 3 Changes 4 4 Learning algorithms 6 ..."
Abstract
-
Cited by 240 (62 self)
- Add to MetaCart
This document is available from http://ilk.kub.nl/~ilk/papers/ilk9901.ps.gz. All rights reserved Induction of Linguistic Knowledge, Tilburg University. Contents 1 License terms 1 2 Installation 3 3 Changes 4 4 Learning algorithms 6
MBT: A Memory-Based Part of Speech Tagger-Generator
- PROC. OF FOURTH WORKSHOP ON VERY LARGE CORPORA
, 1996
"... We introduce a memory-based approach to part of speech tagging. Memory-based learning is a form of supervised learning based on similarity-based reasoning. The part of speech tag of a word in a particular context is extrapolated from the most similar cases held in memory. Supervised learning approac ..."
Abstract
-
Cited by 168 (47 self)
- Add to MetaCart
We introduce a memory-based approach to part of speech tagging. Memory-based learning is a form of supervised learning based on similarity-based reasoning. The part of speech tag of a word in a particular context is extrapolated from the most similar cases held in memory. Supervised learning approaches are useful when a tagged corpus is available as an example of the desired output of the tagger. Based on such a corpus, the tagger-generator automatically builds a tagger which is able to tag new text the same way, diminishing development time for the construction of a tagger considerably. Memory-based tagging shares this advantage with other statistical or machine learning approaches. Additional advantages specific to a memory-based approach include (i) the relatively small tagged corpus size sufficient for training, (ii) incremental learning, (iii) explanation capabilities, (iv) flexible integration of information in case representations, (v) its non-parametric nature, (vi) reasonably good results on unknown words without morphological analysis, and (vii) fast learning and tagging. In this paper we show that a large-scale application of the memory-based approach is feasible: we obtain a tagging accuracy that is on a par with that of known statistical approaches, ad with attractive space and time complexity properties when using IGTree, a tree-based formalism for indexing and searching huge case bases. The use of IGTree has as additional advantage that optimal context size for disambiguation is dynamically computed.
Forgetting Exceptions is Harmful in Language Learning
- MACHINE LEARNING, SPECIAL ISSUE ON NATURAL LANGUAGE LEARNING
, 1999
"... We show that in language learning, contrary to received wisdom, keeping exceptional training instances in memory can be beneficial for generalization accuracy. We investigate this phenomenon empirically on a selection of benchmark natural language processing tasks: grapheme-to-phoneme conversion, pa ..."
Abstract
-
Cited by 94 (38 self)
- Add to MetaCart
We show that in language learning, contrary to received wisdom, keeping exceptional training instances in memory can be beneficial for generalization accuracy. We investigate this phenomenon empirically on a selection of benchmark natural language processing tasks: grapheme-to-phoneme conversion, part-of-speech tagging, prepositional-phrase attachment, and base noun phrase chunking. In a first series of experiments we combine memory-based learning with training set editing techniques, in which instances are edited based on their typicality and class prediction strength. Results show that editing exceptional instances (with low typicality or low class prediction strength) tends to harm generalization accuracy. In a second series of experiments we compare memory-based learning and decision-tree learning methods on the same selection of tasks, and find that decision-tree learning often performs worse than memory-based learning. Moreover, the decrease in performance can be linked to the degree of abstraction from exceptions (i.e., pruning or eagerness). We provide explanations for both results in terms of the properties of the natural language processing tasks and the learning algorithms.
The Interaction of Knowledge Sources for Word Sense Disambiguation
- Computational Linguistics
, 2001
"... Word sense disambiguation (WSD) is a computational linguistics task likely to benefit from the tradition of combining different knowledge sources in artificial in telligence research. An important step in the exploration of this hypothesis is to determine which linguistic knowledge sources are most ..."
Abstract
-
Cited by 58 (2 self)
- Add to MetaCart
Word sense disambiguation (WSD) is a computational linguistics task likely to benefit from the tradition of combining different knowledge sources in artificial in telligence research. An important step in the exploration of this hypothesis is to determine which linguistic knowledge sources are most useful and whether their combination leads to improved results. We present a sense tagger which uses several knowledge sources. Tested accuracy exceeds 94 % on our evaluation corpus. Our system attempts to disambiguate all content words in running text rather than limiting itself to treating a restricted vocabulary of words. It is argued that this approach is more likely to assist the creation of practical systems. 1.
Memory-Based Lexical Acquisition and Processing
- MACHINE TRANSLATION AND THE LEXICON
, 1995
"... Current approaches to computational lexicology in language technology are knowledge-based (competence-oriented) and try to abstract away from specific formalisms, domains, and applications. This results in severe complexity, acquisition and reusability bottlenecks. As an alternative, we propose a pa ..."
Abstract
-
Cited by 47 (23 self)
- Add to MetaCart
Current approaches to computational lexicology in language technology are knowledge-based (competence-oriented) and try to abstract away from specific formalisms, domains, and applications. This results in severe complexity, acquisition and reusability bottlenecks. As an alternative, we propose a particular performance-oriented approach to Natural Language Processing based on automatic memory-based learning of linguistic (lexical) tasks. The consequences of the approach for computational lexicology are discussed, and the application of the approach on a number of lexical acquisition and disambiguation tasks in phonology, morphology and syntax is described.
K*: An Instance-based Learner Using an Entropic Distance Measure
- In Proceedings of the 12th International Conference on Machine Learning
, 1995
"... The use of entropy as a distance measure has several benefits. Amongst other things it provides a consistent approach to handling of symbolic attributes, real valued attributes and missing values. The approach of taking all possible transformation paths is discussed. We describe K*, an instance-base ..."
Abstract
-
Cited by 36 (0 self)
- Add to MetaCart
The use of entropy as a distance measure has several benefits. Amongst other things it provides a consistent approach to handling of symbolic attributes, real valued attributes and missing values. The approach of taking all possible transformation paths is discussed. We describe K*, an instance-based learner which uses such a measure, and results are presented which compare favourably with several machine learning algorithms. Introduction The task of classifying objects is one to which researchers in artificial intelligence have devoted much time and effort. The classification problem is hard because often the data available may be noisy or have irrelevant attributes, there may be few examples to learn from or simply because the domain is inherently difficult. Many different approaches have been tried with varying success. Some well known schemes and their representations include: ID3 which uses decision trees (Quinlan 1986), FOIL which uses rules (Quinlan 1990), PROTOS which is a case...
Fast NP Chunking Using Memory-Based Learning Techniques
- In Proceedings of BENELEARN'98
, 1998
"... In this paper we discuss the application of Memory-Based Learning (MBL) to fast NP chunking. We first discuss the application of a fast decision tree variant of MBL (IGTree) on the dataset described in (Ramshaw and Marcus, 1995), which consists of roughly 50,000 test and 200,000 train items. In a se ..."
Abstract
-
Cited by 26 (1 self)
- Add to MetaCart
In this paper we discuss the application of Memory-Based Learning (MBL) to fast NP chunking. We first discuss the application of a fast decision tree variant of MBL (IGTree) on the dataset described in (Ramshaw and Marcus, 1995), which consists of roughly 50,000 test and 200,000 train items. In a second series of experiments we used an architecture of two cascaded IGTrees. In the second level of this cascaded classifier we added context predictions as extra features so that incorrect predictions from the first level can be corrected, yielding a 97.2% generalisation accuracy with training and testing times in the order of seconds to minutes. Submission Type: regular paper Topic Areas: robust parsing, NP chunking, memory-based learning Author of Record: Jorn Veenstra Under consideration for other conferences (specify)? no Fast NP Chunking Using Memory-Based Learning Techniques Abstract In this paper we discuss the application of Memory-Based Learning (MBL) to fast NP chunking. We fir...
Similarity and rules: Distinct? Exhaustive? Empirically distinguishable
- Cognition
, 1998
"... The distinction between rule-based and similarity-based processes in cognition is of fundamental importance for cognitive science, and has been the focus of a large body of empirical research. However, intuitive uses of the distinction are subject to theoretical difficulties and their relation to em ..."
Abstract
-
Cited by 26 (4 self)
- Add to MetaCart
The distinction between rule-based and similarity-based processes in cognition is of fundamental importance for cognitive science, and has been the focus of a large body of empirical research. However, intuitive uses of the distinction are subject to theoretical difficulties and their relation to empirical evidence is not clear. We propose a ‘core ’ distinction between ruleand similarity-based processes, in terms of the way representations of stored information are ‘matched ’ with the representation of a novel item. This explication captures the intuitively clear-cut cases of processes of each type, and resolves apparent problems with the rule/ similarity distinction. Moreover, it provides a clear target for assessing the psychological and AI literatures. We show that many lines of psychological evidence are less conclusive than sometimes assumed, but suggest that converging lines of evidence may be persuasive. We then argue that the AI literature suggests that approaches which combine rules and similarity are an important new focus for empirical work. © 1998 Elsevier Science B.V. Keywords: Similarity-based process; Rule-based process 1.
Memory-Based Learning: Using Similarity for Smoothing
, 1997
"... This paper analyses the relation between the use of similarity in Memory-Based Learning and the notion of backed-off smoothing in statistical language modeling. We show that the two approaches are closely related, and we argue that feature weighting methods in the Memory-Based paradigm can offer the ..."
Abstract
-
Cited by 23 (7 self)
- Add to MetaCart
This paper analyses the relation between the use of similarity in Memory-Based Learning and the notion of backed-off smoothing in statistical language modeling. We show that the two approaches are closely related, and we argue that feature weighting methods in the Memory-Based paradigm can offer the advantage of automatically specifying a suitable domain-specific hierarchy between most specific and most general conditioning information without the need for a large number of parameters. We report two applications of this approach: PP-attachment and POS-tagging. Our method achieves state-of-the-art performance in both domains, and allows the easy integration of diverse information sources, such as rich lexical representations.

