Results 1 -
8 of
8
Learning to Resolve Natural Language Ambiguities: A Unified Approach
, 1998
"... We analyze a few of the commonly used statistics based and machine learning algorithms for natural language disambiguation tasks and observe that they can be recast as learning linear separators in the feature space. Each of the methods makes a priori assumptions, which it employs, given the data, w ..."
Abstract
-
Cited by 154 (75 self)
- Add to MetaCart
We analyze a few of the commonly used statistics based and machine learning algorithms for natural language disambiguation tasks and observe that they can be recast as learning linear separators in the feature space. Each of the methods makes a priori assumptions, which it employs, given the data, when searching for its hypothesis. Nevertheless, as we show, it searches a space that is as rich as the space of all linear separators. We use this to build an argument for a data driven approach which merely searches for a good linear separator in the feature space, without further assumptions on the domain or a specific problem. We present such an approach - a sparse network of linear separators, utilizing the Winnow learning algorithm - and show how to use it in a variety of ambiguity resolution problems. The learning approach presented is attribute-efficient and, therefore, appropriate for domains having very large number of attributes. In particular, we present an extensive experimental ...
Improving Accuracy in Wordclass Tagging through Combination of Machine Learning Systems
- Computational Linguistics
, 2000
"... this paper, we combine different systems employing known representations. The observation that suggests this approach is that systems that are designed differently, either because they use a different formalism or because they contain different knowledge, will typically produce different errors. We ..."
Abstract
-
Cited by 38 (3 self)
- Add to MetaCart
this paper, we combine different systems employing known representations. The observation that suggests this approach is that systems that are designed differently, either because they use a different formalism or because they contain different knowledge, will typically produce different errors. We hope to make use of this fact and reduce the number of errors with very little additional effort by exploiting the disagreement between different language models. Al- though the approach is applicable to any type of language model, we focus on the case of statistical disambiguators that are trained on annotated corpora. The examples of the task that are present in the corpus and its annotation are fed into a learning algorithm, which induces a model of the desired input-output mapping in the form of a classifier. * EO. Box 9103, 6500 HD Nijmegen, The Netherlands, hvh@let.ktm.nl t Universiteitsplein 1, 2610 Wilrijk, Belgium, {zavrel, daelem}@uia.ua.ac.be () 2000 Association for Computational Linguistics We use a number of different learning algorithms simultaneously on the same training corpus. Each type of learning method brings its own 'inductive bias' to the task and will produce a classifier with slightly different characteristics, so that different methods will tend to produce different errors
Comparing a Linguistic and a Stochastic Tagger
- Proceedings of the Thirty-Fifth Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
, 1997
"... Concerning different approaches to automatic PoS tagging: EngCG-2, a constraintbased morphological tagger, is compared in a double-blind test with a state-of-the-art statistical tagger on a common disambiguation task using a common tag set. The ex- periments show that for the same amount of remainin ..."
Abstract
-
Cited by 27 (1 self)
- Add to MetaCart
Concerning different approaches to automatic PoS tagging: EngCG-2, a constraintbased morphological tagger, is compared in a double-blind test with a state-of-the-art statistical tagger on a common disambiguation task using a common tag set. The ex- periments show that for the same amount of remaining ambiguity, the error rate of the statistical tagger is one order of magnitude greater than that of the rule-based one. The two related issues of priming effects compromising the results and disagreement between human annotators are also addressed.
Memory-Based Learning: Using Similarity for Smoothing
, 1997
"... This paper analyses the relation between the use of similarity in Memory-Based Learning and the notion of backed-off smoothing in statistical language modeling. We show that the two approaches are closely related, and we argue that feature weighting methods in the Memory-Based paradigm can offer the ..."
Abstract
-
Cited by 23 (7 self)
- Add to MetaCart
This paper analyses the relation between the use of similarity in Memory-Based Learning and the notion of backed-off smoothing in statistical language modeling. We show that the two approaches are closely related, and we argue that feature weighting methods in the Memory-Based paradigm can offer the advantage of automatically specifying a suitable domain-specific hierarchy between most specific and most general conditioning information without the need for a large number of parameters. We report two applications of this approach: PP-attachment and POS-tagging. Our method achieves state-of-the-art performance in both domains, and allows the easy integration of diverse information sources, such as rich lexical representations.
Efficient Stochastic Part-of-Speech Tagging for Hungarian
- In Proc. of the Third LREC, pages 710–717, Las Palmas, Espanha
, 2002
"... Many of the methods developed for Western European languages and used widespread to produce annotated language resources cannot readily be applied to Central and Eastern European languages, due to the large number of novel phenomena exhibited in the syntax and morphology of these languages, which th ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Many of the methods developed for Western European languages and used widespread to produce annotated language resources cannot readily be applied to Central and Eastern European languages, due to the large number of novel phenomena exhibited in the syntax and morphology of these languages, which these methods have to handle but have not been designed to cope with. The process of morphological tagging when applied to Hungarian data to produce corpora annotated at least at the morphosyntactic level is most indicative of this problem: several of the algorithms (either rule-based or statistical) that have been used very successfully in other domains cannot readily be applied to a language exhibiting such a varied morphology and huge number of wordforms as Hungarian. The paper will describe a robust tagging scenario for Hungarian using a relatively simple stochastic system augmented with external morphological processing, which can overcome the two most conspcicuous problems: the complexity of morphosyntactic descriptions and most importantly the huge number of possible wordforms.
A corpus of dutch aphasic speech: Sketching the design and performing a pilot study
"... In this thesis, a pilot study for the development of a corpus of Dutch aphasic speech (CoDAS) is presented. Given the lack of resources of this kind not only for Dutch but also for other languages, CoDAS will be able to set standards and will contribute to the future research in this area. A corpus ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
In this thesis, a pilot study for the development of a corpus of Dutch aphasic speech (CoDAS) is presented. Given the lack of resources of this kind not only for Dutch but also for other languages, CoDAS will be able to set standards and will contribute to the future research in this area. A corpus of Dutch aphasic speech should fulfill at least three requirements. First, it should en-code a plausible sample of contemporary Dutch as spoken by aphasic patients. That is, it should include speech representing different types of aphasia as well as various communication settings. Secondly, the speech fragments should be documented with the relevant metadata which should include information about the speaker and aphasia. Thirdly, the corpus should be enriched with various kinds of linguistic information. Given the special character of the speech contained in CoDAS, we cannot simply carry over the design and the annotation protocols of existing corpora, such as SDC or CHILDES. However, they have been assumed as starting point. In our pilot study, we have established the basic requirements with respect to text types, metadata, and annotation levels that CoDAS should fulfill. In this respect, we have investigated whether and how the procedures and protocols for
Synther -- A New M-Gram Pos Tagger
- In Proc. NLP-KE, 628–633, Bejing
, 2003
"... In this paper, the Part-Of-Speech (POS) tagger synther based on m-gram statistics is described. After explaining its basic architecture, three smoothing approaches and the strategy for handling unknown words is exposed. Subsequently, synther's performance is evaluated in comparison with four state-o ..."
Abstract
- Add to MetaCart
In this paper, the Part-Of-Speech (POS) tagger synther based on m-gram statistics is described. After explaining its basic architecture, three smoothing approaches and the strategy for handling unknown words is exposed. Subsequently, synther's performance is evaluated in comparison with four state-of-the-art POS taggers. All of them are trained and tested on three corpora of di#erent languages and domains. In the course of this evaluation, synther resulted in the lowest error rates or at least below average error rates. Finally, it is shown that the linear interpolation smoothing strategy with coverage-dependent weights features better properties than the two other approaches.
Memory-Based Learning: Using Similarity for Smoothing
, 1997
"... This paper analyses the relation between the use of similarity in Memory-Based Learning and the notion of backed-off smoothing in statistical language modeling. We show that the two approaches are closely related, and we argue that feature weighting methods in the Memory-Based paradigm can offer the ..."
Abstract
- Add to MetaCart
This paper analyses the relation between the use of similarity in Memory-Based Learning and the notion of backed-off smoothing in statistical language modeling. We show that the two approaches are closely related, and we argue that feature weighting methods in the Memory-Based paradigm can offer the advantage of automatically specifying a suitable domainspecific hierarchy between most specific and most general conditioning information without the need for a large number of parameters. We report two applications of this approach: PP-attachment and POStagging. Our method achieves state-of-theart performance in both domains, and allows the easy integration of diverse information sources, such as rich lexical representations. 1

