Results 1 - 10
of
91
Integrating Multiple Knowledge Sources to Disambiguate Word Sense: An Exemplar-Based Approach
- IN PROCEEDINGS OF THE 34TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS
, 1996
"... In this paper, we present a new approach for word sense disambiguation (WSD) using an exemplar-based learning algorithm. This approach ..."
Abstract
-
Cited by 204 (7 self)
- Add to MetaCart
In this paper, we present a new approach for word sense disambiguation (WSD) using an exemplar-based learning algorithm. This approach
Generalized Probabilistic LR Parsing of Natural Language (Corpora) with Unification-Based Grammars
- COMPUTATIONAL LINGUISTICS
, 1993
"... ..."
Word-Sense Disambiguation Using Decomposable Models
- In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics
, 1994
"... Most probabilistic classifiers used for word-sense disambiguation have either been based on only one contextual feature or have used a model that is simply assumed to characterize the interdependencies among multiple contextual features. In this paper, a different approach to formulating a probabili ..."
Abstract
-
Cited by 124 (17 self)
- Add to MetaCart
Most probabilistic classifiers used for word-sense disambiguation have either been based on only one contextual feature or have used a model that is simply assumed to characterize the interdependencies among multiple contextual features. In this paper, a different approach to formulating a probabilistic model is presented along with a case study of the performance of models produced in this manner for the disambiguafion of the noun interest. We describe a method for formulating probabilistic models that use multiple contextual features for word-sense disambiguafion, without requiring untested assumptions regarding the form of the model. Using this approach, the joint distribution of all variables is described by only the most systematic variable interactions, thereby limiting the number of parameters to be estimated, supporting computational efficiency, and providing an understanding of the data.
Large-scale dictionary construction for foreign language tutoring and interlingual machine translation
- MACHINE TRANSLATION
, 1997
"... This paper describes techniques for automatic construction of dictionaries for use in large-scale foreign language tutoring (FLT) and interlingual machine translation (MT) systems. The dictionaries are based on a language-independent representation called lexical conceptual structure (LCS). A primar ..."
Abstract
-
Cited by 71 (9 self)
- Add to MetaCart
This paper describes techniques for automatic construction of dictionaries for use in large-scale foreign language tutoring (FLT) and interlingual machine translation (MT) systems. The dictionaries are based on a language-independent representation called lexical conceptual structure (LCS). A primary goal of the LCS research is to demonstrate that synonymous verb senses share distributional patterns. In this paper, we show how the syntax-semantics relation can be used to develop a lexical acquisition approach that contributes both toward the enrichment of existing online resources and toward the development of lexicons containing more complete information than is provided in any of these resources alone. We start by describing the structure of the LCS and showing how this representation is used in FLT and MT. We then focus on the problem of building LCS dictionaries for large-scale FLT and MT. First, we describe authoring tools for manual and semi-automatic construction of LCS dictionaries; we then present a more sophisticated approach that uses linguistic techniques for building word defmitions automatically. These techniques have been implemented as part of a set of lexicon-development tools used in the MILT FLT project (Dorr et al., 1995; Sams, 1995; Weinberg et al., 1995) and in the PRINCITRAN MT project (Dorr et al., 1995b).
Subcategorization Acquisition
, 2002
"... Manual development of large subcategorised lexicons has proved difficult because predicates change behaviour between sublanguages, domains and over time. Yet access to a comprehensive subcategorization lexicon is vital for successful parsing capable of recovering predicate-argument relations, and pr ..."
Abstract
-
Cited by 64 (13 self)
- Add to MetaCart
Manual development of large subcategorised lexicons has proved difficult because predicates change behaviour between sublanguages, domains and over time. Yet access to a comprehensive subcategorization lexicon is vital for successful parsing capable of recovering predicate-argument relations, and probabilistic parsers would greatly benefit from accurate information concerning the relative likelihood of different subcategorisation frames (scfs) of a given predicate. Acquisition of subcategorization lexicons from textual corpora has recently become increasingly popular. Although this work has met with some success, resulting lexicons indicate a need for greater accuracy. One significant source of error lies in the statistical filtering used for hypothesis selection, i.e. for removing noise from automatically acquired scfs. This thesis builds on earlier work in verbal subcategorization acquisition, taking as a starting point the problem with statistical filtering. Our investigation shows that statistical filters tend to work poorly because not only is the underlying distribution zipfian, but there is also very little correlation between conditional distribution of
Distinguishing Word Senses in Untagged Text
- In Proceedings of the Second Conference on Empirical Methods in Natural Language Processing
"... This paper describes an experimental com- parison of three unsupervised learning algorithms that distinguish the sense of an ambiguous word in untagged text. ..."
Abstract
-
Cited by 59 (15 self)
- Add to MetaCart
This paper describes an experimental com- parison of three unsupervised learning algorithms that distinguish the sense of an ambiguous word in untagged text.
The Interaction of Knowledge Sources for Word Sense Disambiguation
- Computational Linguistics
, 2001
"... Word sense disambiguation (WSD) is a computational linguistics task likely to benefit from the tradition of combining different knowledge sources in artificial in telligence research. An important step in the exploration of this hypothesis is to determine which linguistic knowledge sources are most ..."
Abstract
-
Cited by 58 (2 self)
- Add to MetaCart
Word sense disambiguation (WSD) is a computational linguistics task likely to benefit from the tradition of combining different knowledge sources in artificial in telligence research. An important step in the exploration of this hypothesis is to determine which linguistic knowledge sources are most useful and whether their combination leads to improved results. We present a sense tagger which uses several knowledge sources. Tested accuracy exceeds 94 % on our evaluation corpus. Our system attempts to disambiguate all content words in running text rather than limiting itself to treating a restricted vocabulary of words. It is argued that this approach is more likely to assist the creation of practical systems. 1.
Practical Unification-based Parsing of Natural Language
, 1993
"... The thesis describes novel techniques and algorithms for the practical parsing of realistic Natural Language (NL) texts with a wide-coverage unification-based grammar of English. The thesis tackles two of the major problems in this area: firstly, the fact that parsing realistic inputs with such gr ..."
Abstract
-
Cited by 46 (7 self)
- Add to MetaCart
The thesis describes novel techniques and algorithms for the practical parsing of realistic Natural Language (NL) texts with a wide-coverage unification-based grammar of English. The thesis tackles two of the major problems in this area: firstly, the fact that parsing realistic inputs with such grammars can be computationally very expensive, and secondly, the observation that many analyses are often assigned to an input, only one of which usually forms the basis of the correct interpretation. The thesis starts by presenting a new unification algorithm, justifies why it is well-suited to practical NL parsing, and describes a bottom-up active chart parser which employs this unification algorithm together with several other novel processing and optimisation techniques. Empirical results demonstrate that an implementation of this parser has significantly better practical
Multilingual Lexical Representation
, 1993
"... The approach to multilingual lexical representation developed as part of the ACQUILEX Lexical Knowledge Base (LKB) is discussed with specic reference to complex translation equivalence. The treatment described provides a lexicalist account of translation mismatches in terms of translation links whic ..."
Abstract
-
Cited by 42 (11 self)
- Add to MetaCart
The approach to multilingual lexical representation developed as part of the ACQUILEX Lexical Knowledge Base (LKB) is discussed with specic reference to complex translation equivalence. The treatment described provides a lexicalist account of translation mismatches in terms of translation links which capture cross-linguistic generalizations across sets of semantically related lexical items, and can be readily integrated with several transfer-based MT systems. 1 Introduction The ACQUILEX LKB system was designed to allow the representation of syntactic and semantic information which has been (semi-)automatically extracted from machine readable dictionaries (MRDs). Large scale monolingual lexicon fragments have been constructed semi-automatically for four languages (English, Spanish, Dutch and Italian); descriptions of the monolingual lexicons and the lexical representation language (LRL) are given in, for example, Copestake (1992), Sanlippo and Poznanski (1992) and papers in Briscoe ...
The ACQUILEX LKB: representation issues in semi-automatic acquisition of large lexicons
- Proceedings of the 3rd Conference on Applied Natural Language Processing (ANLP-92
, 1992
"... We describe the lexical knowledge base sys- tem (LKB) which has been designed and implemented as part of the ACQUILEX project x to allow the representation of multilingual syn- tactic and semantic information extracted from machine readable dictionaries (MRDs), in such a way that it is usable ..."
Abstract
-
Cited by 35 (12 self)
- Add to MetaCart
We describe the lexical knowledge base sys- tem (LKB) which has been designed and implemented as part of the ACQUILEX project x to allow the representation of multilingual syn- tactic and semantic information extracted from machine readable dictionaries (MRDs), in such a way that it is usable by natural language processing (NLP) systems. The LKB's lexical representation language (LRL) augments typed graph-based unification with default inheritance, formalised in terms of default unifi- cation of feature structures. We evaluate how well the LRL meets the practical requirements arising from the semi-automatic construction of a large scale, multilingual lexicon. The system as described is fully implemented and is being used to represent substantial amounts of information automatically extracted from MRDs.

