Results 1  10
of
15
Using Corpus Statistics and WordNet Relations for Sense Identification
, 1998
"... Introduction An impressive array of statistical methods have been developed for word sense identification. They range from dictionarybased approaches that rely on definitions (Vronis and Ide 1990; Wilks et al. 1993) to corpusbased approaches that use only word cooccurrence frequencies extracted f ..."
Abstract

Cited by 199 (0 self)
 Add to MetaCart
Introduction An impressive array of statistical methods have been developed for word sense identification. They range from dictionarybased approaches that rely on definitions (Vronis and Ide 1990; Wilks et al. 1993) to corpusbased approaches that use only word cooccurrence frequencies extracted from large textual corpora (Schfitze 1995; Dagan and Itai 1994). We have drawn on these two traditions, using corpusbased cooccurrence and the lexical knowledge base that is embodied in the WordNet lexicon. The two traditions complement each other. Corpusbased approaches have the advantage of being generally applicable to new texts, domains, and corpora without needing costly and perhaps errorprone parsing or semantic analysis. They require only training corpora in which the sense distinctions have been marked, but therein lies their weakness. Obtaining training materials for statistical methods is costly and timeconsuming it is a "knowledge acquisition bottleneck" (Gale, Church, and Y
The Measure of a Model
, 1996
"... This paper describes measures for evaluating the three determinants of how well a probabilistic classifier performs on a given test set. These determinants are the appropriateness, for the test set, of the results of (1) feature selection, (2) formulation of the parametric form of the model, and (3) ..."
Abstract

Cited by 30 (15 self)
 Add to MetaCart
This paper describes measures for evaluating the three determinants of how well a probabilistic classifier performs on a given test set. These determinants are the appropriateness, for the test set, of the results of (1) feature selection, (2) formulation of the parametric form of the model, and (3) parameter estimation. These are part of any model formulation procedure, even if not broken out as separate steps, so the tradeoffs explored in this paper are relevant to a wide variety of methods. The measures are demonstrated in a large experiment, in which they are used to analyze the results of roughly 300 classifiers that perform wordsense disambiguation. Introduction This paper presents techniques that can be used to analyze the formulation of a probabilistic classifier. As part of this presentation, we apply these techniques to the results of a large number of classifiers, developed using the methodology presented in (2), (3), (4), (5), (12) and (16), which tag words according to ...
Sequential Model Selection for Word Sense Disambiguation
, 1997
"... Statistical models of wordsense disam biguation are often based on a small num ber of contextual features or on a model that is assumed to characterize the inter actions among a set of features. Model selection is presented as an alternative to these approaches, where a sequential search ..."
Abstract

Cited by 29 (14 self)
 Add to MetaCart
Statistical models of wordsense disam biguation are often based on a small num ber of contextual features or on a model that is assumed to characterize the inter actions among a set of features. Model selection is presented as an alternative to these approaches, where a sequential search of possible models is conducted in order to find the model that best characterizes the interactions among features. This paper expands existing model selection methodology and presents the first comparative study of model selection search strategies and evaluation criteria when applied to the problem of building probabilistic classifiers for wordsense disambiguation.
An approach to Robust Partial Parsing and Evaluation Metrics
 In Proceedings of the Eight European Summer School In Logic, Language and Information
, 1996
"... In this paper, we present a new technique called LightweightDependency Analysis which in conjunctionwith Supertag disambiguation provides a method for Robust Partial Parsing, called Almost Parsing. An overview is given of the XTAG system in which this technique is being developed. In addition, we ..."
Abstract

Cited by 19 (1 self)
 Add to MetaCart
(Show Context)
In this paper, we present a new technique called LightweightDependency Analysis which in conjunctionwith Supertag disambiguation provides a method for Robust Partial Parsing, called Almost Parsing. An overview is given of the XTAG system in which this technique is being developed. In addition, we propose alternate metrics for evaluation of partial parsers that can also serve to evaluate full parsers.
POS Tagging Using Relaxation Labelling
 PROCEEDINGS OF 16TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL LINGUISTICS, COLING
, 1996
"... Relaxation labelling is an optimization technique used in many fields to solve constraint satisfaction problems. The algorithm finds a combination of values for a set of variables such that satisfies  to the maximum possible degree  a set of given constraints. This pat)er scribes some experiment ..."
Abstract

Cited by 12 (5 self)
 Add to MetaCart
Relaxation labelling is an optimization technique used in many fields to solve constraint satisfaction problems. The algorithm finds a combination of values for a set of variables such that satisfies  to the maximum possible degree  a set of given constraints. This pat)er scribes some experiments performed applying it to POS tagging, and the results obtained. it also ponders the possibility of applying it, to Word Sense Disambiguation.
Probabilistic Classifiers for Tracking Point of View
 In Working
"... This paper describes work in developing probabilistic classifiers for a discourse segmentation problem that involves segmentation, reference resolution, and belief. Specifically, the problem is to segment a text into blocks such that all subjective sentences in a block are from the point of vie ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
This paper describes work in developing probabilistic classifiers for a discourse segmentation problem that involves segmentation, reference resolution, and belief. Specifically, the problem is to segment a text into blocks such that all subjective sentences in a block are from the point of view of the same agent, and to identify noun phrases that refer to that agent. In our method for developing classifiers (Bruce & Wiebe 1994ab), rather than making assumptions about which variables to use and how they are related, statistical techniques are used to explore these questions empirically. Further, the types of models used in this work can express complex relationships among diverse sets of variables. This work is part of a large project that is in an early stage of development. The contributions of this paper are an illustration of framing a highlevel discourse problem in such a way that it is amenable to statistical processing while still retaining its core, and a des...
Towards the Acquisition and Representation of a BroadCoverage Lexicon
 In Working Notes of the AAA1 Spring Symposium on Representation and Acquisition of Lexical Knowledge
, 1995
"... Statistical techniques for NLP typically do not take advantage of existing domain knowledge and require large amounts of tagged training data. This paper presents a partial remedy to these shortcomings by introducing a richer class of statistical models, graphical models, along with techniques ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
(Show Context)
Statistical techniques for NLP typically do not take advantage of existing domain knowledge and require large amounts of tagged training data. This paper presents a partial remedy to these shortcomings by introducing a richer class of statistical models, graphical models, along with techniques for: 1) establishing the form of the model in this class that best describes a given set of training data, 2) estimating the parameters of graphical models from untagged data, 3) combining constraints formulated in propositional logic with those derived from training data to produce a graphical model, and 4) simultaneously resolving interdependent ambiguities. The paper also describes how these tools can be used to produce a broadcoverage lexicon represented as a probabilistic model, and presents a method for using such a lexicon to simultaneously disambiguate all words in a sentence. Introduction The specification and acquisition of lexical knowledge is a central problem in n...
Summary Statement
"... this paper uses the mental phenomenon approach, and is based on the concept of "SharedPlans of Grosz, Sidner, Lochbaum, and Kraus (Grosz & Sidner 1990, Lochbaum, Grosz, & Sidner 1990, and Grosz & Kraus 1993). ..."
Abstract
 Add to MetaCart
this paper uses the mental phenomenon approach, and is based on the concept of "SharedPlans of Grosz, Sidner, Lochbaum, and Kraus (Grosz & Sidner 1990, Lochbaum, Grosz, & Sidner 1990, and Grosz & Kraus 1993).
Naive Bayes as a Satisficing Model
, 1998
"... We report on an empirical study of supervised learning algorithms that induce models to resolve the meaning of ambiguous words in text. We find that the Naive Bayesian classifier is as accurate as several more sophisticated methods. This is a surprising result since Naive Bayes makes simplifying ass ..."
Abstract
 Add to MetaCart
(Show Context)
We report on an empirical study of supervised learning algorithms that induce models to resolve the meaning of ambiguous words in text. We find that the Naive Bayesian classifier is as accurate as several more sophisticated methods. This is a surprising result since Naive Bayes makes simplifying assumptions about disambiguation that are not realistic. However, our results correspond to a growing body of evidence that Naive Bayes acts as a satisficing model in a wide range of domains. We suggest that bias variance decompositions of classification error can be used to identify and develop satisficing models.