Results 1 
2 of
2
Distinguishing Word Senses in Untagged Text
 In Proceedings of the Second Conference on Empirical Methods in Natural Language Processing
"... This paper describes an experimental com parison of three unsupervised learning algorithms that distinguish the sense of an ambiguous word in untagged text. ..."
Abstract

Cited by 80 (17 self)
 Add to MetaCart
(Show Context)
This paper describes an experimental com parison of three unsupervised learning algorithms that distinguish the sense of an ambiguous word in untagged text.
unknown title
"... Statistical methods for automatically identifying dependent word pairs (i.e. dependent bigrams) in a corpus of natural language text have traditionally been performed using asymptotic tests of signi cance. This paper suggests that Fisher's exact test is a more appropriate test due to the skewed ..."
Abstract
 Add to MetaCart
Statistical methods for automatically identifying dependent word pairs (i.e. dependent bigrams) in a corpus of natural language text have traditionally been performed using asymptotic tests of signi cance. This paper suggests that Fisher's exact test is a more appropriate test due to the skewed and sparse data samples typical of this problem. Both theoretical and experimental comparisons between Fisher's exact test andavariety of asymptotic tests (the ttest, Pearson's chisquare test, and Likelihoodratio chisquare test) are presented. These comparisons show that Fisher's exact test is more reliable in identifying dependent word pairs. The usefulness of Fisher's exact test extends to other problems in statistical natural language processing as skewed and sparse data appears to be the rule in natural language. The experiment presented in this paper was performed using PROC FREQ of the SAS System.