Making Sense About Sense (2006)
| Venue: | WORD SENSE DISAMBIGUATION: ALGORITHMS AND APPLICATIONS |
| Citations: | 22 - 3 self |
BibTeX
@MISC{Ide06makingsense,
author = {Nancy Ide and Yorick Wilks and Agirre and E. and Edmonds and P.},
title = {Making Sense About Sense},
year = {2006}
}
Years of Citing Articles
OpenURL
Abstract
We first reconsider the role of lexicographers in word-sense disambiguation as a computational task, as providers of both legacy material (dictionaries) and special test material for competitions like SENSEVAL. We suggest that the standard fine-grained division of senses and (larger) homographs by a lexicographer for use by a human reader may not be an appropriate goal for the computational WSD task. We argue that the level of sense-discrimination that NLP needs corresponds roughly to homographs, though we discuss psycholinguistic evidence that there are broad sense divisions with some etymological derivation (i.e. non-homographic) that are as distinct for humans as homographic ones and they may be part of the broad class of sensedivisions we seek to identify here. Fifteen years or more of WSD research has shown that it is this kind of discrimination that existing WSD programs are able to capture at the ~95% success level, whereas the full lexicographicallyderived division of senses seems to remain too hard for both programs and human discriminators. We link this discussion to the observation that major NLP tasks like MT and IR seem not to need independent WSD modules of the sort produced in the research field, even though they are undoubtedly doing WSD by other means. Our conclusion is that WSD should continue to focus on these broad discriminations, at which it can do very well, thereby possibly offering the close-to-100% success that IR needs (especially search-engine, rather than classic long-query) IR, and assume that this is what most NLP requires, with the possible exception of very fine questions of target word choice in MT. This proposal can be seen as reorienting WSD to what it can actually perform at the standard success levels, but we argue that this, rather...







