Lexical Acquisition at the Syntax-Semantics Interface: Diathesis Alternations, Subcategorization Frames and Selectional Preferences. (2001)
| Citations: | 17 - 1 self |
BibTeX
@MISC{McCarthy01lexicalacquisition,
author = {Diana McCarthy},
title = {Lexical Acquisition at the Syntax-Semantics Interface: Diathesis Alternations, Subcategorization Frames and Selectional Preferences.},
year = {2001}
}
OpenURL
Abstract
Concrete inanimate animate liquid gas plant animal human solid moveable not-moveable Figure 2.4: LDOCE semantic space space by keeping to a simple hierarchy. However, it seems likely that a lot of specific predicates will not be adequately catered for. For example, given the 16 core categories depicted in figure 2.4 the direct object slot of sail would have to be accounted for by the movable class, when a more specific classification would be useful to distinguish, for example, cars, stones and ships. There are now WordNet versions for some European languages other than English (Vossen, 1999). For other languages, producing a new man-made hierarchy is not an easy alternative. The coverage needed for even a restricted domain requires considerable human effort. The noun hyponym hierarchy of WordNet is used as the representation medium for the preferences within this thesis. This makes our preferences prone to the human error inherent in the hierarchy and characteristic of any manmade resource. However, this is to some extent outweighed by the rigorous human effort that has gone into creating this useful taxonomy. WordNet has in excess of 60,000 classes in the hyponym hierarchy with over 88,000 word forms (version 1.5). Using current automatic classification methods for building a hierarchy of reasonable size would require considerable effort in post-editing to avoid incongruous classes and considerable processing time in the first place (Resnik, 1993a). The preferences we obtain are limited to the distinctions made within WordNet. Using corpus data does, to some extent, allow us to obtain preferences for the sublanguage of the corpus, since areas of WordNet that are not relevant to the domain have negligible frequency counts. 2.3 The WordNet Approaches There is a...







