Results 1 -
4 of
4
Investigating the cross-linguistic potential of VerbNet-style classification
"... Verb classes which integrate a wide range of linguistic properties (Levin, 1993) have proved useful for natural language processing (NLP) applications. However, the real-world use of these classes has been limited because for most languages, no resources similar to VerbNet (Kipper-Schuler, 2005) are ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Verb classes which integrate a wide range of linguistic properties (Levin, 1993) have proved useful for natural language processing (NLP) applications. However, the real-world use of these classes has been limited because for most languages, no resources similar to VerbNet (Kipper-Schuler, 2005) are available. We apply a verb clustering approach developed for English to French – a language for which no such experiment has been conducted yet. Our investigation shows that not only the general methodology but also the best performing features are transferable between the languages, making it possible to learn useful VerbNet style classes for French automatically without languagespecific tuning. 1
Hierarchical Verb Clustering Using Graph Factorization
"... Most previous research on verb clustering has focussed on acquiring flat classifications from corpus data, although many manually built classifications are taxonomic in nature. Also Natural Language Processing (NLP) applications benefit from taxonomic classifications because they vary in terms of th ..."
Abstract
- Add to MetaCart
Most previous research on verb clustering has focussed on acquiring flat classifications from corpus data, although many manually built classifications are taxonomic in nature. Also Natural Language Processing (NLP) applications benefit from taxonomic classifications because they vary in terms of the granularity they require from a classification. We introduce a new clustering method called Hierarchical Graph Factorization Clustering (HGFC) and extend it so that it is optimal for the task. Our results show that HGFC outperforms the frequently used agglomerative clustering on a hierarchical test set extracted from VerbNet, and that it yields state-of-the-art performance also on a flat test set. We demonstrate how the method can be used to acquire novel classifications as well as to extend existing ones on the basis of some prior knowledge about the classification. 1
Towards Unrestricted, Large-Scale Acquisition of Feature-Based Conceptual Representations from Corpus Data
, 2010
"... Abstract In recent years a number of methods have been proposed for the automatic acquisition of feature-based conceptual representations from text corpora. Such methods could offer valuable support for theoretical research on conceptual representation. However, existing methods do not target the fu ..."
Abstract
- Add to MetaCart
Abstract In recent years a number of methods have been proposed for the automatic acquisition of feature-based conceptual representations from text corpora. Such methods could offer valuable support for theoretical research on conceptual representation. However, existing methods do not target the full range of concept-relation-feature triples occurring in human-generated norms (e.g. flute produce sound) but rather focus on concept-feature pairs (e.g. flute – sound) or triples involving specific relations only (e.g. is-a or part-of relations). In this article we investigate the challenges that need to be met in both methodology and evaluation when moving towards the acquisition of more comprehensive conceptual representations from corpora. In particular, we investigate the usefulness of three types of knowledge in guiding the extraction process: encyclopedic, syntactic and semantic. We present first a semantic analysis of existing, human-generated feature production norms, which reveals information about co-occurring concept and feature classes. We introduce then a novel method for large-scale feature extraction which uses the class-based information to guide the acquisition process. The method involves extracting candidate triples consisting of concepts, relations and features (e.g. deer have antlers, flute produce sound) from corpus data parsed for grammatical dependencies, and re-weighting the triples on the

