Results 1 - 10
of
24
Automatic Verb Classification Based on Statistical Distributions of Argument Structure
- Computational Linguistics
, 2001
"... this paper, we focus on argument structure--the thematic roles assigned by a verb to its arguments--as the way in which the relational semantics of the verb is represented at the syntactic level ..."
Abstract
-
Cited by 79 (15 self)
- Add to MetaCart
this paper, we focus on argument structure--the thematic roles assigned by a verb to its arguments--as the way in which the relational semantics of the verb is represented at the syntactic level
BalkaNet: Aims, Methods, Results and Perspectives. A General Overview
- In: D. Tufiş (ed): Special Issue on BalkaNet. Romanian Journal on Science and Technology of Information
"... Abstract. BalkaNet is an EC funded project (IST-2000-29388) that started in September 2001 and will end in August 2004. It aims at developing [109] aligned wordnets for the following Balkan languages: Bulgarian, Greek, Romanian, Serbian, Turkish and to extend the Czech wordnet previously developed i ..."
Abstract
-
Cited by 32 (14 self)
- Add to MetaCart
Abstract. BalkaNet is an EC funded project (IST-2000-29388) that started in September 2001 and will end in August 2004. It aims at developing [109] aligned wordnets for the following Balkan languages: Bulgarian, Greek, Romanian, Serbian, Turkish and to extend the Czech wordnet previously developed in the EuroWordNet project. BalkaNet project has insofar delivered many useful results in the fields of both Computational Lexicography and Natural Language Processing. However, most of these results have been only partially disseminated in different conferences and journals. This is the first attempt to provide an overall description of the findings, methodologies and results of the project as well as a detailed account on each monolingual wordnet. The paper also presents the freeware multilingual tools designed for the development, maintenance and efficient exploitation of the aligned BalkaNet wordnets. A preliminary approach on BalkaNet’s application towards indexing Web documents and Information Retrieval is described, following the consideration that semantic networks are valuable in the context of real world systems and user communities. Last but not least, a rather thorough analyses of wordnet applications over the last years is intended to put in evidence the hottest themes for further developments based on wordnets. The ultimate objective of this contribution is to spread the knowledge and experience that we have acquired, to the benefit of the research and industrial communities. We also hope that our shared experience will be helpful for other wordnet-builders. 10 D. Tufi¸s, D. Cristea, S. Stamou 1.
Clustering Polysemic Subcategorization Frame Distributions Semantically
- IN PROC. OF THE 41ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS
, 2003
"... Previous research has demonstrated the utility of clustering in inducing semantic verb classes from undisambiguated corpus data. We describe a new approach which involves clustering subcategorization frame (SCF) distributions using the Information Bottleneck and nearest neighbour methods. In ..."
Abstract
-
Cited by 23 (7 self)
- Add to MetaCart
Previous research has demonstrated the utility of clustering in inducing semantic verb classes from undisambiguated corpus data. We describe a new approach which involves clustering subcategorization frame (SCF) distributions using the Information Bottleneck and nearest neighbour methods. In contrast to previous work, we particularly focus on clustering polysemic verbs. A novel evaluation scheme is proposed which accounts for the effect of polysemy on the clusters, offering us a good insight into the potential and limitations of semantically classifying undisambiguated SCF data.
Deriving Concept Hierarchies from Text by Smooth Formal Concept Analysis
, 2003
"... We present a novel approach to the automatic acquisition of taxonomies or concept hierarchies from texts based on Formal Concept Analysis. ..."
Abstract
-
Cited by 10 (4 self)
- Add to MetaCart
We present a novel approach to the automatic acquisition of taxonomies or concept hierarchies from texts based on Formal Concept Analysis.
Crosslinguistic Transfer in Automatic Verb Classification
- IN PROCEEDINGS OF COLING 2002
, 2002
"... We investigate the use of multilingual data in the automatic classification of English verbs, and show that there is a useful transfer of information across languages. Specifically, we experiment with three lexical semantic classes of English verbs. We collect statistical features over a sample of E ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
We investigate the use of multilingual data in the automatic classification of English verbs, and show that there is a useful transfer of information across languages. Specifically, we experiment with three lexical semantic classes of English verbs. We collect statistical features over a sample of English verbs from each of the classes, as well as over Chinese translations of those verbs. We use the English and Chinese data, alone and in combination, as training data for a machine learning algorithm whose output is an automatic verb classifier. We demonstrate that Chinese data is indeed useful in helping to classify the English verbs (at 82% accuracy), and furthermore that a multilingual combination of data outperforms the English data alone (85% accuracy). Moreover, our results using monolingual corpora show that it is not necessary to use a parallel corpus to extract the translations in order for this technique to be successful.
Token-level disambiguation of verbnet classes. (Erk et al
, 2005
"... The automatic disambiguation of verbs in domain independent text becomes more and more important for applications such as Machine Translation, Text Summarization, and Question Answering, mainly because verbs play a key factor in the syntactic and semantic interpretation of sentences. In this paper w ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
The automatic disambiguation of verbs in domain independent text becomes more and more important for applications such as Machine Translation, Text Summarization, and Question Answering, mainly because verbs play a key factor in the syntactic and semantic interpretation of sentences. In this paper we present a system for the automatic classification of token verbs in context based on VerbNet classes. A supervised machine learning classifier is trained and tested on a portion of PropBank using a set of lexical and syntactic features. 1
Conceptual Knowledge Processing with Formal Concept Analysis and Ontologies
, 2004
"... Among many other knowledge representations formalisms, Ontologies and Formal Concept Analysis (FCA) aim at modeling ‘concepts’. We discuss how these two formalisms may complement another from an application point of view. In particular, we will see how FCA can be used to support Ontology Engineerin ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
Among many other knowledge representations formalisms, Ontologies and Formal Concept Analysis (FCA) aim at modeling ‘concepts’. We discuss how these two formalisms may complement another from an application point of view. In particular, we will see how FCA can be used to support Ontology Engineering, and how ontologies can be exploited in FCA applications. The interplay of FCA and ontologies is studied along the life cycle of an ontology: (i) FCA can support the building of the ontology as a learning technique. (ii) The established ontology can be analyzed and navigated by using techniques of FCA. (iii) Last but not least, the ontology may be used to improve an FCA application.
Data analysis of conceptual similarities of Finnish verbs
- 18 5 Hyperparameter Estimation 21
, 2002
"... The study of the conceptual representations that underlie the use of language is a problem motivated from both a cognitive research point of view and that of construing language models for various language processing tasks. In this work, we organized 600 Finnish verbs using the SOM algorithm. T ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
The study of the conceptual representations that underlie the use of language is a problem motivated from both a cognitive research point of view and that of construing language models for various language processing tasks. In this work, we organized 600 Finnish verbs using the SOM algorithm. Three experiments were conducted using dierent features to encode the verbs: morphosyntactic properties, individual nouns, and noun categories in the context of the verb. In general, the morphosyntactic properties seem to draw attention to semantic roles, whereas nouns as features seem to highlight clusters formed on grounds of topics in the text.
Verb class discovery from rich syntactic data
- In 9th International Conference on Intelligent Text Processing and Computational Linguistics
, 2008
"... Abstract. Previous research has shown that syntactic features are the most informative features in automatic verb classification. We investigate their optimal characteristics by comparing a range of feature sets extracted from data where the proportion of verbal arguments and adjuncts is controlled. ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
Abstract. Previous research has shown that syntactic features are the most informative features in automatic verb classification. We investigate their optimal characteristics by comparing a range of feature sets extracted from data where the proportion of verbal arguments and adjuncts is controlled. The data are obtained from different versions of VALEX [1] – a large SCF lexicon for English which was acquired automatically from several corpora and the Web. We evaluate the feature sets thoroughly using four supervised classifiers and one unsupervised method. The best performing feature set includes rich syntactic information about both arguments and adjuncts of verbs. When combined with our best performing classifier (a novel Gaussian classifier), it yields the promising accuracy of 64.2 % in classifying 204 verbs to 17 Levin (1993) classes. We discuss the impact of our results on the state-or-art and propose avenues for future work. 1
Combining Labeled and Unlabeled Data in Statistical Natural Language Parsing
, 2002
"... Prof. Aravind Joshi, my dissertation advisor has been my guide and mentor for the entire time that I spent at Penn. I thank him for all his academic help and personal kindness. The external member on my dissertation committee was Steven Abney, whose suggestions and advice have made the ideas present ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Prof. Aravind Joshi, my dissertation advisor has been my guide and mentor for the entire time that I spent at Penn. I thank him for all his academic help and personal kindness. The external member on my dissertation committee was Steven Abney, whose suggestions and advice have made the ideas presented here stronger. My dissertation committee members from Penn: Mitch Marcus, Mark Liberman and Martha Palmer provided questions whose answers shaped my dissertation proposal into the finished form in front of you. Many thanks to my academic collaborators; the work on prefix probabilities was done with Mark-Jan Nederhof and Giorgio Satta when they visited IRCS in 1998, the work on subcategorization frame learning was done in collaboration with Daniel Zeman when he visited IRCS in 2000. Thanks to B. Srinivas whose previous work provided the path to the experimental work in this dissertation. Thanks also to Paola Merlo and Suzanne Stevenson for discussions on their work on verb alternation classes. I also acknowledge the help of Woottiporn Tripasai in the extension of their work presented in this dissertation. Thanks to

