Results 1 - 10
of
23
WordNet: An on-line lexical database
- International Journal of Lexicography
, 1990
"... WordNet is an on-line lexical reference system whose design is inspired by current ..."
Abstract
-
Cited by 1302 (7 self)
- Add to MetaCart
WordNet is an on-line lexical reference system whose design is inspired by current
Class-Based Construction of a Verb Lexicon
, 2000
"... We present an approach to building a verb lexicon compatible with WordNet but with explicitly stated syntactic and semantic information, using Levin verb classes to systematically construct lexical entries. By using verb classes we capture generalizations about verb behavior and reduce the effo ..."
Abstract
-
Cited by 115 (8 self)
- Add to MetaCart
We present an approach to building a verb lexicon compatible with WordNet but with explicitly stated syntactic and semantic information, using Levin verb classes to systematically construct lexical entries. By using verb classes we capture generalizations about verb behavior and reduce the effort needed to construct the lexicon. The syntactic frames for the verb classes are represented by a Lexicalized Tree Adjoining Grammar augmented with semantic predicates, which allows a compositional interpretation. Introduction Despite many different approaches to lexicon development (Pustejovsky 1991), (Copestake & Sanfilippo 1993), (Lowe, Baker, & Fillmore 1997), (Dorr 1997), the field of Natural Language Processing (NLP) has yet to develop a clear consensus on guidelines for computational verb lexicons, which has severely limited their utility in NLP applications. Many approaches make no attempt to associate the semantics of a verb with its possible syntactic frames. Others list too...
Acquisition of Semantic Lexicons: Using Word Sense Disambiguation to Improve Precision
, 2000
"... lexicons from machine-readable resources. We describe semantic filters designed to reduce the number of incorrect assignments (i.e., improve precision) made by a purely syntactic technique. We demonstrate that it is possible to use these filters to build broad-coverage lexicons with minimal effort, ..."
Abstract
-
Cited by 21 (7 self)
- Add to MetaCart
lexicons from machine-readable resources. We describe semantic filters designed to reduce the number of incorrect assignments (i.e., improve precision) made by a purely syntactic technique. We demonstrate that it is possible to use these filters to build broad-coverage lexicons with minimal effort, at a depth of knowledge that lies at the syntax-semantics interface. We report on our results of disambiguating the verbs in the semantic filters by adding WordNet sense annotations. We then show the results of our classification on unknown words and we evaluate these results.
A Dynamic Adaptive Self-Organising Hybrid Model for Text Clustering
- Proceedings of The Third IEEE International Conference on Data Mining (ICDM’03
, 2003
"... Clustering by document concepts is a powerful way of retrieving information from a large number of documents. This task in general does not make any assumption on the data distribution. In this paper, for this task we propose a new competitive Self-Organising (SOM) model, namely the Dynamic Adaptive ..."
Abstract
-
Cited by 12 (4 self)
- Add to MetaCart
Clustering by document concepts is a powerful way of retrieving information from a large number of documents. This task in general does not make any assumption on the data distribution. In this paper, for this task we propose a new competitive Self-Organising (SOM) model, namely the Dynamic Adaptive Self-Organising Hybrid model (DASH). The features of DASH are a dynamic structure, hierarchical clustering, non-stationary data learning and parameter self-adjustment. All features are data-oriented: DASH adjusts its behaviour not only by modifying its parameters but also by an adaptive structure. The hierarchical growing architecture is a useful facility for such a competitive neural model which is designed for text clustering. In this paper, we have presented a new type of self-organising dynamic growing neural network which can deal with the non-uniform data distribution and the non-stationary data sets and represent the inner data structure by a hierarchical view.
Selforganizing classification on the Reuters news corpus
- In Proceedings of the 19th International Conference on Computational Linguistics, volume 1. Association of Computing Machinery
, 2002
"... In this paper we propose an integration of a selforganizing map and semantic networks from WordNet for a text classification task using the new Reuters news corpus. This neural model is based on significance vectors and benefits from the presentation of document clusters. The Hypernym relation in Wo ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
In this paper we propose an integration of a selforganizing map and semantic networks from WordNet for a text classification task using the new Reuters news corpus. This neural model is based on significance vectors and benefits from the presentation of document clusters. The Hypernym relation in WordNet supplements the neural model in classification. We also analyse the relationships of news headlines and their contents of the new Reuters corpus by a series of experiments. This hybrid approach of neural selforganization and symbolic hypernym relationships is successful to achieve good classification rates on 100,000 full-text news articles. These results demonstrate that this approach can scale up to a large real-world task and show a lot of potential for text classification.
2004) Hybrid Neural Document Clustering Using Guided Self-organisation and WordNet
- Issue of IEEE Intelligent Systems
"... Copyright © 2004 IEEE. Reprinted from the March/April 2004 issue of IEEE Intelligent Systems. This material is posted here with permission of the IEEE. Such permission of the IEEE does not in any way imply IEEE endorsement of any of [Publisher Name]'s products or services. Internal or personal use o ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
Copyright © 2004 IEEE. Reprinted from the March/April 2004 issue of IEEE Intelligent Systems. This material is posted here with permission of the IEEE. Such permission of the IEEE does not in any way imply IEEE endorsement of any of [Publisher Name]'s products or services. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the IEEE by sending a blank email message to
Senses and Texts
- special issue of Computers and the Humanities
, 1997
"... This paper addresses the question of whether it is possible to sense-tag systematically, and on a large scale, and how we should assess progress so far. That is to say, how to attach each occurrence of a word in a text to one and only one sense in a dictionary---a particular dictionary of course, an ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
This paper addresses the question of whether it is possible to sense-tag systematically, and on a large scale, and how we should assess progress so far. That is to say, how to attach each occurrence of a word in a text to one and only one sense in a dictionary---a particular dictionary of course, and that is part of the problem. The paper does not propose a solution to the question, though we have reported empirical findings elsewhere (Cowie et al. 1992 and Wilks et al. 1996), and intend to continue and refine that work. The point of this paper is to examine two well-known contributions critically, one (Kilgarriff 1993) which is widely taken as showing that the task, as defined, cannot be carried out systematically by humans, and secondly (Yarowsky 1995) which claims strikingly good results at doing exactly that. Introduction Empirical, corpus-based, computational linguistics reached by now into almost every crevice of the subject, and perhaps pragmatics will soon succumb. Semantics...
Coder Lexicon: The Collins English Dictionary and its Adverb Definitions
, 1986
"... The CODER (COmposite Document Expert/extended/effective Retrieval) project is an investigation of the applicability of artificial intelligence techniques to the information retrieval task of analyzing, storing, and retrieving heterogeneous collections of “composite documents.” In order to support so ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
The CODER (COmposite Document Expert/extended/effective Retrieval) project is an investigation of the applicability of artificial intelligence techniques to the information retrieval task of analyzing, storing, and retrieving heterogeneous collections of “composite documents.” In order to support some of the processing desired, and to allow experimentation in information retrieval and natural language processing, a lexicon was constructed from the machine readable Collins Dictionary of the English Language. After giving background, motivation, and a survey of related work, the Collins lexicon is discussed. Following is a description of the conversion process, the format of the resulting Prolog database, and characteristics of the dictionary and relations. To illustrate what is present and to explain how it relates to the files produced from Webster's Seventh New Collegiate Dictionary, a number of comparative charts are given. Finally, a grammar for adverb definitions is presented, together with a description of defining formula that usually indicate the type of the adverb. Ultimately it is hoped that definitions for adverbs and other words will be parsed so that the relational lexicon being constructed will include many additional relationships and other knowledge about words and their usage.
On the Use of Linguistic Ontologies for Accessing and Indexing Distributed Digital Libraries
- in Proceedings of the First Annual Conference on the Theory and Practice of Digital Libraries
, 1994
"... this paper, we review some of the previous approaches and then present our approach, based on the use of a large and sophisticated linguistic ontology to generate suitable keyword matches. We compare our approach to the previous ones and discuss the advantages and disadvantages of these approaches i ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
this paper, we review some of the previous approaches and then present our approach, based on the use of a large and sophisticated linguistic ontology to generate suitable keyword matches. We compare our approach to the previous ones and discuss the advantages and disadvantages of these approaches in the context of large, full text, digital libraries distributed across a wide area network. We conclude by discussing directions for future work. 2 Background Work
A Self-Organising Hybrid Model for Dynamic Text Clustering
- Proceedings of the The Twenty-third SGAI International Conference on Innovative Techniques and Applications of Artificial Intelligence
, 2003
"... A text clustering neural model, traditionally, is assumed to cluster static text information and represent its inner structure on a flat map. However, the quantity of text information is continuously growing and the relationships between them are usually complicated. Therefore, the information i ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
A text clustering neural model, traditionally, is assumed to cluster static text information and represent its inner structure on a flat map. However, the quantity of text information is continuously growing and the relationships between them are usually complicated. Therefore, the information is not static and a flat map may be not enough to describe the relationships of input data. In this paper, for a real-world text clustering task we propose a new competitive Self-Organising Map (SOM) model, namely the Dynamic Adaptive Self-Organising Hybrid model (DASH). The features of DASH are a dynamic structure, hierarchical clustering, non-stationary data learning and parameter self-adjustment. All features are data-oriented: DASH adjusts its behaviour not only by modifying its parameters but also by an adaptive structure. We test the performance of our model using the larger new Reuters news corpus based on the criteria of classification accuracy and mean quantization error.

