Results 1 - 10
of
46
Automatic Word Sense Discrimination
- Journal of Computational Linguistics
, 1998
"... This paper presents context-group discrimination, a disambiguation algorithm based on clustering. Senses are interpreted as groups (or clusters) of similar contexts of the ambiguous word. Words, contexts, and senses are represented in Word Space, a high-dimensional, real-valued space in which closen ..."
Abstract
-
Cited by 272 (0 self)
- Add to MetaCart
This paper presents context-group discrimination, a disambiguation algorithm based on clustering. Senses are interpreted as groups (or clusters) of similar contexts of the ambiguous word. Words, contexts, and senses are represented in Word Space, a high-dimensional, real-valued space in which closeness corresponds to semantic similarity. Similarity in Word Space is based on second-order co-occurrence: two tokens (or contexts) of the ambiguous word are assigned to the same sense cluster if the words they co-occur with in turn occur with similar words in a training corpus. The algorithm is automatic and unsupervised in both training and application: senses are induced from a corpus without labeled training insta,nces or other external knowledge sources. The paper demonstrates good performance of context-group discrimination for a sample of natural and artificial ambiguous words
Word-Sense Disambiguation Using Statistical Models of Roget's Categories Trained on Large Corpora
, 1992
"... This paper describes a program that disambiguates English word senses in unrestricted text using statistical models of the major Roget's Thesaurus categories. Roget's categories serve as approximations of conceptual classes. The categories listed for a word in Roget's index tend to correspond to ..."
Abstract
-
Cited by 265 (10 self)
- Add to MetaCart
This paper describes a program that disambiguates English word senses in unrestricted text using statistical models of the major Roget's Thesaurus categories. Roget's categories serve as approximations of conceptual classes. The categories listed for a word in Roget's index tend to correspond to sense distinctions; thus selecting the most likely category provides a useful level of sense disambiguation. The selection of categories is accomplished by identifying and weighting words that are indicative of each category when seen in context, using a Bayesian theoretical framework. Other
Introduction to the special issue on word sense disambiguation
- Computational Linguistics J
, 1998
"... ..."
Word sense disambiguation: The state of the art
- Computational Linguistics
, 1998
"... The automatic disambiguation of word senses has been an interest and concern since the earliest days of computer treatment of language in the 1950's. Sense disambiguation is an “intermediate task ” (Wilks and Stevenson, 1996) which is not an end in itself, but rather is necessary at one level or ano ..."
Abstract
-
Cited by 92 (3 self)
- Add to MetaCart
The automatic disambiguation of word senses has been an interest and concern since the earliest days of computer treatment of language in the 1950's. Sense disambiguation is an “intermediate task ” (Wilks and Stevenson, 1996) which is not an end in itself, but rather is necessary at one level or another to accomplish most natural language processing tasks. It is
Information Retrieval Based on Word Senses
, 1995
"... This paper proposes an algorithm for word sense disambiguation based on a vector representation of word similarity derived from lexical co-occurrence. It differs from standard approaches by allowing for as fine grained distinctions as is warranted by the information at hand, rather than supposing a ..."
Abstract
-
Cited by 65 (0 self)
- Add to MetaCart
This paper proposes an algorithm for word sense disambiguation based on a vector representation of word similarity derived from lexical co-occurrence. It differs from standard approaches by allowing for as fine grained distinctions as is warranted by the information at hand, rather than supposing a fixed number of senses per word, and by allowing for more than one sense to be assigned to a given word occur-rance. The algorithm is applied to the standard vectorspace information retrieval model and an evaluation is performed over the Category B TREC-1 corpus (WSJ subcollection). Results show that this sense disambiguation algorithm improves performance by between 7o and 1o on aver-age.
Learning to Segment Speech Using Multiple Cues: A Connectionist Model
- LANGUAGE AND COGNITIVE PROCESSES
, 1998
"... ..."
Word Space
- Advances in Neural Information Processing Systems 5
, 1993
"... Representations for semantic information about words are necessary for many applications of neural networks in natural language processing. This paper describes an efficient, corpus-based method for inducing distributed semantic representations for a large number of words (50,000) from lexical coccu ..."
Abstract
-
Cited by 53 (0 self)
- Add to MetaCart
Representations for semantic information about words are necessary for many applications of neural networks in natural language processing. This paper describes an efficient, corpus-based method for inducing distributed semantic representations for a large number of words (50,000) from lexical coccurrence statistics by means of a large-scale linear regression. The representations are successfully applied to word sense disambiguation using a nearest neighbor method.
Advances in SHRUTI - A neurally motivated model of relational knowledge representation and rapid inference using temporal synchrony
- Applied Intelligence
, 1999
"... We are capable of drawing a variety of inferences effortlessly, spontaneously, and with remarkable efficiency — as though these inferences are a reflex response of our cognitive apparatus. This remarkable human ability poses a challenge for cognitive science and computational neuroscience: How can a ..."
Abstract
-
Cited by 50 (15 self)
- Add to MetaCart
We are capable of drawing a variety of inferences effortlessly, spontaneously, and with remarkable efficiency — as though these inferences are a reflex response of our cognitive apparatus. This remarkable human ability poses a challenge for cognitive science and computational neuroscience: How can a network of slow neuron-like elements represent a large body of systematic knowledge and perform a wide range of inferences with such speed? The connectionist model Shruti attempts to address this challenge by demonstrating how a neurally plausible network can encode a large body of semantic and episodic facts, systematic rules, and knowledge about entities and types, and yet perform a wide range of explanatory and predictive inferences within a few hundred milliseconds. Relational structures (frames, schemas) are represented in Shruti by clusters of cells, and inference in Shruti corresponds to a transient propagation of rhythmic activity over such cell-clusters wherein dynamic bindings are represented by the synchronous firing of appropriate cells. Shruti encodes mappings across relational structures using high-efficacy links that enable the propagation of rhythmic activity, and it encodes items in long-term memory as coincidence and conincidence-error detector circuits that become active in response to the occurrence (or non-occurrence) of appropriate coincidences in the on going flux of rhythmic activity.
"I Don't Believe in Word Senses"
, 1999
"... Word sense disambiguation assumes word senses. Within the lexicography and linguistics literature, they are known to be very slippery entities. The paper looks at problems with existing accounts of `word sense' and describes the various kinds of ways in which a word's meaning can deviate from its co ..."
Abstract
-
Cited by 50 (2 self)
- Add to MetaCart
Word sense disambiguation assumes word senses. Within the lexicography and linguistics literature, they are known to be very slippery entities. The paper looks at problems with existing accounts of `word sense' and describes the various kinds of ways in which a word's meaning can deviate from its core meaning. An analysis is presented in which word senses are abstractions from clusters of corpus citations, in accordance with current lexicographic practice. The corpus citations, not the word senses, are the basic objects in the ontology. The corpus citations will be clustered into senses according to the purposes of whoever or whatever does the clustering. In the absence of such purposes, word senses do not exist. Word sense disambiguation also needs a set of word senses to disambiguate between. In most recent work, the set has been taken from a general-purpose lexical resource, with the assumption that the lexical resource describes the word senses of English/French/. . . , between whi...

