Results 1 -
6 of
6
Nouns are vectors, adjectives are matrices: Representing adjective-noun constructions in semantic space
"... We propose an approach to adjective-noun composition (AN) for corpus-based distributional semantics that, building on insights from theoretical linguistics, represents nouns as vectors and adjectives as data-induced (linear) functions (encoded as matrices) over nominal vectors. Our model significant ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
We propose an approach to adjective-noun composition (AN) for corpus-based distributional semantics that, building on insights from theoretical linguistics, represents nouns as vectors and adjectives as data-induced (linear) functions (encoded as matrices) over nominal vectors. Our model significantly outperforms the rivals on the task of reconstructing AN vectors not seen in training. A small post-hoc analysis further suggests that, when the model-generated AN vector is not similar to the corpus-observed AN vector, this is due to anomalies in the latter. We show moreover that our approach provides two novel ways to represent adjective meanings, alternative to its representation via corpus-based co-occurrence vectors, both outperforming the latter in an adjective clustering task. 1
Measuring Distributional Similarity in Context
"... The computation of meaning similarity as operationalized by vector-based models has found widespread use in many tasks ranging from the acquisition of synonyms and paraphrases to word sense disambiguation and textual entailment. Vector-based models are typically directed at representing words in iso ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
The computation of meaning similarity as operationalized by vector-based models has found widespread use in many tasks ranging from the acquisition of synonyms and paraphrases to word sense disambiguation and textual entailment. Vector-based models are typically directed at representing words in isolation and thus best suited for measuring similarity out of context. In his paper we propose a probabilistic framework for measuring similarity in context. Central to our approach is the intuition that word meaning is represented as a probability distribution over a set of latent senses and is modulated by context. Experimental results on lexical substitution and word similarity show that our algorithm outperforms previously proposed models. 1
A Semantic Similarity Language Model to Improve Automatic Image Annotation
"... Abstract — In recent years, with the rapid proliferation of digital images, the need to search and retrieve the images accurately, efficiently, and conveniently is becoming more acute. Automatic image annotation with image semantic content has attracted increasing attention, as it is the preprocess ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract — In recent years, with the rapid proliferation of digital images, the need to search and retrieve the images accurately, efficiently, and conveniently is becoming more acute. Automatic image annotation with image semantic content has attracted increasing attention, as it is the preprocess of annotation based image retrieval which provides users accurate, efficient, and convenient image retrieval with image understanding. Different machine learning approaches have been used to tackle the problem of automatic image annotation; however, most of them focused on exploring the relationship between images and annotation words and neglected the relationship among the annotation words. In this paper, we propose a framework of using language models to represent the word-to-word relation and thus to improve the performance of existing image annotation approaches utilizing probabilistic models. We also propose a specific language model- the semantic similarity language model to estimate the semantic similarity among the annotation words so that annotations that are more semantically coherent will have higher probability to be chosen to annotate the image. To illustrate the general idea of using language model to improve current image annotation systems, we added the language model on top of the two specific image annotation models- the translation model (TM) and the cross media relevance model (CMRM). We tested the improved models on a widely used image annotation corpus- the Corel 5K dataset. Our results show that by adding the semantic similarity language model, the performance of image annotation improves significantly in comparison with the original models. Our proposed language model can also be applied to other image annotation approaches using word probability conditioned on image or word-image joint probability as well. I.
Mayo
"... Distributed models of semantics assume that word meanings can be discovered from “the company they keep. ” Many such approaches learn semantics from large corpora, with each document considered to be unstructured bags of words, ignoring syntax and compositionality within a document. In contrast, thi ..."
Abstract
- Add to MetaCart
Distributed models of semantics assume that word meanings can be discovered from “the company they keep. ” Many such approaches learn semantics from large corpora, with each document considered to be unstructured bags of words, ignoring syntax and compositionality within a document. In contrast, this paper proposes a structured vectorial semantic framework, in which semantic vectors are defined and composed in syntactic context. As such, syntax and semantics are fully interactive; composition of semantic vectors necessarily produces a hypothetical syntactic parse. Evaluations show that using relationally-clustered headwords as a semantic space in this framework improves on a syntax-only model in perplexity and parsing accuracy.
(Linear) Maps of the Impossible: Capturing semantic anomalies in distributional space
"... In this paper, we present a first attempt to characterize the semantic deviance of composite expressions in distributional semantics. Specifically, we look for properties of adjective-noun combinations within a vectorbased semantic space that might cue their lack of meaning. We evaluate four differe ..."
Abstract
- Add to MetaCart
In this paper, we present a first attempt to characterize the semantic deviance of composite expressions in distributional semantics. Specifically, we look for properties of adjective-noun combinations within a vectorbased semantic space that might cue their lack of meaning. We evaluate four different compositionality models shown to have various levels of success in representing the meaning of AN pairs: the simple additive and multiplicative models of Mitchell and Lapata (2008), and the linear-map-based models of Guevara (2010) and Baroni and Zamparelli (2010). For each model, we generate composite vectors for a set of AN combinations unattested in the source corpus and which have been deemed either acceptable or semantically deviant. We then compute measures that might cue semantic anomaly, and compare each model’s results for the two classes of ANs. Our study shows that simple, unsupervised cues can indeed significantly tell unattested but acceptable ANs apart from impossible, or deviant, ANs, and that the simple additive and multiplicative models are the most effective in this task. 1
Evaluating Distributional Models of Semantics for Syntactically Invariant Inference
"... A major focus of current work in distributional models of semantics is to construct phrase representations compositionally from word representations. However, the syntactic contexts which are modelled are usually severely limited, a fact which is reflected in the lexical-level WSD-like evaluation me ..."
Abstract
- Add to MetaCart
A major focus of current work in distributional models of semantics is to construct phrase representations compositionally from word representations. However, the syntactic contexts which are modelled are usually severely limited, a fact which is reflected in the lexical-level WSD-like evaluation methods used. In this paper, we broaden the scope of these models to build sentence-level representations, and argue that phrase representations are best evaluated in terms of the inference decisions that they support, invariant to the particular syntactic constructions used to guide composition. We propose two evaluation methods in relation classification and QA which reflect these goals, and apply several recent compositional distributional models to the tasks. We find that the models outperform a simple lemma overlap baseline slightly, demonstrating that distributional approaches can already be useful for tasks requiring deeper inference. 1

