Results 1  10
of
25
Vectorbased models of semantic composition
 In Proceedings of ACL08: HLT
, 2008
"... This paper proposes a framework for representing the meaning of phrases and sentences in vector space. Central to our approach is vector composition which we operationalize in terms of additive and multiplicative functions. Under this framework, we introduce a wide range of composition models which ..."
Abstract

Cited by 216 (5 self)
 Add to MetaCart
This paper proposes a framework for representing the meaning of phrases and sentences in vector space. Central to our approach is vector composition which we operationalize in terms of additive and multiplicative functions. Under this framework, we introduce a wide range of composition models which we evaluate empirically on a sentence similarity task. Experimental results demonstrate that the multiplicative models are superior to the additive alternatives when compared against human judgments.
Composition in distributional models of semantics
, 2010
"... Distributional models of semantics have proven themselves invaluable both in cognitive modelling of semantic phenomena and also in practical applications. For example, they have been used to model judgments of semantic similarity (McDonald, 2000) and association (Denhire and Lemaire, 2004; Griffit ..."
Abstract

Cited by 141 (3 self)
 Add to MetaCart
(Show Context)
Distributional models of semantics have proven themselves invaluable both in cognitive modelling of semantic phenomena and also in practical applications. For example, they have been used to model judgments of semantic similarity (McDonald, 2000) and association (Denhire and Lemaire, 2004; Griffiths et al., 2007) and have been shown to achieve human level performance on synonymy tests (Landuaer and Dumais, 1997; Griffiths et al., 2007) such as those included in the Test of English as Foreign Language (TOEFL). This ability has been put to practical use in automatic thesaurus extraction (Grefenstette, 1994). However, while there has been a considerable amount of research directed at the most effective ways of constructing representations for individual words, the representation of larger constructions, e.g., phrases and sentences, has received relatively little attention. In this thesis we examine this issue of how to compose meanings within distributional models of semantics to form representations of multiword structures. Natural language data typically consists of such complex structures, rather than
A ContextTheoretic Framework for Compositionality in Distributional Semantics
"... Formalizing “meaning as context ” mathematically leads to a new, algebraic theory of meaning, in which composition is bilinear and associative. These properties are shared by other methods that have been proposed in the literature, including the tensor product, vector addition, pointwise multiplicat ..."
Abstract

Cited by 26 (2 self)
 Add to MetaCart
Formalizing “meaning as context ” mathematically leads to a new, algebraic theory of meaning, in which composition is bilinear and associative. These properties are shared by other methods that have been proposed in the literature, including the tensor product, vector addition, pointwise multiplication, and matrix multiplication. Entailment can be represented by a vector lattice ordering, inspired by a strengthened form of the distributional hypothesis, and a degree of entailment is defined in the form of a conditional probability. Approaches to the task of recognizing textual entailment, including the use of subsequence matching, lexical entailment probability, and latent Dirichlet allocation, can be described within our framework. 1.
Language models based on semantic composition
 In Proceedings of EMNLP
, 2009
"... In this paper we propose a novel statistical language model to capture longrange semantic dependencies. Specifically, we apply the concept of semantic composition to the problem of constructing predictive history representations for upcoming words. We also examine the influence of the underlying se ..."
Abstract

Cited by 17 (3 self)
 Add to MetaCart
(Show Context)
In this paper we propose a novel statistical language model to capture longrange semantic dependencies. Specifically, we apply the concept of semantic composition to the problem of constructing predictive history representations for upcoming words. We also examine the influence of the underlying semantic space on the composition task by comparing spatial semantic representations against topicbased ones. The composition models yield reductions in perplexity when combined with a standard ngram language model over the ngram model alone. We also obtain perplexity reductions when integrating our models with a structured language model. 1
Compositional matrixspace models of language
 In ACL
, 2010
"... We propose CMSMs, a novel type of generic compositional models for syntactic and semantic aspects of natural language, based on matrix multiplication. We argue for the structural and cognitive plausibility of this model and show that it is able to cover and combine various common compositional NLP a ..."
Abstract

Cited by 13 (0 self)
 Add to MetaCart
We propose CMSMs, a novel type of generic compositional models for syntactic and semantic aspects of natural language, based on matrix multiplication. We argue for the structural and cognitive plausibility of this model and show that it is able to cover and combine various common compositional NLP approaches ranging from statistical word space models to symbolic grammar formalisms.
Contexttheoretic Semantics for Natural Language: an Overview
"... We present the contexttheoretic framework, which provides a set of rules for the nature of composition of meaning based on the philosophy of meaning as context. Principally, in the framework the composition of the meaning of words can be represented as multiplication of their representative vectors ..."
Abstract

Cited by 11 (2 self)
 Add to MetaCart
We present the contexttheoretic framework, which provides a set of rules for the nature of composition of meaning based on the philosophy of meaning as context. Principally, in the framework the composition of the meaning of words can be represented as multiplication of their representative vectors, where multiplication is distributive with respect to the vector space. We discuss the applicability of the framework to a range of techniques in natural language processing, including subsequence matching, the lexical entailment model of Dagan et al. (2005), vectorbased representations of taxonomies, statistical parsing and the representation of uncertainty in logical semantics. 1
A formalization of logical imaging for information retrieval using quantum theory
 In DEXA’08
, 2008
"... In this paper we introduce a formalization of Logical Imaging applied to IR in terms of Quantum Theory through the use of an analogy between states of a quantum system and terms in text documents. Our formalization relies upon the Schrödinger Picture, creating an analogy between the dynamics of a p ..."
Abstract

Cited by 9 (5 self)
 Add to MetaCart
(Show Context)
In this paper we introduce a formalization of Logical Imaging applied to IR in terms of Quantum Theory through the use of an analogy between states of a quantum system and terms in text documents. Our formalization relies upon the Schrödinger Picture, creating an analogy between the dynamics of a physical system and the kinematics of probabilities generated by Logical Imaging. By using Quantum Theory, it is possible to model more precisely contextual information in a seamless and principled fashion within the Logical Imaging process. While further work is needed to empirically validate this, the foundations for doing so are provided. 1
Domain and function: A dualspace model of semantic relations and compositions
 Journal of Artificial Intelligence Research
"... Given appropriate representations of the semantic relations between carpenter and wood and between mason and stone (for example, vectors in a vector space model), a suitable algorithm should be able to recognize that these relations are highly similar (carpenter is to wood as mason is to stone; the ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
(Show Context)
Given appropriate representations of the semantic relations between carpenter and wood and between mason and stone (for example, vectors in a vector space model), a suitable algorithm should be able to recognize that these relations are highly similar (carpenter is to wood as mason is to stone; the relations are analogous). Likewise, with representations of dog, house, and kennel, an algorithm should be able to recognize that the semantic composition of dog and house, dog house, is highly similar to kennel (dog house and kennel are synonymous). It seems that these two tasks, recognizing relations and compositions, are closely connected. However, up to now, the best models for relations are significantly different from the best models for compositions. In this paper, we introduce a dualspace model that unifies these two tasks. This model matches the performance of the best previous models for relations and compositions. The dualspace model consists of a space for measuring domain similarity and a space for measuring function similarity. Carpenter and wood share the same domain, the domain of carpentry. Mason and stone share the same domain, the domain of masonry. Carpenter and mason share the same function, the function of artisans. Wood and stone share the same function, the function of materials. In the composition dog house, kennel has some domain overlap with both dog and house (the domains of pets and buildings). The function of kennel is similar to the function of house (the function of shelters). By combining domain and function similarities in various ways, we can model relations, compositions, and other aspects of semantics. 1.
Semantic Composition with Quotient Algebras
 Proceedings of Geometric Models of Natural Language Semantics (GEMS2010
, 2010
"... We describe an algebraic approach for computing with vector based semantics. The tensor product has been proposed as a method of composition, but has the undesirable property that strings of different length are incomparable. We consider how a quotient algebra of the tensor algebra can allow such co ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
(Show Context)
We describe an algebraic approach for computing with vector based semantics. The tensor product has been proposed as a method of composition, but has the undesirable property that strings of different length are incomparable. We consider how a quotient algebra of the tensor algebra can allow such comparisons to be made, offering the possibility of datadriven models of semantic composition. 1
Lambek vs. Lambek: Functorial vector space semantics and string diagrams for Lambek calculus. Annals of Pure and Applied Logic
, 2013
"... The Distributional Compositional Categorical (DisCoCat) model is a mathematical framework that provides compositional semantics for meanings of natural language sentences. It consists of a computational procedure for constructing meanings of sentences, given their grammatical structure in terms of c ..."
Abstract

Cited by 5 (4 self)
 Add to MetaCart
(Show Context)
The Distributional Compositional Categorical (DisCoCat) model is a mathematical framework that provides compositional semantics for meanings of natural language sentences. It consists of a computational procedure for constructing meanings of sentences, given their grammatical structure in terms of compositional typelogic, and given the empirically derived meanings of their words. For the particular case that the meaning of words is modelled within a distributional vector space model, its experimental predictions, derived from real large scale data, have outperformed other empirically validated methods that could build vectors for a full sentence. This success can be attributed to a conceptually motivated mathematical underpinning, something which the other methods lack, by integrating qualitative compositional typelogic and quantitative modelling of meaning within a categorytheoretic mathematical framework. The typelogic used in the DisCoCat model is Lambek’s pregroup grammar. Pregroup types form a posetal compact closed category, which can be passed, in a functorial manner, on to the compact closed structure of vector spaces, linear maps and tensor product. The diagrammatic versions of the equational reasoning in compact closed categories can be interpreted as the flow of word meanings within sentences. Pregroups simplify Lambek’s previous typelogic, the Lambek calculus. The latter and its extensions have been extensively used to formalise and reason about various linguistic phenomena. Hence, the apparent reliance of the DisCoCat on pregroups has been seen as a shortcoming. This paper addresses this concern, by pointing out that one may as well realise a functorial passage from the original typelogic of Lambek, a monoidal biclosed category, to vector spaces, or to any other model of meaning organised within a monoidal biclosed category. The corresponding string diagram calculus, due to Baez and Stay, now depicts the flow of word meanings, and also reflects the structure of the parse trees of the Lambek calculus. 1 ar