MetaCart Sign in to MyCiteSeerX

Include Citations | Advanced Search | Help

Disambiguated Search | Include Citations | Advanced Search | Help

Dimensions of Meaning (1992) [102 citations — 3 self]

Abstract:

The representation of documents and queries as vectors in a high-dimensional space is well-established in information retrieval [1]. This paper proposes to represent the semantics of words and contexts in a text as vectors. The dimensions of the space are words and the initial vectors are determined by the words occurring close to the entity to be represented which implies that the space has several thousand dimensions (words). This makes the vector representations (which are dense) too cumbersome to use directly. Therefore, dimensionality reduction by means of a singular value decomposition is employed. The paper analyzes the structure of the vector representations and applies them to word sense disambiguation and thesaurus induction.

Citations

2329 Introduction to modern information retrieval – Salton - 1983
1636 Indexing by latent semantic analysis – Deerwester, Dumais, et al. - 1990
970 Principal Component Analysis – Jolliffe - 1986
464 Word association norms, mutual information, and lexicography – CHURCH, HANKS - 1989
430 Scatter/gather: a cluster-based approach to browsing large document collections – Cutting, Karger, et al. - 1992
351 Building Large Knowledge-Based Systems – Lenat, Guha - 1990
228 Word sense disambiguation using statistical models of Roget's categories trained on large corpora – Yarowsky - 1992
210 AutoClass: a Bayesian classification system – Cheeseman, Kelly, et al. - 1988
146 Word-sense disambiguation using statistical methods – Brown, Pietra, et al. - 1991
63 Methods for Statistical Data Analysis of Multivariate Observations – Gnanadesikan - 1977
57 Using Bilingual Materials to Develop Word Sense Disambiguation Methods – Gale, Church, et al. - 1992