Results 1  10
of
474
Indexing by latent semantic analysis
 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE
, 1990
"... A new method for automatic indexing and retrieval is described. The approach is to take advantage of implicit higherorder structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries. The p ..."
Abstract

Cited by 3460 (33 self)
 Add to MetaCart
(Show Context)
A new method for automatic indexing and retrieval is described. The approach is to take advantage of implicit higherorder structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries. The particular technique used is singularvalue decomposition, in which a large term by document matrix is decomposed into a set of ca. 100 orthogonal factors from which the original matrix can be approximated by linear combination. Documents are represented by ca. 100 item vectors of factor weights. Queries are represented as pseudodocument vectors formed from weighted combinations of terms, and documents with suprathreshold cosine values are returned. initial tests find this completely automatic method for retrieval to be promising.
Tensor Decompositions and Applications
 SIAM REVIEW
, 2009
"... This survey provides an overview of higherorder tensor decompositions, their applications, and available software. A tensor is a multidimensional or N way array. Decompositions of higherorder tensors (i.e., N way arrays with N â¥ 3) have applications in psychometrics, chemometrics, signal proce ..."
Abstract

Cited by 667 (17 self)
 Add to MetaCart
(Show Context)
This survey provides an overview of higherorder tensor decompositions, their applications, and available software. A tensor is a multidimensional or N way array. Decompositions of higherorder tensors (i.e., N way arrays with N â¥ 3) have applications in psychometrics, chemometrics, signal processing, numerical linear algebra, computer vision, numerical analysis, data mining, neuroscience, graph analysis, etc. Two particular tensor decompositions can be considered to be higherorder extensions of the matrix singular value decompo
sition: CANDECOMP/PARAFAC (CP) decomposes a tensor as a sum of rankone tensors, and the Tucker decomposition is a higherorder form of principal components analysis. There are many other tensor decompositions, including INDSCAL, PARAFAC2, CANDELINC, DEDICOM, and PARATUCK2 as well as nonnegative variants of all of the above. The Nway Toolbox and Tensor Toolbox, both for MATLAB, and the Multilinear Engine are examples of software packages for working with tensors.
Attention, similarity, and the identificationCategorization Relationship
, 1986
"... A unified quantitative approach to modeling subjects ' identification and categorization of multidimensional perceptual stimuli is proposed and tested. Two subjects identified and categorized the same set of perceptually confusable stimuli varying on separable dimensions. The identification dat ..."
Abstract

Cited by 612 (28 self)
 Add to MetaCart
(Show Context)
A unified quantitative approach to modeling subjects ' identification and categorization of multidimensional perceptual stimuli is proposed and tested. Two subjects identified and categorized the same set of perceptually confusable stimuli varying on separable dimensions. The identification data were modeled using Sbepard's (1957) multidimensional scalingchoice framework. This framework was then extended to model the subjects ' categorization performance. The categorization model, which generalizes the context theory of classification developed by Medin and Schaffer (1978), assumes that subjects store category exemplars in memory. Classification decisions are based on the similarity of stimuli to the stored exemplars. It is assumed that the same multidimensional perceptual representation underlies performance in both the identification and Categorization paradigms. However, because of the influence of selective attention, similarity relationships change systematically across the two paradigms. Some support was gained for the hypothesis that subjects distribute attention among component dimensions so as to optimize categorization performance. Evidence was also obtained that subjects may have augmented their category representations with inferred exemplars. Implications of the results for theories of multidimensional scaling and categorization are discussed.
From frequency to meaning : Vector space models of semantics
 Journal of Artificial Intelligence Research
, 2010
"... Computers understand very little of the meaning of human language. This profoundly limits our ability to give instructions to computers, the ability of computers to explain their actions to us, and the ability of computers to analyse and process text. Vector space models (VSMs) of semantics are begi ..."
Abstract

Cited by 311 (3 self)
 Add to MetaCart
(Show Context)
Computers understand very little of the meaning of human language. This profoundly limits our ability to give instructions to computers, the ability of computers to explain their actions to us, and the ability of computers to analyse and process text. Vector space models (VSMs) of semantics are beginning to address these limits. This paper surveys the use of VSMs for semantic processing of text. We organize the literature on VSMs according to the structure of the matrix in a VSM. There are currently three broad classes of VSMs, based on term–document, word–context, and pair–pattern matrices, yielding three classes of applications. We survey a broad range of applications in these three categories and we take a detailed look at a specific open source project in each category. Our goal in this survey is to show the breadth of applications of VSMs for semantics, to provide a new perspective on VSMs for those who are already familiar with the area, and to provide pointers into the literature for those who are less familiar with the field. 1.
Hierarchical singular value decomposition of tensors
 SIAM Journal on Matrix Analysis and Applications
"... Abstract. We define the hierarchical singular value decomposition (SVD) for tensors of order d ≥ 2. This hierarchical SVD has properties like the matrix SVD (and collapses to the SVD in d = 2), and we prove these. In particular, one can find low rank (almost) best approximations in a hierarchical fo ..."
Abstract

Cited by 168 (9 self)
 Add to MetaCart
(Show Context)
Abstract. We define the hierarchical singular value decomposition (SVD) for tensors of order d ≥ 2. This hierarchical SVD has properties like the matrix SVD (and collapses to the SVD in d = 2), and we prove these. In particular, one can find low rank (almost) best approximations in a hierarchical format (HTucker) which requires only O((d − 1)k3 + dnk) parameters, where d is the order of the tensor, n the size of the modes and k the (hierarchical) rank. The HTucker format is a specialization of the Tucker format and it contains as a special case all (canonical) rank k tensors. Based on this new concept of a hierarchical SVD we present algorithms for hierarchical tensor calculations allowing for a rigorous error analysis. The complexity of the truncation (finding lower rank approximations to hierarchical rank k tensors) is in O((d−1)k4+dnk2) and the attainable accuracy is just 2–3 digits less than machine precision.
A Framework for Robust Subspace Learning
 International Journal of Computer Vision
, 2003
"... Many computer vision, signal processing and statistical problems can be posed as problems of learning low dimensional linear or multilinear models. These models have been widely used for the representation of shape, appearance, motion, etc, in computer vision applications. ..."
Abstract

Cited by 157 (10 self)
 Add to MetaCart
(Show Context)
Many computer vision, signal processing and statistical problems can be posed as problems of learning low dimensional linear or multilinear models. These models have been widely used for the representation of shape, appearance, motion, etc, in computer vision applications.
Blind PARAFAC receivers for DSCDMA systems
 IEEE TRANS. SIGNAL PROCESSING
, 2000
"... This paper links the directsequence codedivision multiple access (DSCDMA) multiuser separationequalizationdetection problem to the parallel factor (PARAFAC) model, which is an analysis tool rooted in psychometrics and chemometrics. Exploiting this link, it derives a deterministic blind PARAFAC ..."
Abstract

Cited by 118 (20 self)
 Add to MetaCart
This paper links the directsequence codedivision multiple access (DSCDMA) multiuser separationequalizationdetection problem to the parallel factor (PARAFAC) model, which is an analysis tool rooted in psychometrics and chemometrics. Exploiting this link, it derives a deterministic blind PARAFAC DSCDMA receiver with performance close to nonblind minimum meansquared error (MMSE). The proposed PARAFAC receiver capitalizes on code, spatial, and temporal diversitycombining, thereby supporting small sample sizes, more users than sensors, and/or less spreading than users. Interestingly, PARAFAC does not require knowledge of spreading codes, the specifics of multipath (interchip interference), DOAcalibration information, finite alphabet/constant modulus, or statistical independence/whiteness to recover the informationbearing signals. Instead, PARAFAC relies on a fundamental result regarding the uniqueness of lowrank threeway array decomposition due to Kruskal (and generalized herein to the complexvalued case) that guarantees identifiability of all relevant signals and propagation parameters. These and other issues are also demonstrated in pertinent simulation experiments.
Principal component analysis of threemode data by means of alternating least squares algorithms
 Springer Complete Collection
, 1980
"... A new method to estimate the parameters of Tucker's threemode principal component model is discussed, and the convergence properties of the alternating least squares algorithm to solve the estimation problem are considered. A special case ofthe general Tucker model, in which the principal comp ..."
Abstract

Cited by 109 (0 self)
 Add to MetaCart
A new method to estimate the parameters of Tucker's threemode principal component model is discussed, and the convergence properties of the alternating least squares algorithm to solve the estimation problem are considered. A special case ofthe general Tucker model, in which the principal component analysis is only performed over two of the three modes is briefly outlined as well. The Miller & Nicely data on the confusion of English consonants are used to illustrate the programs TUCKALS3 and TUCKALS2 which incorporate the algorithms for the two models described. Key words: threemode principal component analysis, alternating least squares, factor analysis, multidimensional scaling, individual differences scaling, simultaneous iteration, confusion of consonants. 1. ThreeMode Models and Their Solutions The threemode modelhere r ferred to as the Tucker3 modelwas first formulated by Tucker [1963], and subsequently extended in articles by Tucker [1964, 1966], and Levin [1963, Note 5] especially with respect to the mathematical description and program
Perceptualcognitive universals as reflections of the world
 Behavioral and Brain Sciences
, 2001
"... SFI Working Papers contain accounts of scientific work of the author(s) and do not necessarily represent the views of the Santa Fe Institute. We accept papers intended for publication in peerreviewed journals or proceedings volumes, but not papers that have already appeared in print. Except for pap ..."
Abstract

Cited by 108 (2 self)
 Add to MetaCart
SFI Working Papers contain accounts of scientific work of the author(s) and do not necessarily represent the views of the Santa Fe Institute. We accept papers intended for publication in peerreviewed journals or proceedings volumes, but not papers that have already appeared in print. Except for papers by our external faculty, papers must be based on work done at SFI, inspired by an invited visit to or collaboration at SFI, or funded by an SFI grant. ©NOTICE: This working paper is included by permission of the contributing author(s) as a means to ensure timely distribution of the scholarly and technical work on a noncommercial basis. Copyright and all rights therein are maintained by the author(s). It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may be reposted only with the explicit permission of the copyright holder. www.santafe.edu
Beyond streams and graphs: Dynamic tensor analysis
 In KDD
, 2006
"... How do we find patterns in authorkeyword associations, evolving over time? Or in DataCubes, with productbranchcustomer sales information? Matrix decompositions, like principal component analysis (PCA) and variants, are invaluable tools for mining, dimensionality reduction, feature selection, rule ..."
Abstract

Cited by 106 (16 self)
 Add to MetaCart
(Show Context)
How do we find patterns in authorkeyword associations, evolving over time? Or in DataCubes, with productbranchcustomer sales information? Matrix decompositions, like principal component analysis (PCA) and variants, are invaluable tools for mining, dimensionality reduction, feature selection, rule identification in numerous settings like streaming data, text, graphs, social networks and many more. However, they have only two orders, like author and keyword, in the above example. We propose to envision such higher order data as tensors, and tap the vast literature on the topic. However, these methods do not necessarily scale up, let alone operate on semiinfinite streams. Thus, we introduce the dynamic tensor analysis (DTA) method, and its variants. DTA provides a compact summary for highorder and highdimensional data, and it also reveals the hidden correlations. Algorithmically, we designed DTA very carefully so that it is (a) scalable, (b) space efficient (it does not need to store the past) and (c) fully automatic with no need for user defined parameters. Moreover, we propose STA, a streaming tensor analysis method, which provides a fast, streaming approximation to DTA. We implemented all our methods, and applied them in two real settings, namely, anomaly detection and multiway latent semantic indexing. We used two real, large datasets, one on network flow data (100GB over 1 month) and one from DBLP (200MB over 25 years). Our experiments show that our methods are fast, accurate and that they find interesting patterns and outliers on the real datasets. 1.