Results 1 - 10
of
32
Tensor Decompositions and Applications
- SIAM REVIEW
, 2009
"... This survey provides an overview of higher-order tensor decompositions, their applications, and available software. A tensor is a multidimensional or N -way array. Decompositions of higher-order tensors (i.e., N -way arrays with N ⥠3) have applications in psychometrics, chemometrics, signal proce ..."
Abstract
-
Cited by 95 (13 self)
- Add to MetaCart
This survey provides an overview of higher-order tensor decompositions, their applications, and available software. A tensor is a multidimensional or N -way array. Decompositions of higher-order tensors (i.e., N -way arrays with N ⥠3) have applications in psychometrics, chemometrics, signal processing, numerical linear algebra, computer vision, numerical analysis, data mining, neuroscience, graph analysis, etc. Two particular tensor decompositions can be considered to be higher-order extensions of the matrix singular value decompo-
sition: CANDECOMP/PARAFAC (CP) decomposes a tensor as a sum of rank-one tensors, and the Tucker decomposition is a higher-order form of principal components analysis. There are many other tensor decompositions, including INDSCAL, PARAFAC2, CANDELINC, DEDICOM, and PARATUCK2 as well as nonnegative variants of all of the above. The N-way Toolbox and Tensor Toolbox, both for MATLAB, and the Multilinear Engine are examples of software packages for working with tensors.
Unsupervised multiway data analysis: A literature survey
- IEEE Transactions on Knowledge and Data Engineering
, 2008
"... Multiway data analysis captures multilinear structures in higher-order datasets, where data have more than two modes. Standard two-way methods commonly applied on matrices often fail to find the underlying structures in multiway arrays. With increasing number of application areas, multiway data anal ..."
Abstract
-
Cited by 24 (7 self)
- Add to MetaCart
Multiway data analysis captures multilinear structures in higher-order datasets, where data have more than two modes. Standard two-way methods commonly applied on matrices often fail to find the underlying structures in multiway arrays. With increasing number of application areas, multiway data analysis has become popular as an exploratory analysis tool. We provide a review of significant contributions in literature on multiway models, algorithms as well as their applications in diverse disciplines including chemometrics, neuroscience, computer vision, and social network analysis. 1.
Scalable tensor decompositions for multi-aspect data mining
- In ICDM 2008: Proceedings of the 8th IEEE International Conference on Data Mining
, 2008
"... Modern applications such as Internet traffic, telecommunication records, and large-scale social networks generate massive amounts of data with multiple aspects and high dimensionalities. Tensors (i.e., multi-way arrays) provide a natural representation for such data. Consequently, tensor decompositi ..."
Abstract
-
Cited by 17 (1 self)
- Add to MetaCart
Modern applications such as Internet traffic, telecommunication records, and large-scale social networks generate massive amounts of data with multiple aspects and high dimensionalities. Tensors (i.e., multi-way arrays) provide a natural representation for such data. Consequently, tensor decompositions such as Tucker become important tools for summarization and analysis. One major challenge is how to deal with highdimensional, sparse data. In other words, how do we compute decompositions of tensors where most of the entries of the tensor are zero. Specialized techniques are needed for computing the Tucker decompositions for sparse tensors because standard algorithms do not account for the sparsity of the data. As a result, a surprising phenomenon is observed by practitioners: Despite the fact that there is enough memory to store both the input tensors and the factorized output tensors, memory overflows occur during the tensor factorization process. To address this intermediate blowup problem, we propose Memory-Efficient Tucker (MET). Based on the available memory, MET adaptively selects the right execution strategy during the decomposition. We provide quantitative and qualitative evaluation of MET on real tensors. It achieves over 1000X space reduction without sacrificing speed; it also allows us to work with much larger tensors that were too big to handle before. Finally, we demonstrate a data mining case-study using MET. 1
A NEWTON-GRASSMANN METHOD FOR COMPUTING THE BEST MULTI-LINEAR RANK-(R_1, R_2, R_3) APPROXIMATION OF A Tensor
"... We derive a Newton method for computing the best rank-(r_1, r_2, r_3) approximation of a given J × K × L tensor A. The problem is formulated as an approximation problem on a product of Grassmann manifolds. Incorporating the manifold structure into Newton’s method ensures that all iterates generated ..."
Abstract
-
Cited by 13 (6 self)
- Add to MetaCart
We derive a Newton method for computing the best rank-(r_1, r_2, r_3) approximation of a given J × K × L tensor A. The problem is formulated as an approximation problem on a product of Grassmann manifolds. Incorporating the manifold structure into Newton’s method ensures that all iterates generated by the algorithm are points on the Grassmann manifolds. We also introduce a consistent notation for matricizing a tensor, for contracted tensor products and some tensor-algebraic manipulations, which simplify the derivation of the Newton equations and enable straightforward algorithmic implementation. Experiments show a quadratic convergence rate for the Newton-Grassmann algorithm.
Cross-language information retrieval using PARAFAC2
- Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
, 2007
"... Approved for public release; further dissemination unlimited. ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
Approved for public release; further dissemination unlimited.
An Optimization Approach for Fitting Canonical Tensor Decompositions
, 2009
"... Tensor decompositions are higher-order analogues of matrix decompositions and have proven to be powerful tools for data analysis. In particular, we are interested in the canonical tensor decomposition, otherwise known as the CANDECOMP/PARAFAC decomposition (CPD), which expresses a tensor as the sum ..."
Abstract
-
Cited by 8 (4 self)
- Add to MetaCart
Tensor decompositions are higher-order analogues of matrix decompositions and have proven to be powerful tools for data analysis. In particular, we are interested in the canonical tensor decomposition, otherwise known as the CANDECOMP/PARAFAC decomposition (CPD), which expresses a tensor as the sum of component rank-one tensors and is used in a multitude of applications such as chemometrics, signal processing, neuroscience, and web analysis. The task of computing the CPD, however, can be difficult. The typical approach is based on alternating least squares (ALS) optimization, which can be remarkably fast but is not very accurate. Previously, nonlinear least squares (NLS) methods have also been recommended; existing NLS methods are accurate but slow. In this paper, we propose the use
of gradient-based optimization methods. We discuss the mathematical calculation of the derivatives and further show that they can be computed efficiently, at the same cost as one iteration of ALS. Computational experiments demonstrate that the gradient-based optimization methods are much more accurate than ALS and orders of magnitude faster than NLS.
Multivis: Content-based social network exploration through multi-way visual analysis
- In SDM
, 2009
"... With the explosion of social media, scalability becomes a key challenge. There are two main aspects of the problems that arise: 1) data volume: how to manage and analyze huge datasets to efficiently extract patterns, 2) data understanding: how to facilitate understanding of the patterns by users? To ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
With the explosion of social media, scalability becomes a key challenge. There are two main aspects of the problems that arise: 1) data volume: how to manage and analyze huge datasets to efficiently extract patterns, 2) data understanding: how to facilitate understanding of the patterns by users? To address both aspects of the scalability challenge, we present a hybrid approach that leverages two complementary disciplines, data mining and information visualization. In particular, we propose 1) an analytic data model for content-based networks using tensors; 2) an efficient high-order clustering framework for analyzing the data; 3) a scalable context-sensitive graph visualization to present the clusters. We evaluate the proposed methods using both synthetic and real datasets. In terms of computational efficiency, the proposed methods are an order of magnitude faster compared to the baseline. In terms of effectiveness, we present several case studies of real corporate social networks. 1
Descent methods for Nonnegative Matrix Factorization arXiv:0801.3199v3 [cs.NA
, 2009
"... In this paper, we present several descent methods that can be applied to nonnegative matrix factorization and we analyze a recently developed fast block coordinate method. We also give a comparison of these different methods and show that the new block coordinate method has better properties in term ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
In this paper, we present several descent methods that can be applied to nonnegative matrix factorization and we analyze a recently developed fast block coordinate method. We also give a comparison of these different methods and show that the new block coordinate method has better properties in terms of approximation error and complexity. By interpreting this method as a rank-one approximation of the residue matrix, we also extend it to the nonnegative tensor factorization and introduce some variants of the method by imposing some additional controllable constraints such as: sparsity, discreteness and smoothness. 1
Link Prediction on Evolving Data using Matrix and Tensor Factorizations
- IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS
, 2009
"... The data in many disciplines such as social networks, web analysis, etc. is link-based, and the link structure can be exploited for many different data mining tasks. In this paper, we consider the problem of temporal link prediction: Given link data for time periods 1 through T, can we predict the l ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
The data in many disciplines such as social networks, web analysis, etc. is link-based, and the link structure can be exploited for many different data mining tasks. In this paper, we consider the problem of temporal link prediction: Given link data for time periods 1 through T, can we predict the links in time period T +1? Specifically, we look at bipartite graphs changing over time and consider matrix- and tensorbased methods for predicting links. We present a weight-based method for collapsing multi-year data into a single matrix. We show how the well-known Katz method for link prediction can be extended to bipartite graphs and, moreover, approximated in a scalable way using a truncated singular value decomposition. Using a CANDECOMP/PARAFAC tensor decomposition of the data, we illustrate the usefulness of exploiting the natural threedimensional structure of temporal link data. Through several numerical experiments, we demonstrate that both matrixand tensor-based techniques are effective for temporal link prediction despite the inherent difficulty of the problem.
MACH: Fast Randomized Tensor Decompositions
, 2010
"... Tensors naturally model many real world processes which generate multi-aspect data. Such processes appear in many different research disciplines, e.g, chemometrics, computer vision, psychometrics and neuroimaging analysis. Tensor decompositions such as the Tucker decomposition are used to analyze mu ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
Tensors naturally model many real world processes which generate multi-aspect data. Such processes appear in many different research disciplines, e.g, chemometrics, computer vision, psychometrics and neuroimaging analysis. Tensor decompositions such as the Tucker decomposition are used to analyze multi-aspect data and extract latent factors, which capture the multilinear data structure. Such decompositions are powerful mining tools for extracting patterns from large data volumes. However, most frequently used algorithms for such decompositions involve the computationally expensive Singular Value Decomposition. In this paper we propose MACH, a new sampling algorithm to compute such decompositions. Our method is of significant practical value for tensor streams, such as environmental monitoring systems, IP traffic matrices over time, where large amounts of data are accumulated and the analysis is computationally intensive but also in “post-mortem ” data analysis cases where the tensor does not fit in the available memory. We provide the theoretical analysis of our proposed method and verify its efficacy on synthetic data and two real world monitoring system applications. Categories and Subject Descriptors:

