Results 1  10
of
110
Tensor Decompositions and Applications
 SIAM REVIEW
, 2009
"... This survey provides an overview of higherorder tensor decompositions, their applications, and available software. A tensor is a multidimensional or N way array. Decompositions of higherorder tensors (i.e., N way arrays with N â¥ 3) have applications in psychometrics, chemometrics, signal proce ..."
Abstract

Cited by 705 (17 self)
 Add to MetaCart
(Show Context)
This survey provides an overview of higherorder tensor decompositions, their applications, and available software. A tensor is a multidimensional or N way array. Decompositions of higherorder tensors (i.e., N way arrays with N â¥ 3) have applications in psychometrics, chemometrics, signal processing, numerical linear algebra, computer vision, numerical analysis, data mining, neuroscience, graph analysis, etc. Two particular tensor decompositions can be considered to be higherorder extensions of the matrix singular value decompo
sition: CANDECOMP/PARAFAC (CP) decomposes a tensor as a sum of rankone tensors, and the Tucker decomposition is a higherorder form of principal components analysis. There are many other tensor decompositions, including INDSCAL, PARAFAC2, CANDELINC, DEDICOM, and PARATUCK2 as well as nonnegative variants of all of the above. The Nway Toolbox and Tensor Toolbox, both for MATLAB, and the Multilinear Engine are examples of software packages for working with tensors.
Graphscope: parameterfree mining of large timeevolving graphs
 In KDD ’07: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
, 2007
"... How can we find communities in dynamic networks of social interactions, such as who calls whom, who emails whom, or who sells to whom? How can we spot discontinuity timepoints in such streams of graphs, in an online, anytime fashion? We propose GraphScope, that addresses both problems, using inf ..."
Abstract

Cited by 151 (12 self)
 Add to MetaCart
(Show Context)
How can we find communities in dynamic networks of social interactions, such as who calls whom, who emails whom, or who sells to whom? How can we spot discontinuity timepoints in such streams of graphs, in an online, anytime fashion? We propose GraphScope, that addresses both problems, using information theoretic principles. Contrary to the majority of earlier methods, it needs no userdefined parameters. Moreover, it is designed to operate on large graphs, in a streaming fashion. We demonstrate the efficiency and effectiveness of our GraphScope on real datasets from several diverse domains. In all cases it produces meaningful timeevolving patterns that agree with human intuition.
Efficient MATLAB computations with sparse and factored tensors
 SIAM JOURNAL ON SCIENTIFIC COMPUTING
, 2007
"... In this paper, the term tensor refers simply to a multidimensional or $N$way array, and we consider how specially structured tensors allow for efficient storage and computation. First, we study sparse tensors, which have the property that the vast majority of the elements are zero. We propose stori ..."
Abstract

Cited by 80 (15 self)
 Add to MetaCart
In this paper, the term tensor refers simply to a multidimensional or $N$way array, and we consider how specially structured tensors allow for efficient storage and computation. First, we study sparse tensors, which have the property that the vast majority of the elements are zero. We propose storing sparse tensors using coordinate format and describe the computational efficiency of this scheme for various mathematical operations, including those typical to tensor decomposition algorithms. Second, we study factored tensors, which have the property that they can be assembled from more basic components. We consider two specific types: A Tucker tensor can be expressed as the product of a core tensor (which itself may be dense, sparse, or factored) and a matrix along each mode, and a Kruskal tensor can be expressed as the sum of rank1 tensors. We are interested in the case where the storage of the components is less than the storage of the full tensor, and we demonstrate that many elementary operations can be computed using only the components. All of the efficiencies described in this paper are implemented in the Tensor Toolbox for MATLAB.
Unsupervised multiway data analysis: A literature survey
 IEEE Transactions on Knowledge and Data Engineering
, 2008
"... Multiway data analysis captures multilinear structures in higherorder datasets, where data have more than two modes. Standard twoway methods commonly applied on matrices often fail to find the underlying structures in multiway arrays. With increasing number of application areas, multiway data anal ..."
Abstract

Cited by 80 (10 self)
 Add to MetaCart
Multiway data analysis captures multilinear structures in higherorder datasets, where data have more than two modes. Standard twoway methods commonly applied on matrices often fail to find the underlying structures in multiway arrays. With increasing number of application areas, multiway data analysis has become popular as an exploratory analysis tool. We provide a review of significant contributions in literature on multiway models, algorithms as well as their applications in diverse disciplines including chemometrics, neuroscience, computer vision, and social network analysis. 1.
A ThreeWay Model for Collective Learning on MultiRelational Data
"... Relational learning is becoming increasingly important in many areas of application. Here, we present a novel approach to relational learning based on the factorization of a threeway tensor. We show that unlike other tensor approaches, our method is able to perform collective learning via the laten ..."
Abstract

Cited by 64 (13 self)
 Add to MetaCart
(Show Context)
Relational learning is becoming increasingly important in many areas of application. Here, we present a novel approach to relational learning based on the factorization of a threeway tensor. We show that unlike other tensor approaches, our method is able to perform collective learning via the latent components of the model and provide an efficient algorithm to compute the factorization. We substantiate our theoretical considerations regarding the collective learning capabilities of our model by the means of experiments on both a new dataset and a dataset commonly used in entity resolution. Furthermore, we show on common benchmark datasets that our approach achieves better or onpar results, if compared to current stateoftheart relational learning solutions, while it is significantly faster to compute. 1.
Scalable tensor decompositions for multiaspect data mining
 In ICDM 2008: Proceedings of the 8th IEEE International Conference on Data Mining
, 2008
"... Modern applications such as Internet traffic, telecommunication records, and largescale social networks generate massive amounts of data with multiple aspects and high dimensionalities. Tensors (i.e., multiway arrays) provide a natural representation for such data. Consequently, tensor decompositi ..."
Abstract

Cited by 59 (1 self)
 Add to MetaCart
(Show Context)
Modern applications such as Internet traffic, telecommunication records, and largescale social networks generate massive amounts of data with multiple aspects and high dimensionalities. Tensors (i.e., multiway arrays) provide a natural representation for such data. Consequently, tensor decompositions such as Tucker become important tools for summarization and analysis. One major challenge is how to deal with highdimensional, sparse data. In other words, how do we compute decompositions of tensors where most of the entries of the tensor are zero. Specialized techniques are needed for computing the Tucker decompositions for sparse tensors because standard algorithms do not account for the sparsity of the data. As a result, a surprising phenomenon is observed by practitioners: Despite the fact that there is enough memory to store both the input tensors and the factorized output tensors, memory overflows occur during the tensor factorization process. To address this intermediate blowup problem, we propose MemoryEfficient Tucker (MET). Based on the available memory, MET adaptively selects the right execution strategy during the decomposition. We provide quantitative and qualitative evaluation of MET on real tensors. It achieves over 1000X space reduction without sacrificing speed; it also allows us to work with much larger tensors that were too big to handle before. Finally, we demonstrate a data mining casestudy using MET. 1
Fast Local Algorithms for Large Scale Nonnegative Matrix and Tensor Factorizations
, 2008
"... Nonnegative matrix factorization (NMF) and its extensions such as Nonnegative Tensor Factorization (NTF) have become prominent techniques for blind sources separation (BSS), analysis of image databases, data mining and other information retrieval and clustering applications. In this paper we propose ..."
Abstract

Cited by 49 (13 self)
 Add to MetaCart
Nonnegative matrix factorization (NMF) and its extensions such as Nonnegative Tensor Factorization (NTF) have become prominent techniques for blind sources separation (BSS), analysis of image databases, data mining and other information retrieval and clustering applications. In this paper we propose a family of efficient algorithms for NMF/NTF, as well as sparse nonnegative coding and representation, that has many potential applications in computational neuroscience, multisensory processing, compressed sensing and multidimensional data analysis. We have developed a class of optimized local algorithms which are referred to as Hierarchical Alternating Least Squares (HALS) algorithms. For these purposes, we have performed sequential constrained minimization on a set of squared Euclidean distances. We then extend this approach to robust cost functions using the Alpha and Beta divergences and derive flexible update rules. Our algorithms are locally stable and work well for NMFbased blind source separation (BSS) not only for the overdetermined case but also for an underdetermined (overcomplete) case (i.e., for a system which has less sensors than sources) if data are sufficiently sparse. The NMF learning rules are extended and generalized for Nth order nonnegative tensor factorization (NTF). Moreover, these algorithms can be tuned to different noise statistics by adjusting a single parameter. Extensive experimental results confirm the accuracy and computational performance of the developed algorithms, especially, with usage of multilayer hierarchical NMF approach [3].
Factorizing YAGO: scalable machine learning for linked data
 In WWW
, 2012
"... Vast amounts of structured information have been published in the Semantic Web’s Linked Open Data (LOD) cloud and their size is still growing rapidly. Yet, access to this information via reasoning and querying is sometimes difficult, due to LOD’s size, partial data inconsistencies and inherent noisi ..."
Abstract

Cited by 45 (13 self)
 Add to MetaCart
(Show Context)
Vast amounts of structured information have been published in the Semantic Web’s Linked Open Data (LOD) cloud and their size is still growing rapidly. Yet, access to this information via reasoning and querying is sometimes difficult, due to LOD’s size, partial data inconsistencies and inherent noisiness. Machine Learning offers an alternative approach to exploiting LOD’s data with the advantages that Machine Learning algorithms are typically robust to both noise and data inconsistencies and are able to efficiently utilize nondeterministic dependencies in the data. From a Machine Learning point of view, LOD is challenging due to its relational nature and its scale. Here, we present an efficient approach to relational learning on LOD data, based on the factorization of a sparse tensor that scales to data consisting
Proximity Tracking on TimeEvolving Bipartite Graphs
"... Given an authorconference network that evolves over time, which are the conferences that a given author is most closely related with, and how do they change over time? Large timeevolving bipartite graphs appear in many settings, such as social networks, cocitations, marketbasket analysis, and co ..."
Abstract

Cited by 35 (5 self)
 Add to MetaCart
(Show Context)
Given an authorconference network that evolves over time, which are the conferences that a given author is most closely related with, and how do they change over time? Large timeevolving bipartite graphs appear in many settings, such as social networks, cocitations, marketbasket analysis, and collaborative filtering. Our goal is to monitor (i) the centrality of an individual node (e.g., who are the most important authors?); and (ii) the proximity of two nodes or sets of nodes (e.g., who are the most important authors with respect to a particular conference?) Moreover, we want to do this efficiently and incrementally, and to provide “anytime ” answers. We propose pTrack and cTrack, which are based on random walk with restart, and use powerful matrix tools. Experiments on real data show that our methods are effective and efficient: the mining results agree with intuition; and we achieve up to 15∼176 times speedup, without any quality loss. 1
Probabilistic models for incomplete multidimensional arrays
 In Proceedings of the 12th International Conference on Artificial Intelligence and Statistics
, 2009
"... In multiway data, each sample is measured by multiple sets of correlated attributes. We develop a probabilistic framework for modeling structural dependency from partially observed multidimensional array data, known as pTucker. Latent components associated with individual array dimensions are joint ..."
Abstract

Cited by 31 (2 self)
 Add to MetaCart
In multiway data, each sample is measured by multiple sets of correlated attributes. We develop a probabilistic framework for modeling structural dependency from partially observed multidimensional array data, known as pTucker. Latent components associated with individual array dimensions are jointly retrieved while the core tensor is integrated out. The resulting algorithm is capable of handling largescale data sets. We verify the usefulness of this approach by comparing against classical models on applications to modeling amino acid fluorescence, collaborative filtering and a number of benchmark multiway array data. 1