Results 1  10
of
17
Tensor Decompositions and Applications
 SIAM REVIEW
, 2009
"... This survey provides an overview of higherorder tensor decompositions, their applications, and available software. A tensor is a multidimensional or N way array. Decompositions of higherorder tensors (i.e., N way arrays with N â¥ 3) have applications in psychometrics, chemometrics, signal proce ..."
Abstract

Cited by 228 (14 self)
 Add to MetaCart
This survey provides an overview of higherorder tensor decompositions, their applications, and available software. A tensor is a multidimensional or N way array. Decompositions of higherorder tensors (i.e., N way arrays with N â¥ 3) have applications in psychometrics, chemometrics, signal processing, numerical linear algebra, computer vision, numerical analysis, data mining, neuroscience, graph analysis, etc. Two particular tensor decompositions can be considered to be higherorder extensions of the matrix singular value decompo
sition: CANDECOMP/PARAFAC (CP) decomposes a tensor as a sum of rankone tensors, and the Tucker decomposition is a higherorder form of principal components analysis. There are many other tensor decompositions, including INDSCAL, PARAFAC2, CANDELINC, DEDICOM, and PARATUCK2 as well as nonnegative variants of all of the above. The Nway Toolbox and Tensor Toolbox, both for MATLAB, and the Multilinear Engine are examples of software packages for working with tensors.
Unsupervised multiway data analysis: A literature survey
 IEEE Transactions on Knowledge and Data Engineering
, 2008
"... Multiway data analysis captures multilinear structures in higherorder datasets, where data have more than two modes. Standard twoway methods commonly applied on matrices often fail to find the underlying structures in multiway arrays. With increasing number of application areas, multiway data anal ..."
Abstract

Cited by 42 (8 self)
 Add to MetaCart
Multiway data analysis captures multilinear structures in higherorder datasets, where data have more than two modes. Standard twoway methods commonly applied on matrices often fail to find the underlying structures in multiway arrays. With increasing number of application areas, multiway data analysis has become popular as an exploratory analysis tool. We provide a review of significant contributions in literature on multiway models, algorithms as well as their applications in diverse disciplines including chemometrics, neuroscience, computer vision, and social network analysis. 1.
RELATIVEERROR CUR MATRIX DECOMPOSITIONS
 SIAM J. MATRIX ANAL. APPL
, 2008
"... Many data analysis applications deal with large matrices and involve approximating the matrix using a small number of “components.” Typically, these components are linear combinations of the rows and columns of the matrix, and are thus difficult to interpret in terms of the original features of the ..."
Abstract

Cited by 39 (9 self)
 Add to MetaCart
Many data analysis applications deal with large matrices and involve approximating the matrix using a small number of “components.” Typically, these components are linear combinations of the rows and columns of the matrix, and are thus difficult to interpret in terms of the original features of the input data. In this paper, we propose and study matrix approximations that are explicitly expressed in terms of a small number of columns and/or rows of the data matrix, and thereby more amenable to interpretation in terms of the original data. Our main algorithmic results are two randomized algorithms which take as input an m × n matrix A and a rank parameter k. In our first algorithm, C is chosen, and we let A ′ = CC + A, where C + is the Moore–Penrose generalized inverse of C. In our second algorithm C, U, R are chosen, and we let A ′ = CUR. (C and R are matrices that consist of actual columns and rows, respectively, of A, and U is a generalized inverse of their intersection.) For each algorithm, we show that with probability at least 1 − δ, ‖A − A ′ ‖F ≤ (1 + ɛ) ‖A − Ak‖F, where Ak is the “best ” rankk approximation provided by truncating the SVD of A, and where ‖X‖F is the Frobenius norm of the matrix X. The number of columns of C and rows of R is a lowdegree polynomial in k, 1/ɛ, and log(1/δ). Both the Numerical Linear Algebra community and the Theoretical Computer Science community have studied variants
An Improved Approximation Algorithm for the Column Subset Selection Problem
"... We consider the problem of selecting the “best ” subset of exactly k columns from an m × n matrix A. In particular, we present and analyze a novel twostage algorithm that runs in O(min{mn 2, m 2 n}) time and returns as output an m × k matrix C consisting of exactly k columns of A. In the first stag ..."
Abstract

Cited by 28 (2 self)
 Add to MetaCart
We consider the problem of selecting the “best ” subset of exactly k columns from an m × n matrix A. In particular, we present and analyze a novel twostage algorithm that runs in O(min{mn 2, m 2 n}) time and returns as output an m × k matrix C consisting of exactly k columns of A. In the first stage (the randomized stage), the algorithm randomly selects O(k log k) columns according to a judiciouslychosen probability distribution that depends on information in the topk right singular subspace of A. In the second stage (the deterministic stage), the algorithm applies a deterministic columnselection procedure to select and return exactly k columns from the set of columns selected in the first stage. Let C be the m × k matrix containing those k columns, let PC denote the projection matrix onto the span of those columns, and let Ak denote the “best ” rankk approximation to the matrix A as computed with the singular value decomposition. Then, we prove that ‖A − PCA‖2 ≤ O k 3 4 log 1
Unsupervised Feature Selection for Principal Components Analysis [Extended Abstract]
"... Principal Components Analysis (PCA) is the predominant linear dimensionality reduction technique, and has been widely applied on datasets in all scientific domains. We consider, both theoretically and empirically, the topic of unsupervised feature selection for PCA, by leveraging algorithms for the ..."
Abstract

Cited by 12 (1 self)
 Add to MetaCart
Principal Components Analysis (PCA) is the predominant linear dimensionality reduction technique, and has been widely applied on datasets in all scientific domains. We consider, both theoretically and empirically, the topic of unsupervised feature selection for PCA, by leveraging algorithms for the socalled Column Subset Selection Problem (CSSP). In words, the CSSP seeks the“best”subset of exactly k columns from an m×n data matrix A, and has been extensively studied in the Numerical Linear Algebra community. We present a novel twostage algorithm for the CSSP. From a theoretical perspective, for small to moderate values of k, this algorithm significantly improves upon the best previouslyexisting results [24, 12] for the CSSP. From an empirical perspective, we evaluate this algorithm as an unsupervised feature selection strategy in three application domains of modern statistical data analysis: finance, documentterm data, and genetics. We pay particular attention to how this algorithm may be used to select representative or landmark features from an objectfeature matrix in an unsupervised manner. In all three application domains, we are able to identify k landmark features, i.e., columns of the data matrix, that capture nearly the same amount of information as does the subspace that is spanned by the top k “eigenfeatures.”
A Unified Framework for Providing Recommendations in Social Tagging Systems Based on Ternary Semantic Analysis
"... Abstract—Social Tagging is the process by which many users add metadata in the form of keywords, to annotate and categorize items (songs, pictures, web links, products, etc.). Social tagging systems (STSs) can provide three different types of recommendations: They can recommend 1) tags to users, bas ..."
Abstract

Cited by 6 (3 self)
 Add to MetaCart
Abstract—Social Tagging is the process by which many users add metadata in the form of keywords, to annotate and categorize items (songs, pictures, web links, products, etc.). Social tagging systems (STSs) can provide three different types of recommendations: They can recommend 1) tags to users, based on what tags other users have used for the same items, 2) items to users, based on tags they have in common with other similar users, and 3) users with common social interest, based on common tags on similar items. However, users may have different interests for an item, and items may have multiple facets. In contrast to the current recommendation algorithms, our approach develops a unified framework to model the three types of entities that exist in a social tagging system: users, items, and tags. These data are modeled by a 3order tensor, on which multiway latent semantic analysis and dimensionality reduction is performed using both the Higher Order Singular Value Decomposition (HOSVD) method and the KernelSVD smoothing technique. We perform experimental comparison of the proposed method against stateoftheart recommendation algorithms with two real data sets (Last.fm and BibSonomy). Our results show significant improvements in terms of effectiveness measured through recall/precision. Index Terms—Social tags, recommender systems, tensors, HOSVD. Ç
MACH: Fast Randomized Tensor Decompositions
, 2010
"... Tensors naturally model many real world processes which generate multiaspect data. Such processes appear in many different research disciplines, e.g, chemometrics, computer vision, psychometrics and neuroimaging analysis. Tensor decompositions such as the Tucker decomposition are used to analyze mu ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
Tensors naturally model many real world processes which generate multiaspect data. Such processes appear in many different research disciplines, e.g, chemometrics, computer vision, psychometrics and neuroimaging analysis. Tensor decompositions such as the Tucker decomposition are used to analyze multiaspect data and extract latent factors, which capture the multilinear data structure. Such decompositions are powerful mining tools for extracting patterns from large data volumes. However, most frequently used algorithms for such decompositions involve the computationally expensive Singular Value Decomposition. In this paper we propose MACH, a new sampling algorithm to compute such decompositions. Our method is of significant practical value for tensor streams, such as environmental monitoring systems, IP traffic matrices over time, where large amounts of data are accumulated and the analysis is computationally intensive but also in “postmortem ” data analysis cases where the tensor does not fit in the available memory. We provide the theoretical analysis of our proposed method and verify its efficacy on synthetic data and two real world monitoring system applications. Categories and Subject Descriptors:
Random walks in timegraphs
 in Proc. 2nd Int. Workshop on Mobile Opportunistic Networking, 2010
"... Dynamic networks are characterized by topologies that vary with time and are represented by timegraphs. The notion of connectivity in timegraphs is fundamentally different than that in static graphs. Endtoend connectivity is achieved opportunistically by storeforwardcarry paradigm if the netwo ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
Dynamic networks are characterized by topologies that vary with time and are represented by timegraphs. The notion of connectivity in timegraphs is fundamentally different than that in static graphs. Endtoend connectivity is achieved opportunistically by storeforwardcarry paradigm if the network is so sparse that sourcedestination pairs are usually not connected by complete paths. In static graphs, it is well known that the network connectivity is tied to the spectral gap of the underlying adjacency matrix of the topology: if the gap is large, the network is well connected and a random walk on this graph has a small hitting time. In this paper, we investigate a similar metric for timegraphs, which indicates how quickly opportunistic methods deliver packets to destinations, speed of convergence in estimating an entity and quickness in the online optimization of protocol parameters, etc. To this end, a timegraph is represented by a 3mode reachability tensor which yields whether a vertex is reachable from another node within t steps. Our observations from an extensive set of simulations show that the correlation between the expected hitting time of a random walk in the timegraph (following a nonhomogenous Markov Chain) and the second singular value of the matrix obtained by unfolding the reachability tensor is significantly large, above 90%.
Dynamic Texture Analysis and Synthesis using Tensor Decomposition
"... Abstract. Dynamic textures are sequences of images showing temporal regularity, such as smoke, flames, flowing water, or moving grass. Despite being a multidimensional signal, existing models reshape the dynamic texture into a 2D signal for analysis. In this article, we propose to directly decompose ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Abstract. Dynamic textures are sequences of images showing temporal regularity, such as smoke, flames, flowing water, or moving grass. Despite being a multidimensional signal, existing models reshape the dynamic texture into a 2D signal for analysis. In this article, we propose to directly decompose the multidimensional (tensor) signal, free from reshaping operations. We show that decomposition techniques originally applied to study psychometric or chemometric data can be used for this purpose. Since spatial, time, and color information are analyzed at the same time, such techniques permit to obtain more compact models. Only one third or less model coefficients are needed for the same quality and synthesis cost of 2D based models, as illustrated by experiments on real dynamic textures. 1