Results 1  10
of
17
The why and how of nonnegative matrix factorization
 REGULARIZATION, OPTIMIZATION, KERNELS, AND SUPPORT VECTOR MACHINES. CHAPMAN & HALL/CRC
, 2014
"... ..."
(Show Context)
Rank regularization and bayesian inference for tensor completion and extrapolation. arXiv preprint arXiv:1301.7619
, 2013
"... factors capturing the tensor’s rank is proposed in this paper, as the key enabler for completion of threeway data arrays with missing entries. Set in a Bayesian framework, the tensor completion method incorporates prior information to enhance its smoothing and prediction capabilities. This probabil ..."
Abstract

Cited by 6 (4 self)
 Add to MetaCart
(Show Context)
factors capturing the tensor’s rank is proposed in this paper, as the key enabler for completion of threeway data arrays with missing entries. Set in a Bayesian framework, the tensor completion method incorporates prior information to enhance its smoothing and prediction capabilities. This probabilistic approach can naturally accommodate general models for the data distribution, lending itself to various fitting criteria that yield optimum estimates in the maximumaposteriori sense. In particular, two algorithms are devised for Gaussian and Poissondistributed data, that minimize the rankregularized leastsquares error and KullbackLeibler divergence, respectively. The proposed technique is able to recover the “groundtruth ” tensor rank when tested on synthetic data, and to complete brain imaging and yeast gene expression datasets with 50 % and 15 % of missing entries respectively, resulting in recovery errors at and. Index Terms—Bayesian inference, lowrank, missing data, Poisson process, tensor. I.
Giannakis, “Inference of poisson count processes using lowrank tensor data
 in Acoustics, Speech and Signal Processing (ICASSP), IEEE International Conference on
, 2013
"... A novel regularizer capturing the tensor rank is introduced in this paper as the key enabler for completion of threeway data arrays with missing entries. The novel regularized imputation approach induces sparsity in the factors of the tensor’s PARAFAC decomposition, thus reducing its rank. The foc ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
A novel regularizer capturing the tensor rank is introduced in this paper as the key enabler for completion of threeway data arrays with missing entries. The novel regularized imputation approach induces sparsity in the factors of the tensor’s PARAFAC decomposition, thus reducing its rank. The focus is on count processes which emerge in diverse applications ranging from genomics to computer and social networking. Based on Poisson count data, a maximum aposteriori (MAP) estimator is developed using the KullbackLeibler divergence criterion. This probabilistic approach also facilitates incorporation of correlated priors regularizing the rank, while endowing the tensor imputation method with extra smoothing and prediction capabilities. Tests on simulated and real datasets corroborate the sparsifying regularization effect, and demonstrate recovery of 15% missing RNAsequencing data with an inference error of −12dB. Index Terms — Tensor, lowrank, missing data, Poisson processes. 1.
ZeroTruncated Poisson Tensor Factorization for Massive Binary Tensors
"... We present a scalable Bayesian model for lowrank factorization of massive tensors with binary observations. The proposed model has the following key properties: (1) in contrast to the models based on the logistic or probit likelihood, using a zerotruncated Poisson likelihood for binary data al ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
We present a scalable Bayesian model for lowrank factorization of massive tensors with binary observations. The proposed model has the following key properties: (1) in contrast to the models based on the logistic or probit likelihood, using a zerotruncated Poisson likelihood for binary data allows our model to scale up in the number of ones in the tensor, which is especially appealing for massive but sparse binary tensors; (2) sideinformation in form of binary pairwise relationships (e.g., an adjacency network) between objects in any tensor mode can also be leveraged, which can be especially useful in “coldstart ” settings; and (3) the model admits simple Bayesian inference via batch, as well as online MCMC; the latter allows scaling up even for dense binary data (i.e., when the number of ones in the tensor/network is also massive). In addition, nonnegative factor matrices in our model provide easy interpretability, and the tensor rank can be inferred from the data. We evaluate our model on several largescale realworld binary tensors, achieving excellent computational scalability, and also demonstrate its usefulness in leveraging sideinformation provided in form of modenetwork(s).
Bayesian poisson tensor factorization for inferring multilateral relations from sparse dyadic event counts.
 In KDD,
, 2015
"... ABSTRACT We present a Bayesian tensor factorization model for inferring latent group structures from dynamic pairwise interaction patterns. For decades, political scientists have collected and analyzed records of the form "country i took action a toward country j at time t"known as dyadi ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
ABSTRACT We present a Bayesian tensor factorization model for inferring latent group structures from dynamic pairwise interaction patterns. For decades, political scientists have collected and analyzed records of the form "country i took action a toward country j at time t"known as dyadic eventsin order to form and test theories of international relations. We represent these event data as a tensor of counts and develop Bayesian Poisson tensor factorization to infer a lowdimensional, interpretable representation of their salient patterns. We demonstrate that our model's predictive performance is better than that of standard nonnegative tensor factorization methods. We also provide a comparison of our variational updates to their maximum likelihood counterparts. In doing so, we identify a better way to form point estimates of the latent factors than that typically used in Bayesian Poisson matrix factorization. Finally, we showcase our model as an exploratory analysis tool for political scientists. We show that the inferred latent factor matrices capture interpretable multilateral relations that both conform to and inform our knowledge of international affairs. Categories and Subject Descriptors Keywords Poisson tensor factorization, Bayesian inference, dyadic data, international relations Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org.
Walk’n’Merge: A Scalable Algorithm for Boolean Tensor Factorization
 In ICDM ’13
, 2013
"... Abstract—Tensors are becoming increasingly common in data mining, and consequently, tensor factorizations are becoming more important tools for data miners. When the data is binary, it is natural to ask if we can factorize it into binary factors while simultaneously making sure that the reconstructe ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Abstract—Tensors are becoming increasingly common in data mining, and consequently, tensor factorizations are becoming more important tools for data miners. When the data is binary, it is natural to ask if we can factorize it into binary factors while simultaneously making sure that the reconstructed tensor is still binary. Such factorizations, called Boolean tensor factorizations, can provide improved interpretability and find Boolean structure that is hard to express using normal factorizations. Unfortunately the algorithms for computing Boolean tensor factorizations do not usually scale well. In this paper we present a novel algorithm for finding Boolean CP and Tucker decompositions of large and sparse binary tensors. In our experimental evaluation we show that our algorithm can handle large tensors and accurately reconstructs the latent Boolean structure. Keywords—Tensor factorizations; Boolean tensors; Random walks; MDL principle
Scalable Boolean tensor factorizations using random walks,” arXiv
, 2013
"... Tensors are becoming increasingly common in data mining, and consequently, tensor factorizations are becoming more and more important tools for data miners. When the data is binary, it is natural to ask if we can factorize it into binary factors while simultaneously making sure that the reconstruct ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Tensors are becoming increasingly common in data mining, and consequently, tensor factorizations are becoming more and more important tools for data miners. When the data is binary, it is natural to ask if we can factorize it into binary factors while simultaneously making sure that the reconstructed tensor is still binary. Such factorizations, called Boolean tensor factorizations, can provide improved interpretability and find Boolean structure that is hard to express using normal factorizations. Unfortunately the algorithms for computing Boolean tensor factorizations do not usually scale well. In this paper we present a novel algorithm for finding Boolean CP and Tucker decompositions of large and sparse binary tensors. In our experimental evaluation we show that our algorithm can handle large tensors and accurately reconstructs the latent Boolean structure. 1
Sparse Robust Matrix Trifactorization with Application to Cancer Genomics
"... Abstract—Nonnegative matrix trifactorization (NMTF) ..."
(Show Context)
Rubik: Knowledge Guided Tensor Factorization and Completion for Health Data Analytics
"... Computational phenotyping is the process of converting heterogeneous electronic health records (EHRs) into meaningful clinical concepts. Unsupervised phenotyping methods have the potential to leverage a vast amount of labeled EHR data for phenotype discovery. However, existing unsupervised phenotyp ..."
Abstract
 Add to MetaCart
(Show Context)
Computational phenotyping is the process of converting heterogeneous electronic health records (EHRs) into meaningful clinical concepts. Unsupervised phenotyping methods have the potential to leverage a vast amount of labeled EHR data for phenotype discovery. However, existing unsupervised phenotyping methods do not incorporate current medical knowledge and cannot directly handle missing, or noisy data. We propose Rubik, a constrained nonnegative tensor factorization and completion method for phenotyping. Rubik incorporates 1) guidance constraints to align with existing medical knowledge, and 2) pairwise constraints for obtaining distinct, nonoverlapping phenotypes. Rubik also has builtin tensor completion that can significantly alleviate the impact of noisy and missing data. We utilize the Alternating Direction Method of Multipliers (ADMM) framework to tensor factorization and completion, which can be easily scaled through parallel computing. We evaluate Rubik on two EHR datasets, one of which contains 647,118 records for 7,744 patients from an outpatient clinic, the other of which is a public dataset containing 1,018,614 CMS claims records for 472,645 patients. Our results show that Rubik can discover more meaningful and distinct phenotypes than the baselines. In particular, by using knowledge guidance constraints, Rubik can also discover subphenotypes for several major diseases. Rubik also runs around seven times faster than current stateoftheart tensor methods. Finally, Rubik is scalable to large datasets containing millions of EHR records.
Analysis of Largescale Traffic Dynamics in an Urban Transportation Network using Nonnegative Tensor Factorization
"... In this paper, we present our work on clustering and prediction of temporal evolution of global congestion configurations in a largescale urban transportation network. Instead of looking into temporal variations of traffic flow states of individual links, we focus on temporal evolution of the compl ..."
Abstract
 Add to MetaCart
In this paper, we present our work on clustering and prediction of temporal evolution of global congestion configurations in a largescale urban transportation network. Instead of looking into temporal variations of traffic flow states of individual links, we focus on temporal evolution of the complete spatial configuration of congestions over the network. In our work, we pursue to describe the typical temporal patterns of the global traffic states and achieve longterm prediction of the largescale traffic evolution in a unified datamining framework. To this end, we formulate this joint task using regularized Nonnegative Tensor Factorization, which has been shown to be a useful analysis tool for spatiotemporal data sequences. Clustering and prediction are performed based on the compact tensor factorization results. The validity of the proposed spatiotemporal traffic data analysis method is shown on experiments using simulated realistic traffic data.