Results 1  10
of
19
Finding Approximate POMDP Solutions Through Belief Compression
, 2003
"... Standard value function approaches to finding policies for Partially Observable Markov Decision Processes (POMDPs) are generally considered to be intractable for large models. The intractability of these algorithms is to a large extent a consequence of computing an exact, optimal policy over the ent ..."
Abstract

Cited by 63 (2 self)
 Add to MetaCart
Standard value function approaches to finding policies for Partially Observable Markov Decision Processes (POMDPs) are generally considered to be intractable for large models. The intractability of these algorithms is to a large extent a consequence of computing an exact, optimal policy over the entire belief space. However, in realworld POMDP problems, computing the optimal policy for the full belief space is often unnecessary for good control even for problems with complicated policy classes. The beliefs experienced by the controller often lie near a structured, lowdimensional manifold embedded in the highdimensional belief space. Finding a good approximation to the optimal value function for only this manifold can be much easier than computing the full value function. We introduce a new method for solving largescale POMDPs by reducing the dimensionality of the belief space. We use Exponential family Principal Components Analysis (Collins, Dasgupta, & Schapire, 2002) to represent sparse, highdimensional belief spaces using lowdimensional sets of learned features of the belief state. We then plan only in terms of the lowdimensional belief features. By planning in this lowdimensional space, we can find policies for POMDP models that are orders of magnitude larger than models that can be handled by conventional techniques. We demonstrate the use of this algorithm on a synthetic problem and on mobile robot navigation tasks. 1.
Relational Learning via Collective Matrix Factorization
, 2008
"... Relational learning is concerned with predicting unknown values of a relation, given a database of entities and observed relations among entities. An example of relational learning is movie rating prediction, where entities could include users, movies, genres, and actors. Relations would then encode ..."
Abstract

Cited by 60 (3 self)
 Add to MetaCart
Relational learning is concerned with predicting unknown values of a relation, given a database of entities and observed relations among entities. An example of relational learning is movie rating prediction, where entities could include users, movies, genres, and actors. Relations would then encode users ’ ratings of movies, movies ’ genres, and actors ’ roles in movies. A common prediction technique given one pairwise relation, for example a #users × #movies ratings matrix, is lowrank matrix factorization. In domains with multiple relations, represented as multiple matrices, we may improve predictive accuracy by exploiting information from one relation while predicting another. To this end, we propose a collective matrix factorization model: we simultaneously factor several matrices, sharing parameters among factors when an entity participates in multiple relations. Each relation can have a different value type and error distribution; so, we allow nonlinear relationships between the parameters and outputs, using Bregman divergences to measure error. We extend standard alternating projection algorithms to our model, and derive an efficient Newton update for the projection. Furthermore, we propose stochastic optimization methods to deal with large, sparse matrices. Our model generalizes several existing matrix factorization methods, and therefore yields new largescale optimization algorithms for these problems. Our model can handle any pairwise relational schema and a
Learning with Matrix Factorization
, 2004
"... Matrices that can be factored into a product of two simpler matrices can serve as a useful and often natural model in the analysis of tabulated or highdimensional data. Models based on matrix factorization (Factor Analysis, PCA) have been extensively used in statistical analysis and machine learning ..."
Abstract

Cited by 39 (4 self)
 Add to MetaCart
Matrices that can be factored into a product of two simpler matrices can serve as a useful and often natural model in the analysis of tabulated or highdimensional data. Models based on matrix factorization (Factor Analysis, PCA) have been extensively used in statistical analysis and machine learning for over a century, with many new formulations and models suggested in recent
A Unified View of Matrix Factorization Models
"... Abstract. We present a unified view of matrix factorization that frames the differences among popular methods, such as NMF, Weighted SVD, EPCA, MMMF, pLSI, pLSIpHITS, Bregman coclustering, and many others, in terms of a small number of modeling choices. Many of these approaches can be viewed as m ..."
Abstract

Cited by 33 (0 self)
 Add to MetaCart
Abstract. We present a unified view of matrix factorization that frames the differences among popular methods, such as NMF, Weighted SVD, EPCA, MMMF, pLSI, pLSIpHITS, Bregman coclustering, and many others, in terms of a small number of modeling choices. Many of these approaches can be viewed as minimizing a generalized Bregman divergence, and we show that (i) a straightforward alternating projection algorithm can be applied to almost any model in our unified view; (ii) the Hessian for each projection has special structure that makes a Newton projection feasible, even when there are equality constraints on the factors, which allows for matrix coclustering; and (iii) alternating projections can be generalized to simultaneously factor a set of matrices that share dimensions. These observations immediately yield new optimization algorithms for the above factorization methods, and suggest novel generalizations of these methods such as incorporating row and column biases, and adding or relaxing clustering constraints. 1
Modelbased overlapping clustering
 In KDD
, 2005
"... While the vast majority of clustering algorithms are partitional, many real world datasets have inherently overlapping clusters. Several approaches to finding overlapping clusters have come from work on analysis of biological datasets. In this paper, we interpret an overlapping clustering model prop ..."
Abstract

Cited by 29 (6 self)
 Add to MetaCart
While the vast majority of clustering algorithms are partitional, many real world datasets have inherently overlapping clusters. Several approaches to finding overlapping clusters have come from work on analysis of biological datasets. In this paper, we interpret an overlapping clustering model proposed by Segal et al. [23] as a generalization of Gaussian mixture models, and we extend it to an overlapping clustering model based on mixtures of any regular exponential family distribution and the corresponding Bregman divergence. We provide the necessary algorithm modifications for this extension, and present results on synthetic data as well as subsets of 20Newsgroups and EachMovie datasets.
Nonnegative matrix approximation: algorithms and applications
, 2006
"... Low dimensional data representations are crucial to numerous applications in machine learning, statistics, and signal processing. Nonnegative matrix approximation (NNMA) is a method for dimensionality reduction that respects the nonnegativity of the input data while constructing a lowdimensional ap ..."
Abstract

Cited by 16 (3 self)
 Add to MetaCart
Low dimensional data representations are crucial to numerous applications in machine learning, statistics, and signal processing. Nonnegative matrix approximation (NNMA) is a method for dimensionality reduction that respects the nonnegativity of the input data while constructing a lowdimensional approximation. NNMA has been used in a multitude of applications, though without commensurate theoretical development. In this report we describe generic methods for minimizing generalized divergences between the input and its low rank approximant. Some of our general methods are even extensible to arbitrary convex penalties. Our methods yield efficient multiplicative iterative schemes for solving the proposed problems. We also consider interesting extensions such as the use of penalty functions, nonlinear relationships via “link ” functions, weighted errors, and multifactor approximations. We present some experiments as an illustration of our algorithms. For completeness, the report also includes a brief literature survey of the various algorithms and the applications of NNMA. Keywords: Nonnegative matrix factorization, weighted approximation, Bregman divergence, multiplicative
Planning under uncertainty for reliable health care robotics
 in: The Fourth International Conference on Field and Service Robots (FSR
, 2003
"... We describe a mobile robot system, designed to assist residents of an retirement facility. This system is being developed to respond to an aging population and a predicted shortage of nursing professionals. In this paper, we discuss the task of finding and escorting people from place to place in the ..."
Abstract

Cited by 14 (0 self)
 Add to MetaCart
We describe a mobile robot system, designed to assist residents of an retirement facility. This system is being developed to respond to an aging population and a predicted shortage of nursing professionals. In this paper, we discuss the task of finding and escorting people from place to place in the facility, a task containing uncertainty throughout the problem. Planning algorithms that model uncertainty well such as Partially Observable Markov Decision Processes (POMDPs) do not scale tractably to most real world problems. We demonstrate an algorithm for representing real world POMDP problems compactly, which allows us to find good policies in reasonable amounts of time. We show that our algorithm is able to find moving people in close to optimal time, where the optimal policy would start with knowledge of the person’s location. 1
Topic extraction from itemlevel grades
 American Association for Artificial Intelligence 2005 Workshop on Educational Datamining
, 2005
"... The most common form of dataset within the educational domain is likely the course gradebook. Data mining on the assignmentlevel scores is unlikely to provide meaningful results, but a matrix recording scores for every student and every question may provide hidden insight into the workings of a cou ..."
Abstract

Cited by 8 (2 self)
 Add to MetaCart
The most common form of dataset within the educational domain is likely the course gradebook. Data mining on the assignmentlevel scores is unlikely to provide meaningful results, but a matrix recording scores for every student and every question may provide hidden insight into the workings of a course. Here we will investigate collaborative filtering techniques applied to such data in an attempt to discover what the fundamental topics of a course are and the proficiencies of each student in those topics. Nearly every universitylevel course offering creates a new educationrelated dataset in the process of recording student grades. While the vast majority of gradebooks record each student’s aggregate score on each assignment,
Weighted LowRank Approximations
 In 20th International Conference on Machine Learning
, 2003
"... We study the common problem of approximating a target matrix with a matrix of lower rank. We provide a simple and e#cient (EM) algorithm for solving weighted lowrank approximation problems, which, unlike their unweighted version, do not admit a closedform solution in general. We analyze, in a ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
We study the common problem of approximating a target matrix with a matrix of lower rank. We provide a simple and e#cient (EM) algorithm for solving weighted lowrank approximation problems, which, unlike their unweighted version, do not admit a closedform solution in general. We analyze, in addition, the nature of locally optimal solutions that arise in this context, demonstrate the utility of accommodating the weights in reconstructing the underlying lowrank representation, and extend the formulation to nonGaussian noise models such as logistic models.
Efficient global optimization for exponential family PCA and lowrank matrix factorization
 In Allerton Conf. on Commun., Control, and Computing
, 2008
"... Abstract—We present an efficient global optimization algorithm for exponential family principal component analysis (PCA) and associated lowrank matrix factorization problems. Exponential family PCA has been shown to improve the results of standard PCA on nonGaussian data. Unfortunately, the widesp ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
Abstract—We present an efficient global optimization algorithm for exponential family principal component analysis (PCA) and associated lowrank matrix factorization problems. Exponential family PCA has been shown to improve the results of standard PCA on nonGaussian data. Unfortunately, the widespread use of exponential family PCA has been hampered by the existence of only local optimization procedures. The prevailing assumption has been that the nonconvexity of the problem prevents an efficient global optimization approach from being developed. Fortunately, this pessimism is unfounded. We present a reformulation of the underlying optimization problem that preserves the identity of the global solution while admitting an efficient optimization procedure. The algorithm we develop involves only a subgradient optimization of a convex objective plus associated eigenvector computations. (No general purpose semidefinite programming solver is required.) The lowrank constraint is exactly preserved, while the method can be kernelized through a consistent approximation to admit a fixed nonlinearity. We demonstrate improved solution quality with the global solver, and also add to the evidence that exponential family PCA produces superior results to standard PCA on nonGaussian data. I.