Results 1 - 10
of
15
Finding Approximate POMDP Solutions Through Belief Compression
, 2003
"... Standard value function approaches to finding policies for Partially Observable Markov Decision Processes (POMDPs) are generally considered to be intractable for large models. The intractability of these algorithms is to a large extent a consequence of computing an exact, optimal policy over the ent ..."
Abstract
-
Cited by 46 (2 self)
- Add to MetaCart
Standard value function approaches to finding policies for Partially Observable Markov Decision Processes (POMDPs) are generally considered to be intractable for large models. The intractability of these algorithms is to a large extent a consequence of computing an exact, optimal policy over the entire belief space. However, in real-world POMDP problems, computing the optimal policy for the full belief space is often unnecessary for good control even for problems with complicated policy classes. The beliefs experienced by the controller often lie near a structured, low-dimensional manifold embedded in the high-dimensional belief space. Finding a good approximation to the optimal value function for only this manifold can be much easier than computing the full value function. We introduce a new method for solving large-scale POMDPs by reducing the dimensionality of the belief space. We use Exponential family Principal Components Analysis (Collins, Dasgupta, & Schapire, 2002) to represent sparse, high-dimensional belief spaces using low-dimensional sets of learned features of the belief state. We then plan only in terms of the low-dimensional belief features. By planning in this low-dimensional space, we can find policies for POMDP models that are orders of magnitude larger than models that can be handled by conventional techniques. We demonstrate the use of this algorithm on a synthetic problem and on mobile robot navigation tasks. 1.
Relational Learning via Collective Matrix Factorization
, 2008
"... Relational learning is concerned with predicting unknown values of a relation, given a database of entities and observed relations among entities. An example of relational learning is movie rating prediction, where entities could include users, movies, genres, and actors. Relations would then encode ..."
Abstract
-
Cited by 34 (3 self)
- Add to MetaCart
Relational learning is concerned with predicting unknown values of a relation, given a database of entities and observed relations among entities. An example of relational learning is movie rating prediction, where entities could include users, movies, genres, and actors. Relations would then encode users ’ ratings of movies, movies ’ genres, and actors ’ roles in movies. A common prediction technique given one pairwise relation, for example a #users × #movies ratings matrix, is low-rank matrix factorization. In domains with multiple relations, represented as multiple matrices, we may improve predictive accuracy by exploiting information from one relation while predicting another. To this end, we propose a collective matrix factorization model: we simultaneously factor several matrices, sharing parameters among factors when an entity participates in multiple relations. Each relation can have a different value type and error distribution; so, we allow nonlinear relationships between the parameters and outputs, using Bregman divergences to measure error. We extend standard alternating projection algorithms to our model, and derive an efficient Newton update for the projection. Furthermore, we propose stochastic optimization methods to deal with large, sparse matrices. Our model generalizes several existing matrix factorization methods, and therefore yields new large-scale optimization algorithms for these problems. Our model can handle any pairwise relational schema and a
Learning with Matrix Factorization
, 2004
"... Matrices that can be factored into a product of two simpler matrices can serve as a useful and often natural model in the analysis of tabulated or highdimensional data. Models based on matrix factorization (Factor Analysis, PCA) have been extensively used in statistical analysis and machine learning ..."
Abstract
-
Cited by 20 (3 self)
- Add to MetaCart
Matrices that can be factored into a product of two simpler matrices can serve as a useful and often natural model in the analysis of tabulated or highdimensional data. Models based on matrix factorization (Factor Analysis, PCA) have been extensively used in statistical analysis and machine learning for over a century, with many new formulations and models suggested in recent
A Unified View of Matrix Factorization Models
"... Abstract. We present a unified view of matrix factorization that frames the differences among popular methods, such as NMF, Weighted SVD, E-PCA, MMMF, pLSI, pLSI-pHITS, Bregman co-clustering, and many others, in terms of a small number of modeling choices. Many of these approaches can be viewed as m ..."
Abstract
-
Cited by 19 (0 self)
- Add to MetaCart
Abstract. We present a unified view of matrix factorization that frames the differences among popular methods, such as NMF, Weighted SVD, E-PCA, MMMF, pLSI, pLSI-pHITS, Bregman co-clustering, and many others, in terms of a small number of modeling choices. Many of these approaches can be viewed as minimizing a generalized Bregman divergence, and we show that (i) a straightforward alternating projection algorithm can be applied to almost any model in our unified view; (ii) the Hessian for each projection has special structure that makes a Newton projection feasible, even when there are equality constraints on the factors, which allows for matrix co-clustering; and (iii) alternating projections can be generalized to simultaneously factor a set of matrices that share dimensions. These observations immediately yield new optimization algorithms for the above factorization methods, and suggest novel generalizations of these methods such as incorporating row and column biases, and adding or relaxing clustering constraints. 1
Model-based overlapping clustering
- In KDD
, 2005
"... While the vast majority of clustering algorithms are partitional, many real world datasets have inherently overlapping clusters. Several approaches to finding overlapping clusters have come from work on analysis of biological datasets. In this paper, we interpret an overlapping clustering model prop ..."
Abstract
-
Cited by 18 (4 self)
- Add to MetaCart
While the vast majority of clustering algorithms are partitional, many real world datasets have inherently overlapping clusters. Several approaches to finding overlapping clusters have come from work on analysis of biological datasets. In this paper, we interpret an overlapping clustering model proposed by Segal et al. [23] as a generalization of Gaussian mixture models, and we extend it to an overlapping clustering model based on mixtures of any regular exponential family distribution and the corresponding Bregman divergence. We provide the necessary algorithm modifications for this extension, and present results on synthetic data as well as subsets of 20-Newsgroups and EachMovie datasets.
Planning under uncertainty for reliable health care robotics
- in: The Fourth International Conference on Field and Service Robots (FSR
, 2003
"... We describe a mobile robot system, designed to assist residents of an retirement facility. This system is being developed to respond to an aging population and a predicted shortage of nursing professionals. In this paper, we discuss the task of finding and escorting people from place to place in the ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
We describe a mobile robot system, designed to assist residents of an retirement facility. This system is being developed to respond to an aging population and a predicted shortage of nursing professionals. In this paper, we discuss the task of finding and escorting people from place to place in the facility, a task containing uncertainty throughout the problem. Planning algorithms that model uncertainty well such as Partially Observable Markov Decision Processes (POM-DPs) do not scale tractably to most real world problems. We demonstrate an algorithm for representing real world POMDP problems compactly, which allows us to find good policies in reasonable amounts of time. We show that our algorithm is able to find moving people in close to optimal time, where the optimal policy would start with knowledge of the person’s location. 1
Nonnegative matrix approximation: algorithms and applications
, 2006
"... Low dimensional data representations are crucial to numerous applications in machine learning, statistics, and signal processing. Nonnegative matrix approximation (NNMA) is a method for dimensionality reduction that respects the nonnegativity of the input data while constructing a low-dimensional ap ..."
Abstract
-
Cited by 11 (3 self)
- Add to MetaCart
Low dimensional data representations are crucial to numerous applications in machine learning, statistics, and signal processing. Nonnegative matrix approximation (NNMA) is a method for dimensionality reduction that respects the nonnegativity of the input data while constructing a low-dimensional approximation. NNMA has been used in a multitude of applications, though without commensurate theoretical development. In this report we describe generic methods for minimizing generalized divergences between the input and its low rank approximant. Some of our general methods are even extensible to arbitrary convex penalties. Our methods yield efficient multiplicative iterative schemes for solving the proposed problems. We also consider interesting extensions such as the use of penalty functions, non-linear relationships via “link ” functions, weighted errors, and multi-factor approximations. We present some experiments as an illustration of our algorithms. For completeness, the report also includes a brief literature survey of the various algorithms and the applications of NNMA. Keywords: Nonnegative matrix factorization, weighted approximation, Bregman divergence, multiplicative
Topic extraction from itemlevel grades
- American Association for Artificial Intelligence 2005 Workshop on Educational Datamining
, 2005
"... The most common form of dataset within the educational domain is likely the course gradebook. Data mining on the assignment-level scores is unlikely to provide meaningful results, but a matrix recording scores for every student and every question may provide hidden insight into the workings of a cou ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
The most common form of dataset within the educational domain is likely the course gradebook. Data mining on the assignment-level scores is unlikely to provide meaningful results, but a matrix recording scores for every student and every question may provide hidden insight into the workings of a course. Here we will investigate collaborative filtering techniques applied to such data in an attempt to discover what the fundamental topics of a course are and the proficiencies of each student in those topics. Nearly every university-level course offering creates a new education-related dataset in the process of recording student grades. While the vast majority of gradebooks record each student’s aggregate score on each assignment,
Efficient global optimization for exponential family PCA and low-rank matrix factorization
- In Allerton Conf. on Commun., Control, and Computing
, 2008
"... Abstract—We present an efficient global optimization algorithm for exponential family principal component analysis (PCA) and associated low-rank matrix factorization problems. Exponential family PCA has been shown to improve the results of standard PCA on non-Gaussian data. Unfortunately, the widesp ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Abstract—We present an efficient global optimization algorithm for exponential family principal component analysis (PCA) and associated low-rank matrix factorization problems. Exponential family PCA has been shown to improve the results of standard PCA on non-Gaussian data. Unfortunately, the widespread use of exponential family PCA has been hampered by the existence of only local optimization procedures. The prevailing assumption has been that the non-convexity of the problem prevents an efficient global optimization approach from being developed. Fortunately, this pessimism is unfounded. We present a reformulation of the underlying optimization problem that preserves the identity of the global solution while admitting an efficient optimization procedure. The algorithm we develop involves only a sub-gradient optimization of a convex objective plus associated eigenvector computations. (No general purpose semidefinite programming solver is required.) The lowrank constraint is exactly preserved, while the method can be kernelized through a consistent approximation to admit a fixed non-linearity. We demonstrate improved solution quality with the global solver, and also add to the evidence that exponential family PCA produces superior results to standard PCA on non-Gaussian data. I.
Weighted Low-Rank Approximations
- In 20th International Conference on Machine Learning
, 2003
"... We study the common problem of approximating a target matrix with a matrix of lower rank. We provide a simple and e#cient (EM) algorithm for solving weighted low-rank approximation problems, which, unlike their unweighted version, do not admit a closedform solution in general. We analyze, in a ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We study the common problem of approximating a target matrix with a matrix of lower rank. We provide a simple and e#cient (EM) algorithm for solving weighted low-rank approximation problems, which, unlike their unweighted version, do not admit a closedform solution in general. We analyze, in addition, the nature of locally optimal solutions that arise in this context, demonstrate the utility of accommodating the weights in reconstructing the underlying low-rank representation, and extend the formulation to nonGaussian noise models such as logistic models.

