• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Gaussian processes for machine learning (2006)

by Carl Edward Rasmussen
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 137
Next 10 →

Fast random walk with restart and its applications

by Hanghang Tong - In ICDM ’06: Proceedings of the 6th IEEE International Conference on Data Mining , 2006
"... How closely related are two nodes in a graph? How to compute this score quickly, on huge, disk-resident, real graphs? Random walk with restart (RWR) provides a good relevance score between two nodes in a weighted graph, and it has been successfully used in numerous settings, like automatic captionin ..."
Abstract - Cited by 65 (12 self) - Add to MetaCart
How closely related are two nodes in a graph? How to compute this score quickly, on huge, disk-resident, real graphs? Random walk with restart (RWR) provides a good relevance score between two nodes in a weighted graph, and it has been successfully used in numerous settings, like automatic captioning of images, generalizations to the “connection subgraphs”, personalized PageRank, and many more. However, the straightforward implementations of RWR do not scale for large graphs, requiring either quadratic space and cubic pre-computation time, or slow response time on queries. We propose fast solutions to this problem. The heart of our approach is to exploit two important properties shared by many real graphs: (a) linear correlations and (b) blockwise, community-like structure. We exploit the linearity by using low-rank matrix approximation, and the community structure by graph partitioning, followed by the Sherman-Morrison lemma for matrix inversion. Experimental results on the Corel image and the DBLP dabasets demonstrate that our proposed methods achieve significant savings over the straightforward implementations: they can save several orders of magnitude in pre-computation and storage cost, and they achieve up to 150x speed up with 90%+ quality preservation. 1

The pyramid match kernel: Efficient learning with sets of features

by Kristen Grauman, Trevor Darrell, Pietro Perona - Journal of Machine Learning Research , 2007
"... In numerous domains it is useful to represent a single example by the set of the local features or parts that comprise it. However, this representation poses a challenge to many conventional machine learning techniques, since sets may vary in cardinality and elements lack a meaningful ordering. Kern ..."
Abstract - Cited by 55 (6 self) - Add to MetaCart
In numerous domains it is useful to represent a single example by the set of the local features or parts that comprise it. However, this representation poses a challenge to many conventional machine learning techniques, since sets may vary in cardinality and elements lack a meaningful ordering. Kernel methods can learn complex functions, but a kernel over unordered set inputs must somehow solve for correspondences—generally a computationally expensive task that becomes impractical for large set sizes. We present a new fast kernel function called the pyramid match that measures partial match similarity in time linear in the number of features. The pyramid match maps unordered feature sets to multi-resolution histograms and computes a weighted histogram intersection in order to find implicit correspondences based on the finest resolution histogram cell where a matched pair first appears. We show the pyramid match yields a Mercer kernel, and we prove bounds on its error relative to the optimal partial matching cost. We demonstrate our algorithm on both classification and regression tasks, including object recognition, 3-D human pose inference, and time of publication estimation for documents, and we show that the proposed method is accurate and significantly more efficient than current approaches.

Gaussian Processes for Signal Strength-Based Location Estimation

by Brian Ferris, Dirk Hähnel, Dieter Fox - In Proc. of Robotics Science and Systems , 2006
"... Abstract — Estimating the location of a mobile device or a robot from wireless signal strength has become an area of highly active research. The key problem in this context stems from the complexity of how signals propagate through space, especially in the presence of obstacles such as buildings, wa ..."
Abstract - Cited by 40 (6 self) - Add to MetaCart
Abstract — Estimating the location of a mobile device or a robot from wireless signal strength has become an area of highly active research. The key problem in this context stems from the complexity of how signals propagate through space, especially in the presence of obstacles such as buildings, walls or people. In this paper we show how Gaussian processes can be used to generate a likelihood model for signal strength measurements. We also show how parameters of the model, such as signal noise and spatial correlation between measurements, can be learned from data via hyperparameter estimation. Experiments using WiFi indoor data and GSM cellphone connectivity demonstrate the superior performance of our approach. I.

Confidence-weighted linear classification

by Mark Dredze, Koby Crammer - In ICML ’08: Proceedings of the 25th international conference on Machine learning , 2008
"... We introduce confidence-weighted linear classifiers, which add parameter confidence information to linear classifiers. Online learners in this setting update both classifier parameters and the estimate of their confidence. The particular online algorithms we study here maintain a Gaussian distributi ..."
Abstract - Cited by 31 (8 self) - Add to MetaCart
We introduce confidence-weighted linear classifiers, which add parameter confidence information to linear classifiers. Online learners in this setting update both classifier parameters and the estimate of their confidence. The particular online algorithms we study here maintain a Gaussian distribution over parameter vectors and update the mean and covariance of the distribution with each instance. Empirical evaluation on a range of NLP tasks show that our algorithm improves over other state of the art online and batch methods, learns faster in the online setting, and lends itself to better classifier combination after parallel training. 1.

WiFi-SLAM Using Gaussian Process Latent Variable Models

by Brian Ferris, Dieter Fox, Neil Lawrence - In Proceedings of IJCAI 2007 , 2007
"... WiFi localization, the task of determining the physical location of a mobile device from wireless signal strengths, has been shown to be an accurate method of indoor and outdoor localization and a powerful building block for location-aware applications. However, most localization techniques require ..."
Abstract - Cited by 28 (4 self) - Add to MetaCart
WiFi localization, the task of determining the physical location of a mobile device from wireless signal strengths, has been shown to be an accurate method of indoor and outdoor localization and a powerful building block for location-aware applications. However, most localization techniques require a training set of signal strength readings labeled against a ground truth location map, which is prohibitive to collect and maintain as maps grow large. In this paper we propose a novel technique for solving the WiFi SLAM problem using the Gaussian Process Latent Variable Model (GP-LVM) to determine the latent-space locations of unlabeled signal strength data. We show how GP-LVM, in combination with an appropriate motion dynamics model, can be used to reconstruct a topological connectivity graph from a signal strength sequence which, in combination with the learned Gaussian Process signal strength model, can be used to perform efficient localization. 1

A Hilbert space embedding for distributions

by Alex Smola, Arthur Gretton, Le Song, Bernhard Schölkopf - In Algorithmic Learning Theory: 18th International Conference , 2007
"... Abstract. We describe a technique for comparing distributions without the need for density estimation as an intermediate step. Our approach relies on mapping the distributions into a reproducing kernel Hilbert space. Applications of this technique can be found in two-sample tests, which are used for ..."
Abstract - Cited by 27 (15 self) - Add to MetaCart
Abstract. We describe a technique for comparing distributions without the need for density estimation as an intermediate step. Our approach relies on mapping the distributions into a reproducing kernel Hilbert space. Applications of this technique can be found in two-sample tests, which are used for determining whether two sets of observations arise from the same distribution, covariate shift correction, local learning, measures of independence, and density estimation. Kernel methods are widely used in supervised learning [1, 2, 3, 4], however they are much less established in the areas of testing, estimation, and analysis of probability distributions, where information theoretic approaches [5, 6] have long been dominant. Recent examples include [7] in the context of construction of graphical models, [8] in the context of feature extraction, and [9] in the context of independent component analysis. These methods have by and large a common issue: to compute quantities such as the mutual information, entropy, or Kullback-Leibler divergence, we require sophisticated space partitioning and/or

Non-linear Matrix Factorization with Gaussian Processes

by Neil D. Lawrence, Raquel Urtasun
"... A popular approach to collaborative filtering is matrix factorization. In this paper we develop a non-linear probabilistic matrix factorization using Gaussian process latent variable models. We use stochastic gradient descent (SGD) to optimize the model. SGD allows us to apply Gaussian processes to ..."
Abstract - Cited by 20 (1 self) - Add to MetaCart
A popular approach to collaborative filtering is matrix factorization. In this paper we develop a non-linear probabilistic matrix factorization using Gaussian process latent variable models. We use stochastic gradient descent (SGD) to optimize the model. SGD allows us to apply Gaussian processes to data sets with millions of observations without approximate methods. We apply our approach to benchmark movie recommender data sets. The results show better than previous state-of-theart performance. 1.

Gaussian processes and reinforcement learning for identification and control of an autonomous blimp

by Jonathan Ko, Daniel J. Klein - in IEEE Intl. Conf. on Robotics and Automation (ICRA , 2007
"... Abstract — Blimps are a promising platform for aerial robotics and have been studied extensively for this purpose. Unlike other aerial vehicles, blimps are relatively safe and also possess the ability to loiter for long periods. These advantages, however, have been difficult to exploit because blimp ..."
Abstract - Cited by 17 (4 self) - Add to MetaCart
Abstract — Blimps are a promising platform for aerial robotics and have been studied extensively for this purpose. Unlike other aerial vehicles, blimps are relatively safe and also possess the ability to loiter for long periods. These advantages, however, have been difficult to exploit because blimp dynamics are complex and inherently non-linear. The classical approach to system modeling represents the system as an ordinary differential equation (ODE) based on Newtonian principles. A more recent modeling approach is based on representing state transitions as a Gaussian process (GP). In this paper, we present a general technique for system identification that combines these two modeling approaches into a single formulation. This is done by training a Gaussian process on the residual between the non-linear model and ground truth training data. The result is a GP-enhanced model that provides an estimate of uncertainty in addition to giving better state predictions than either ODE or GP alone. We show how the GP-enhanced model can be used in conjunction with reinforcement learning to generate a blimp controller that is superior to those learned with ODE or GP models alone. I.

Nonmyopic active learning of gaussian processes: An exploration-exploitation approach

by Andreas Krause, Carlos Guestrin - IN ICML , 2007
"... When monitoring spatial phenomena, such as the ecological condition of a river, deciding where to make observations is a challenging task. In these settings, a fundamental question is when an active learning, or sequential design, strategy, where locations are selected based on previous measurements ..."
Abstract - Cited by 16 (3 self) - Add to MetaCart
When monitoring spatial phenomena, such as the ecological condition of a river, deciding where to make observations is a challenging task. In these settings, a fundamental question is when an active learning, or sequential design, strategy, where locations are selected based on previous measurements, will perform significantly better than sensing at an a priori specified set of locations. For Gaussian Processes (GPs), which often accurately model spatial phenomena, we present an analysis and efficient algorithms that address this question. Central to our analysis is a theoretical bound which quantifies the performance difference between active and a priori design strategies. We consider GPs with unknown kernel parameters and present a nonmyopic approach for trading off exploration, i.e., decreasing uncertainty about the model parameters, and exploitation, i.e., near-optimally selecting observations when the parameters are (approximately) known. We discuss several exploration strategies, and present logarithmic sample complexity bounds for the exploration phase. We then extend our algorithm to handle nonstationary GPs exploiting local structure in the model. A variational approach allows us to perform efficient inference in this class of nonstationary models. We also present extensive empirical evaluation on several real-world problems.

Beam Sampling for the Infinite Hidden Markov Model

by Jurgen Van Gael, Yunus Saatci, Yee Whye Teh, Zoubin Ghahramani
"... The infinite hidden Markov model is a nonparametric extension of the widely used hidden Markov model. Our paper introduces a new inference algorithm for the infinite Hidden Markov model called beam sampling. Beam sampling combines slice sampling, which limits the number of states considered at each ..."
Abstract - Cited by 14 (2 self) - Add to MetaCart
The infinite hidden Markov model is a nonparametric extension of the widely used hidden Markov model. Our paper introduces a new inference algorithm for the infinite Hidden Markov model called beam sampling. Beam sampling combines slice sampling, which limits the number of states considered at each time step to a finite number, with dynamic programming, which samples whole state trajectories efficiently. Our algorithm typically outperforms the Gibbs sampler and is more robust. We present applications of iHMM inference using the beam sampler on changepoint detection and text prediction problems. 1.
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University