Results 1  10
of
177
A tutorial on support vector regression
, 2004
"... In this tutorial we give an overview of the basic ideas underlying Support Vector (SV) machines for function estimation. Furthermore, we include a summary of currently used algorithms for training SV machines, covering both the quadratic (or convex) programming part and advanced methods for dealing ..."
Abstract

Cited by 865 (3 self)
 Add to MetaCart
In this tutorial we give an overview of the basic ideas underlying Support Vector (SV) machines for function estimation. Furthermore, we include a summary of currently used algorithms for training SV machines, covering both the quadratic (or convex) programming part and advanced methods for dealing with large datasets. Finally, we mention some modifications and extensions that have been applied to the standard SV algorithm, and discuss the aspect of regularization from a SV perspective.
Experiences with an Interactive Museum TourGuide Robot
, 1998
"... This article describes the software architecture of an autonomous, interactive tourguide robot. It presents a modular and distributed software architecture, which integrates localization, mapping, collision avoidance, planning, and various modules concerned with user interaction and Webbased telep ..."
Abstract

Cited by 329 (72 self)
 Add to MetaCart
This article describes the software architecture of an autonomous, interactive tourguide robot. It presents a modular and distributed software architecture, which integrates localization, mapping, collision avoidance, planning, and various modules concerned with user interaction and Webbased telepresence. At its heart, the software approach relies on probabilistic computation, online learning, and anytime algorithms. It enables robots to operate safely, reliably, and at high speeds in highly dynamic environments, and does not require any modifications of the environment to aid the robot's operation. Special emphasis is placed on the design of interactive capabilities that appeal to people's intuition. The interface provides new means for humanrobot interaction with crowds of people in public places, and it also provides people all around the world with the ability to establish a "virtual telepresence" using the Web. To illustrate our approach, results are reported obtained in mid...
Nonlinear BlackBox Modeling in System Identification: a Unified Overview
 Automatica
, 1995
"... A nonlinear black box structure for a dynamical system is a model structure that is prepared to describe virtually any nonlinear dynamics. There has been considerable recent interest in this area with structures based on neural networks, radial basis networks, wavelet networks, hinging hyperplanes, ..."
Abstract

Cited by 225 (16 self)
 Add to MetaCart
(Show Context)
A nonlinear black box structure for a dynamical system is a model structure that is prepared to describe virtually any nonlinear dynamics. There has been considerable recent interest in this area with structures based on neural networks, radial basis networks, wavelet networks, hinging hyperplanes, as well as wavelet transform based methods and models based on fuzzy sets and fuzzy rules. This paper describes all these approaches in a common framework, from a user's perspective. It focuses on what are the common features in the different approaches, the choices that have to be made and what considerations are relevant for a successful system identification application of these techniques. It is pointed out that the nonlinear structures can be seen as a concatenation of a mapping from observed data to a regression vector and a nonlinear mapping from the regressor space to the output space. These mappings are discussed separately. The latter mapping is usually formed as a basis function e...
Learning with Labeled and Unlabeled Data
, 2001
"... In this paper, on the one hand, we aim to give a review on literature dealing with the problem of supervised learning aided by additional unlabeled data. On the other hand, being a part of the author's first year PhD report, the paper serves as a frame to bundle related work by the author as we ..."
Abstract

Cited by 202 (3 self)
 Add to MetaCart
(Show Context)
In this paper, on the one hand, we aim to give a review on literature dealing with the problem of supervised learning aided by additional unlabeled data. On the other hand, being a part of the author's first year PhD report, the paper serves as a frame to bundle related work by the author as well as numerous suggestions for potential future work. Therefore, this work contains more speculative and partly subjective material than the reader might expect from a literature review. We give a rigorous definition of the problem and relate it to supervised and unsupervised learning. The crucial role of prior knowledge is put forward, and we discuss the important notion of inputdependent regularization. We postulate a number of baseline methods, being algorithms or algorithmic schemes which can more or less straightforwardly be applied to the problem, without the need for genuinely new concepts. However, some of them might serve as basis for a genuine method. In the literature revi...
The Evidence Framework Applied to Classification Networks
, 1992
"... Three Bayesian ideas are presented for supervised adaptive classifiers. First, it is argued that the output of a classifier should be obtained by marginalizing over the posterior distribution of the parameters; a simple approximation to this integral is proposed and demonstrated. This involves a &qu ..."
Abstract

Cited by 189 (13 self)
 Add to MetaCart
Three Bayesian ideas are presented for supervised adaptive classifiers. First, it is argued that the output of a classifier should be obtained by marginalizing over the posterior distribution of the parameters; a simple approximation to this integral is proposed and demonstrated. This involves a "moderation" of the most probable classifier's outputs, and yields improved performance. Second, it is demonstrated that the Bayesian framework for model comparison described for regression models in MacKay (1992a,b) can also be applied to classification problems. This framework successfully chooses the magnitude of weight decay terms, and ranks solutions found using different numbers of hidden units. Third, an informationbased data selection criterion is derived and demonstrated within this framework.
Fast Sparse Gaussian Process Methods: The Informative Vector Machine
 Advances in Neural Information Processing Systems 15
, 2003
"... We present a framework for sparse Gaussian process (GP) methods which uses forward selection with criteria based on informationtheoretic principles, previously suggested for active learning. Our goal is not only to learn dsparse predictors (which can be evaluated in O(d) rather than O(n), d ..."
Abstract

Cited by 173 (30 self)
 Add to MetaCart
(Show Context)
We present a framework for sparse Gaussian process (GP) methods which uses forward selection with criteria based on informationtheoretic principles, previously suggested for active learning. Our goal is not only to learn dsparse predictors (which can be evaluated in O(d) rather than O(n), d n, n the number of training points), but also to perform training under strong restrictions on time and memory requirements. The scaling of our method is at most O(n ), and in large realworld classification experiments we show that it can match prediction performance of the popular support vector machine (SVM), yet can be significantly faster in training. In contrast to the SVM, our approximation produces estimates of predictive probabilities (`error bars'), allows for Bayesian model selection and is less complex in implementation.
Bounds on the sample complexity of Bayesian learning using information theory and the VC dimension.
 Machine Learning
, 1994
"... ..."
(Show Context)
Gaussian processes for machine learning
 International Journal of Neural Systems
, 2004
"... Gaussian processes (GPs) are natural generalisations of multivariate Gaussian random variables to infinite (countably or continuous) index sets. GPs have been applied in a large number of fields to a diverse range of ends, and very many deep theoretical analyses of various properties are available. ..."
Abstract

Cited by 92 (14 self)
 Add to MetaCart
(Show Context)
Gaussian processes (GPs) are natural generalisations of multivariate Gaussian random variables to infinite (countably or continuous) index sets. GPs have been applied in a large number of fields to a diverse range of ends, and very many deep theoretical analyses of various properties are available. This paper gives an introduction to Gaussian processes on a fairly elementary level with special emphasis on characteristics relevant in machine learning. It draws explicit connections to branches such as spline smoothing models and support vector machines in which similar ideas have been investigated. Gaussian process models are routinely used to solve hard machine learning problems. They are attractive because of their flexible nonparametric nature and computational simplicity. Treated within a Bayesian framework, very powerful statistical methods can be implemented which offer valid estimates of uncertainties in our predictions and generic model selection procedures cast as nonlinear optimization problems. Their main drawback of heavy computational scaling has recently been alleviated by the introduction of generic sparse approximations [13, 78, 31]. The mathematical literature on GPs is large and often uses deep
Online Bayes Point Machines
"... We present a new and simple algorithm for learning large margin classi ers that works in a truly online manner. The algorithm generates a linear classi er by averaging the weights associated with several perceptronlike algorithms run in parallel in order to approximate the Bayes point. A rand ..."
Abstract

Cited by 83 (3 self)
 Add to MetaCart
(Show Context)
We present a new and simple algorithm for learning large margin classi ers that works in a truly online manner. The algorithm generates a linear classi er by averaging the weights associated with several perceptronlike algorithms run in parallel in order to approximate the Bayes point. A random subsample of the incoming data stream is used to ensure diversity in the perceptron solutions. We experimentally study the algorithm's performance on online and batch learning settings.