Results 1 
4 of
4
Risk bounds for random regression graphs
 Foundations of Computational Mathematics
"... Abstract. We consider the regression problem and describe an algorithm approximating the regression function by estimators piecewise constant on the elements of an adaptive partition. The partitions are iteratively constructed by suitable random merges and splits, using cuts of arbitrary geometry. W ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Abstract. We consider the regression problem and describe an algorithm approximating the regression function by estimators piecewise constant on the elements of an adaptive partition. The partitions are iteratively constructed by suitable random merges and splits, using cuts of arbitrary geometry. We give a risk bound under the assumption that a “weak learning hypothesis” holds, and characterize this hypothesis in terms of a suitable RKHS. 1.
Efficient Estimators for Generalized Additive Models
, 2005
"... Generalized additive models are a powerful generalization of linear and logistic regression models. In this paper we show that a natural regression graph learning algorithm efficiently learns generalized additive models. Efficiency is proven in two senses: the estimator’s future prediction accuracy ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Generalized additive models are a powerful generalization of linear and logistic regression models. In this paper we show that a natural regression graph learning algorithm efficiently learns generalized additive models. Efficiency is proven in two senses: the estimator’s future prediction accuracy approaches optimality at rate inverse polynomial in the size of the training data, and its runtime is polynomial in the size of the training data. Furthermore, the guarantees are nearly linear in terms of the dimensionality (number of regressors) of the problem, and hence the algorithm does not suffer from the “curse of dimensionality. ” The algorithm is a simple generalization of Mansour and McAllester’s classification algorithm that generates decision graphs, i.e., decision trees with merges. Our analysis is also viewed as defining a natural extension of the original classification boosting theorems (Schapire, 1990) to the regression setting. Loosely speaking, we define a weak correlator to be a realvalued predictor that has a correlation coefficient with the target function that is bounded from zero. We show how to efficiently boost weak correlators to get predictions with correlation arbitrarily close to 1 (error arbitrarily close to 0). Our boosting analysis is a natural extension of the classification boosting analysis of Kearns and Mansour (1999) and Mansour and McAllester (2002).
Learning Nested Halfspaces and Uphill Decision Trees
"... Abstract. Predicting class probabilities and other realvalued quantities is often more useful than binary classification, but comparatively little work in PACstyle learning addresses this issue. We show that two rich classes of realvalued functions are learnable in the probabilisticconcept framew ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Abstract. Predicting class probabilities and other realvalued quantities is often more useful than binary classification, but comparatively little work in PACstyle learning addresses this issue. We show that two rich classes of realvalued functions are learnable in the probabilisticconcept framework of Kearns and Schapire. Let X be a subset of Euclidean space and f be a realvalued function on X. We say f is a nested halfspace function if, for each real threshold t, the set {x ∈ Xf(x) ≤ t}, is a halfspace. This broad class of functions includes binary halfspaces with a margin (e.g., SVMs) as a special case. We give an efficient algorithm that provably learns (Lipschitzcontinuous) nested halfspace functions on the unit ball. The sample complexity is independent of the number of dimensions. We also introduce the class of uphill decision trees, which are realvalued decision trees (sometimes called regression trees) in which the sequence of leaf values is nondecreasing. We give an efficient algorithm for provably learning uphill decision trees whose sample complexity is polynomial in the number of dimensions but independent of the size of the tree (which may be exponential). Both of our algorithms employ a realvalued extension of Mansour and McAllester’s boosting algorithm. 1
Microsoft Research One Memorial Drive
"... The Perceptron algorithm elegantly solves binary classification problems that have a margin between positive and negative examples. Isotonic regression (fitting an arbitrary increasing function in one dimension) is also a natural problem with a simple solution. By combining the two, we get a new but ..."
Abstract
 Add to MetaCart
The Perceptron algorithm elegantly solves binary classification problems that have a margin between positive and negative examples. Isotonic regression (fitting an arbitrary increasing function in one dimension) is also a natural problem with a simple solution. By combining the two, we get a new but very simple algorithm with strong guarantees. Our ISOTRON algorithm provably learns Single Index Models (SIM), a generalization of linear and logistic regression, generalized linear models, as well as binary classification by linear threshold functions. In particular, it provably learns SIMs with unknown mean functions that are nondecreasing and Lipschitzcontinuous, thereby generalizing linear and logistic regression and linearthreshold functions (with a margin). Like the Perceptron, it is straightforward to implement and kernelize. Hence, the ISOTRON provides a very simple yet flexible and principled approach to regression. 1