Results 1  10
of
10
An introduction to boosting and leveraging
 Advanced Lectures on Machine Learning, LNCS
, 2003
"... ..."
Constructing Boosting Algorithms from SVMs: An Application to Oneclass Classification
, 2002
"... ..."
Theory and implementation of numerical methods based on RungeKutta integration for solving optimal control problems
, 1996
"... ..."
Barrier Boosting
"... Boosting algorithms like AdaBoost and ArcGV are iterative strategies to minimize a constrained objective function, equivalent to Barrier algorithms. ..."
Abstract

Cited by 19 (7 self)
 Add to MetaCart
Boosting algorithms like AdaBoost and ArcGV are iterative strategies to minimize a constrained objective function, equivalent to Barrier algorithms.
Sparse Regression Ensembles in Infinite and Finite Hypothesis Spaces
, 2000
"... We examine methods for constructing regression ensembles based on a linear program (LP). The ensemble regression function consists of linear combina tions of base hypotheses generated by some boostingtype base learning algorithm. Unlike the classification case, for regression the set of possible h ..."
Abstract

Cited by 18 (9 self)
 Add to MetaCart
We examine methods for constructing regression ensembles based on a linear program (LP). The ensemble regression function consists of linear combina tions of base hypotheses generated by some boostingtype base learning algorithm. Unlike the classification case, for regression the set of possible hypotheses producible by the base learning algorithm may be infinite. We explicitly tackle the issue of how to define and solve ensemble regression when the hypothesis space is infinite. Our approach is based on a semiinfinite linear program that has an infinite number of constraints and a finite number of variables. We show that the regression problem is well posed for infinite hypothesis spaces in both the primal and dual spaces. Most importantly, we prove there exists an optimal solution to the infinite hypothesisspace problem consisting of a finite number of hypothesis. We propose two algorithms for solving the infinite and finite hypothesis problems. One uses a column generation simplextype algorithm and the other adopts an exponential barrier approach. Furthermore, we give sufficient conditions for the base learning algorithm and the hypothesis set to be used for infinite regression ensembles. Computational resultsshow that these methods are extremely promising.
Comparison Of Entropy And Mean Square Error Criteria In Adaptive System Training Using Higher Order Statistics
 Proceedings of the Second International Workshop on Independent Component Analysis and Blind Signal Separation
, 2000
"... The errorentropyminimization approach in adaptive system training is investigated. The effect of Parzen windowing on the location of the global minimum of entropy has been investigated. An analytical proof that shows the global minimum of the entropy is a local minimum, possibly the global minimum ..."
Abstract

Cited by 13 (6 self)
 Add to MetaCart
The errorentropyminimization approach in adaptive system training is investigated. The effect of Parzen windowing on the location of the global minimum of entropy has been investigated. An analytical proof that shows the global minimum of the entropy is a local minimum, possibly the global minimum, of the nonparametrically estimated entropy using Parzen windowing with Gaussian kernels. The performances of errorentropyminimization and the meansquareerrorminimization criteria are compared in shortterm prediction of a chaotic time series. Statistical behavior of the estimation errors and the higher order central moments of the time series data and its predictions are utilized as the comparison criteria. 1. INTRODUCTION Starting with the early work of Wiener [1] on adaptive filters, mean square error (MSE) has been almost exclusively employed in the training of all adaptive systems including artificial neural networks. There were mainly two reasons lying behind this choice: Analyti...
On the convergence of leveraging
 In Advances in Neural Information Processing Systems (NIPS
, 2002
"... We give an unified convergence analysis of ensemble learning methods including e.g. AdaBoost, Logistic Regression and the LeastSquareBoost algorithm for regression. These methods have in common that they iteratively call a base learning algorithm which returns hypotheses that are then linearly com ..."
Abstract

Cited by 10 (2 self)
 Add to MetaCart
We give an unified convergence analysis of ensemble learning methods including e.g. AdaBoost, Logistic Regression and the LeastSquareBoost algorithm for regression. These methods have in common that they iteratively call a base learning algorithm which returns hypotheses that are then linearly combined. We show that these methods are related to the GaussSouthwell method known from numerical optimization and state nonasymptotical convergence results for all these methods. Our analysis includes ℓ1norm regularized cost functions leading to a clean and general way to regularize ensemble learning. 1
From Adaptive Linear to Information Filtering
"... Adaptive signal processing theory was born and has lived by exclusively exploiting the mean square error criterion. When we think of the goal of least squares without restrictions of Gaussianity, one has to wonder why an information theoretic error criterion is not utilized instead. After all, the g ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Adaptive signal processing theory was born and has lived by exclusively exploiting the mean square error criterion. When we think of the goal of least squares without restrictions of Gaussianity, one has to wonder why an information theoretic error criterion is not utilized instead. After all, the goal of adaptive filtering should be to find the linear projection that best captures the information in the desired response. In this paper we summarize our efforts to extend adaptive linear filtering to information filtering. We briefly review Renyi’s entropy definition, Parzen windows and put them together in a framework to estimate entropy directly from samples (nonparametric). Once this criterion is developed we can train linear or nonlinear adaptive networks for entropy maximization or minimization. We present results on the properties of the Renyi’s nonparametric entropy estimator, and show how it performs in chaotic time series prediction. 1.
iii TABLE OF CONTENTS
"... 2002 This work is dedicated to all scientists and researchers, who have lived in pursuit of knowledge, and have dedicated themselves to the advancement of science. ACKNOWLEDGMENTS I would like to start by thanking my supervisor, Dr. Jose C. Principe, for his encouraging and inspiring style that made ..."
Abstract
 Add to MetaCart
2002 This work is dedicated to all scientists and researchers, who have lived in pursuit of knowledge, and have dedicated themselves to the advancement of science. ACKNOWLEDGMENTS I would like to start by thanking my supervisor, Dr. Jose C. Principe, for his encouraging and inspiring style that made possible the completion of this work. Without his guidance, imagination, and enthusiasm, which I admire, this dissertation would not have been possible. I also wish to thank the members of my committee, Dr. John G. Harris, Dr. Tan F. Wong, and Dr. Mark C.K. Yang, for their valuable time and interest in serving on my supervisory committee, as well as their comments, which helped improve the quality of this dissertation. Throughout the course my PhD research, I have been in interaction with many CNEL colleagues and I have benefited from the valuable discussions we had together