Results 1  10
of
1,149
A tutorial on support vector regression
, 2004
"... In this tutorial we give an overview of the basic ideas underlying Support Vector (SV) machines for function estimation. Furthermore, we include a summary of currently used algorithms for training SV machines, covering both the quadratic (or convex) programming part and advanced methods for dealing ..."
Abstract

Cited by 828 (3 self)
 Add to MetaCart
In this tutorial we give an overview of the basic ideas underlying Support Vector (SV) machines for function estimation. Furthermore, we include a summary of currently used algorithms for training SV machines, covering both the quadratic (or convex) programming part and advanced methods for dealing with large datasets. Finally, we mention some modifications and extensions that have been applied to the standard SV algorithm, and discuss the aspect of regularization from a SV perspective.
When Networks Disagree: Ensemble Methods for Hybrid Neural Networks
, 1993
"... This paper presents a general theoretical framework for ensemble methods of constructing significantly improved regression estimates. Given a population of regression estimators, we construct a hybrid estimator which is as good or better in the MSE sense than any estimator in the population. We argu ..."
Abstract

Cited by 347 (3 self)
 Add to MetaCart
(Show Context)
This paper presents a general theoretical framework for ensemble methods of constructing significantly improved regression estimates. Given a population of regression estimators, we construct a hybrid estimator which is as good or better in the MSE sense than any estimator in the population. We argue that the ensemble method presented has several properties: 1) It efficiently uses all the networks of a population  none of the networks need be discarded. 2) It efficiently uses all the available data for training without overfitting. 3) It inherently performs regularization by smoothing in functional space which helps to avoid overfitting. 4) It utilizes local minima to construct improved estimates whereas other neural network algorithms are hindered by local minima. 5) It is ideally suited for parallel computation. 6) It leads to a very useful and natural measure of the number of distinct estimators in a population. 7) The optimal parameters of the ensemble estimator are given in clo...
Constructive Incremental Learning from Only Local Information
, 1998
"... ... This article illustrates the potential learning capabilities of purely local learning and offers an interesting and powerful approach to learning with receptive fields. ..."
Abstract

Cited by 206 (39 self)
 Add to MetaCart
(Show Context)
... This article illustrates the potential learning capabilities of purely local learning and offers an interesting and powerful approach to learning with receptive fields.
Exploring the Relationships between Design Measures and Software Quality in ObjectOriented Systems
, 1998
"... The first goal of this paper is to empirically explore the relationships between existing objectoriented coupling, cohesion, and inheritance measures and the probability of fault detection in system classes during testing. In other words, we wish to better understand the relationship between exi ..."
Abstract

Cited by 184 (10 self)
 Add to MetaCart
The first goal of this paper is to empirically explore the relationships between existing objectoriented coupling, cohesion, and inheritance measures and the probability of fault detection in system classes during testing. In other words, we wish to better understand the relationship between existing design measurement in OO systems and the quality of the software developed. The second goal is to propose an investigation and analysis strategy to make these kind of studies more repeatable and comparable, a problem which is pervasive in the literature on quality measurement. Results show that many of the measures capture similar dimensions in the data set, thus reflecting the fact that many of them are based on similar principles and hypotheses. However, it is shown that by using a subset of measures, accurate models can be built to predict which classes contain most of the existing faults. When predicting faultprone classes, the best model shows a percentage of correct clas...
Evaluating the predictive performance of habitat models developed using logistic regression
 Ecological Modelling
, 2000
"... The use of statistical models to predict the likely occurrence or distribution of species is becoming an increasingly important tool in conservation planning and wildlife management. Evaluating the predictive performance of models using independent data is a vital step in model development. Such eva ..."
Abstract

Cited by 182 (3 self)
 Add to MetaCart
(Show Context)
The use of statistical models to predict the likely occurrence or distribution of species is becoming an increasingly important tool in conservation planning and wildlife management. Evaluating the predictive performance of models using independent data is a vital step in model development. Such evaluation assists in determining the suitability of a model for specific applications, facilitates comparative assessment of competing models and modelling techniques, and identifies aspects of a model most in need of improvement. The predictive performance of habitat models developed using logistic regression needs to be evaluated in terms of two components: reliability or calibration (the agreement between predicted probabilities of occurrence and observed proportions of sites occupied), and discrimination capacity (the ability of a model to correctly distinguish between occupied and unoccupied sites). Lack of reliability can be attributed to two systematic sources, calibration bias and spread. Techniques are described for evaluating both of these sources of error. The discrimination capacity of logistic regression models is often measured by crossclassifying observations and predictions in a twobytwo table, and calculating indices of classification performance. However, this approach relies on the essentially arbitrary choice of a threshold probability to determine whether or not a site is predicted to be occupied. An alternative approach is described which measures discrimination capacity in terms of the area under a relative operating characteristic (ROC) curve relating relative proportions of correctly and incorrectly classified predictions over a wide and continuous range of threshold levels. Wider application of the techniques promoted in this paper could greatly improve understanding of the usefulness, and potential limitations, of habitat models developed for use in conservation planning and wildlife management. © 2000 Elsevier
Error Correlation And Error Reduction In Ensemble Classifiers
, 1996
"... Using an ensemble of classifiers, instead of a single classifier, can lead to improved generalization. The gains obtained by combining however, are often affected more by the selection of what is presented to the combiner, than by the actual combining method that is chosen. In this paper we focus ..."
Abstract

Cited by 181 (24 self)
 Add to MetaCart
(Show Context)
Using an ensemble of classifiers, instead of a single classifier, can lead to improved generalization. The gains obtained by combining however, are often affected more by the selection of what is presented to the combiner, than by the actual combining method that is chosen. In this paper we focus on data selection and classifier training methods, in order to "prepare" classifiers for combining. We review a combining framework for classification problems that quantifies the need for reducing the correlation among individual classifiers. Then, we discuss several methods that make the classifiers in an ensemble more complementary. Experimental results are provided to illustrate the benefits and pitfalls of reducing the correlation among classifiers, especially when the training data is in limited supply. 2 1 Introduction A classifier's ability to meaningfully respond to novel patterns, or generalize, is perhaps its most important property (Levin et al., 1990; Wolpert, 1990). In...
The Stationary Wavelet Transform and some Statistical Applications
, 1995
"... Wavelets are of wide potential use in statistical contexts. The basics of the discrete wavelet transform are reviewed using a filter notation that is useful subsequently in the paper. A `stationary wavelet transform', where the coefficient sequences are not decimated at each stage, is described ..."
Abstract

Cited by 174 (19 self)
 Add to MetaCart
Wavelets are of wide potential use in statistical contexts. The basics of the discrete wavelet transform are reviewed using a filter notation that is useful subsequently in the paper. A `stationary wavelet transform', where the coefficient sequences are not decimated at each stage, is described. Two different approaches to the construction of an inverse of the stationary wavelet transform are set out. The application of the stationary wavelet transform as an exploratory statistical method is discussed, together with its potential use in nonparametric regression. A method of local spectral density estimation is developed. This involves extensions to the wavelet context of standard time series ideas such as the periodogram and spectrum. The technique is illustrated by its application to data sets from astronomy and veterinary anatomy. 1 Introduction In this paper we discuss some aspects of wavelets with a particular view to their statistical application. In particular we shall be conce...
Estimating Portfolio and Consumption Choice: A Conditional Euler Equations Approach
 JOURNAL OF FINANCE
, 1999
"... This paper develops a nonparametric approach to examine how portfolio and consumption choice depends on variables that forecast timevarying investment opportunities. I estimate singleperiod and multiperiod portfolio and consumption rules of an investor with constant relative risk aversion and a on ..."
Abstract

Cited by 163 (16 self)
 Add to MetaCart
This paper develops a nonparametric approach to examine how portfolio and consumption choice depends on variables that forecast timevarying investment opportunities. I estimate singleperiod and multiperiod portfolio and consumption rules of an investor with constant relative risk aversion and a onemonth to 20year horizon. The investor allocates wealth to the NYSE index and a 30day Treasury bill. I find that the portfolio choice varies significantly with the dividend yield, default premium, term premium, and lagged excess return. Furthermore, the optimal decisions depend on the investor’s horizon and rebalancing frequency.
RSVM: Reduced support vector machines
 Data Mining Institute, Computer Sciences Department, University of Wisconsin
, 2001
"... Abstract An algorithm is proposed which generates a nonlinear kernelbased separating surface that requires as little as 1 % of a large dataset for its explicit evaluation. To generate this nonlinear surface, the entire dataset is used as a constraint in an optimization problem with very few variabl ..."
Abstract

Cited by 160 (19 self)
 Add to MetaCart
(Show Context)
Abstract An algorithm is proposed which generates a nonlinear kernelbased separating surface that requires as little as 1 % of a large dataset for its explicit evaluation. To generate this nonlinear surface, the entire dataset is used as a constraint in an optimization problem with very few variables corresponding to the 1%
The Power of Decision Tables
 Proceedings of the European Conference on Machine Learning
, 1995
"... . We evaluate the power of decision tables as a hypothesis space for supervised learning algorithms. Decision tables are one of the simplest hypothesis spaces possible, and usually they are easy to understand. Experimental results show that on artificial and realworld domains containing only discre ..."
Abstract

Cited by 158 (5 self)
 Add to MetaCart
(Show Context)
. We evaluate the power of decision tables as a hypothesis space for supervised learning algorithms. Decision tables are one of the simplest hypothesis spaces possible, and usually they are easy to understand. Experimental results show that on artificial and realworld domains containing only discrete features, IDTM, an algorithm inducing decision tables, can sometimes outperform stateoftheart algorithms such as C4.5. Surprisingly, performance is quite good on some datasets with continuous features, indicating that many datasets used in machine learning either do not require these features, or that these features have few values. We also describe an incremental method for performing crossvalidation that is applicable to incremental learning algorithms including IDTM. Using incremental crossvalidation, it is possible to crossvalidate a given dataset and IDTM in time that is linear in the number of instances, the number of features, and the number of label values. The time for incre...