Results 1 - 10
of
135
Locality Preserving Projections
, 2002
"... Many problems in information processing involve some form of dimensionality reduction. In this paper, we introduce Locality Preserving Projections (LPP). These are linear projective maps that arise by solving a variational problem that optimally preserves the neighborhood structure of the data s ..."
Abstract
-
Cited by 142 (15 self)
- Add to MetaCart
Many problems in information processing involve some form of dimensionality reduction. In this paper, we introduce Locality Preserving Projections (LPP). These are linear projective maps that arise by solving a variational problem that optimally preserves the neighborhood structure of the data set. LPP should be seen as an alternative to Principal Component Analysis (PCA) -- a classical linear technique that projects the data along the directions of maximal variance. When the high dimensional data lies on a low dimensional manifold embedded in the ambient space, the Locality Preserving Projections are obtained by finding the optimal linear approximations to the eigenfunctions of the Laplace Beltrami operator on the manifold. As a result, LPP shares many of the data representation properties of nonlinear techniques such as Laplacian Eigenmaps or Locally Linear Embedding. Yet LPP is linear and more crucially is defined everywhere in ambient space rather than just on the training data points. This is borne out by illustrative examples on some high dimensional data sets.
Estimating the Generalization Performance of an SVM Efficiently
, 2000
"... This paper proposes and analyzes an approach to estimating the generalization performance of a support vector machine (SVM) for text classification. Without any computation intensive resampling, the new estimators are computationally much more ecient than cross-validation or bootstrap, since they ca ..."
Abstract
-
Cited by 79 (1 self)
- Add to MetaCart
This paper proposes and analyzes an approach to estimating the generalization performance of a support vector machine (SVM) for text classification. Without any computation intensive resampling, the new estimators are computationally much more ecient than cross-validation or bootstrap, since they can be computed immediately from the form of the hypothesis returned by the SVM. Moreover, the estimators delevoped here address the special performance measures needed for text classification. While they can be used to estimate error rate, one can also estimate the recall, the precision, and the F 1 . A theoretical analysis and experiments on three text classification collections show that the new method can effectively estimate the performance of SVM text classifiers in a very efficient way.
The bootstrap
- In Handbook of Econometrics
, 2001
"... The bootstrap is a method for estimating the distribution of an estimator or test statistic by resampling one’s data. It amounts to treating the data as if they were the population for the purpose of evaluating the distribution of interest. Under mild regularity conditions, the bootstrap yields an a ..."
Abstract
-
Cited by 38 (1 self)
- Add to MetaCart
The bootstrap is a method for estimating the distribution of an estimator or test statistic by resampling one’s data. It amounts to treating the data as if they were the population for the purpose of evaluating the distribution of interest. Under mild regularity conditions, the bootstrap yields an approximation to the distribution of an estimator or test statistic that is at least as accurate as the
The grid bootstrap and the autoregressive model
- Review of Economics and Statistics
, 1999
"... Abstract —A ‘‘grid’ ’ bootstrap method is proposed for con � dence-interva l construction, which has improved performance over conventiona l bootstrap methods when the sampling distribution depends upon the parameter of interest. The basic idea is to calculate the bootstrap distribution over a grid ..."
Abstract
-
Cited by 38 (0 self)
- Add to MetaCart
Abstract —A ‘‘grid’ ’ bootstrap method is proposed for con � dence-interva l construction, which has improved performance over conventiona l bootstrap methods when the sampling distribution depends upon the parameter of interest. The basic idea is to calculate the bootstrap distribution over a grid of values of the parameter of interest and form the con � dence interval by the no-rejectio n principle. Our primary motivation is given by autoregressiv e models, where it is known that conventional bootstrap methods fail to provide correct � rst-order asymptotic coverage when an autoregressiv e root is close to unity. In contrast, the grid bootstrap is � rst-order correct globally in the parameter space. Simulation results verify these insights, suggesting that the grid bootstrap provides an important improvement over conventiona l methods. Gauss code that calculates the grid bootstrap intervals—and replicates the empirical work reported in this paper—is available from the author’s Web page at www.ssc.wisc.edu, bhansen. I.
Testing for Linearity
- Journal of Economic Surveys
, 1999
"... Abstract. The problem of testing for linearity and the number of regimes in the context of self-exciting threshold autoregressive (SETAR) models is reviewed. We describe least-squares methods of estimation and inference. The primary complication is that the testing problem is non-standard, due to th ..."
Abstract
-
Cited by 23 (1 self)
- Add to MetaCart
Abstract. The problem of testing for linearity and the number of regimes in the context of self-exciting threshold autoregressive (SETAR) models is reviewed. We describe least-squares methods of estimation and inference. The primary complication is that the testing problem is non-standard, due to the presence of parameters which are only defined under the alternative, so the asymptotic distribution of the test statistics is non-standard. Simulation methods to calculate asymptotic and bootstrap distributions are presented. As the sampling distributions are quite sensitive to conditional heteroskedasticity in the error, careful modeling of the conditional variance is necessary for accurate inference on the conditional mean. We illustrate these methods with two applications Ð annual sunspot means and monthly U.S. industrial production. We find that annual sunspots and monthly industrial production are SETAR(2) processes. Keywords. SETAR models; Thresholds; Non-standard asymptotic theory; Bootstrap
Monte Carlo test methods in econometrics
- Companion to Theoretical Econometrics’, Blackwell Companions to Contemporary Economics
, 2001
"... The authors thank three anonymous referees and the Editor Badi Baltagi for several useful comments. This work was supported by the Bank of Canada and by grants from the Canadian Network of Centres of Excellence [program on Mathematics ..."
Abstract
-
Cited by 15 (11 self)
- Add to MetaCart
The authors thank three anonymous referees and the Editor Badi Baltagi for several useful comments. This work was supported by the Bank of Canada and by grants from the Canadian Network of Centres of Excellence [program on Mathematics
Accelerated Degradation Tests: Modeling Analysis
, 1999
"... High reliability systems generally require individual system components having extremely high reliabilityover long periods of time. Short product development times require reliability tests to be conducted with severe time constraints. Frequently few or no failures occur during such tests, even with ..."
Abstract
-
Cited by 15 (10 self)
- Add to MetaCart
High reliability systems generally require individual system components having extremely high reliabilityover long periods of time. Short product development times require reliability tests to be conducted with severe time constraints. Frequently few or no failures occur during such tests, even with acceleration. Thus, it is difficult to assess reliabilitywith traditional life tests that record only failure times. For some components, degradation measures can be taken over time. A relationship between component failure and amountof degradation makes it possible to use degradation models and data to make inferences and predictions about a failure-time distribution. This paper describes degradation reliability models that correspond to physical-failure mechanisms. We explain the connection between degradation reliability models and failuretime reliabilitymodels. Acceleration is modeled byhaving an acceleration model that describes the effect that temperature (or another accelerating vari...
Bayesian Statistics
- in WWW', Computing Science and Statistics
, 1989
"... ∗ Signatures are on file in the Graduate School. This dissertation presents two topics from opposite disciplines: one is from a parametric realm and the other is based on nonparametric methods. The first topic is a jackknife maximum likelihood approach to statistical model selection and the second o ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
∗ Signatures are on file in the Graduate School. This dissertation presents two topics from opposite disciplines: one is from a parametric realm and the other is based on nonparametric methods. The first topic is a jackknife maximum likelihood approach to statistical model selection and the second one is a convex hull peeling depth approach to nonparametric massive multivariate data analysis. The second topic includes simulations and applications on massive astronomical data. First, we present a model selection criterion, minimizing the Kullback-Leibler distance by using the jackknife method. Various model selection methods have been developed to choose a model of minimum Kullback-Liebler distance to the true model, such as Akaike information criterion (AIC), Bayesian information criterion (BIC), Minimum description length (MDL), and Bootstrap information criterion. Likewise, the jackknife method chooses a model of minimum Kullback-Leibler distance through bias reduction. This bias, which is inevitable in model
Neural network regularization and ensembling using multi-objective evolutionary algorithms
- In: Congress on Evolutionary Computation (CEC’04), IEEE
, 2004
"... Abstract — Regularization is an essential technique to improve generalization of neural networks. Traditionally, regularization is conduced by including an additional term in the cost function of a learning algorithm. One main drawback of these regularization techniques is that a hyperparameter that ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
Abstract — Regularization is an essential technique to improve generalization of neural networks. Traditionally, regularization is conduced by including an additional term in the cost function of a learning algorithm. One main drawback of these regularization techniques is that a hyperparameter that determines to which extension the regularization in¤uences the learning algorithm must be determined beforehand. This paper addresses the neural network regularization problem from a multi-objective optimization point of view. During the optimization, both structure and parameters of the neural network will be optimized. A slightly modi£ed version of two multi-objective optimization algorithms, the dynamic weighted aggregation (DWA) method and the elitist non-dominated sorting genetic algorithm (NSGA-II) are used and compared. An evolutionary multi-objective approach to neural network regularization has a number of advantages compared to the traditional methods. First, a number of models with a spectrum of model complexity can be obtained in one optimization run instead of only one single solution. Second, an ef£cient new regularization term can be introduced, which is not applicable to gradient-based learning algorithms. As a natural by-product of the multi-objective optimization approach to neural network regularization, neural network ensembles can be easily constructed using the obtained networks with different levels of model complexity. Thus, the model complexity of the ensemble can be adjusted by adjusting the weight of each member network in the ensemble. Simulations are carried out on a test function to illustrate the feasibility of the proposed ideas. I.

