Results 1  10
of
455
Locality Preserving Projections
, 2002
"... Many problems in information processing involve some form of dimensionality reduction. In this paper, we introduce Locality Preserving Projections (LPP). These are linear projective maps that arise by solving a variational problem that optimally preserves the neighborhood structure of the data s ..."
Abstract

Cited by 405 (16 self)
 Add to MetaCart
Many problems in information processing involve some form of dimensionality reduction. In this paper, we introduce Locality Preserving Projections (LPP). These are linear projective maps that arise by solving a variational problem that optimally preserves the neighborhood structure of the data set. LPP should be seen as an alternative to Principal Component Analysis (PCA)  a classical linear technique that projects the data along the directions of maximal variance. When the high dimensional data lies on a low dimensional manifold embedded in the ambient space, the Locality Preserving Projections are obtained by finding the optimal linear approximations to the eigenfunctions of the Laplace Beltrami operator on the manifold. As a result, LPP shares many of the data representation properties of nonlinear techniques such as Laplacian Eigenmaps or Locally Linear Embedding. Yet LPP is linear and more crucially is defined everywhere in ambient space rather than just on the training data points. This is borne out by illustrative examples on some high dimensional data sets.
The bootstrap
 In Handbook of Econometrics
, 2001
"... The bootstrap is a method for estimating the distribution of an estimator or test statistic by resampling one’s data. It amounts to treating the data as if they were the population for the purpose of evaluating the distribution of interest. Under mild regularity conditions, the bootstrap yields an a ..."
Abstract

Cited by 175 (2 self)
 Add to MetaCart
The bootstrap is a method for estimating the distribution of an estimator or test statistic by resampling one’s data. It amounts to treating the data as if they were the population for the purpose of evaluating the distribution of interest. Under mild regularity conditions, the bootstrap yields an approximation to the distribution of an estimator or test statistic that is at least as accurate as the
K.L.: Fast modelbased estimation of ancestry in unrelated individuals. Genome Res. 19, 1655– 1664
 Information Systems and Data Analysis
, 1997
"... Population stratification has long been recognized as a confounding factor in genetic association studies. Estimated ancestries, derived from multilocus genotype data, can be used as covariates to correct for population stratification. One popular technique for estimation of ancestry is the modelb ..."
Abstract

Cited by 127 (4 self)
 Add to MetaCart
(Show Context)
Population stratification has long been recognized as a confounding factor in genetic association studies. Estimated ancestries, derived from multilocus genotype data, can be used as covariates to correct for population stratification. One popular technique for estimation of ancestry is the modelbased approach embodied by the widelyapplied program structure. Another approach, implemented in the program eigenstrat, relies on principal component analysis rather than modelbased estimation and does not directly deliver admixture fractions. eigenstrat has gained in popularity in part due to its remarkable speed in comparison to structure. We present a new algorithm and a program, admixture, for modelbased estimation of ancestry in unrelated individuals. admixture adopts the likelihood model embedded in structure. However, admixture runs considerably faster, solving problems in minutes that take structure hours. In many of our experiments we have found that admixture is almost as fast as eigenstrat. The runtime improvements of admixture rely on a fast block relaxation scheme using sequential quadratic programming for block updates, coupled with a novel quasiNewton acceleration of convergence. Our algorithm also runs faster and with greater accuracy than the implementation of an ExpectationMaximization (EM) algorithm incorporated in the program frappe. Our simulations show that admixture’s maximum likelihood estimates of the underlying admixture coefficients and ancestral allele frequencies are as accurate as structure’s Bayesian estimates. On real world datasets, admixture’s estimates are directly comparable to those from structure and eigenstrat. Taken together, our results show that admixture’s computational speed opens up the possibility of using a much larger setof markers in modelbased ancestry estimation and that its estimates are suitable for use in correcting for population stratification in association studies. 2 1
Estimating the Generalization Performance of an SVM Efficiently
, 2000
"... This paper proposes and analyzes an approach to estimating the generalization performance of a support vector machine (SVM) for text classification. Without any computation intensive resampling, the new estimators are computationally much more ecient than crossvalidation or bootstrap, since they ca ..."
Abstract

Cited by 115 (1 self)
 Add to MetaCart
This paper proposes and analyzes an approach to estimating the generalization performance of a support vector machine (SVM) for text classification. Without any computation intensive resampling, the new estimators are computationally much more ecient than crossvalidation or bootstrap, since they can be computed immediately from the form of the hypothesis returned by the SVM. Moreover, the estimators delevoped here address the special performance measures needed for text classification. While they can be used to estimate error rate, one can also estimate the recall, the precision, and the F 1 . A theoretical analysis and experiments on three text classification collections show that the new method can effectively estimate the performance of SVM text classifiers in a very efficient way.
The grid bootstrap and the autoregressive model
 Review of Economics and Statistics
, 1999
"... Abstract —A ‘‘grid’ ’ bootstrap method is proposed for con � denceinterva l construction, which has improved performance over conventiona l bootstrap methods when the sampling distribution depends upon the parameter of interest. The basic idea is to calculate the bootstrap distribution over a grid ..."
Abstract

Cited by 103 (0 self)
 Add to MetaCart
Abstract —A ‘‘grid’ ’ bootstrap method is proposed for con � denceinterva l construction, which has improved performance over conventiona l bootstrap methods when the sampling distribution depends upon the parameter of interest. The basic idea is to calculate the bootstrap distribution over a grid of values of the parameter of interest and form the con � dence interval by the norejectio n principle. Our primary motivation is given by autoregressiv e models, where it is known that conventional bootstrap methods fail to provide correct � rstorder asymptotic coverage when an autoregressiv e root is close to unity. In contrast, the grid bootstrap is � rstorder correct globally in the parameter space. Simulation results verify these insights, suggesting that the grid bootstrap provides an important improvement over conventiona l methods. Gauss code that calculates the grid bootstrap intervals—and replicates the empirical work reported in this paper—is available from the author’s Web page at www.ssc.wisc.edu, bhansen. I.
Testing for Linearity
 Journal of Economic Surveys
, 1999
"... Abstract. The problem of testing for linearity and the number of regimes in the context of selfexciting threshold autoregressive (SETAR) models is reviewed. We describe leastsquares methods of estimation and inference. The primary complication is that the testing problem is nonstandard, due to th ..."
Abstract

Cited by 89 (1 self)
 Add to MetaCart
Abstract. The problem of testing for linearity and the number of regimes in the context of selfexciting threshold autoregressive (SETAR) models is reviewed. We describe leastsquares methods of estimation and inference. The primary complication is that the testing problem is nonstandard, due to the presence of parameters which are only defined under the alternative, so the asymptotic distribution of the test statistics is nonstandard. Simulation methods to calculate asymptotic and bootstrap distributions are presented. As the sampling distributions are quite sensitive to conditional heteroskedasticity in the error, careful modeling of the conditional variance is necessary for accurate inference on the conditional mean. We illustrate these methods with two applications Ð annual sunspot means and monthly U.S. industrial production. We find that annual sunspots and monthly industrial production are SETAR(2) processes. Keywords. SETAR models; Thresholds; Nonstandard asymptotic theory; Bootstrap
Stepwise multiple testing as formalized data snooping
 Econometrica
, 2005
"... It is common in econometric applications that several hypothesis tests are carried out at the same time. The problem then becomes how to decide which hypotheses to reject, accounting for the multitude of tests. In this paper, we suggest a stepwise multiple testing procedure which asymptotically cont ..."
Abstract

Cited by 75 (7 self)
 Add to MetaCart
It is common in econometric applications that several hypothesis tests are carried out at the same time. The problem then becomes how to decide which hypotheses to reject, accounting for the multitude of tests. In this paper, we suggest a stepwise multiple testing procedure which asymptotically controls the familywise error rate at a desired level. Compared to related singlestep methods, our procedure is more powerful in the sense that it often will reject more false hypotheses. In addition, we advocate the use of studentization when it is feasible. Unlike some stepwise methods, our method implicitly captures the joint dependence structure of the test statistics, which results in increased ability to detect alternative hypotheses. We prove our method asymptotically controls the familywise error rate under minimal assumptions. We present our methodology in the context of comparing several strategies to a common benchmark and deciding which strategies actually beat the benchmark. However, our ideas can easily be extended and/or modified to other contexts, such as making inference for the individual regression coefficients in a multiple regression framework. Some simulation studies show the improvements of our methods over previous proposals. We also provide an application to a set of real data.
Analysis of Complex Survey Samples
 Journal of Statistical Software
, 2004
"... I present software for analysing complex survey samples in R. The sampling scheme can be explicitly described or represented by replication weights. Variance estimation uses either replication or linearisation. 1 ..."
Abstract

Cited by 71 (0 self)
 Add to MetaCart
I present software for analysing complex survey samples in R. The sampling scheme can be explicitly described or represented by replication weights. Variance estimation uses either replication or linearisation. 1
Taskdependent viscoelasticity of human multijoint arm and its spatial characteristics for interaction with environments
 J Neurosci
, 1999
"... Human arm viscoelasticity is important in stabilizing posture, movement, and in interacting with objects. Viscoelastic spatial characteristics are usually indexed by the size, shape, and orientation of a hand stiffness ellipse. It is well known that arm posture is a dominant factor in determining th ..."
Abstract

Cited by 61 (6 self)
 Add to MetaCart
(Show Context)
Human arm viscoelasticity is important in stabilizing posture, movement, and in interacting with objects. Viscoelastic spatial characteristics are usually indexed by the size, shape, and orientation of a hand stiffness ellipse. It is well known that arm posture is a dominant factor in determining the properties of the stiffness ellipse. However, it is still unclear how much joint stiffness can change under different conditions, and the effects of that change on the spatial characteristics of hand stiffness are poorly examined. To investigate the dexterous control mechanisms of the human arm, we studied the controllability and spatial characteristics of viscoelastic properties of human multijoint arm during different cocontractions and force interactions in various directions and amplitudes in a horizontal plane. We found that different cocontraction ratios between shoulder and elbow joints can produce changes in the shape and orien
Maximum Likelihood and the Bootstrap for Nonlinear Dynamic Models
, 2002
"... We provide a uniÞed framework for analyzing bootstrapped extremum estimators of nonlinear dynamic models for heterogeneous dependent stochastic processes. We apply our results to the moving blocks bootstrap of Künsch (1989) and Liu and Singh (1992) and prove the Þrst order asymptotic validity of the ..."
Abstract

Cited by 48 (5 self)
 Add to MetaCart
We provide a uniÞed framework for analyzing bootstrapped extremum estimators of nonlinear dynamic models for heterogeneous dependent stochastic processes. We apply our results to the moving blocks bootstrap of Künsch (1989) and Liu and Singh (1992) and prove the Þrst order asymptotic validity of the bootstrap approximation to the true distribution of quasimaximum likelihood estimators. We also consider bootstrap testing. In particular, we prove the Þrst order asymptotic validity of the bootstrap distribution of suitable bootstrap analogs of Wald and Lagrange Multiplier statistics for testing hypotheses.