Sliced inverse regression for dimension reduction
 J. AMER. STATIST. ASSOC
, 1991
Nonparametric regression using Bayesian variable selection
 Journal of Econometrics
, 1996
"... This paper estimates an additive model semiparametrically, while automatically selecting the significant independent variables and the app~opriatc power transformation of the dependent variable. The nonlinear variables arc modeled as regression splincs, with significant knots selected fiom a large ..."
This paper estimates an additive model semiparametrically, while automatically selecting the significant independent variables and the app~opriatc power transformation of the dependent variable. The nonlinear variables arc modeled as regression splincs, with significant knots selected fiom a large number of candidate knots. The estimation is made robust by modeling the errors as a mixture of normals. A Bayesian approach is used to select the significant knots, the power transformation, and to identify oatliers using the Gibbs sampler to curry out the computation. Empirical evidence is given that the sampler works well on both simulated and real examples and that in the univariate case it compares faw)rably with a kernelweighted local linear smoother, The variable selection algorithm in the paper is substantially fasler than previous Bayesian variable sclcclion algorithms. K('I ' word~': Additive nlodel, Pov¢¢r Iransformalio:l: Robust cslinlalion
ModelBased Clustering and Data Transformations for Gene Expression Data
, 2001
"... Motivation: Clustering is a useful exploratory technique for the analysis of gene expression data. Many different heuristic clustering algorithms have been proposed in this context. Clustering algorithms based on probability models offer a principled alternative to heuristic algorithms. In particula ..."
Motivation: Clustering is a useful exploratory technique for the analysis of gene expression data. Many different heuristic clustering algorithms have been proposed in this context. Clustering algorithms based on probability models offer a principled alternative to heuristic algorithms. In particular, modelbased clustering assumes that the data is generated by a finite mixture of underlying probability distributions such as multivariate normal distributions. The issues of selecting a 'good' clustering method and determining the 'correct' number of clusters are reduced to model selection problems in the probability framework. Gaussian mixture models have been shown to be a powerful tool for clustering in many applications.
The bootstrap
 In Handbook of Econometrics
, 2001
"... The bootstrap is a method for estimating the distribution of an estimator or test statistic by resampling one’s data. It amounts to treating the data as if they were the population for the purpose of evaluating the distribution of interest. Under mild regularity conditions, the bootstrap yields an a ..."
The bootstrap is a method for estimating the distribution of an estimator or test statistic by resampling one’s data. It amounts to treating the data as if they were the population for the purpose of evaluating the distribution of interest. Under mild regularity conditions, the bootstrap yields an approximation to the distribution of an estimator or test statistic that is at least as accurate as the
Sampling and Bayes inference in scientific modeling and robustness. (with discussion
 Journal of the Royal Statistical Society, Series A
, 1980
There is a riskreturn tradeoff after all,”
 Journal of Financial Economics,
, 2005
"... Abstract This paper studies the ICAPM intertemporal relation between conditional mean and conditional variance of the aggregate stock market return. We introduce a new estimator that forecasts monthly variance with past daily squared returns the Mixed Data Sampling (or MIDAS) approach. Using MIDAS ..."
Abstract This paper studies the ICAPM intertemporal relation between conditional mean and conditional variance of the aggregate stock market return. We introduce a new estimator that forecasts monthly variance with past daily squared returns the Mixed Data Sampling (or MIDAS) approach. Using MIDAS, we find that there is a significantly positive relation between risk and return in the stock market. This finding is robust in subsamples, to asymmetric specifications of the variance process, and to controlling for variables associated with the business cycle. We compare the MIDAS results with other tests of the ICAPM based on alternative conditional variance specifications and explain the conflicting results in the literature. Finally, we offer new insights about the dynamics of conditional variance. * We thank
A variancestabilizing transformation for geneexpression microarray data
 Bioinformatics
, 2002
"... Motivation: A variance stabilizing transformation for microarray data was recently introduced independently by several research groups. This transformation has sometimes been called the generalized logarithm or glog transformation. In this paper, we derive several alternative approximate variance st ..."
Motivation: A variance stabilizing transformation for microarray data was recently introduced independently by several research groups. This transformation has sometimes been called the generalized logarithm or glog transformation. In this paper, we derive several alternative approximate variance stabilizing transformations that may be easier to use in some applications. Results: We demonstrate that the startedlog and the loglinearhybrid transformation families can produce approximate variance stabilizing transformations for microarray data that are nearly as good as the generalized logarithm (glog) transformation. These transformations may be more convenient in some applications. Contact:
Nonparametric analysis of a generalized regression model: the maximum rank correlation estimator
 Journal of the Royal Statistical Society
, 1977
"... The paper considers estimation of a model.b; = D F ( x//3,, u,), where the composite transformation D. F is only specified that D: W * R is nondegenerate monotonic and F: R * + R is strictly monotonic in each of its variables. The paper thus generalizes standard data analysis which assumes tha ..."
The paper considers estimation of a model.b; = D F ( x//3,, u,), where the composite transformation D. F is only specified that D: W * R is nondegenerate monotonic and F: R * + R is strictly monotonic in each of its variables. The paper thus generalizes standard data analysis which assumes that the functional form of II. F is known and additive. The estimator which it proposes is the maximum rank correlation estimator which is nonparametric in the functional form of D. F and nonparametric in the distribution of the error terms, a,. The estimator is shown to be strongly consistent for the parameters /?a up to a scale coefficient. 1.
Empirical Analysis of CK Metrics for ObjectOriented Design Complexity: Implications for Software Defects
 IEEE Trans. Software Eng
, 2003
"... Abstract—To produce high quality objectoriented (OO) applications, a strong emphasis on design aspects, especially during the early phases of software development, is necessary. Design metrics play an important role in helping developers understand design aspects of software and, hence, improve sof ..."
Abstract—To produce high quality objectoriented (OO) applications, a strong emphasis on design aspects, especially during the early phases of software development, is necessary. Design metrics play an important role in helping developers understand design aspects of software and, hence, improve software quality and developer productivity. In this paper, we provide empirical evidence supporting the role of OO design complexity metrics, specifically a subset of the Chidamber and Kemerer suite, in determining software defects. Our results, based on industry data from software developed in two popular programming languages used in OO development, indicate that, even after controlling for the size of the software, these metrics are significantly associated with defects. In addition, we find that the effects of these metrics on defects vary across the samples from two programming languages—C++ and Java. We believe that these results have significant implications for designing highquality software products using the OO approach.
All in the Family: Nesting Symmetric and Asymmetric GARCH Models
 Journal of Financial Economics
, 1995
"... This paper develops a parametric family of models of generalized autoregressive heteroskedasticity (GARCH). The family nests the most popular symmetric and asymmetric GARCH models, thereby highlighting the relation between the models and their treatment of asymmetry. Furthermore, the structure perm ..."
This paper develops a parametric family of models of generalized autoregressive heteroskedasticity (GARCH). The family nests the most popular symmetric and asymmetric GARCH models, thereby highlighting the relation between the models and their treatment of asymmetry. Furthermore, the structure permits nested tests of different ypes of asymmetry and functional forms. Daily U.S. stock return data reject all standard GARCH models in favor of a model in which, roughly speaking, the conditional standard deviation depends on the shifted absolute value of the shocks raised to the power three halves and past standard deviations.