Results 1 
6 of
6
When Second Best Is Good Enough: A Comparison Between A True Experiment and a Regression Discontinuity QuasiExperiment ∗
, 2009
"... In this paper, we compare the results from a randomized clinical trial to the results from a regression discontinuity quasiexperiment when both designs are implemented in the same setting. We find the that results from the two approaches are effectively identical. We ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
In this paper, we compare the results from a randomized clinical trial to the results from a regression discontinuity quasiexperiment when both designs are implemented in the same setting. We find the that results from the two approaches are effectively identical. We
Forecasts of Violence to Inform Sentencing Decisions.” Journal of Quantitative Criminology, forthcoming
, 2013
"... Behavioral forecasts have informed parole decisions in the United States since the 1920’s (Borden, 1928; Burgess, 1928). Over the decades, these forecasts have increasingly relied on quantitative methods that some would call actuarial (Messinger and Berk, 1987; Feely and Simon, 1994). Despite jurisp ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Behavioral forecasts have informed parole decisions in the United States since the 1920’s (Borden, 1928; Burgess, 1928). Over the decades, these forecasts have increasingly relied on quantitative methods that some would call actuarial (Messinger and Berk, 1987; Feely and Simon, 1994). Despite jurisprudential
What You Can and Can’t Properly Do With Regression ∗
, 2010
"... Regression analysis, broadly construed, has over the past 60 years become the dominant statistical paradigm within the social sciences and criminology. In its most canonical and popular form, a regression analysis becomes a “structural equation model ” from which “causal effects ” can be estimated. ..."
Abstract
 Add to MetaCart
Regression analysis, broadly construed, has over the past 60 years become the dominant statistical paradigm within the social sciences and criminology. In its most canonical and popular form, a regression analysis becomes a “structural equation model ” from which “causal effects ” can be estimated.
Submitted to the Annals of Statistics VALID POSTSELECTION INFERENCE
"... It is common practice in statistical data analysis to perform datadriven variable selection and derive statistical inference from the resulting model. Such inference enjoys none of the guarantees that classical statistical theory provides for tests and confidence intervals when the model has been ch ..."
Abstract
 Add to MetaCart
It is common practice in statistical data analysis to perform datadriven variable selection and derive statistical inference from the resulting model. Such inference enjoys none of the guarantees that classical statistical theory provides for tests and confidence intervals when the model has been chosen a priori. We propose to produce valid “postselection inference ” by reducing the problem to one of simultaneous inference. Simultaneity is required for all linear functions that arise as coefficient estimates in all submodels. By purchasing “simultaneity insurance ” for all possible submodels, the resulting postselection inference is rendered universally valid under all possible model selection procedures. This inference is therefore generally conservative for particular selection procedures, but it is always less conservative than full Scheffé protection. Importantly it does not depend on the truth of the selected submodel, and hence it produces valid inference even in wrong models. We describe the structure of the simultaneous inference problem and give some asymptotic results. 1. Introduction — The Problem with Statistical Inference
Stochastic Gradient Boosting ∗
, 2010
"... In many metropolitan areas, efforts are made to count the homeless to ensure proper provision of social services. Some areas are very large, which makes spatial sampling a viable alternative to an enumeration of the entire terrain. Counts are observed in sampled regions but must be imputed in unvisi ..."
Abstract
 Add to MetaCart
In many metropolitan areas, efforts are made to count the homeless to ensure proper provision of social services. Some areas are very large, which makes spatial sampling a viable alternative to an enumeration of the entire terrain. Counts are observed in sampled regions but must be imputed in unvisited areas. Along with the imputation process, the costs of underestimating and overestimating may be different. For example, if precise estimation in areas with large homeless counts is critical, then underestimation should be penalized more than overestimation in the loss function. We analyze data from the 20042005 Los Angeles County homeless study using an augmentation of L1 stochastic gradient boosting that can weight overestimates and underestimates asymmetrically. We discuss our choice to utilize stochastic gradient boosting over other function estimation procedures. Insample fitted and outofsample imputed values, as well as relationships between the response and predictors, are analyzed for various cost functions. We gratefully acknowledge numerous helpful discussions with Greg Ridgeway. The
Package hdlm: Regression Tables for High Dimensional Linear Model Estimation
"... We present the R package hdlm, created to facilitate the study of high dimensional datasets. Our emphasis is on the production of regression tables and a class ‘hdlm ’ for which new extensions can be easily written. We model our work on the functionality given for linear and generalized linear model ..."
Abstract
 Add to MetaCart
We present the R package hdlm, created to facilitate the study of high dimensional datasets. Our emphasis is on the production of regression tables and a class ‘hdlm ’ for which new extensions can be easily written. We model our work on the functionality given for linear and generalized linear models from the functions lm and glm in the recommended package stats. Reasonable default options have been selected so that the package may be used immediately by anyone familiar with the low dimensional variants; however, a generic procedure for using alternative point estimators is also provided. Two techniques are given for constructing high dimensional regression tables. The first uses the the twostage approach of Wasserman and Roeder (2009), with the generalization proposed by Meinshausen, Meier, and Bühlmann (2009) to increase robustness, in order to calculate highdimensional p values. We introduce and implement a novel method for generalizing these p value methods to confidence intervals. The second technique constructs regression tables using a hierarchical Bayesian approach solved via Gibbs Sampling MCMC. In this article, we focus on design choices made in the package, relevant computational issues, and approaches to changing the default options.