Results 1 - 10
of
30
Two Estimators of the Mean of a Counting Process with Panel Count Data
, 1998
"... We study two estimators of the mean function of a counting process based on "panel count data". The setting for "panel count data" is one in which n independent subjects, each with a counting process with common mean function, are observed at several possibly di erent times during a study. Following ..."
Abstract
-
Cited by 18 (11 self)
- Add to MetaCart
We study two estimators of the mean function of a counting process based on "panel count data". The setting for "panel count data" is one in which n independent subjects, each with a counting process with common mean function, are observed at several possibly di erent times during a study. Following a model proposed by Schick and Yu (1997), we allow the number of observation times, and the observation times themselves, to be random variables. Our goal is to estimate the mean function of the counting process. We show that the estimator of the mean function proposed by Sun and Kalbfleisch (1995) can be viewed as a pseudo-maximum likelihood estimator when a nonhomogeneous Poisson process model is assumed for the counting process. We establish consistency of both the nonparametric pseudo maximum likelihood estimator of Sun and Kalbfleisch (1995) and the full maximum likelihood estimator, even if the underlying counting process is not a Poisson process. We also derive the asymptotic distribution of both estimators at a xed time t, and compare the resulting theoretical relative e ciency with nite sample relative efficiency by way of a limited monte-carlo study.
Manual Controls For High-Dimensional Data Projections
- Journal of Computational and Graphical Statistics
, 1997
"... Projections of high-dimensional data onto low-dimensional subspaces provide insightful views for understanding multivariate relationships. In this paper we discuss how to manually control the variable contributions to the projection. The user has control of the way a particular variable contributes ..."
Abstract
-
Cited by 17 (12 self)
- Add to MetaCart
Projections of high-dimensional data onto low-dimensional subspaces provide insightful views for understanding multivariate relationships. In this paper we discuss how to manually control the variable contributions to the projection. The user has control of the way a particular variable contributes to the viewed projection and can interactively adjust the variable's contribution. These manual controls complement the automatic views provided by a grand tour, or a guided tour, and give greatly improved flexibility to data analysts. 1 Introduction This paper builds on dynamic visualization methods for high-dimensional data using low-dimensional projections. Among these methods, the most familiar are 3-D data rotations, generated by displaying a continuous sequence of 2-D projections of 3-D data. From a statistical perspective it is rare to have data that are strictly 3-D, and so, unlike most computer graphics applications, the more useful methods for data analysis show projections from a...
On the cost of data analysis
- Journal of Computational and Graphical Statistics
, 1992
"... A regression analysis usually consists of several stages such as variable selection, transformation and residual diagnosis. Inference is often made from the selected model without regard to the model selection methods that preceeded it. This can result in overoptimistic and biased inferences. We fir ..."
Abstract
-
Cited by 15 (2 self)
- Add to MetaCart
A regression analysis usually consists of several stages such as variable selection, transformation and residual diagnosis. Inference is often made from the selected model without regard to the model selection methods that preceeded it. This can result in overoptimistic and biased inferences. We first characterize data analytic actions as functions acting on regression models. We investigate the extent of the problem and test bootstrap, jackknife and sample splitting methods for ameliorating it. We also demonstrate an interactive LISP-STAT system for assessing the cost of the data analysis while it is taking place.
Penalized Regression with Model-Based Penalties
, 2000
"... Nonparametric regression techniques such as spline smoothing and local fitting depend implicitly on a parametric model. For instance, the cubic smoothing spline estimate of a regression function based on observations t i ,Y i is the minimizer of # {Y i - (t i )} 2 + # # ( ## ) 2 .Since ..."
Abstract
-
Cited by 11 (2 self)
- Add to MetaCart
Nonparametric regression techniques such as spline smoothing and local fitting depend implicitly on a parametric model. For instance, the cubic smoothing spline estimate of a regression function based on observations t i ,Y i is the minimizer of # {Y i - (t i )} 2 + # # ( ## ) 2 .Since # ( ## ) 2 is zero when is a line, the cubic smoothing spline estimate favors the parametric model (t)=# 0+# 1 t. Here the authors consider replacing # ( ## ) 2 with the more general expression # (L) 2 where L is a linear di#erential operator with possibly nonconstant coe#cients. The resulting estimate of performs well, particularly if L is small. They present present a O(n) algorithm for the computation of . This algorithm is applicable to a wide class of L's. They also suggest a method for the estimation of L. They study our estimates via simulation and apply them to several data sets. R ESUM E Les techniques de regression non parametrique telles que l'ajustement local ou ...
Two likelihood-based semiparametric estimation methods for panel count data with covariates
, 2005
"... We consider estimation in a particular semiparametric regression model for the mean of a counting process with “panel count ” data. The basic model assumption is that the conditional mean function of the counting process is of the form E{N(t)|Z} = exp(β T 0 Z)Λ0(t) where Z is a vector of covariates ..."
Abstract
-
Cited by 11 (7 self)
- Add to MetaCart
We consider estimation in a particular semiparametric regression model for the mean of a counting process with “panel count ” data. The basic model assumption is that the conditional mean function of the counting process is of the form E{N(t)|Z} = exp(β T 0 Z)Λ0(t) where Z is a vector of covariates and Λ0 is the baseline mean function. The “panel count ” observation scheme involves observation of the counting process N for an individual at a random number K of random time points; both the number and the locations of these time points may differ across individuals. We study semiparametric maximum pseudo-likelihood and maximum likelihood estimators of the unknown parameters (β0,Λ0) derived on the basis of a nonhomogeneous Poisson process assumption. The pseudo-likelihood estimator is fairly easy to compute, while the maximum likelihood estimator poses more challenges from the computational perspective. We study asymptotic properties of both estimators assuming that the proportional mean model holds, but dropping the Poisson process assumption used to derive the estimators. In particular we establish asymptotic normality for the estimators of the regression parameter β0 under appropriate hypotheses. The results show that our estimation procedures are robust in the sense that the estimators converge to the truth regardless of the underlying counting process.
A Quantification Of Distance-Bias Between Evaluation Metrics In Classification
- In Proceedings of the 17th International Conference on Machine Learning
, 2000
"... This paper provides a characterization of bias for evaluation metrics in classification (e.g., Information Gain, Gini, 2 , etc.). Our characterization provides a uniform representation for all traditional evaluation metrics. Such representation leads naturally to a measure for the distance ..."
Abstract
-
Cited by 10 (2 self)
- Add to MetaCart
This paper provides a characterization of bias for evaluation metrics in classification (e.g., Information Gain, Gini, 2 , etc.). Our characterization provides a uniform representation for all traditional evaluation metrics. Such representation leads naturally to a measure for the distance between the bias of two evaluation metrics. We give a practical value to our measure by observing if the distance between the bias of two evaluation metrics correlates with differences in predictive accuracy when we compare two versions of the same learning algorithm that differ in the evaluation metric only. Experiments on real-world domains show how the expectations on accuracy differences generated by the distance-bias measure correlate with actual differences when the learning algorithm is simple (e.g., search for the best single-feature or the best single-rule). The correlation, however, weakens with more complex algorithms (e.g., learning decision trees). Our results sh...
On Locally Uniformly Linearizable High Breakdown Location and Scale Functionals
, 1998
"... this paper and the standard one model situation of robust statistics. They consider a finite number of models or challenges and look for a procedure which performs well at all of them. The hope is that such a procedure will also perform reasonably well for challenges which lie between. For a given s ..."
Abstract
-
Cited by 8 (3 self)
- Add to MetaCart
this paper and the standard one model situation of robust statistics. They consider a finite number of models or challenges and look for a procedure which performs well at all of them. The hope is that such a procedure will also perform reasonably well for challenges which lie between. For a given sample a likelihood based compromise between the two challenges is made. The use of likelihood means that the method of Morgenthaler and Tukey does not satisfy DP5. In Section 6 we show how it is possible to "coarsen" a large class of distributions by reducing them to a finite sample of m points which themselves satisfy DP5. These points can be used to decide between a finite set of challenges and hence to make the weights of the weighted mean depend on the shape of the sample but in a differentiable manner. 3 Local uniform linearity
Some Dynamic Graphics for Spatial Data (with Multiple Attributes) in a GIS
, 1994
"... This paper discusses some multivariate exploratory spatial data analysis tools for detecting spatial dependence. The ideas explored are related to canonical correlation analysis and the graphical tools are related to the dynamic method called the grand tour. The work is implemented with a link betwe ..."
Abstract
-
Cited by 6 (6 self)
- Add to MetaCart
This paper discusses some multivariate exploratory spatial data analysis tools for detecting spatial dependence. The ideas explored are related to canonical correlation analysis and the graphical tools are related to the dynamic method called the grand tour. The work is implemented with a link between a Geographic Information System, ARC/INFO, and software for exploring multivariate data, XGobi.

