Results 1  10
of
27
Calibration and Empirical Bayes Variable Selection
 Biometrika
, 1997
"... this paper, is that with F =2logp. This choice was proposed by Foster &G eorge (1994) where it was called the Risk Inflation Criterion (RIC) because it asymptotically minimises the maximum predictive risk inflation due to selection when X is orthogonal. This choice and its minimax property were ..."
Abstract

Cited by 184 (21 self)
 Add to MetaCart
this paper, is that with F =2logp. This choice was proposed by Foster &G eorge (1994) where it was called the Risk Inflation Criterion (RIC) because it asymptotically minimises the maximum predictive risk inflation due to selection when X is orthogonal. This choice and its minimax property were also discovered independently by Donoho & Johnstone (1994) in the wavelet regression context, where they refer to it as the universal hard thresholding rule
The variable selection problem
 Journal of the American Statistical Association
, 2000
"... The problem of variable selection is one of the most pervasive model selection problems in statistical applications. Often referred to as the problem of subset selection, it arises when one wants to model the relationship between a variable of interest and a subset of potential explanatory variables ..."
Abstract

Cited by 62 (3 self)
 Add to MetaCart
The problem of variable selection is one of the most pervasive model selection problems in statistical applications. Often referred to as the problem of subset selection, it arises when one wants to model the relationship between a variable of interest and a subset of potential explanatory variables or predictors, but there is uncertainty about which subset to use. This vignette reviews some of the key developments which have led to the wide variety of approaches for this problem. 1
Bayesian Deviance, the Effective Number of Parameters, and the Comparison of Arbitrarily Complex Models
, 1998
"... We consider the problem of comparing complex hierarchical models in which the number of parameters is not clearly defined. We follow Dempster in examining the posterior distribution of the loglikelihood under each model, from which we derive measures of fit and complexity (the effective number of p ..."
Abstract

Cited by 51 (8 self)
 Add to MetaCart
We consider the problem of comparing complex hierarchical models in which the number of parameters is not clearly defined. We follow Dempster in examining the posterior distribution of the loglikelihood under each model, from which we derive measures of fit and complexity (the effective number of parameters). These may be combined into a Deviance Information Criterion (DIC), which is shown to have an approximate decisiontheoretic justification. Analytic and asymptotic identities reveal the measure of complexity to be a generalisation of a wide range of previous suggestions, with particular reference to the neural network literature. The contributions of individual observations to fit and complexity can give rise to a diagnostic plot of deviance residuals against leverages. The procedure is illustrated in a number of examples, and throughout it is emphasised that the required quantities are trivial to compute in a Markov chain Monte Carlo analysis, and require no analytic work for new...
Estimating the integrated likelihood via posterior simulation using the harmonic mean identity
 Bayesian Statistics
, 2007
"... The integrated likelihood (also called the marginal likelihood or the normalizing constant) is a central quantity in Bayesian model selection and model averaging. It is defined as the integral over the parameter space of the likelihood times the prior density. The Bayes factor for model comparison a ..."
Abstract

Cited by 48 (2 self)
 Add to MetaCart
(Show Context)
The integrated likelihood (also called the marginal likelihood or the normalizing constant) is a central quantity in Bayesian model selection and model averaging. It is defined as the integral over the parameter space of the likelihood times the prior density. The Bayes factor for model comparison and Bayesian testing is a ratio of integrated likelihoods, and the model weights in Bayesian model averaging are proportional to the integrated likelihoods. We consider the estimation of the integrated likelihood from posterior simulation output, aiming at a generic method that uses only the likelihoods from the posterior simulation iterations. The key is the harmonic mean identity, which says that the reciprocal of the integrated likelihood is equal to the posterior harmonic mean of the likelihood. The simplest estimator based on the identity is thus the harmonic mean of the likelihoods. While this is an unbiased and simulationconsistent estimator, its reciprocal can have infinite variance and so it is unstable in general. We describe two methods for stabilizing the harmonic mean estimator. In the first one, the parameter space is reduced in such a way that the modified estimator involves a harmonic mean of heaviertailed densities, thus resulting in a finite variance estimator. The resulting
Clustering time series from ARMA models with clipped data
, 2004
"... Clustering time series from ARMA models with clipped data ..."
Abstract

Cited by 23 (10 self)
 Add to MetaCart
(Show Context)
Clustering time series from ARMA models with clipped data
Spline adaptation in extended linear models
 Statistical Science
, 2002
"... Abstract. In many statistical applications, nonparametric modeling can provide insight into the features of a dataset that are not obtainable by other means. One successful approach involves the use of (univariate or multivariate) spline spaces. As a class, these methods have inherited much from cla ..."
Abstract

Cited by 19 (2 self)
 Add to MetaCart
Abstract. In many statistical applications, nonparametric modeling can provide insight into the features of a dataset that are not obtainable by other means. One successful approach involves the use of (univariate or multivariate) spline spaces. As a class, these methods have inherited much from classical tools for parametric modeling. For example, stepwise variable selection with spline basis terms is a simple scheme for locating knots (breakpoints) in regions where the data exhibit strong, local features. Similarly, candidate knot con gurations (generated by this or some other search technique), are routinely evaluated with traditional selection criteria like AIC or BIC. In short, strategies typically applied in parametric model selection have proved useful in constructing exible, lowdimensional models for nonparametric problems. Until recently, greedy, stepwise procedures were most frequently suggested in the literature. Researchinto Bayesian variable selection, however, has given rise to a number of new splinebased methods that primarily rely on some form of Markov chain Monte Carlo to identify promising knot locations. In this paper, we consider various alternatives to greedy, deterministic schemes, and present aBayesian framework for studying adaptation in the context of an extended linear model (ELM). Our major test cases are Logspline density estimation and (bivariate) Triogram regression models. We selected these because they illustrate a number of computational and methodological issues concerning model adaptation that arise in ELMs.
Hierarchical multilinear models for multiway data
, 2009
"... Reducedrank decompositions provide descriptions of the variation among the elements of a matrix or array. In such decompositions, the elements of an array are expressed as products of lowdimensional latent factors. This article presents a modelbased version of such a decomposition, extending the s ..."
Abstract

Cited by 14 (1 self)
 Add to MetaCart
Reducedrank decompositions provide descriptions of the variation among the elements of a matrix or array. In such decompositions, the elements of an array are expressed as products of lowdimensional latent factors. This article presents a modelbased version of such a decomposition, extending the scope of reduced rank methods to accommodate a variety of data types such as longitudinal social networks and continuous multivariate data that is crossclassified by categorical variables. The proposed modelbased approach is hierarchical, in that the latent factors corresponding to a given dimension of the array are not a priori independent, but exchangeable. Such a hierarchical approach allows more flexibility in the types of patterns that can be represented. Matrixvalued data are prevalent in many scientific disciplines. Studies in social and health sciences often gather social network data that can be represented by square, binary matrices with undefined diagonals. Numerical results from gene expression studies are recorded in matrices with rows
Mechanisms underlying spatial representation revealed through studies of hemispatial neglect
 Journal of Cognitive Neuroscience
, 2002
"... & The representations that mediate the coding of spatial position were examined by comparing the behavior of patients with left hemispatial neglect with that of nonneurological control subjects. To determine the spatial coordinate system(s) used to define ‘‘left’ ’ and ‘‘right,’ ’ eye movements ..."
Abstract

Cited by 14 (7 self)
 Add to MetaCart
(Show Context)
& The representations that mediate the coding of spatial position were examined by comparing the behavior of patients with left hemispatial neglect with that of nonneurological control subjects. To determine the spatial coordinate system(s) used to define ‘‘left’ ’ and ‘‘right,’ ’ eye movements were measured for targets that appeared at 58, 108, and 158 to the relative left or right defined with respect to the midline of the eyes, head, or midsaggital plane of the trunk. In the baseline condition, in which the various egocentric midlines were all aligned with the environmental midline, patients were disproportionately slower at initiating saccades to left than right targets, relative to the controls. When either the trunk or the head was rotated and the midline aligned with the most peripheral position while the eyes remained aligned with the midline of the environment, the results did not differ from the baseline condition. However, when the eyes were rotated and the midline aligned with the peripheral position, saccadic reaction time (SRT) differed significantly from the baseline, especially when the eyes were rotated to the right. These findings suggest that target position is coded relative to the current position of gaze (oculocentrically) and that this eyecentered coding is modulated by orbital position (eyeinhead signal). The findings dovetail well with results from existing neurophysiological studies and shed further light on the spatial representations mediated by the human parietal cortex. &
Mixture of latent trait analyzers for modelbased clustering of categorical data, Statistics and Computing
, 2013
"... Modelbased clustering methods for continuous data are well established and commonly used in a wide range of applications. However, modelbased clustering methods for categorical data are less standard. Latent class analysis is a commonly used method for modelbased clustering of binary data and/or ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
Modelbased clustering methods for continuous data are well established and commonly used in a wide range of applications. However, modelbased clustering methods for categorical data are less standard. Latent class analysis is a commonly used method for modelbased clustering of binary data and/or categorical data, but due to an assumed local independence structure there may not be a correspondence between the estimated latent classes and groups in the population of interest. The mixture of latent trait analyzers model extends latent class analysis by assuming a model for the categorical response variables that depends on both a categorical latent class and a continuous latent trait variable; the discrete latent class accommodates group structure and the continuous latent trait accommodates dependence within these groups. Fitting the mixture of latent trait analyzers model is potentially difficult because the likelihood function involves an integral that cannot be evaluated analytically. We develop a variational approach for fitting the mixture of latent trait models and this provides an efficient model fitting strategy. The mixture of latent trait analyzers model is demonstrated on the analysis of data from the National Long Term Care Survey (NLTCS) and voting in the U.S. Congress. The model is shown to yield intuitive clustering results and it gives a much better fit than either latent class analysis or latent trait analysis alone. 1