Results 1 - 10
of
330
Sequential Monte Carlo Methods for Dynamic Systems
- Journal of the American Statistical Association
, 1998
"... A general framework for using Monte Carlo methods in dynamic systems is provided and its wide applications indicated. Under this framework, several currently available techniques are studied and generalized to accommodate more complex features. All of these methods are partial combinations of three ..."
Abstract
-
Cited by 340 (4 self)
- Add to MetaCart
A general framework for using Monte Carlo methods in dynamic systems is provided and its wide applications indicated. Under this framework, several currently available techniques are studied and generalized to accommodate more complex features. All of these methods are partial combinations of three ingredients: importance sampling and resampling, rejection sampling, and Markov chain iterations. We deliver a guideline on how they should be used and under what circumstance each method is most suitable. Through the analysis of differences and connections, we consolidate these methods into a generic algorithm by combining desirable features. In addition, we propose a general use of Rao-Blackwellization to improve performances. Examples from econometrics and engineering are presented to demonstrate the importance of Rao-Blackwellization and to compare different Monte Carlo procedures. Keywords: Blind deconvolution; Bootstrap filter; Gibbs sampling; Hidden Markov model; Kalman filter; Markov...
Posterior Predictive Assessment of Model Fitness Via Realized Discrepancies
- Statistica Sinica
, 1996
"... Abstract: This paper considers Bayesian counterparts of the classical tests for goodness of fit and their use in judging the fit of a single Bayesian model to the observed data. We focus on posterior predictive assessment, in a framework that also includes conditioning on auxiliary statistics. The B ..."
Abstract
-
Cited by 124 (25 self)
- Add to MetaCart
Abstract: This paper considers Bayesian counterparts of the classical tests for goodness of fit and their use in judging the fit of a single Bayesian model to the observed data. We focus on posterior predictive assessment, in a framework that also includes conditioning on auxiliary statistics. The Bayesian formulation facilitates the construction and calculation of a meaningful reference distribution not only for any (classical) statistic, but also for any parameter-dependent “statistic ” or discrepancy. The latter allows us to propose the realized discrepancy assessment of model fitness, which directly measures the true discrepancy between data and the posited model, for any aspect of the model which we want to explore. The computation required for the realized discrepancy assessment is a straightforward byproduct of the posterior simulation used for the original Bayesian analysis. We illustrate with three applied examples. The first example, which serves mainly to motivate the work, illustrates the difficulty of classical tests in assessing the fitness of a Poisson model to a positron emission tomography image that is constrained to be nonnegative. The second and third examples illustrate the details of the posterior predictive approach in two problems: estimation in a model with inequality constraints on the parameters, and estimation in a mixture model. In all three examples, standard test statistics (either a χ 2 or a likelihood ratio) are not pivotal: the difficulty is not just how to compute the reference distribution for the test, but that in the classical framework no such distribution exists, independent of the unknown model parameters. Key words and phrases: Bayesian p-value, χ 2 test, discrepancy, graphical assessment, mixture model, model criticism, posterior predictive p-value, prior predictive
Analyzing Incomplete Political Science Data: An Alternative Algorithm for Multiple Imputation
- American Political Science Review
, 2000
"... We propose a remedy for the discrepancy between the way political scientists analyze data with missing values and the recommendations of the statistics community. Methodologists and statisticians agree that "multiple imputation" is a superior approach to the problem of missing data scattered through ..."
Abstract
-
Cited by 88 (35 self)
- Add to MetaCart
We propose a remedy for the discrepancy between the way political scientists analyze data with missing values and the recommendations of the statistics community. Methodologists and statisticians agree that "multiple imputation" is a superior approach to the problem of missing data scattered through one's explanatory and dependent variables than the methods currently used in applied data analysis. The reason for this discrepancy lies with the fact that the computational algorithms used to apply the best multiple imputation models have been slow, difficult to implement, impossible to run with existing commercial statistical packages, and demanding of considerable expertise. In this paper, we adapt an existing algorithm, and use it to implement a generalpurpose, multiple imputation model for missing data. This algorithm is considerably faster and easier to use than the leading method recommended in the statistics literature. We also quantify the risks of current missing data practices, ...
Learning from incomplete data
, 1994
"... Real-world learning tasks often involve high-dimensional data sets with complex patterns of missing features. In this paper we review the problem of learning from incomplete data from two statistical perspectives---the likelihood-based and the Bayesian. The goal is two-fold: to place current neura ..."
Abstract
-
Cited by 49 (0 self)
- Add to MetaCart
Real-world learning tasks often involve high-dimensional data sets with complex patterns of missing features. In this paper we review the problem of learning from incomplete data from two statistical perspectives---the likelihood-based and the Bayesian. The goal is two-fold: to place current neural network approaches to missing data within a statistical framework, and to describe a set of algorithms, derived from the likelihood-based framework, that handle clustering, classification, and function approximation from incomplete data in a principled and efficient manner. These algorithms are based on mixture modeling and maketwo distinct appeals to the Expectation-Maximization (EM) principle (Dempster et al., 1977)---both for the estimation of mixture components and for coping with the missing data.
Parameter Expansion for Data Augmentation
- Journal of the American Statistical Association
, 1999
"... Viewing the observed data of a statistical model as incomplete and augmenting its missing parts are useful for clarifying concepts and central to the invention of two well-known statistical algorithms: expectation-maximization (EM) and data augmentation. Recently, Liu, Rubin, and Wu (1998) demonstra ..."
Abstract
-
Cited by 48 (1 self)
- Add to MetaCart
Viewing the observed data of a statistical model as incomplete and augmenting its missing parts are useful for clarifying concepts and central to the invention of two well-known statistical algorithms: expectation-maximization (EM) and data augmentation. Recently, Liu, Rubin, and Wu (1998) demonstrate that expanding the parameter space along with augmenting the missing data is useful for accelerating iterative computation in an EM algorithm. The main purpose of this article is to rigorously define a parameter expanded data augmentation (PX-DA) algorithm and to study its theoretical properties. The PX-DA is a special way of using auxiliary variables to accelerate Gibbs sampling algorithms and is closely related to reparameterization techniques. Theoretical results concerning the convergence rate of the PX-DA algorithm and the choice of prior for the expansion parameter are obtained. In order to understand the role of the expansion parameter, we establish a new theory for iterative condi...
A Foundation for Capturing and Querying Complex Multidimensional Data
- Information Systems
, 2001
"... On-line analytical processing (OLAP) systems considerably improve data analysis and are finding wide-spread use. OLAP systems typically employ multidimensional data models to structure their data. This paper identifies 11 modeling requirements for multidimensional data models. These requirements are ..."
Abstract
-
Cited by 41 (10 self)
- Add to MetaCart
On-line analytical processing (OLAP) systems considerably improve data analysis and are finding wide-spread use. OLAP systems typically employ multidimensional data models to structure their data. This paper identifies 11 modeling requirements for multidimensional data models. These requirements are derived from an assessment of complexdata found in real-world applications. A survey of 14 multidimensional data models reveals shortcomings in meeting some of the requirements. Existing models do not support many-to-many relationships between facts and dimensions, lack built-in mechanisms for handling change and time, lack support for imprecision, and are generally unable to insert data with varying granularities. This paper defines an extended multidimensional data model and algebraic query language that address all 11 requirements. The model reuses the common multidimensional concepts of dimension hierarchies and granularities to capture imprecise data. For queries that cannot be answere...
Parameter expansion to accelerate EM: The PX-EM algorithm
, 1998
"... The EM algorithm and its extensions are popular tools for modal estimation but are often criticised for their slow convergence. We propose a new method that can often make EM much faster. The intuitive idea is to use a 'covariance adjustment ' to correct the analysis of the M step, capitalising on e ..."
Abstract
-
Cited by 32 (6 self)
- Add to MetaCart
The EM algorithm and its extensions are popular tools for modal estimation but are often criticised for their slow convergence. We propose a new method that can often make EM much faster. The intuitive idea is to use a 'covariance adjustment ' to correct the analysis of the M step, capitalising on extra information captured in the imputed complete data. The way we accomplish this is by parameter expansion; we expand the complete-data model while preserving the observed-data model and use the expanded complete-data model to generate EM. This parameter-expanded EM, PX-EM, algorithm shares the simplicity and stability of ordinary EM, but has a faster rate of convergence since its M step performs a more efficient analysis. The PX-EM algorithm is illustrated for the multivariate t distribution, a random effects model, factor analysis, probit regression and a Poisson imaging model.
Multiple imputation for multivariate missing-data problems: a data analyst's perspective
- Multivariate Behavioral Research
, 1998
"... Analyses of multivariate data are frequently hampered by missing values. Until re-cently, the only missing-data methods available to most data analysts have been relatively ad hoc practices such as listwise deletion. Recent dramatic advances in theoretical and com-putational statistics, however, hav ..."
Abstract
-
Cited by 30 (0 self)
- Add to MetaCart
Analyses of multivariate data are frequently hampered by missing values. Until re-cently, the only missing-data methods available to most data analysts have been relatively ad hoc practices such as listwise deletion. Recent dramatic advances in theoretical and com-putational statistics, however, have produced a new generation of flexible procedures with a sound statistical basis. These procedures involve multiple imputation (Rubin, 1987), a simu-lation technique that replaces each missing datum with a set of m>1 plausible values. The m versions of the complete data are analyzed by standard complete-data methods, and the results are combined using simple rules to yield estimates, standard errors, and p-values that formally incorporate missing-data uncertainty. New computational algorithms and software described in a recent book (Schafer, 1997) allow us to create proper multiple imputations in complex multivariate settings. This article reviews the key ideas of multiple imputation, discusses the software programs currently available, and demonstrates their use on data from
Multiple Imputation in the Survey of Consumer Finances
- Proceedings of the Section on Business and Economic Statistics, 1998 Annual Meetings of the American Statistical Association
, 1998
"... The views presented in this paper are those of the author alone and do not necessarily reflect ..."
Abstract
-
Cited by 27 (3 self)
- Add to MetaCart
The views presented in this paper are those of the author alone and do not necessarily reflect
Imputation of the 1989 Survey of Consumer Finances: Stochastic Relaxation and Multiple Imputation” mimeo, Board of Governors of the Federal Reserve System
- 1991 Proceedings of the Section on Survey Research Methods, Annual Meetings of the American Statistical Association
, 1991
"... acknowledges the support for this work by staff in the Division of Research and Statistics including ..."
Abstract
-
Cited by 22 (7 self)
- Add to MetaCart
acknowledges the support for this work by staff in the Division of Research and Statistics including

