Results 1 - 10
of
189
Analyzing Incomplete Political Science Data: An Alternative Algorithm for Multiple Imputation
- American Political Science Review
, 2000
"... We propose a remedy for the discrepancy between the way political scientists analyze data with missing values and the recommendations of the statistics community. Methodologists and statisticians agree that "multiple imputation" is a superior approach to the problem of missing data scattered through ..."
Abstract
-
Cited by 90 (35 self)
- Add to MetaCart
We propose a remedy for the discrepancy between the way political scientists analyze data with missing values and the recommendations of the statistics community. Methodologists and statisticians agree that "multiple imputation" is a superior approach to the problem of missing data scattered through one's explanatory and dependent variables than the methods currently used in applied data analysis. The reason for this discrepancy lies with the fact that the computational algorithms used to apply the best multiple imputation models have been slow, difficult to implement, impossible to run with existing commercial statistical packages, and demanding of considerable expertise. In this paper, we adapt an existing algorithm, and use it to implement a generalpurpose, multiple imputation model for missing data. This algorithm is considerably faster and easier to use than the leading method recommended in the statistics literature. We also quantify the risks of current missing data practices, ...
Robust Portfolio Selection Problems
- Mathematics of Operations Research
, 2001
"... In this paper we show how to formulate and solve robust portfolio selection problems. The objective of these robust formulations is to systematically combat the sensitivity of the optimal portfolio to statistical and modeling errors in the estimates of the relevant market parameters. We introduce "u ..."
Abstract
-
Cited by 61 (7 self)
- Add to MetaCart
In this paper we show how to formulate and solve robust portfolio selection problems. The objective of these robust formulations is to systematically combat the sensitivity of the optimal portfolio to statistical and modeling errors in the estimates of the relevant market parameters. We introduce "uncertainty structures" for the market parameters and show that the robust portfolio selection problems corresponding to these uncertainty structures can be reformulated as second-order cone programs and, therefore, the computational effort required to solve them is comparable to that required for solving convex quadratic programs. Moreover, we show that these uncertainty structures correspond to confidence regions associated with the statistical procedures used to estimate the market parameters. We demonstrate a simple recipe for efficiently computing robust portfolios given raw market data and a desired level of confidence.
2001): “Clarify: Software for Interpreting and Presenting Statistical Results
- Journal of Statistical Software
"... and distribute this program provided that no charge is made and the copy is identical to the original. To request an exception, please contact Michael Tomz. Contents 1 ..."
Abstract
-
Cited by 30 (0 self)
- Add to MetaCart
and distribute this program provided that no charge is made and the copy is identical to the original. To request an exception, please contact Michael Tomz. Contents 1
Multiple imputation for multivariate missing-data problems: a data analyst's perspective
- Multivariate Behavioral Research
, 1998
"... Analyses of multivariate data are frequently hampered by missing values. Until re-cently, the only missing-data methods available to most data analysts have been relatively ad hoc practices such as listwise deletion. Recent dramatic advances in theoretical and com-putational statistics, however, hav ..."
Abstract
-
Cited by 30 (0 self)
- Add to MetaCart
Analyses of multivariate data are frequently hampered by missing values. Until re-cently, the only missing-data methods available to most data analysts have been relatively ad hoc practices such as listwise deletion. Recent dramatic advances in theoretical and com-putational statistics, however, have produced a new generation of flexible procedures with a sound statistical basis. These procedures involve multiple imputation (Rubin, 1987), a simu-lation technique that replaces each missing datum with a set of m>1 plausible values. The m versions of the complete data are analyzed by standard complete-data methods, and the results are combined using simple rules to yield estimates, standard errors, and p-values that formally incorporate missing-data uncertainty. New computational algorithms and software described in a recent book (Schafer, 1997) allow us to create proper multiple imputations in complex multivariate settings. This article reviews the key ideas of multiple imputation, discusses the software programs currently available, and demonstrates their use on data from
1997] “Multiple Imputation and Disclosure Protection: The Case of the 1995 Survey of Consumer Finances,” presented at ‘98
"... Recent developments in record linkage technology together with vast increases in the amount of personally identified information available in machine readable form raise serious concerns about the future of public use datasets. One possibility raised by Rubin [1993] is to release only simulated data ..."
Abstract
-
Cited by 19 (4 self)
- Add to MetaCart
Recent developments in record linkage technology together with vast increases in the amount of personally identified information available in machine readable form raise serious concerns about the future of public use datasets. One possibility raised by Rubin [1993] is to release only simulated data created by multiple imputation techniques using the actual data. This paper uses the multiple imputation software developed for the Survey of Consumer Finances (Kennickell [1991]) to develop a series of experimental simulated versions of the 1995 survey data.
Learning Gaussian process kernels via hierarchical Bayes
- In Advances in Neural Information Processing Systems (NIPS
, 2004
"... We present a novel method for learning with Gaussian process regression in a hierarchical Bayesian framework. In a first step, kernel matrices on a fixed set of input points are learned from data using a simple and efficient EM algorithm. This step is nonparametric, in that it does not require a par ..."
Abstract
-
Cited by 19 (0 self)
- Add to MetaCart
We present a novel method for learning with Gaussian process regression in a hierarchical Bayesian framework. In a first step, kernel matrices on a fixed set of input points are learned from data using a simple and efficient EM algorithm. This step is nonparametric, in that it does not require a parametric form of covariance function. In a second step, kernel functions are fitted to approximate the learned covariance matrix using a generalized Nyström method, which results in a complex, data driven kernel. We evaluate our approach as a recommendation engine for art images, where the proposed hierarchical Bayesian method leads to excellent prediction performance. 1
What Determines Individual Trade Policy Preferences
- Stata 8 User’s Guide and Base Reference Manual. Stata
, 2001
"... This paper provides new evidence on the determinants of individual trade-policy preferences using individual-level survey data for the United States. There are two main empirical results. First, we find that factor type dominates industry of employment in explaining support for trade barriers. Secon ..."
Abstract
-
Cited by 17 (3 self)
- Add to MetaCart
This paper provides new evidence on the determinants of individual trade-policy preferences using individual-level survey data for the United States. There are two main empirical results. First, we find that factor type dominates industry of employment in explaining support for trade barriers. Second, we find that home ownership also matters for individuals ' trade-policy preferences. Independent of factor type, home ownership in counties with a manufacturing mix concentrated in comparative-disadvantage industries is correlated with support for trade barriers. This finding suggests that in addition to current factor incomes driving preferences as in standard trade models, preferences also depend on asset values.
Multiple Imputation in Practice: Comparison of Software Packages for Regression Models With Missing Variables
"... This article reviews multiple imputation, describes assumptions that it requires, and reviews software packages that implement this procedure. We apply the methods and compare the results using two examples---a child psychopathology dataset with missing outcomes and an artificial dataset with missin ..."
Abstract
-
Cited by 17 (0 self)
- Add to MetaCart
This article reviews multiple imputation, describes assumptions that it requires, and reviews software packages that implement this procedure. We apply the methods and compare the results using two examples---a child psychopathology dataset with missing outcomes and an artificial dataset with missing covariates. We conclude with some discussion of the strengths and weaknesses of these implementations as well as advantages and limitations of imputation
Not Asked Or Not Answered: Multiple Imputation for Multiple Surveys
- Journal of the American Statistical Association
, 1998
"... We present a method of analyzing a series of independent cross-sectional surveys in which some questions are not answered in some surveys and some respondents do not answer some of the questions posed. The method is also applicable to a single survey in which different questions are asked, or differ ..."
Abstract
-
Cited by 16 (7 self)
- Add to MetaCart
We present a method of analyzing a series of independent cross-sectional surveys in which some questions are not answered in some surveys and some respondents do not answer some of the questions posed. The method is also applicable to a single survey in which different questions are asked, or different sampling methods used, in different strata or clusters. Our method involves multiply-imputing the missing items and questions by adding to existing methods of imputation designed for single surveys a hierarchical regression model that allows covariates at the individual and survey levels. Information from survey weights is exploited by including in the analysis the variables on which the weights were based, and then reweighting individual responses (observed and imputed) to estimate population quantities. We also develop diagnostics for checking the fit of the imputation model based on comparing imputed to nonimputed data. We illustrate with the example that motivated this project --- a ...
Cross-sell: A Fast Promotion-Tunable Customer-item Recommendation Method Based on Conditionally Independent Probabilities
- In Proceedings of ACM SIGKDD International Conference
, 2000
"... We develop a method for recommending products to customers with applications to both on-line and surface mail promotional offers. Our method differs from previous work in collaborative filtering [8] and imputation [18], in that we assume probabilities are conditionally independent. This assumption, ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
We develop a method for recommending products to customers with applications to both on-line and surface mail promotional offers. Our method differs from previous work in collaborative filtering [8] and imputation [18], in that we assume probabilities are conditionally independent. This assumption, which is also made in Nave Bayes [5], enables us to pre-compute probabilities and store them in main memory, enabling very fast performance on millions of customers. The algorithm supports a variety of tunable parameters so that the method can address different promotional objectives. We tested the algorithm at an on-line hardware retailer, with 17,400 customers divided randomly into control and experimental groups. In the experimental group, clickthrough increased by +40% (p<0.01), revenue by +38% (p<0.07), and units sold by +61% (p<0.01). By changing the algorithm's parameter settings we found that these results could be improved even further. This work demonstrates the considerable potent...

